Skip to content

[DRAFT] Display VTT transcripts in audio/video players#7418

Draft
eltiffster wants to merge 24 commits intomainfrom
av-transcripts
Draft

[DRAFT] Display VTT transcripts in audio/video players#7418
eltiffster wants to merge 24 commits intomainfrom
av-transcripts

Conversation

@eltiffster
Copy link
Copy Markdown
Contributor

Summary

Allow users to select and display WebVTT files as transcripts/captions for audio/video files.

Guidance for testing, such as acceptance criteria or new user interface behaviors:

  • Create a work with an audio/video representative file set and a VTT file set.
  • Once both files have been uploaded, go to the edit page for the audio/video file set. There should be a form field titled "Transcripts" that lists your VTT file as an option. Select your VTT file and click Save.
  • If IIIF/AV Support is not enabled (via the Features Dashboard):
    • Go to a file set or work page
    • Click the 3 dots in the audio/video player and click "Captions" to enable captions.
  • If IIIF/AV Support is enabled, you'll need to first install the Clover IIIF viewer since UV doesn't support captions at time of writing:
    • In your webapp (.dassie or /app/samvera/hyrax-webapp in Docker), run rails generate hyrax:iiif_viewer clover to install
    • NOTE: If you previously installed Clover, you'll need to run rails destroy hyrax:iiif_viewer clover, then rails generate hyrax:iiif_viewer clover to reinstall fresh copies of the viewer files. The clover.js file was altered to fix this issue and an extra class was added to a div in clover.html.erb.
    • Once Clover is installed and the transcript has been saved to the audio/video file, go the work page. You should be able to display the captions by clicking the 3 dots in the audio/video player toolbar and enabling them. If you click on the "Annotations" tab, you should also see the interactive transcript with timestamps (example pictured below).
screenshot of the interactive transcript in Clover's Annotations tab, side-by-side with the video player

Type of change (for release notes)

  • notes-major (I think) due to needing to install a new gem for converting language field values into a language code readable by the IIIF viewer

Detailed Description

This is a continuation of work started before/during the March 2026 Community Sprint. More context/discussion on implementation was recorded on the the Sprint Board.

After uploading VTT file(s) to a work with an audio/video (AV) file, users can choose to use the VTT file as the subtitles/captions file for the corresponding audio/video. This is done by editing the AV file set, selecting the VTT file by title, and saving the AV file set (see screenshot below). Under the hood, the VTT file set ids are saved to the transcript_ids attribute of the AV file and indexed as transcript_ids_ssim in the AV file's Solr document.

The transcript(s) form is populated by a Solr query that searches for "sibling" file sets (i.e. file sets of the same parent work as the AV file set) with a text/vtt mime type. Currently, text/vtt is the only accepted mime type, but other mime types could be added in future. Users can select multiple VTT files per audio/video file. This approach also supports a nested work structure where a child work has a different transcript than other child works or its parent work.

captions_edit_interface

Selected transcripts are displayed via a <track> element in default audio/video partials. When using a IIIF AV viewer, the transcript is displayed in a IIIF manifest via an annotation, following the pattern of this IIIF cookbook recipe:

"annotations": [
  {
   "type": "AnnotationPage",
   "items": [
      {
         "type": "Annotation",
         "motivation": "supplementing",
         "body": {
              "id": "http://localhost:3000/transcripts/file_id.vtt",
              "type": "Text",
              "format": "text/vtt",
              "label": {
                 "none": ["title_or_label.vtt"]
               }
           },
        "id": "http://localhost:3000/concern/generic_works/id/manifest/canvas/id",
        "target": "http://localhost:3000/concern/generic_works/id/manifest/canvas/id"
      }
    ]
  }
 ]

Thanks to @kirkkwang and @trmccormick for their previous work on this!

Changes proposed in this pull request:

  • Add transcript_ids and language properties/attributes to ActiveFedora and Hyrax file set classes, their respective form classes, and their respective indexers
  • Add UI hints/guidance for transcript_ids and language form fields in the file set edit form
  • Pass a Hyrax::FileSetPresenter to render_media_display_partial instead of a Solr document (in app/views/hyrax/file_sets/edit.html.erb)
  • Add VTT transcripts as annotations to the IIIF manifest and sort them by language/locale by @kirkkwang
  • Add homepage property to IIIF manifest by @kirkkwang
  • Add route and controller for serving transcripts to IIIF AV viewers by @kirkkwang
  • Add LanguageList gem for parsing 2-letter language code from a Solr document's language field
  • Add <track> elements to app/views/hyrax/file_sets/media_display/_video.html.erb and app/views/hyrax/file_sets/media_display/_audio.html.erb: initial idea and technical plan by @trmccormick

Possible Future Work

  • Localization/translation for UI hints for the language and transcript fields
  • The file set edit form simply renders the views/hyrax/file_sets/_form.html.erb partial for ActiveFedora file sets, while the Valkyrie version uses the hydra-editor gem, which renders partials in views/records/edit_fields. Is there a reason for this difference? Should the form be refactored?
  • Ideally, Clover should be easier to configure, maybe with a JSON file like with Universal Viewer. This would require using a JS framework to modify it and rebuild the clover.js file.

@samvera/hyrax-code-reviewers

…notations tab of Clover IIIF viewer.

"l" is supposed to refer to the <ItemStyled> object in Annotation (https://github.com/samvera-labs/clover-iiif/blob/main/src/components/Viewer/InformationPanel/Annotation/Item.styled.tsx). In the minified version of this file, this should actually be uppercase L, not lowercase l.

`l("span",{style:{backgroundImage` was changed to `L("span",{style:{backgroundImage`
`return l(n9,{dir:P,"data-format"` was changed to `return L(n9,{dir:P,"data-format"`

There appears to be an unnecessary call to `l()` in the following switch/case statement:
```
case "text/vtt":
  return l(HQ, {
    inlineCues: k,
    label: A,
    vttUri: ((D = y[0]) == null ? void 0 : D.id) || void 0
  });
```

In fact, `return l(HQ,{inlineCues:k` can be changed to `return HQ({inlineCues:k` since HQ is a function in the minified file.
…ion for other file types seems unfinished and doesn't work.
…f the solr document. Do not make language a required field for a file set.
… Remove tests for code that was already tested/covered elsewhere.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 17, 2026

Test Results

    17 files  ±  0      17 suites  ±0   3h 27m 35s ⏱️ - 1m 42s
 7 252 tests + 44   6 937 ✅ + 36  306 💤 ±0   9 ❌ + 8 
24 296 runs  +153  23 682 ✅ +131  593 💤 +2  21 ❌ +20 

For more details on these failures, see this check.

Results for commit b7dfbac. ± Comparison against base commit b5f81f9.

This pull request removes 448 and adds 492 tests. Note that renamed tests count towards both.
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f0c96e8a8d8>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f17e6051cc0>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f8a2461d4a0>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007faae0fe4b00>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f0c96e97948>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f17e63653d0>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f8a2462aa10>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007faae10d2e40>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy AdminSet: bf1c01fd-8c4d-426c-bf5e-f1490c26bb6d
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy Hyrax::AdministrativeSet: 16edf10c-dfb0-4c30-8e0a-4d459dbc1ef5
…
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f1d8e2b12d8>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f60f1167458>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007fa7f1513ad8>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007fc8a90f1cd8>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f1d8e3cfb60>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f60f1174388>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007fa7f1652598>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007fc8a8f7e4c8>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy AdminSet: cdc1b11a-cd1f-4bf5-9083-a07c0ac511f6
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy Hyrax::AdministrativeSet: 06683831-b048-4adf-8919-4b2b6414f1fe
…

♻️ This comment has been updated with latest results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Annotations tab fails in Vanilla JS

1 participant