[DRAFT] Display VTT transcripts in audio/video players by eltiffster · Pull Request #7418 · samvera/hyrax

eltiffster · 2026-04-17T18:23:07Z

Summary

Allow users to select and display WebVTT files as transcripts/captions for audio/video files.

Guidance for testing, such as acceptance criteria or new user interface behaviors:

Create a work with an audio/video representative file set and a VTT file set.
Once both files have been uploaded, go to the edit page for the audio/video file set. There should be a form field titled "Transcripts" that lists your VTT file as an option. Select your VTT file and click Save.
If IIIF/AV Support is not enabled (via the Features Dashboard):
- Go to a file set or work page
- Click the 3 dots in the audio/video player and click "Captions" to enable captions.
If IIIF/AV Support is enabled, you'll need to first install the Clover IIIF viewer since UV doesn't support captions at time of writing:
- In your webapp (.dassie or /app/samvera/hyrax-webapp in Docker), run rails generate hyrax:iiif_viewer clover to install
- NOTE: If you previously installed Clover, you'll need to run rails destroy hyrax:iiif_viewer clover, then rails generate hyrax:iiif_viewer clover to reinstall fresh copies of the viewer files. The clover.js file was altered to fix this issue and an extra class was added to a div in clover.html.erb.
- Once Clover is installed and the transcript has been saved to the audio/video file, go the work page. You should be able to display the captions by clicking the 3 dots in the audio/video player toolbar and enabling them. If you click on the "Annotations" tab, you should also see the interactive transcript with timestamps (example pictured below).

screenshot of the interactive transcript in Clover's Annotations tab, side-by-side with the video player

Type of change (for release notes)

notes-major (I think) due to needing to install a new gem for converting language field values into a language code readable by the IIIF viewer

Detailed Description

This is a continuation of work started before/during the March 2026 Community Sprint. More context/discussion on implementation was recorded on the the Sprint Board.

After uploading VTT file(s) to a work with an audio/video (AV) file, users can choose to use the VTT file as the subtitles/captions file for the corresponding audio/video. This is done by editing the AV file set, selecting the VTT file by title, and saving the AV file set (see screenshot below). Under the hood, the VTT file set ids are saved to the transcript_ids attribute of the AV file and indexed as transcript_ids_ssim in the AV file's Solr document.

The transcript(s) form is populated by a Solr query that searches for "sibling" file sets (i.e. file sets of the same parent work as the AV file set) with a text/vtt mime type. Currently, text/vtt is the only accepted mime type, but other mime types could be added in future. Users can select multiple VTT files per audio/video file. This approach also supports a nested work structure where a child work has a different transcript than other child works or its parent work.

Selected transcripts are displayed via a <track> element in default audio/video partials. When using a IIIF AV viewer, the transcript is displayed in a IIIF manifest via an annotation, following the pattern of this IIIF cookbook recipe:

"annotations": [
  {
   "type": "AnnotationPage",
   "items": [
      {
         "type": "Annotation",
         "motivation": "supplementing",
         "body": {
              "id": "http://localhost:3000/transcripts/file_id.vtt",
              "type": "Text",
              "format": "text/vtt",
              "label": {
                 "none": ["title_or_label.vtt"]
               }
           },
        "id": "http://localhost:3000/concern/generic_works/id/manifest/canvas/id",
        "target": "http://localhost:3000/concern/generic_works/id/manifest/canvas/id"
      }
    ]
  }
 ]

Thanks to @kirkkwang and @trmccormick for their previous work on this!

Changes proposed in this pull request:

Add transcript_ids and language properties/attributes to ActiveFedora and Hyrax file set classes, their respective form classes, and their respective indexers
Add UI hints/guidance for transcript_ids and language form fields in the file set edit form
Pass a Hyrax::FileSetPresenter to render_media_display_partial instead of a Solr document (in app/views/hyrax/file_sets/edit.html.erb)
Add VTT transcripts as annotations to the IIIF manifest and sort them by language/locale by @kirkkwang
Add homepage property to IIIF manifest by @kirkkwang
Add route and controller for serving transcripts to IIIF AV viewers by @kirkkwang
Add LanguageList gem for parsing 2-letter language code from a Solr document's language field
Add <track> elements to app/views/hyrax/file_sets/media_display/_video.html.erb and app/views/hyrax/file_sets/media_display/_audio.html.erb: initial idea and technical plan by @trmccormick

Possible Future Work

Localization/translation for UI hints for the language and transcript fields
The file set edit form simply renders the views/hyrax/file_sets/_form.html.erb partial for ActiveFedora file sets, while the Valkyrie version uses the hydra-editor gem, which renders partials in views/records/edit_fields. Is there a reason for this difference? Should the form be refactored?
Ideally, Clover should be easier to configure, maybe with a JSON file like with Universal Viewer. This would require using a JS framework to modify it and rebuild the clover.js file.

@samvera/hyrax-code-reviewers

… Support in Hyrax #7380 https://github.com/samvera/hyrax/pull/7380/changes

… into simple form

…notations tab of Clover IIIF viewer. "l" is supposed to refer to the <ItemStyled> object in Annotation (https://github.com/samvera-labs/clover-iiif/blob/main/src/components/Viewer/InformationPanel/Annotation/Item.styled.tsx). In the minified version of this file, this should actually be uppercase L, not lowercase l. `l("span",{style:{backgroundImage` was changed to `L("span",{style:{backgroundImage` `return l(n9,{dir:P,"data-format"` was changed to `return L(n9,{dir:P,"data-format"` There appears to be an unnecessary call to `l()` in the following switch/case statement: ``` case "text/vtt": return l(HQ, { inlineCues: k, label: A, vttUri: ((D = y[0]) == null ? void 0 : D.id) || void 0 }); ``` In fact, `return l(HQ,{inlineCues:k` can be changed to `return HQ({inlineCues:k` since HQ is a function in the minified file.

…ion for other file types seems unfinished and doesn't work.

…f the solr document. Do not make language a required field for a file set.

…ix some formatting and specs.

…s with multiple files are visible.

… Remove tests for code that was already tested/covered elsewhere.

…itle or label

… since it's not required by the viewer.

github-actions · 2026-04-17T19:08:35Z

Test Results

17 files ± 0 17 suites ±0 3h 27m 35s ⏱️ - 1m 42s
7 252 tests + 44 6 937 ✅ + 36 306 💤 ±0 9 ❌ + 8
24 296 runs +153 23 682 ✅ +131 593 💤 +2 21 ❌ +20

For more details on these failures, see this check.

Results for commit b7dfbac. ± Comparison against base commit b5f81f9.

This pull request removes 448 and adds 492 tests. Note that renamed tests count towards both.

spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f0c96e8a8d8>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f17e6051cc0>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f8a2461d4a0>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007faae0fe4b00>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f0c96e97948>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f17e63653d0>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f8a2462aa10>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007faae10d2e40>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy AdminSet: bf1c01fd-8c4d-426c-bf5e-f1490c26bb6d
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy Hyrax::AdministrativeSet: 16edf10c-dfb0-4c30-8e0a-4d459dbc1ef5
…

spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f1d8e2b12d8>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f60f1167458>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007fa7f1513ad8>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007fc8a90f1cd8>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f1d8e3cfb60>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f60f1174388>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007fa7f1652598>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007fc8a8f7e4c8>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy AdminSet: cdc1b11a-cd1f-4bf5-9083-a07c0ac511f6
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy Hyrax::AdministrativeSet: 06683831-b048-4adf-8919-4b2b6414f1fe
…

♻️ This comment has been updated with latest results.

eltiffster added 22 commits March 27, 2026 16:38

Add transcript_ids property to file set model(s)

a35ae68

Add transcript_ids to file set form

9d63a58

Copy changes from @kirkkwang's draft PR, WIP: Implement Transcription…

6c55ba0

… Support in Hyrax #7380 https://github.com/samvera/hyrax/pull/7380/changes

Support transcriptions for ActiveFedora works

2a4df1d

Fix broken specs

ec91a73

Normalize VTT file set language values to 2-letter language codes.

b4a9419

Simplify iiif_manifest_presenter_spec#annotates_content

757101b

Add comments and rubocop fixes

6a5da1d

Add language field to file set form and move transcript ids form hint…

1d60f21

… into simple form

Add VTT transcripts to default audio/video partials

3da19f1

Fix "l is not a function" error for other file types. The implementat…

1371460

…ion for other file types seems unfinished and doesn't work.

Use the file set presenter to render the file set edit form instead o…

5ed7239

…f the solr document. Do not make language a required field for a file set.

Allow transcriptions_controller to serve file types other than VTT. F…

148273e

…ix some formatting and specs.

Fix syntax error in clover.js

3067885

Increase the height of Clover IIIF viewer so that thumbnails for work…

6dd9b6e

…s with multiple files are visible.

Rename "transcriptions" to "transcripts" for consistency

5319280

Account for ActiveTriples::Resource in a transcript's language field.…

26de256

… Remove tests for code that was already tested/covered elsewhere.

Remove unused fallback label, since a file set should always have a t…

ff06bb5

…itle or label

Make language optional again in IIIF manifest annotations, especially…

1a10000

… since it's not required by the viewer.

Delete a stray, unnecessary comment

d0074b0

Rename .valid_transcripts to .available_transcripts

8368731

eltiffster added 2 commits April 17, 2026 12:29

Update file_set_form_helper_spec.rb

7ff49e6

Merge branch 'main' into av-transcripts

b7dfbac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] Display VTT transcripts in audio/video players#7418

[DRAFT] Display VTT transcripts in audio/video players#7418
eltiffster wants to merge 24 commits intomainfrom
av-transcripts

eltiffster commented Apr 17, 2026

Uh oh!

github-actions bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eltiffster commented Apr 17, 2026

Summary

Guidance for testing, such as acceptance criteria or new user interface behaviors:

Type of change (for release notes)

Detailed Description

Changes proposed in this pull request:

Possible Future Work

Uh oh!

github-actions bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Apr 17, 2026 •

edited

Loading