Skip to content

Asset processing: transcode non-H.264/H.265 video, normalize exotic image formats, render PDFs via pdf.js #2812

@vpetersson

Description

@vpetersson

Summary

Add a server-side processing step (Celery) that normalizes uploads into formats the existing viewer plays directly, plus client-side pdf.js so PDFs become a viewer-renderable asset. This broadens the set of files an end user can drop in and have play correctly, without expanding the viewer's runtime media stack.

Three workstreams in one PR (or three small PRs landed together):

  1. Video — probe on upload, transcode anything that isn't H.264/H.265 to H.264 + AAC in MP4.
  2. Image — convert HEIC, HEIF, TIFF to lossless WebP on upload.
  3. PDF — vendor pdf.js so PDFs upload as a webpage-style asset that paginates client-side at runtime.

Why now

Today Anthias accepts a narrow set of extensions (anthias_app/helpers.py:43, static/src/components/add-asset-modal/file-upload-utils.ts:10-50) and routes them straight to mpv/vlc or the Qt webview. Anything outside that list either errors at upload or fails silently at playback. The fix is small and bounded: a Celery normalisation pass at upload time, mirroring the existing download_video_from_youtube async pattern (api/serializers/mixins.py:67).

Workstreams

1. Video normalization (Celery)

On every video upload, run a fast ffprobe to detect codec and container. The output of the probe decides whether the file passes through or is transcoded.

Directly playable (no transcode, just rename .original → final URI):

  • Container ∈ {mp4, mkv, mov, webm, ts, mpg/mpeg, flv, avi}
  • Video codec ∈ {h264, hevc}
  • Audio codec ∈ {aac, mp3, opus, vorbis, ac3, none}

Otherwise, transcode:

ffmpeg -threads 2 -i <input> -c:v libx264 -preset medium -crf 23 \
       -c:a aac -b:a 192k -movflags +faststart <out>.mp4

ffmpeg is already a runtime dependency for lib.utils.get_video_duration and the YouTube ingest path. No new system packages.

Implementation:

  • New Celery task normalize_video_asset(asset_id) in celery_tasks.py.
  • Upload path sets Asset.is_processing=True and enqueues the task.
  • On success: update Asset.uri, set Asset.duration via get_video_duration, flip is_processing=False, write metadata['original_ext'] + metadata['transcoded'].
  • On failure: write metadata['error_message'], leave is_processing=False (no stuck rows).
  • time_limit=1800 (30 min ceiling).

2. Image normalization (Celery)

Convert HEIC, HEIF, and TIFF uploads to lossless WebP so the Qt webview only ever sees a format it can render. WebP keeps source fidelity (lossless), preserves alpha (which HEIC/TIFF can carry and JPEG would drop), is well below PNG in file size for photographic content, and is already accepted by Anthias's upload list (static/src/components/add-asset-modal/file-upload-utils.ts:13) so the viewer/Qt webview path is exercised.

Toolchain:

  • Pillow (already used elsewhere in the stack); WebP support is built into Pillow's libwebp linkage on Debian/Pi.
  • pillow-heif (Python, ARM- and x86-friendly libheif wrapper).
  • libheif1 (apt, ~1 MB) — runtime dep for pillow-heif.

Implementation:

  • New Celery task normalize_image_asset(asset_id):
    Image.open(<input>).convert('RGBA').save(<out>.webp, 'WEBP', lossless=True)
    (Use RGBA rather than RGB so transparency is preserved when present.)
  • Upload path sets is_processing=True for {heic, heif, tif, tiff} and enqueues the task.
  • On success: Asset.uri.webp path, Asset.mimetype='image', metadata['original_ext']=ext, is_processing=False.
  • Failure path mirrors video.

3. PDF rendering (pdf.js)

PDFs become a webpage-type asset that loads a self-hosted pdf.js page. The viewer's existing view_webpage branch handles runtime display; no viewer-side code changes.

Toolchain:

  • pdfjs-dist (Mozilla PDF.js), pinned to current LTS major.
  • Vendored at build time under static/vendor/pdfjs/, served by WhiteNoise per anthias_django/settings.py.

Implementation:

  • New Django route GET /docs/<asset_id>/pdf/?page_duration=<s> in a new anthias_app/document_views.py that streams a small HTML shell. The shell loads pdf.js, opens /anthias_assets/<asset_id>.pdf, renders page-by-page on a <canvas>, advances every page_duration seconds, and loops.
  • Page count read at upload time via a small Celery task that uses pdfjs-dist under Node. Asset.duration = pages * page_duration_s.
  • Encrypted PDFs are rejected at upload with a clear error.
  • Per-asset page_duration is editable; default comes from a new device setting default_document_page_duration (10s).

Schema groundwork

Add metadata = JSONField(default=dict, blank=True) to the Asset model (anthias_app/models.py:11). New migration. Used by all three workstreams to carry original_ext, transcoded, error_message, and document_pages without inflating the schema.

Expose metadata on AssetSerializerV2 (read+write) in api/serializers/v2.py:22. v1.x serializers expose it read-only for back-compat.

Resource constraints

The Celery worker runs on the same device as the viewer (per CLAUDE.md, the worker container even shares the rootfs with the server). A naive ffmpeg transcode will pin all four cores on a Pi 4/5 and starve the viewer mid-playback. Two layers of throttling:

  1. Lower the worker's process priority at launch — set CPU and I/O priority on the Celery worker itself so every subprocess inherits it. Update the anthias-celery service command in docker-compose.yml.tmpl:

    nice -n 19 ionice -c 3 celery -A celery_tasks worker ...
    

    nice -n 19 is the lowest CPU priority (highest niceness); ionice -c 3 is idle-class I/O priority. Both are no-ops when the system is idle and only kick in under contention — so background sweeps still finish fast on an idle device while never disrupting active playback.

  2. Cap ffmpeg thread count — pass -threads 2 to the transcode invocation so two cores stay free for the viewer on Pi 4/5 (and on Pi 3, ffmpeg will already be slow enough that single-threaded vs multi-threaded matters less than the priority cap).

Pillow's HEIC/TIFF → WebP conversion is light (typically sub-second per image) and inherits the worker's nice level, so no per-call thread cap is needed.

Verification: Pi 5 smoke test — start a video playback, then upload a ~100 MB ProRes MOV. Watch for any playback stutter while the transcode runs in the background. Expectation: zero visible stutter.

Acceptance criteria

  • Uploading a ProRes MOV (or any non-H.264/H.265 source) yields a playable H.264 MP4 asset; an already-H.264 MP4 is a no-op rename with no re-encode.
  • Uploading a HEIC yields a playable lossless WebP asset; same for TIFF.
  • Uploading a multi-page PDF yields a webpage asset whose viewer-side display paginates at the configured page duration and loops.
  • Failure paths (corrupt video, encrypted PDF, broken HEIC) leave the row in a clear error state with metadata.error_message, never stuck in is_processing=True.
  • Unit tests under api/tests/test_assets.py cover each Celery task with deterministic fixtures (small ProRes clip, HEIC, 2-page PDF). Tests assert on output codec/extension and on the failure-state contract.
  • uv run pytest -m "not integration" passes; integration suite passes per CLAUDE.md.
  • New metadata field appears in the v2 OpenAPI schema (drf-spectacular).
  • Pi 5 smoke test: upload one video, one HEIC, one PDF — confirm playback in the live dev environment with no observable stutter while a transcode runs in the background.

Critical files

  • anthias_app/models.py — add metadata field, new migration
  • celery_tasks.pynormalize_video_asset, normalize_image_asset, process_pdf_asset
  • api/views/v2.py, api/views/mixins.py — upload path branches by extension; sets is_processing=True and enqueues the right task
  • api/serializers/v2.py — expose metadata; add default_document_page_duration to device settings
  • static/src/components/add-asset-modal/file-upload-utils.ts — accept heic, heif, tif, tiff, pdf
  • static/vendor/pdfjs/ — vendored pdf.js
  • anthias_app/document_views.py (new) — /docs/<id>/pdf/ route
  • pyproject.tomlpillow-heif
  • package.jsonpdfjs-dist
  • docker/anthias-server.Dockerfile.j2libheif1
  • docker-compose.yml.tmpl — wrap anthias-celery command with nice + ionice

Patterns to reuse

  • download_video_from_youtube (api/serializers/mixins.py:67) — model the async upload+convert flow after this: synchronous URI placeholder + is_processing=True + Celery task that flips is_processing=False.
  • get_video_duration (lib/utils.py) — ffprobe duration extraction; reuse for the transcoded MP4.
  • The <asset_id>.tmp rename pattern in celery_tasks.py:cleanup.original suffix during conversion, atomic rename on success. The 1-hour mtime guard already handles in-flight files correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions