Skip to content

feat(moq-video): native per-platform H.264 codecs + zero-copy capture, drop ffmpeg#1691

Open
kixelated wants to merge 24 commits into
devfrom
claude/confident-kapitsa-93e6f5
Open

feat(moq-video): native per-platform H.264 codecs + zero-copy capture, drop ffmpeg#1691
kixelated wants to merge 24 commits into
devfrom
claude/confident-kapitsa-93e6f5

Conversation

@kixelated

Copy link
Copy Markdown
Collaborator

Why

moq-video linked ffmpeg-next for both encode and capture. Static-linking ffmpeg drops hardware encoders (they dlopen vendor driver libs); dynamic-linking forces a package per distro release (libav* soname churn). This replaces ffmpeg with native, per-platform codecs + capture so a single statically self-contained binary still reaches the GPU at runtime: one .deb/.rpm/brew bottle per arch.

Encoders

Behind a Backend trait chosen by encode::Kind (Auto/Hardware/Software/Named), with a linear hardware→software fallback. Every backend emits Annex-B (moq_mux avc3), so the producer + on-demand catalog logic are unchanged.

Backend Platform Crate Linking
openh264 all (software fallback) openh264 (vendored) static, no runtime dep
VideoToolbox macOS hand-written on objc2-video-toolbox system frameworks
NVENC Linux NVIDIA (nvenc feature) nvidia-video-codec-sdk dlopen driver
VAAPI Linux Intel/AMD (vaapi feature) cros-codecs links libva, dlopen driver

VideoToolbox converts its native AVCC + out-of-band SPS/PPS to Annex-B in its output callback, so it matches the other backends and doesn't disturb the advertise-before-camera-opens model.

Capture + the Frame boundary (zero-copy)

Frame is now an enum: a platform GPU surface (macOS CVPixelBuffer, Linux dmabuf) for the zero-copy fast path, or CPU I420 for software. A hardware backend takes the surface as-is; a software backend downloads it to I420 only when needed.

  • macOS: AVFoundation camera + ScreenCaptureKit screen capture both deliver IOSurface-backed CVPixelBuffers straight to VideoToolbox: no copy, no color conversion. capture::Source selects camera vs display; moq-cli gains --screen.
  • Other platforms: nokhwa camera (CPU RGBA → I420 via the yuv crate) until a native zero-copy path lands.

Verified vs not

Verified on macOS (just check): openh264 + VideoToolbox encode synthetic frames; a VideoToolbox test drives a real CVPixelBuffer through the zero-copy path and asserts a self-contained IDR (SPS+PPS+slice); the NV12→I420 surface download fallback; dimension/buffer guards. moq-cli --features capture and moq-boy build.

Compile-out on macOS, need a Linux box to validate: NVENC, VAAPI, and the Linux dmabuf paths. They are written against the real crate APIs (names checked against source) but are not compiled or run here. AVFoundation + ScreenCaptureKit capture compile but can't run headless (camera/screen TCC).

Not implemented: the V4L2 dmabuf capture that feeds VAAPI (raw V4L2 ioctls; the v4l crate has no high-level dmabuf export). Until then Linux uses nokhwa+openh264 (CPU) with the vaapi feature off. Spec is in DESIGN-native-codecs.md.

Branch / breaking changes

Targets dev: removes ffmpeg-next, adds many deps, and makes breaking moq-video API changes (Error variants removed, capture::Config reshaped, encode_rgba requires matching dims, Kind::Named now means a backend id).

Cross-package sync

  • doc/bin/cli.md is not updated here: the capture device-string semantics changed broadly (macOS is now an AVFoundation uniqueID) and --screen is new; worth a dedicated docs pass rather than a one-line tweak.

Test plan

  • just check on macOS (10 unit tests, clippy -D warnings, fmt)
  • moq-cli --features capture + moq-boy build
  • Linux: compile + run NVENC / VAAPI / V4L2 dmabuf capture (needs a GPU box)
  • macOS: live camera + screen capture publish (needs TCC grant)

🤖 Generated with Claude Code

(Written by Claude)

kixelated and others added 24 commits June 8, 2026 18:55
…talog through it (#1655)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…1657)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>
…iminants (#1661)

Co-authored-by: Claude <noreply@anthropic.com>
DecodeFailed was overloaded: it covered both client JWT/JWK decode errors
(genuinely 401) and unparseable auth-API response bodies. With the new
fail-closed mTLS path, a broken upstream /auth response would surface as a
client-credential 401 instead of a 502, diverging from how network and 5xx
failures are reported.

Split out AuthError::ApiInvalidResponse (#[from] serde_json::Error) for the
auth-API JSON parse failures in fetch_auth_api and fetch_public_response, and
map it to BAD_GATEWAY alongside ApiUnavailable. The JWT/JWK decode sites keep
DecodeFailed -> 401.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Commit 24d2560 was pushed to main by mistake: it captured an in-progress
moq-native Client::connect / reconnect refactor under an unrelated commit
message ("classify malformed auth-API JSON ...") and bypassed review. This
reverts those changes so the refactor can land properly via its own PR. The
original change is preserved in history at 24d2560 for re-use.

This reverts commit 24d2560.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ror (#1663)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…1658)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…and announce race (#1668)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: moq-bot[bot] <186640430+moq-bot[bot]@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…, drop ffmpeg

Replace ffmpeg-next with native encoders and capture so a single statically
self-contained binary still reaches the GPU at runtime. This is the packaging
fix: one .deb/.rpm/brew per arch instead of one per distro release.

Encoders sit behind a `Backend` trait chosen by `encode::Kind` (Auto/Hardware/
Software/Named), falling back through hardware candidates to the software
fallback:

- openh264 (vendored, static): software fallback, all platforms.
- VideoToolbox (hand-written on objc2-video-toolbox): macOS hardware. Converts
  its native AVCC + out-of-band SPS/PPS to Annex-B so every backend emits the
  same wire format (moq_mux avc3), keeping the producer and on-demand catalog
  logic unchanged.
- NVENC (nvidia-video-codec-sdk): Linux NVIDIA, behind the `nvenc` feature.
- VAAPI (cros-codecs): Linux Intel/AMD, behind the `vaapi` feature. Imports a
  dmabuf as a VA surface (zero-copy).

Capture and the frame boundary are reworked for zero-copy. `Frame` is now an
enum: a platform GPU surface (macOS `CVPixelBuffer`, Linux dmabuf) for the
zero-copy fast path, or CPU I420 for the software path. A hardware backend takes
the surface as-is; a software backend downloads it to I420 only when needed.

- macOS: AVFoundation camera and ScreenCaptureKit screen capture both deliver
  IOSurface-backed CVPixelBuffers straight to VideoToolbox, no copy and no color
  conversion. `capture::Source` selects camera vs display; moq-cli gains
  `--screen`.
- Other platforms: nokhwa camera (CPU RGBA -> I420 via the yuv crate) until a
  native zero-copy path lands.

moq-cli `--features capture` keeps working; `--software` now means openh264.

Tested on macOS via just check: openh264 and VideoToolbox encode synthetic
frames (a VideoToolbox test drives a real CVPixelBuffer through the zero-copy
path and asserts a self-contained IDR), the NV12 surface download fallback, and
the dimension/buffer guards. NVENC, VAAPI, and the Linux V4L2 dmabuf capture are
Linux-only and compile-out on macOS; they are written against the real crate
APIs but need a Linux box to compile and validate. The V4L2 dmabuf capture that
feeds VAAPI is not yet implemented (raw V4L2 ioctls); the spec is in
DESIGN-native-codecs.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants