Skip to content

Add motion‑vector extraction API (MV‑only mode + frame types) for ML workflows #1218

@bhack

Description

@bhack

🚀 The feature

The idea is to expose compressed-domain motion vectors from VideoDecoder.

  • Add an API to fetch per‑frame motion vectors returned by FFmpeg via AV_FRAME_DATA_MOTION_VECTORS.
  • Support MV‑only mode that skips RGB decoding for higher throughput.
  • Optionally return frame type (I/P/B) alongside vectors.
  • NVDEC: if motion vectors are available via the CUDA/NVDEC path, consider parity with the CPU API; otherwise it can be CPU‑only initially.

Possible API shapes:

  • VideoDecoder.get_motion_vectors_at(indices=...) -> MotionVectorBatch
  • VideoDecoder.get_frames_at(..., with_motion_vectors=True)
  • VideoDecoder(..., decode_frames=False, return_motion_vectors=True)

Motion vector format could mirror AVMotionVector fields (N x 10):
source, w, h, src_x, src_y, dst_x, dst_y, motion_x, motion_y, motion_scale as int32 on CPU.

Motivation, pitch

TorchCodec is a great fit for video ML pipelines, but it currently returns only decoded frames. Many models can benefit from compressed‑domain motion vectors for fast motion cues, tracking, or flow warm‑starts.
Exposing AV_FRAME_DATA_MOTION_VECTORS (and an MV‑only fast path) would enable high‑throughput training/inference without full optical flow.

Related work:
https://arxiv.org/html/2510.17427v1

Reference implementation:
https://github.com/LukasBommes/mv-extractor

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions