🚀 The feature
The idea is to expose compressed-domain motion vectors from VideoDecoder.
- Add an API to fetch per‑frame motion vectors returned by FFmpeg via
AV_FRAME_DATA_MOTION_VECTORS.
- Support MV‑only mode that skips
RGB decoding for higher throughput.
- Optionally return frame type (
I/P/B) alongside vectors.
NVDEC: if motion vectors are available via the CUDA/NVDEC path, consider parity with the CPU API; otherwise it can be CPU‑only initially.
Possible API shapes:
VideoDecoder.get_motion_vectors_at(indices=...) -> MotionVectorBatch
VideoDecoder.get_frames_at(..., with_motion_vectors=True)
VideoDecoder(..., decode_frames=False, return_motion_vectors=True)
Motion vector format could mirror AVMotionVector fields (N x 10):
source, w, h, src_x, src_y, dst_x, dst_y, motion_x, motion_y, motion_scale as int32 on CPU.
Motivation, pitch
TorchCodec is a great fit for video ML pipelines, but it currently returns only decoded frames. Many models can benefit from compressed‑domain motion vectors for fast motion cues, tracking, or flow warm‑starts.
Exposing AV_FRAME_DATA_MOTION_VECTORS (and an MV‑only fast path) would enable high‑throughput training/inference without full optical flow.
Related work:
https://arxiv.org/html/2510.17427v1
Reference implementation:
https://github.com/LukasBommes/mv-extractor
🚀 The feature
The idea is to expose compressed-domain motion vectors from
VideoDecoder.AV_FRAME_DATA_MOTION_VECTORS.RGBdecoding for higher throughput.I/P/B) alongside vectors.NVDEC: if motion vectors are available via theCUDA/NVDECpath, consider parity with the CPU API; otherwise it can be CPU‑only initially.Possible API shapes:
VideoDecoder.get_motion_vectors_at(indices=...) -> MotionVectorBatchVideoDecoder.get_frames_at(..., with_motion_vectors=True)VideoDecoder(..., decode_frames=False, return_motion_vectors=True)Motion vector format could mirror
AVMotionVectorfields (N x 10):source, w, h, src_x, src_y, dst_x, dst_y, motion_x, motion_y, motion_scaleasint32on CPU.Motivation, pitch
TorchCodec is a great fit for video ML pipelines, but it currently returns only decoded frames. Many models can benefit from compressed‑domain motion vectors for fast motion cues, tracking, or flow warm‑starts.
Exposing
AV_FRAME_DATA_MOTION_VECTORS(and an MV‑only fast path) would enable high‑throughput training/inference without full optical flow.Related work:
https://arxiv.org/html/2510.17427v1
Reference implementation:
https://github.com/LukasBommes/mv-extractor