ViMo — Video-Motion Face Liveness Detection

Multimodal face liveness detection system that combines IMU sensor data with video embeddings to distinguish genuine face presentations from spoofing attacks (screen spoofs, cardboard spoofs, etc.) on mobile devices.

The core idea: distill knowledge from a video encoder (teacher) into a lightweight IMU encoder (student) using cross-modal contrastive learning, so that at inference time only IMU data from the device's accelerometer and gyroscope is needed.

Architecture

Video encoder (teacher): InternVideo2.5 or VideoMAE
IMU encoder (student): Mantis-8M with a projection MLP head
Loss: COMODO — cross-modal distillation with an instance queue of negative samples and forward KL divergence
Evaluation metrics: AUC, EER, APCER/BPCER at multiple thresholds, TPR @ FPR=0

Project Structure

├── train_vimo.py               # Main ViMo training (IMU ↔ video distillation)
├── train_imu_encoder.py        # Standalone IMU encoder fine-tuning
├── train_imu_encoder_mmea.py   # IMU encoder training on MMEA dataset
├── train_eval_vimo_head.py     # Train & evaluate SVM head on IMU embeddings
├── eval_imu_encoder.py         # Evaluate fine-tuned IMU encoder
├── get_imu_embeds.py           # Extract IMU embeddings for downstream tasks
├── configs/                    # Training & evaluation configs (YAML)
├── utils/
│   ├── model_utils.py          # IMUEncoder, VideoEncoder, VideoEncoderMAE
│   ├── dataset_utils.py        # ViMoDataset, IMUDataset, S3 data loading
│   ├── loss.py                 # COMODOLoss
│   └── utils.py                # Metrics, plotting, evaluation helpers
└── tools/
    ├── imu_viz.py              # IMU trajectory visualization orchestrator
    ├── trajectory_calculator.py # Trajectory from quaternion + accelerometer
    ├── trajectory_visualizer.py # 3D trajectory rendering with video overlay
    └── web_motion_check.py     # Motion statistics analysis

Setup

Requires Python 3.12+.

uv sync

Training

All training scripts use ClearML for experiment tracking, dataset management, and remote execution.

ViMo (cross-modal distillation)

uv run python train_vimo.py --config configs/vimo_train.yaml

Trains the IMU encoder to match video encoder embeddings using COMODO loss. Video embeddings are pre-computed and cached as .npy files.

ViMo on MMEA dataset

uv run python train_vimo_mmea.py --config configs/vimo_mmea_train.yaml --dataset_path /path/to/UESTC-MMEA-CL/

Trains ViMo distillation on the UESTC-MMEA-CL dataset. Expects train.txt and val.txt split files inside --dataset_path.

IMU encoder (standalone classification)

uv run python train_imu_encoder.py --config configs/imu_encoder_train.yaml

Fine-tunes Mantis-8M for binary genuine/spoof classification with BCE or cross-entropy loss.

ViMo head (SVM on embeddings)

uv run python train_eval_vimo_head.py --config configs/vimo_head_train_eval.yaml

Extracts IMU embeddings and trains an SVM classifier with grid search.

Evaluation

uv run python eval_imu_encoder.py --config configs/imu_encoder_eval.yaml

Reports AUC, EER, accuracy, APCER/BPCER curves, and confusion matrix.

Configuration

Configs are in configs/ as YAML files. Key parameters:

Parameter	Default	Description
`train.lr`	3e-4	Learning rate
`train.epochs`	50	Training epochs
`train.batch_size`	128	Batch size
`train.mlp_hidden_dim`	2048	Projection MLP hidden dim
`train.mlp_output_dim`	128	Embedding dimension
`train.queue_size`	2048	COMODO instance queue size
`train.teacher_temp`	0.1	Teacher softmax temperature
`train.student_temp`	0.05	Student softmax temperature
`train.use_quat`	true	Use quaternion data (7 ch) or accel+gyro only (6 ch)
`train.num_of_frames`	16	Video frames per sample

Tools

Visualization and analysis utilities in tools/:

# IMU trajectory visualization
uv run python tools/imu_viz.py --imu_csv <path> --video <path> --config tools/config.yaml

Computes 3D trajectory from IMU data (Savitzky-Golay smoothing, gravity compensation, stationary detection) and renders it overlaid on video frames.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ViMo — Video-Motion Face Liveness Detection

Architecture

Project Structure

Setup

Training

ViMo (cross-modal distillation)

ViMo on MMEA dataset

IMU encoder (standalone classification)

ViMo head (SVM on embeddings)

Evaluation

Configuration

Tools

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.devcontainer		.devcontainer
configs		configs
tools		tools
utils		utils
.gitignore		.gitignore
README.md		README.md
eval_imu_encoder.py		eval_imu_encoder.py
get_imu_embeds.py		get_imu_embeds.py
pyproject.toml		pyproject.toml
train_eval_vimo_head.py		train_eval_vimo_head.py
train_imu_encoder.py		train_imu_encoder.py
train_imu_encoder_mmea.py		train_imu_encoder_mmea.py
train_vimo.py		train_vimo.py
train_vimo_mmea.py		train_vimo_mmea.py
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

ViMo — Video-Motion Face Liveness Detection

Architecture

Project Structure

Setup

Training

ViMo (cross-modal distillation)

ViMo on MMEA dataset

IMU encoder (standalone classification)

ViMo head (SVM on embeddings)

Evaluation

Configuration

Tools

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages