This repository contains a pytorch implementation of "PerFRDiff: Personalised Weight Editing for Multiple Appropriate Facial Reaction Generation".
🛠️ Dependency Installation
We provide detailed instructions for setting up the environment using conda. First, create and activate a new environment:
conda create -n react python=3.10
conda activate reactFirst, check your CUDA version:
nvidia-smiVisit Pytorch official website to get the appropriate installation command. For example:
conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidiaInstall the following dependencies:
conda install -c fvcore -c iopath -c conda-forge fvcore iopathFor CUDA versions older than 11.7, you will need to install the CUB library. There are two installation options:
Option A: Using conda (Recommended)
conda install -c bottler nvidiacubOption B: Manual installation
- Download the CUB library from NVIDIA CUB Releases.
- Unpack it to a folder of your choice. For example, on Linux/Mac:
cd ~
mkdir CUB
curl -LO https://github.com/NVIDIA/cub/archive/2.1.0.tar.gz
tar xzf 2.1.0.tar.gz3. Define the environment variable CUB_HOME in ~/.bashrc. This variable should point to the directory that contains CMakeLists.txt for CUB. Add this line to your ~/.bashrc:
export CUB_HOME=~/CUB/cub-2.1.0To enable Jupyter notebook support, install Jupyter and register the environment:
conda install jupyter
python -m ipykernel install --user --name=reactFirst, verify your CUDA version in Python:
import torch
torch.version.cudaDownload the appropriate PyTorch3D package from Anaconda based on your Python, CUDA, and PyTorch versions. For example, for Python 3.10, CUDA 11.6, and PyTorch 1.12.0:
# linux-64_pytorch3d-0.7.5-py310_cu116_pyt1120.tar.bz2
conda install linux-64_pytorch3d-0.7.5-py310_cu116_pyt1120.tar.bz2Install all remaining dependencies specified in requirements.txt:
pip install -r requirements.txt📊 Dataset
Our work is built upon the REACT 2024 Multimodal Challenge Dataset, which leverages two well-established dyadic interaction datasets: NOXI and RECOLA. The dataset can be accessed through the official REACT 2024 Challenge Homepage.
After downloading the dataset, please rename your downloaded folder to data and place it in the root directory of this project.
Example directory structure:
data
├── test
├── val
├── train
├── Video_files
├── NoXI
├── 010_2016-03-25_Paris
├── Expert_video
├── Novice_video
├── 1.mp4
├── ....
├── ....
├── RECOLA
├── Audio_files
├── NoXI
├── RECOLA
├── group-1
├── P25
├── P26
├── 1.wav
├── ....
├── group-2
├── group-3
├── Emotion
├── NoXI
├── RECOLA
├── group-1
├── P1
├── P2
├── 1.csv
├── ....
├── group-2
├── group-3
├── 3D_FV_files
├── NoXI
├── RECOLA
├── group-1
├── P25
├── P26
├── 1.npy
├── ....
├── group-2
├── group-3
📖 Usage
This project provides several pre-trained models, such as:
- Generic Appropriate Facial Reaction Generator (GAFRG)
- Personalized Weight Shifts Generation (PWSG) Block
- Personalized Style Space Learning (PSSL) Block
You can access and download all the available pre-trained models from the following Google Drive link. After downloading, please unzip the file and place the checkpoints folder into the root directory of this project.
Our framework leverages two key external tools:
- FaceVerse for extraction of 3DMM coefficients
- PIRender (3D-to-2D tool) for facial reaction frame rendering
For convenience, we have compiled all necessary model files into a single package, available at Google Drive link. After downloading, please extract the external folder and place it in the root directory of this project. This package includes:
- FaceVerse model (Version 2) and auxiliary files (mean_face, std_face, and reference_full)
- Well-trained PIRender model
# Training GAFRG for multiple appropriate facial reaction generation
python train_diffusion.py --mode train --writer True --config diffusion_model.yaml# Training Personalized GAFRG (with Weight Editing) for multiple appropriate facial reaction generation
python train_rewrite_weight.py --mode train --writer True --config rewrite_weight.yaml# Inference using GAFRG for multiple appropriate facial reaction generation
python evaluate_diffusion.py --mode test --config diffusion_model.yaml# Inference using Personalised GAFRG (with Weight Editing) for multiple appropriate facial reaction generation
python evaluate_rewrite_weight.py --mode test --config rewrite_weight.yamlvideo_demo.mp4
We extend our sincere gratitude to the following open-source projects:
@inproceedings{zhu2024perfrdiff,
title={Perfrdiff: Personalised weight editing for multiple appropriate facial reaction generation},
author={Zhu, Hengde and Kong, Xiangyu and Xie, Weicheng and Huang, Xin and Shen, Linlin and Liu, Lu and Gunes, Hatice and Song, Siyang},
booktitle={Proceedings of the 32nd ACM International Conference on Multimedia},
pages={9495--9504},
year={2024}
}
