Skip to content

xk0720/PerFRDiff

Repository files navigation

PerFRDiff: Personalised Weight Editing for Multiple Appropriate Facial Reaction Generation

This repository contains a pytorch implementation of "PerFRDiff: Personalised Weight Editing for Multiple Appropriate Facial Reaction Generation".

👨‍🏫 Main Sections:

🛠️ Dependency Installation

We provide detailed instructions for setting up the environment using conda. First, create and activate a new environment:

conda create -n react python=3.10
conda activate react

1. Install PyTorch

First, check your CUDA version:

nvidia-smi

Visit Pytorch official website to get the appropriate installation command. For example:

conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia

2. Install PyTorch3D Dependencies

Install the following dependencies:

conda install -c fvcore -c iopath -c conda-forge fvcore iopath

For CUDA versions older than 11.7, you will need to install the CUB library. There are two installation options:

Option A: Using conda (Recommended)

conda install -c bottler nvidiacub

Option B: Manual installation

  1. Download the CUB library from NVIDIA CUB Releases.
  2. Unpack it to a folder of your choice. For example, on Linux/Mac:
cd ~
mkdir CUB
curl -LO https://github.com/NVIDIA/cub/archive/2.1.0.tar.gz
tar xzf 2.1.0.tar.gz

3. Define the environment variable CUB_HOME in ~/.bashrc. This variable should point to the directory that contains CMakeLists.txt for CUB. Add this line to your ~/.bashrc:

export CUB_HOME=~/CUB/cub-2.1.0

To enable Jupyter notebook support, install Jupyter and register the environment:

conda install jupyter
python -m ipykernel install --user --name=react

3. Install PyTorch3D

First, verify your CUDA version in Python:

import torch
torch.version.cuda

Download the appropriate PyTorch3D package from Anaconda based on your Python, CUDA, and PyTorch versions. For example, for Python 3.10, CUDA 11.6, and PyTorch 1.12.0:

# linux-64_pytorch3d-0.7.5-py310_cu116_pyt1120.tar.bz2
conda install linux-64_pytorch3d-0.7.5-py310_cu116_pyt1120.tar.bz2

4. Install Additional Dependencies

Install all remaining dependencies specified in requirements.txt:

pip install -r requirements.txt

📊 Dataset

Our work is built upon the REACT 2024 Multimodal Challenge Dataset, which leverages two well-established dyadic interaction datasets: NOXI and RECOLA. The dataset can be accessed through the official REACT 2024 Challenge Homepage.

After downloading the dataset, please rename your downloaded folder to data and place it in the root directory of this project.

Data Structure

Example directory structure:

data
├── test
├── val
├── train
   ├── Video_files
       ├── NoXI
           ├── 010_2016-03-25_Paris
               ├── Expert_video
               ├── Novice_video
                   ├── 1.mp4
                   ├── ....
           ├── ....
       ├── RECOLA
   ├── Audio_files
       ├── NoXI
       ├── RECOLA
           ├── group-1
               ├── P25 
               ├── P26
                   ├── 1.wav
                   ├── ....
           ├── group-2
           ├── group-3
   ├── Emotion
       ├── NoXI
       ├── RECOLA
           ├── group-1
               ├── P1
               ├── P2
                   ├── 1.csv
                   ├── ....
           ├── group-2
           ├── group-3
   ├── 3D_FV_files
       ├── NoXI
       ├── RECOLA
           ├── group-1
               ├── P25 
               ├── P26
                   ├── 1.npy
                   ├── ....
           ├── group-2
           ├── group-3

📖 Usage

Pre-trained Models

This project provides several pre-trained models, such as:

  • Generic Appropriate Facial Reaction Generator (GAFRG)
  • Personalized Weight Shifts Generation (PWSG) Block
  • Personalized Style Space Learning (PSSL) Block

You can access and download all the available pre-trained models from the following Google Drive link. After downloading, please unzip the file and place the checkpoints folder into the root directory of this project.

External Dependencies

Our framework leverages two key external tools:

  • FaceVerse for extraction of 3DMM coefficients
  • PIRender (3D-to-2D tool) for facial reaction frame rendering

For convenience, we have compiled all necessary model files into a single package, available at Google Drive link. After downloading, please extract the external folder and place it in the root directory of this project. This package includes:

  • FaceVerse model (Version 2) and auxiliary files (mean_face, std_face, and reference_full)
  • Well-trained PIRender model

Training

# Training GAFRG for multiple appropriate facial reaction generation
python train_diffusion.py --mode train --writer True --config diffusion_model.yaml
# Training Personalized GAFRG (with Weight Editing) for multiple appropriate facial reaction generation
python train_rewrite_weight.py --mode train --writer True --config rewrite_weight.yaml

Inference

# Inference using GAFRG for multiple appropriate facial reaction generation
python evaluate_diffusion.py --mode test --config diffusion_model.yaml
# Inference using Personalised GAFRG (with Weight Editing) for multiple appropriate facial reaction generation
python evaluate_rewrite_weight.py --mode test --config rewrite_weight.yaml

📽 Visualization of Facial Reactions

Qualitative Results

Qualitative comparison between our method and existing baselines (Click to expand)

Comparison

Dynamic Visualization

video_demo.mp4

🤝 Acknowledgement

We extend our sincere gratitude to the following open-source projects:

🖊️ Citation

@inproceedings{zhu2024perfrdiff,
  title={Perfrdiff: Personalised weight editing for multiple appropriate facial reaction generation},
  author={Zhu, Hengde and Kong, Xiangyu and Xie, Weicheng and Huang, Xin and Shen, Linlin and Liu, Lu and Gunes, Hatice and Song, Siyang},
  booktitle={Proceedings of the 32nd ACM International Conference on Multimedia},
  pages={9495--9504},
  year={2024}
}

About

[ACMMM 2024] The official pytorch implementation of "PerFRDiff: Personalised Weight Editing for Multiple Appropriate Facial Reaction Generation"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages