This repository provides the implementation code corresponding to our paper entitled Causal Time Series Generation via Diffusion Models. The code is implemented using PyTorch 1.13.0 and PyTorch Lightning 1.4.2 framework on a server with NVIDIA A100 80GB PCIe.
In the paper, we introduce causal time series generation as a new time series generation task family, formalized within Pearl’s causal ladder, include interventional and counterfactual settings.
To instantiate these tasks, we develop CaTSG, a unified diffusion-based generative framework with backdoor-adjusted guidance that steers sampling toward desired interventional and individual counterfactual distributions.
CaTSG uses the following dependencies:
- Pytorch 1.13.0 and PyTorch Lightning 1.4.2
- Numpy and Scipy
- Python 3.8
- CUDA 11.7 or latest version, cuDNN
Please first clone the TimeCraft repository and then set up the environment for CaTSG.
# Clone the repository
git clone https://github.com/microsoft/TimeCraft.git
cd TimeCraft/CaTSG
# Create and activate conda environment
conda env create -f environment.yml
conda activate catsgThis project supports both synthetic datasets for controlled experiments and real-world datasets for practical evaluation.
-
Synthetic datasets
We construct two synthetic datasets which simulate a class of damped mechanical oscillators governed by second-order differential equations$m \cdot \ddot{x}(t) + \gamma \cdot \dot{x}(t) + k \cdot x(t) = 0$ . Details are presented in the appendix of our paper.- Harmonic-VM: Harmonic Oscillator with Variable Mass
- Harmonic-VP: Harmonic Oscillator with Variable Parameters
-
Real-world datasets
- Air Quality: Four years of hourly air quality and meteorological measurements from 12 monitoring stations in Beijing, China.
- Traffic: Hourly traffic volume recorded on Interstate 94 near Minneapolis–St Paul, USA, including weather and holiday indicators.
You can also create the datasets from scratch:
# Harmonic-VM
python utils/tsg_dataset_creator.py --config configs/dataset_config/harmonic_vm.yaml
# Harmonic-VP
python utils/tsg_dataset_creator.py --config configs/dataset_config/harmonic_vp.yaml- Air Quality: Download here and unzip the dataset. Place all
.csvfiles fromPRSA_Data_20130301-20170228(12 statations data) into./data_raw/AQ/folder. - Traffic: Download here and unzip the dataset. Place the single csv file into
./data_raw/Metro_Interstate_Traffic_Volume.csv.
After downloading, the directory should look like:
data_raw
├── AQ
│ ├── PRSA_Data_Aotizhongxin_20130301-20170228.csv
│ ├── ...
│ └── PRSA_Data_Wanshouxigong_20130301-20170228.csv
└── Metro_Interstate_Traffic_Volume.csvRun the following commands to generate processed datasets:
# Air Quality dataset
python utils/tsg_dataset_creator.py --config configs/dataset_config/aq.yaml
# Traffic dataset
python utils/tsg_dataset_creator.py --config configs/dataset_config/traffic.yamlThe default dataset splits used in our experiments are listed below.
You can modify them in configs/dataset_config/{dataset}.yaml.
For the Air Quality dataset split, an interactive map of station locations is available here.
| Type | Dataset | Target ( |
Context ( |
Default split strategy | Samples (Train/Val/Test) |
|---|---|---|---|---|---|
| Synthetic | Harmonic-VM | Acceleration | Velocity, Position |
|
3,000/ 1,000/ 1,000 |
| Synthetic | Harmonic-VP | Acceleration | Velocity, Position | Combination-based: Train: |
3,000/ 1,000/ 1,000 |
| Real-world | Air Quality | TEMP, PRES, DEWP, WSPM, RAIN, wd | Station-based: Train (Dongsi, Guanyuan, Tiantan, Wanshouxigong, Aotizhongxin, Nongzhanguan, Wanliu, Gucheng); Val (Changping, Dingling); Test (Shunyi, Huairou) | 11,664/2,916/2,916 | |
| Real-world | Traffic | traffic_volume | rain_1h, snow_1h, clouds_all, weather_main, holiday | Temperature-based: Train (<12°C); Val ([12,22]°C); Test (>22°C) | 26,477/16,054/5,578 |
Train CaTSG on the harmonic dataset and test both intervention and counterfactual tasks:
# 1) Training (automatically runs both int and cf evaluation after training)
python main.py --base configs/catsg.yaml --dataset harmonic_vm --train
# 2) Testing specific tasks
python main.py --base configs/catsg.yaml --dataset harmonic_vm --test int
python main.py --base configs/catsg.yaml --dataset harmonic_vm --test cf_harmonicOutputs
- Logs: saved under
logs/<dataset>/CaTSG/<exp_name>/ - Results: saved as
.csvunderresults/<dataset>/
If you find our work useful, please cite:
@article{xia2025causal,
title={Causal Time Series Generation via Diffusion Models},
author={Xia, Yutong and Xu, Chang and Liang, Yuxuan and Wen, Qingsong and Zimmermann, Roger and Bian, Jiang},
journal={arXiv preprint arXiv:2509.20846},
year={2025}
}