This project implements a Conditional Denoising Diffusion Probabilistic Model (DDPM) for image denoising tasks using PyTorch. It is designed to learn the mapping from noisy or low-quality images (condition) to clean ground truth images.
diffusion/
├── config/ # Configuration files
├── core/ # Core utilities (logger, metrics)
├── model/ # Model definitions
│ ├── conditional_ddpm_modules/
│ │ ├── diffusion.py # Main DDPM logic
│ │ └── unet.py # U-Net architecture
│ ├── base_model.py
│ ├── model.py # Model wrapper
│ └── networks.py # Network initialization
├── dataset.py # Dataset loading and augmentation
├── train.py # Training and evaluation script
├── requirements.txt # Python dependencies
└── README.md # Project documentation
Install the required dependencies:
pip install -r requirement.txtCommon dependencies include:
- torch
- torchvision
- numpy
- tqdm
- tensorboardX
- wandb (optional, for logging)
- Pillow
The dataset is managed via text files listing the image filenames.
-
Data Organization: Place your Ground Truth (GT) and Condition (Noisy) images in corresponding directories.
Example:
dataset/gt_images/dataset/cond_images/
-
File Lists: Create
train.txt,val.txt, andtest.txtcontaining the filenames (e.g.,image_001.png) to be used for each phase. One filename per line. -
Configuration: Update
config/config.jsonwith the paths to your data:"datasets": { "train": { "gt_dataset_path": "/path/to/gt_images", "cond_dataset_path": "/path/to/cond_images", "dataroot": "train.txt", ... }, "val": { ... }, "test": { ... } }
The model and training parameters are defined in config/config.json. Key parameters include:
model: Defines U-Net structure (channels, attention, etc.) and diffusion beta schedule.train: Training iterations, learning rate, validation frequency.wandb: WandB project name (optional).
To start training the model:
python train.py --config config/config.json --phase trainArguments:
--config: Path to the configuration file.--phase:trainorval.--gpu_ids: Specify GPU IDs (e.g.,0or0,1).--debug: Enable debug mode (runs fewer iterations for testing).-enable_wandb: Enable Weights & Biases logging.
To evaluate the model using the test dataset defined in config:
python train.py --config config/config.json --phase valThis will load the resume state defined in config.json (under path.resume_state) or the latest checkpoint if configured, and generate results in the results/ directory.
- Network: A U-Net with attention mechanisms and time embedding injection.
- Diffusion: Gaussian Diffusion with linear beta schedule.
- Conditioning: The condition image is concatenated with the noisy input at the channel dimension (Input channels = 6 -> Output channels = 3).
Recent updates addressed the following issues in diffusion.py:
-
Corrected
predict_start_from_noise: Fixed the mathematical formula for reconstructing$x_0$ from$x_t$ and noise. - Fixed Conditional Input Handling: Resolved an issue where the concatenated input (6 channels) was incorrectly passed to functions expecting only the latent state (3 channels), causing shape mismatches.
- Variable Naming: improved clarity in sampling loops.