GitHub - prasunroy/dsketch: :fire: [ICPR 2024] d-Sketch: Improving Visual Fidelity of Sketch-to-Image Translation with Pretrained Latent Diffusion Models without Retraining (official code).

Official code for d-Sketch: Improving Visual Fidelity of Sketch-to-Image Translation with Pretrained Latent Diffusion Models without Retraining.

Accepted in the International Conference on Pattern Recognition (ICPR) 2024.

⚡ Getting Started

Note: This release is tested on Python 3.9.16.

git clone https://github.com/prasunroy/dsketch.git
cd dsketch
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt

🔥 Training

Download the Flickr20 dataset and extract into datasets/flickr20 directory.
Run lctn_train.py with the following options.

lctn_train.py [-h] [--sd_path SD_PATH] [--mixed_precision {no,fp16,bf16,fp8}] [--force_cpu]
              [--data_root DATA_ROOT] [--image_size IMAGE_SIZE] [--batch_size BATCH_SIZE]
              [--shuffle] [--num_workers NUM_WORKERS] [--lr LR] [--steps STEPS]
              [--output_freq OUTPUT_FREQ] [--output_root OUTPUT_ROOT]

Example

python lctn_train.py --sd_path stabilityai/stable-diffusion-2-1 --mixed_precision fp16 --data_root ./datasets/flickr20/ --image_size 768 --batch_size 4 --shuffle --num_workers 8 --lr 0.001 --steps 50000 --output_freq 100 --output_root ./output/

✨ Sampling

Download the sample sketches and extract into result/sample_sketches directory.
(Optional) Copy the best checkpoint <OUTPUT_ROOT>/<TIMESTAMP>/lctn.pth into checkpoints directory.
Run lctn_sample.py with the following options.

lctn_sample.py [-h] [--seed SEED] [--prompt PROMPT] [--sketch SKETCH] [--image_size IMAGE_SIZE]
               [--guidance_scale GUIDANCE_SCALE] [--noising_scale NOISING_SCALE] [--steps STEPS]
               [--sd_path SD_PATH] [--lctn_path LCTN_PATH] [--mixed_precision {no,fp16,bf16,fp8}]
               [--force_cpu] [--output_dir OUTPUT_DIR]

Example

python lctn_sample.py --seed 11111111 --prompt "photo of a fox" --sketch ./result/sample_sketches/fox.png --image_size 768 --guidance_scale 8.0 --noising_scale 0.8 --steps 50 --sd_path stabilityai/stable-diffusion-2-1 --lctn_path ./checkpoints/lctn_flickr20.pth --mixed_precision fp16 --output_dir ./result/fox/

❤️ Citation

@inproceedings{roy2022dsketch,
  title     = {d-Sketch: Improving Visual Fidelity of Sketch-to-Image Translation with Pretrained Latent Diffusion Models without Retraining},
  author    = {Roy, Prasun and Bhattacharya, Saumik and Ghosh, Subhankar and Pal, Umapada and Blumenstein, Michael},
  booktitle = {The International Conference on Pattern Recognition (ICPR)},
  month     = {December},
  year      = {2024}
}

📄 License

Copyright 2024 by the authors

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
lctn_sample.py		lctn_sample.py
lctn_train.py		lctn_train.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Official code for d-Sketch: Improving Visual Fidelity of Sketch-to-Image Translation with Pretrained Latent Diffusion Models without Retraining.

⚡ Getting Started

🔥 Training

Example

✨ Sampling

Example

❤️ Citation

📄 License

Made with ❤️ and 🍕 on Earth.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Official code for d-Sketch: Improving Visual Fidelity of Sketch-to-Image Translation with Pretrained Latent Diffusion Models without Retraining.

⚡ Getting Started

🔥 Training

Example

✨ Sampling

Example

❤️ Citation

📄 License

Made with ❤️ and 🍕 on Earth.

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages