Skip to content

bnvai/CUDA-From-Scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📖 References

Official Documentation

Learning Resources

Books

  • Professional CUDA C Programming - John Cheng, Max Grossman, Ty McKercher
  • CUDA by Example - Jason Sanders, Edward Kandrot
  • Programming Massively Parallel Processors - David Kirk, Wen-mei Hwu

🤝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/NewOptimization)
  3. Commit your changes (git commit -m 'Add new optimization technique')
  4. Push to the branch (git push origin feature/NewOptimization)
  5. Open a Pull Request with a clear description

Guidelines

  • Follow CUDA coding best practices
  • Include benchmarking results for optimizations
  • Add comments explaining the technique
  • Verify correctness before submitting

📝 License

This project is licensed under the MIT License. See the LICENSE file for details.

📧 Contact & Support

🎓 Learning Path

Beginner:

  1. Start with 01_vec_add.py - Understand basic CUDA workflow
  2. Study 02_softmax.cu - Learn kernel structure and memory management
  3. Explore atomicAdd.cu - Understand thread synchronization

Intermediate:

  1. Analyze naive_matmul.cu - Basic matrix operations
  2. Study tiled_matmul.cu - Shared memory optimization
  3. Benchmark unrolling_example.cu - Understand loop optimization

Advanced:

  1. Profile with NVTX in nvtx_matmul.cu
  2. Implement stream_advanced.cu - Asynchronous execution
  3. Create custom kernels for your use cases

Made with ❤️ for GPU Computing Enthusiasts

Happy CUDA Learning! 🚀🎓

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors