CUDA C++ Optimization: Coding Faster GPU Kernels (Generative AI LLM Programming) by David Spuler
English | October 14, 2024 | ISBN: N/A | ASIN: B0DJT5JKM9 | 233 pages | EPUB | 1.03 Mb
English | October 14, 2024 | ISBN: N/A | ASIN: B0DJT5JKM9 | 233 pages | EPUB | 1.03 Mb
Increase the efficiency of CUDA C++ kernels for AI and high-performance computing on the powerful NVIDIA GPUs. Leverage your GPU investment with the power of an efficient software layer.
Main Topics
- Speeding up CUDA C++ kernels
- Parallelization and vectorization
- Compute optimizations
- Memory access optimizations
Table of Contents:
1. Parallel Programming
2. Optimizing CUDA Programs
3. Vectorization
4. AI Kernel Optimization
5. Profiling Tools
6. Compilers and Optimizers
7. Timing CUDA C++ Programs
8. Memory Optimizations
9. Coalescing and Striding
10. Data Transfer Optimizations
11. Heap Memory Allocation
12. Compute Optimizations
13. Warp Divergence
14. Grid Optimizations
15. Compile-Time Optimizations
16. Arithmetic Optimizations
17. Floating-Point Bit Tricks
18. Advanced Techniques
Appendix: CUDA C++ Slugs