Generative AI in C++: Coding Transformers and LLMs (Generative AI LLM Programming) by David Spuler, Michael Sharpe, Cameron Gregory
English | March 30, 2024 | ISBN: N/A | ASIN: B0CXJKCWX9 | 1206 pages | EPUB | 1.15 Mb
English | March 30, 2024 | ISBN: N/A | ASIN: B0CXJKCWX9 | 1206 pages | EPUB | 1.15 Mb
Do you know C++ but not AI? Do you dream of writing your own Generative AI engine in C++? From beginner to advanced, this book covers the internals of GPT-style Transformer engines and Large Language Models (LLMs) in C++, with source code examples and research paper citations.Key Features
- Transformer components in C++
- Faster and smarter AI
- Open source LLMs
- Advanced software development
- Cutting-edge research optimizations
- Just C++ code without all the math
- Research papers literature survey
Part I: AI Projects in C++
1. Introduction to AI in C++
2. Transformers & LLMs
3. AI Phones
4. AI on Your Desktop
5. Design Choices & Architectures
6. Training, Fine-Tuning & RAG
7. Deployment Architecture
Part II: Basic C++ Optimizations
8. Bitwise Operations
9. Floating Point Arithmetic
10. Arithmetic Optimizations
11. Compile-Time Optimizations
12. Pointer Arithmetic
13. Algorithm Speedups
14. Memory Optimizations
Part III: Parallel C++ Optimizations
15. Loop Vectorization
16. Hardware Acceleration
17. AVX Intrinsics
18. Parallel Data Structures
Part IV: Transformer Components in C++
19. Encoders & Decoders
20. Attention
21. Activation Functions
22. Vector Algorithms
23. Tensors
24. Normalization
25. Softmax
26. Decoding Algorithms
27. Tokenizer and Vocabulary
Part V: Optimizing Transformers in C++
28. Deslugging AI Engines
29. Caching Optimizations
30. Vectorization
31. Kernel Fusion
32. Quantization
33. Pruning
34. MatMul/GEMM
35. Lookup Tables & Precomputation
36. AI Memory Optimizations
Part VI: Enterprise AI in C++
37. Tuning, Profiling & Benchmarking
38. Platform Portability
39. Quality
40. Reliability
41. Self-Testing Code
42. Debugging
Part VII: Research on AI Optimization
43. Overview of AI Research
44. Advanced Quantization
45. Knowledge Distillation
46. Structured Pruning
47. Early Exit and Layer Pruning
48. Width Pruning
49. Length Pruning
50. Adaptive Inference
51. Zero-Multiplication Models
52. Logarithmic Models
53. Arithmetic Optimization Research
54. Ensemble Multi-Model Architectures
55. Advanced Number Systems
56. Neural Architecture Search
Appendix 1: C++ Slug Catalog