My self personal study course on the papers Ilya Sutskever recommended to John Carmack to help him grok Language Models
This reading list represents a comprehensive overview of fundamental AI concepts and breakthrough papers that shaped the field. The list is particularly well-curated as it covers key areas like:
- Neural network architectures (Transformers, LSTMs, ResNets)
- Scaling and optimization techniques
- Natural language processing
- Computer vision
- Theoretical foundations
- Practical applications
- The Annotated Transformer
- Understanding LSTM Networks
- ImageNet Classification with Deep Convolutional Neural Networks
- Attention Is All You Need
- Deep Residual Learning for Image Recognition
- Neural Turing Machines
- Pointer Networks
- Identity Mappings in Deep Residual Networks
- GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism
- Scaling Laws for Neural Language Models
- Recurrent Neural Network Regularization
- The First Law of Complexodynamics
- Kolmogorov Complexity and Algorithmic Randomness
- A Tutorial Introduction to the Minimum Description Length Principle
- The Unreasonable Effectiveness of Recurrent Neural Networks
- CS231n: Convolutional Neural Networks for Visual Recognition
The CS231n course notes serve as an excellent practical foundation before diving into the papers. For someone starting out, I recommend beginning with the blog posts (Karpathy's RNN post, Olah's LSTM post) before tackling the more technical papers.
This approach of first understanding the high-level concepts before diving into mathematical details is sound. It will help build intuition before exploring the deeper theoretical foundations.
- Original list forked from: https://github.com/dzyim/ilya-sutskever-recommended-reading/
- Links merged from: https://github.com/dzyim/ilya-sutskever-recommended-reading/pull/1/commits/2cf1f4117abaf182589656b9b1219dbd3d380162