# Intro
I'm curating this list partly for myself and partly for others as I have seen people asking for online resources to study AI and ML a lot.
Notes:
- I don't accept any form of payment for endorsements and the information below is of course purely my own opinion. You may well disagree and that's totally fine. I'm probably wrong, as anyone who knows me will testify is frequently the case.
- I intend to work through all or most of the below as I find time, but while I have worked through some in depth, started others, and just skimmed a few, I haven't yet done them all in detail so take them as things I view as potentially interesting rather than recommendations from personal experience.
# Resources
## Basic Material
### Videos
- [Karpathy's "Intro to LLMs" video](https://youtu.be/zjkBMFhNj_g)
- [3 blue 1 brown's "Neural Networks from the ground up" Playlist](https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi)
### Online courses
- ["AI for everyone" from deeplearning.ai](http://deeplearning.ai) - nontechnical intro suitable as "team background" including for non-engineers
- [prompt engineering for developers](https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers/) short course by deeplearning.ai. It's *extremely basic* from a development point of view but for something that can be done in a day it radically improved my ability to prompt things.
### Other
- [Enhancing LLMs with Retrieval Augmented Generation](https://scale.com/blog/retrieval-augmented-generation-to-enhance-llms) - Scale.ai blog post that does a decent job of explaining RAG in particular which is a worthwhile concept to understand
- [The State of Opensource AI(book)](https://github.com/premAI-io/state-of-open-source-ai)
- [What is chatgpt doing and why does it work?](https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/) - Good basic practical explanation of how transformer models work. It's Stephen Wolfram so if you're not familiar, some standard Wolfram-related disclaimers apply) (specifically take his more [grandiose](http://bactra.org/reviews/wolfram/) and [non-specific](https://arxiv.org/abs/quant-ph/0206089) claims with a pinch of salt).
## Intermediate
### Reading
- [Reading list for Karpathy Intro to LLMs Video](https://blog.oxen.ai/reading-list-for-andrej-karpathys-intro-to-large-language-models-video/)
- [Anti-hype LLM reading list](https://gist.github.com/veekaybee/be375ab33085102f9027853128dc5f0e) - Big online list of useful LLM-related learning materials varying from relatively introductory to more advanced in nature.
- [# A Mathematical Framework for Transformer Circuits](https://transformer-circuits.pub/2021/framework/index.html)
### Coding
- [An Introduction to Statistical Learning (Book)](https://www.statlearning.com/) - Book with variants showing code in R or Python
- [ML Recipes](https://github.com/rougier/ML-Recipes) - Standalone Python examples of machine learning algorithms
- [Understanding Deep Learning (Preprint Book)](https://udlbook.github.io/udlbook/) - a preprint book with a bunch of python notebooks to fill in/complete
- [Deep learning with Python notebooks](https://github.com/fchollet/deep-learning-with-python-notebooks)
### Online/Videos
- [Luis Serrano's Transformers Playlist](https://www.youtube.com/watch?v=OxCpWwDCDFQ&list=PLs8w1Cdi-zva4fwKkl9EK13siFvL9Wewf)
- [Maths for ML](https://www.deeplearning.ai/courses/mathematics-for-machine-learning-and-data-science-specialization/) Online course "Specialization" actually hosted by Luis Serrano that gives some background in Calculus, probability and Linear Algebra that is directly relevant to ML. If your maths isn't rusty there's probably no need but I found it useful and fun.
## Serious study
### Reading
- [Probabilistic Machine Learning](https://probml.github.io/pml-book/) - 3 book series by Kevin Murphy presenting a very serious mathematical approach to the topic
- Russell and Norvig’s _Artificial Intelligence: A Modern Approach_.
- [Physics-based deep learning (Online Book)](https://physicsbaseddeeplearning.org/intro.html)
### Coding
- [Neural Networks: Zero to Hero by Karpathy](https://karpathy.ai/zero-to-hero.html) - A really exceptional worked example of building neural networks from scratch
- [Fine tuning your own Llama 2 from start to finish](https://news.ycombinator.com/item?id=37484135)
### Online Course Materials
- [Imperial College maths for ML specialization](https://coursera.org/specializations/mathematics-machine-learning)
- [UC Berkley CS intro to AI course materials](http://ai.berkeley.edu/)
- [MIT open courseware on AI](https://ocw.mit.edu/search/?q=artificial%20intelligence) and [ML](https://ocw.mit.edu/search/?q=machine%20learning)
- [Stanford AI Course materials](https://ai.stanford.edu/courses/)
- Yann LeCun's NYU 2001 Course on [Deep Learning](https://atcold.github.io/NYU-DLSP21/)
## Unclassified/other
- [Resources for AI to assist mathematical reasoning](https://docs.google.com/document/d/1kD7H4E28656ua8jOGZ934nbH2HcBLyxcRgFDduH5iQ0/) - Big list of AI resources specifically targeted at use cases for AI within mathematics (eg AI proof assistants etc)
## Backup Material
This is a list of things that might be useful to the above but are not directly AI or ML focussed
- [Introduction to modern statistics (book)](https://openintro-ims2.netlify.app/foundations-of-inference) - Book that seems a pretty solid intro to stats with examples in R and python
## [30 papers that contain 90% of what matters](https://arc.net/folder/D0472A20-9C20-4D3F-B145-D2865C0A9FEE)
...according to Illya Sutskever, who seems not to have noticed there are actually only 26 items in this list of 30. Perhaps the remaining 4 have yet to be published.
- [The Annotated Transformer](https://nlp.seas.harvard.edu/annotated-transformer/)
- [The First Law of Complexodynamics](https://scottaaronson.blog/?p=762 "Permanent Link: The First Law of Complexodynamics")
- [The Unreasonable Effectiveness of Recurrent Neural Networks](https://karpathy.github.io/2015/05/21/rnn-effectiveness/)
- [Understanding LSTM Networks](https://colah.github.io/posts/2015-08-Understanding-LSTMs/)
- [Recurrent Neural Network Regularization](https://arxiv.org/pdf/1409.2329.pdf)
- [Keeping Neural Networks Simple](https://www.cs.toronto.edu/~hinton/absps/colt93.pdf)
- [Pointer Networks](https://arxiv.org/pdf/1506.03134.pdf)
- [ImageNet Classification with Deep Convolutional Neural Networks](https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf)
- [Order Matters: Sequence to Sequence for Sets](https://arxiv.org/pdf/1511.06391.pdf)
- [GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism](https://arxiv.org/pdf/1811.06965.pdf)
- [Deep Residual Learning for Image Recognition](https://arxiv.org/pdf/1512.03385.pdf)
- [Multi-scale Context Aggregation by Dilated Convolutions](https://arxiv.org/pdf/1511.07122.pdf)
- [Neural Message Passing for Quantum Chemistry](https://arxiv.org/pdf/1704.01212.pdf)
- [Attention Is All You Need](https://arxiv.org/pdf/1706.03762.pdf)
- [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/pdf/1409.0473.pdf)
- [Identity Mappings in Deep Residual Networks](https://arxiv.org/pdf/1603.05027.pdf)
- [A simple neural network module for relational reasoning](https://arxiv.org/pdf/1706.01427.pdf)
- [Variational Lossy Autoencoder](https://arxiv.org/pdf/1611.02731.pdf)
- [Relational recurrent neural networks](https://arxiv.org/pdf/1806.01822.pdf)
- [Quantifying the Rise and Fall of Complexity in Closed Systems](https://arxiv.org/pdf/1405.6903.pdf)
- [Neural Turing Machines](https://arxiv.org/pdf/1410.5401.pdf)
- [Deep Speech 2: End-to-End Speech Recognition in English and Mandarin](https://arxiv.org/pdf/1512.02595.pdf)
- [Scaling Laws for Neural Language Models](https://arxiv.org/pdf/2001.08361.pdf)
- [A Tutorial Introduction to the Minimum Description Length Principle](https://arxiv.org/pdf/math/0406077.pdf)
- [Machine Super Intelligence](https://www.vetta.org/documents/Machine_Super_Intelligence.pdf)
- [Kolmogorov Complexity and Algorithmic Randomness (Book)](https://www.lirmm.fr/~ashen/kolmbook-eng-scan.pdf)
- [Stanford course on convolutional nets for image recognition](https://cs231n.github.io/)