COURSE - Stanford - The Modern Algorithmic Toolbox - mnml's vault

# Stanford - The Modern Algorithmic Toolbox - [The Modern Algorithmic Toolbox (CS168), Spring 2022](https://web.stanford.edu/class/cs168/index.html) (archived [The Modern Algorithmic Toolbox (CS168), Spring 2022](https://www.evernote.com/shard/s6/u/0/sh/ff8210c2-5e78-49cc-ace9-f06a9f39832e/1bff7fd9adb42204cacc1535a3b8e442)) Here is a hacker news: - [CS 168: The Modern Algorithmic Toolbox | Hacker News](https://news.ycombinator.com/item?id=32788475) This course goes through a lot of modern [[Algorithms]] and [[Data Structures]] for different domains: - modern [[Hashing]] - [[Consistent Hashing]] - See also [Consistent Hashing | Algorithms You Should Know #1 - YouTube](https://m.youtube.com/watch?v=UF9Iqmg94tk) - Property preserving lossing [[Compression Algorithm]] - [[Bloom Filters]] - [[count-min sketch]] - similarity [[Search]] - [[Jaccard similarity]] - [[kd-trees]] - distance preserving compression - [[Jaccard similarity]] using MinHash - [[Dimension Reduction]] - [[Optimal Hashing]] - [[Locality Sensitive Hashing]] - Generalization and [[Regularization]] - [[L1 regularization]] - [[L2 regularization]] - [[Training error]] - [[Test error]] - [[Principal Components Analysis]] - Videos around these areas: [Steve Brunton - YouTube](https://www.youtube.com/c/Eigensteve) - [[Eigenfaces]] - [[Single Value Decomposition]] - Spectral graph theory - [[Laplacian of a graph]] - [[Eigenvectors]] and [[Eigenvalues]] - [[Graph coloring]] - Sampling and estimation - [[Reservoir Sampling]] - [[Importance Sampling]] - [[Markov Chains]] - [[Stationary Distributions]] - [[Markov Chain Monte Carlo]] - [[Monte Carlo Methods]] - [[Fourier Methods]] - [[Fourier Transform]] - [[Convolution]] - [[Sparse Vector]]/[[Sparse Matrix]] Methods ([[Linear Algebra]]) - See [BOOK - Generalized Low Rank Models](https://web.stanford.edu/~boyd/papers/pdf/glrm.pdf) - See also [Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares | Hacker News](https://news.ycombinator.com/item?id=18678314) - This book focuses on concrete implementation of Linear Algebra concepts - [[Compressive Sensing]] - [[Linear Programming]] - [[Convex Optimization]] - Privacy Preserving Computation - [[Differential Privacy]] ## Projects - Project 1 - randomized [[Load Balancing]] - [[count-min sketch]] implementation - Project 2 - Similarity metrics - [[Nearest Neighbor]] [[Classification]] - [[Dimension Reduction]] - [[Locality Sensitive Hashing]] - Project 3 - [[Regression (statistics)]] - [[Least-Squares Regression]] - [[Gradient Descent]] - [[Stochastic Gradient Descent]] - [[Generalization (statistics)]] - [[L2 regularization]] - [[Stochastic Gradient Descent]] - [[Double Descent]] > [!quote] you will explore “double-descent”, a surprising phenomena whereby the performance of a model increases, then decreases, then increases, as the amount of training data increases. - Project 4 - [[Principal Components Analysis]] - [[Least-Squares Regression]] - Project 5 - [[Single Value Decomposition]] for [[Word Embeddings]] ([[Natural Language Processing]]) - [[Single Value Decomposition]] for [[Image Processing]] - Project 6 - [[Spectral Methods]] - [[Laplacian of a graph]] - [[Laplacian Matrix]] - [[Eigenvectors]] - [[Connected Components]] - Project 7 - [[Markov Chains]] - [[MCMC]], [[Traveling Salesman]] - [[Optimization (algorithms)]] - Project 8 - [[Fourier Transform]] ## Books - [[BOOK - Machine Learning - Kevin P Murphy]] has a lot of resources about the statistical part of the lecture - [[BOOK - Mining of Massive Datasets - Jure Leskovec Anand Rajaraman Jeffrey David Ullman]] - [Chapter 3 - Finding Similar Items](http://infolab.stanford.edu/~ullman/mmds/ch3.pdf)