A course on introduction to data science following the textbook/videos from [An Introduction to Statistical Learning](https://www.statlearning.com/) # Playlist of All Recorded Lectures https://www.youtube.com/playlist?list=PLOpo-gE90mdshx4UEZ01y_uzR3tMK08EA # List of Lecture Topics ## Lecture 1 Topics #lecture1 - Practice thinking about [[Supervised vs Unsupervised Learning]] and [[Regression vs Classification]] with an example of [[Types of Learning Problems - Example]] - What is [[Self-supervised Learning]] ? - What is [[Irreducible Error]] ? - Why do we use the mean as an estimate? [[Minimizing the Square Loss - Dice Example]] - Recording of Lecture 1: https://youtu.be/oGgadmHG4JE and blackboard from this class is the first half of [[Week1_Blackboard_DATA6100_F23.pdf]] ## Lecture 2 Topics #lecture2 - Why does the number of data points we have affect things? [[How Variance Contributes to Loss - Dice Example]] - A first example with both $X$s and $Y$s [[Nearby Neighbour Averaging - Cannonball Example]] - This example illustrates [[Overfitting vs Underfitting]] - Recording of Lecture 2: https://youtu.be/zgsxpRvlbiY and blackboard from that class is the second half of [[Week1_Blackboard_DATA6100_F23.pdf]] ## Lecture 3 Topics #lecture3 - Some Python tricks to make numpy code fast by using [[Vector Operations]] and [[Array Broadcasting]] - Generalize our nearest neighbour averaging to use [[More General Weight Functions]] , for example the Gaussian weight function - Related technique: [[Kernel density estimation]] - How do we measure the error in practice? [[True vs Test vs Train]] sets - Variants of the classic train/test split: $k$-fold cross validation or leave-one-out error - Lecture Recording: https://youtu.be/2F0T-CTIdGE - See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes # Lecture 4 Topics #lecture4 -[[True vs Test vs Train]] errors and how this helps us see [[Overfitting vs Underfitting]] - [[Confidence intervals]] for the coefficients in linear regression! - What does [[Statistically Significant]] mean - Lecture Recording: https://youtu.be/rMbDw8Zzl8k - See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes # Lecture 5 Topics - Some common pitfalls Multiple Linear Regression: What do the coefficients actually mean? - Lecture Recording: https://youtu.be/ehf58GbQrks - See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes # Lecture 6 Topics - Examples of Variable Selection by using a test set - Some other common practical issues (like how to deal with categorical variables) - Lecture Recording: https://youtu.be/Rf8mGvwGEqQ - See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes # Lecture 7 Topics - Intro to classification; maximizing Top1 label accuracy - K-Nearest Neighbour Examples (KNN) including on MNIST - How to do "non-linear regression" - Lecture Recording: https://youtu.be/iNINGnH-97s - See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes # Lecture 8 Topics - Why do we use the mean squared error [[Loss Function]] ? - What is the cross entropy loss function? [[Normal (aka Gaussian) vs Coin-Flip (aka Bernoulli) Random Variables]] - What is the [[Sigmoid Function]] - Derivation of how finding the optimal parameters in linear regression and logistic regression works. [[Linear vs. Logistic Regression]] - Lecture Recording: https://youtu.be/TaQjjr7TC30 - See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes # Lecture 9 Topics - How does the cross-entropy loss generalize to multiple classes? - How does the sigmoid function generalize to multiple classes? Softmax function - How does gradient descent work? Code implementation. - Code example of linear regression. - Lecture Recording: https://youtu.be/Pun2aj0z_zY - See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes # Lecture 10 Topics - What is Bayes theorem and how does it help us classify things? - Kernel Density Estimation for classification - Gaussian Linear Discriminant Analysis in 1D and 2D - Lecture Recording: https://youtu.be/4JG823PhL5k - See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes # Lecture 11 Topics - Fancier versions of Bayesian classification - Quadratic discriminant analysis - Modifying the prior probabilities - ROC curves and false positives vs false negatives - Lecture Recording: https://youtu.be/rOnDRVahDgk - See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes # Lecture 12 Topics - Naive Bayes, Example with coin flips and Titanic Example - See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes # Lecture 13 Topics - Multiple Hypothesis Testing - Bootstrapping - Lecture Recording: https://youtu.be/P0sbHX9_BTQ # Lecture 14 Topics - Shrinkage methods and the James-Stein estimator - Ridge Regression / $L^2$ regularization - Recording link https://youtu.be/1AylBPZLKO8 # Lecture 15 Topics - Normalizing/Standardizing and why its important for regularization - Overfitting in real life and Goodhart's Law https://sohl-dickstein.github.io/2022/11/06/strong-Goodhart.html - Lecture recording link https://youtu.be/D2IuSDrMEsg # Lecture 16 Topics - $L_p$ norms - Lasso (aka $L_1$-Regularization ) and why it finds "corner" solutions (aka why it sets some coefficients to zero) # Lecture 17 Topics - Eigenvalues and Diagonalization - Fibonacci numbers example - Mathematics behind PCA: What is it really? - https://youtu.be/uZIRQkoz4K0 # Lecture 18 - PCA examples - Clustering Scotland, Wales, England and Northern Ireland - Eigenfaces - MNIST digits - https://youtu.be/qHgjJodVIGg # Lecture 19 - Intro to Neural Networks by MNIST example in JAX - Linear Regression - f( Linear Regression ) - Logistic Regression - Adding in more hidden layers - Link to lecture recording: https://youtu.be/PYSz8NAj73g