A course on introduction to data science following the textbook/videos from [An Introduction to Statistical Learning](https://www.statlearning.com/)
# Playlist of All Recorded Lectures
https://www.youtube.com/playlist?list=PLOpo-gE90mdshx4UEZ01y_uzR3tMK08EA
# List of Lecture Topics
## Lecture 1 Topics
#lecture1
- Practice thinking about [[Supervised vs Unsupervised Learning]] and [[Regression vs Classification]] with an example of [[Types of Learning Problems - Example]]
- What is [[Self-supervised Learning]] ?
- What is [[Irreducible Error]] ?
- Why do we use the mean as an estimate? [[Minimizing the Square Loss - Dice Example]]
- Recording of Lecture 1: https://youtu.be/oGgadmHG4JE and blackboard from this class is the first half of [[Week1_Blackboard_DATA6100_F23.pdf]]
## Lecture 2 Topics
#lecture2
- Why does the number of data points we have affect things? [[How Variance Contributes to Loss - Dice Example]]
- A first example with both $X$s and $Y$s [[Nearby Neighbour Averaging - Cannonball Example]]
- This example illustrates [[Overfitting vs Underfitting]]
- Recording of Lecture 2: https://youtu.be/zgsxpRvlbiY and blackboard from that class is the second half of [[Week1_Blackboard_DATA6100_F23.pdf]]
## Lecture 3 Topics
#lecture3
- Some Python tricks to make numpy code fast by using [[Vector Operations]] and [[Array Broadcasting]]
- Generalize our nearest neighbour averaging to use [[More General Weight Functions]] , for example the Gaussian weight function
- Related technique: [[Kernel density estimation]]
- How do we measure the error in practice? [[True vs Test vs Train]] sets
- Variants of the classic train/test split: $k$-fold cross validation or leave-one-out error
- Lecture Recording: https://youtu.be/2F0T-CTIdGE
- See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes
# Lecture 4 Topics
#lecture4
-[[True vs Test vs Train]] errors and how this helps us see [[Overfitting vs Underfitting]]
- [[Confidence intervals]] for the coefficients in linear regression!
- What does [[Statistically Significant]] mean
- Lecture Recording: https://youtu.be/rMbDw8Zzl8k
- See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes
# Lecture 5 Topics
- Some common pitfalls Multiple Linear Regression: What do the coefficients actually mean?
- Lecture Recording: https://youtu.be/ehf58GbQrks
- See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes
# Lecture 6 Topics
- Examples of Variable Selection by using a test set
- Some other common practical issues (like how to deal with categorical variables)
- Lecture Recording: https://youtu.be/Rf8mGvwGEqQ
- See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes
# Lecture 7 Topics
- Intro to classification; maximizing Top1 label accuracy
- K-Nearest Neighbour Examples (KNN) including on MNIST
- How to do "non-linear regression"
- Lecture Recording: https://youtu.be/iNINGnH-97s
- See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes
# Lecture 8 Topics
- Why do we use the mean squared error [[Loss Function]] ?
- What is the cross entropy loss function? [[Normal (aka Gaussian) vs Coin-Flip (aka Bernoulli) Random Variables]]
- What is the [[Sigmoid Function]]
- Derivation of how finding the optimal parameters in linear regression and logistic regression works. [[Linear vs. Logistic Regression]]
- Lecture Recording: https://youtu.be/TaQjjr7TC30
- See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes
# Lecture 9 Topics
- How does the cross-entropy loss generalize to multiple classes?
- How does the sigmoid function generalize to multiple classes? Softmax function
- How does gradient descent work? Code implementation.
- Code example of linear regression.
- Lecture Recording: https://youtu.be/Pun2aj0z_zY
- See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes
# Lecture 10 Topics
- What is Bayes theorem and how does it help us classify things?
- Kernel Density Estimation for classification
- Gaussian Linear Discriminant Analysis in 1D and 2D
- Lecture Recording: https://youtu.be/4JG823PhL5k
- See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes
# Lecture 11 Topics
- Fancier versions of Bayesian classification
- Quadratic discriminant analysis
- Modifying the prior probabilities
- ROC curves and false positives vs false negatives
- Lecture Recording: https://youtu.be/rOnDRVahDgk
- See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes
# Lecture 12 Topics
- Naive Bayes, Example with coin flips and Titanic Example
- See [[Week2-6-blackboard_DATA6100_F23.pdf]] for the pdf of the "blackboards" from the classes
# Lecture 13 Topics
- Multiple Hypothesis Testing
- Bootstrapping
- Lecture Recording: https://youtu.be/P0sbHX9_BTQ
# Lecture 14 Topics
- Shrinkage methods and the James-Stein estimator
- Ridge Regression / $L^2$ regularization
- Recording link https://youtu.be/1AylBPZLKO8
# Lecture 15 Topics
- Normalizing/Standardizing and why its important for regularization
- Overfitting in real life and Goodhart's Law https://sohl-dickstein.github.io/2022/11/06/strong-Goodhart.html
- Lecture recording link https://youtu.be/D2IuSDrMEsg
# Lecture 16 Topics
- $L_p$ norms
- Lasso (aka $L_1$-Regularization ) and why it finds "corner" solutions (aka why it sets some coefficients to zero)
# Lecture 17 Topics
- Eigenvalues and Diagonalization
- Fibonacci numbers example
- Mathematics behind PCA: What is it really?
- https://youtu.be/uZIRQkoz4K0
# Lecture 18
- PCA examples
- Clustering Scotland, Wales, England and Northern Ireland
- Eigenfaces
- MNIST digits
- https://youtu.be/qHgjJodVIGg
# Lecture 19
- Intro to Neural Networks by MNIST example in JAX
- Linear Regression
- f( Linear Regression )
- Logistic Regression
- Adding in more hidden layers
- Link to lecture recording: https://youtu.be/PYSz8NAj73g