Source: [Discrete choice and machine learning - two complementary methodologies, Michel Bierlaire](https://youtu.be/pUDA21B4x38), Aug 5, 2021 --- Talks about how in machine learning when people use panel data (time series data containing multiple individuals) they split the data set into training and validation based on the observations - but that is incorrect, since all observations for a given individual is highly correlated. Therefore data should be split based on individuals and not observations. This is the mistake I made initially when I was training my classifier for my Surgical Data Science course's project on a JIGSAWS dataset.