Decision Tree Regression - ML Pathway

## Overview Decision Tree Regression is a non-linear regression technique that uses a decision tree to predict continuous values. The model splits the data into smaller subsets based on the most significant features and assigns a prediction value to each leaf node. ## Key Components 1. **Root Node** - The topmost node where the data is first split. 2. **Internal Nodes** - Nodes where further splits are made based on features. 3. **Branches** - Connections between nodes representing possible outcomes of decisions. 4. **Leaf Nodes** - Final nodes that hold the predicted continuous values for the data points that reach them. ## How It Works 1. **Feature Selection** - At each node, the tree chooses the feature and split point that minimises the variance of the target variable. 2. **Recursive Splitting** - The tree continues to split the data recursively until stopping criteria (e.g., maximum depth) are met. 3. **Prediction** - For new data, the tree makes predictions based on which leaf node the data points fall into. The prediction is the average target value of the data points in that leaf node. ## Implementation Example ```python from sklearn.tree import DecisionTreeRegressor from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error # Initialize Decision Tree Regressor model dt_regressor = DecisionTreeRegressor(max_depth=5) # Train model dt_regressor.fit(X_train, y_train) # Make predictions predictions = dt_regressor.predict(X_test) mse = mean_squared_error(y_test, predictions) print(f'MSE: {mse:.2f}') ``` ## Advantages - Can model non-linear relationships in data. - No need for feature scaling or normalization. - Simple to interpret and visualize. - Can handle both numerical and categorical data. - Automatically handles feature interactions. ## Disadvantages - Prone to overfitting, especially with deep trees. - Less stable with small changes in data (sensitive to noise). - Can create overly complex trees if not properly pruned. - Not suitable for extrapolation (predicting outside the training range). ## Hyperparameters 1. **Max Depth (`max_depth`)** - Controls the maximum depth of the tree. - Limits the number of splits to prevent overfitting. 2. **Min Samples Split (`min_samples_split`)** - The minimum number of samples required to split an internal node. - Helps in controlling tree growth and preventing overfitting. 3. **Min Samples Leaf (`min_samples_leaf`)** - The minimum number of samples required to be at a leaf node. - Ensures that leaf nodes contain a sufficient amount of data, helping prevent overfitting. ## Best Practices 1. **Pruning Techniques** - Apply pre-pruning (limit tree depth) or post-pruning (prune nodes after tree construction) to reduce overfitting. 2. **Feature Selection** - Remove irrelevant features before building the tree to reduce complexity. 3. **Cross-Validation** - Use cross-validation to select the best hyperparameters (e.g., max depth). ## Common Applications - Stock price prediction - Housing price prediction - Medical data regression - Predicting sales in retail - Predicting fuel consumption in vehicles ## Performance Optimisation 1. **Grid Search** - Use Grid Search to find the optimal hyper parameters (e.g., max depth, min samples split). 2. **Ensemble Methods** - Use [[Random Forest Regression]] or [[Gradient Boosting Regression ]] - to improve model performance and reduce overfitting. ## Evaluation Metrics - Mean Squared Error (MSE) - Root Mean Squared Error (RMSE) - R² Score