Polynomial Regression - ML Pathway

## Overview Polynomial Regression is an extension of Linear Regression that models the relationship between the independent and dependent variables as an nn-degree polynomial. It is useful when data exhibits a non-linear relationship. ## Key Components 1. **Polynomial Equation** - Instead of a simple linear equation (Y=wX+b), polynomial regression fits a higher-degree equation: $Y = w_0 + w_1X + w_2X^2 + w_3X^3 + ... + w_nX^n$ - Where: - w0,w1,w2,...,wn are coefficients - X is the input feature - n is the polynomial degree 2. **Feature Transformation** - Converts original features into polynomial features before fitting a linear model 1. **Loss Function (Mean Squared Error - MSE)** - Measures how well the model fits the data 1. **Optimization (Gradient Descent or Normal Equation)** - Finds the best coefficients by minimizing the MSE ## How It Works 1. **Transform Features** - Convert input X into $X, X^2, X^3, ... X^n$ 2. **Fit a Linear Model** - Apply Linear Regression on transformed features 1. **Make Predictions** - Use the polynomial equation to predict output ## Implementation Example ```python from sklearn.preprocessing import PolynomialFeatures from sklearn.linear_model import LinearRegression from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error # Transform input features into polynomial features poly = PolynomialFeatures(degree=3) X_train_poly = poly.fit_transform(X_train) X_test_poly = poly.transform(X_test) # Train the model poly_reg = LinearRegression() poly_reg.fit(X_train_poly, y_train) # Make predictions predictions = poly_reg.predict(X_test_poly) mse = mean_squared_error(y_test, predictions) print(f'MSE: {mse:.2f}') ``` ## Advantages - Captures non-linear relationships in data - More flexible than simple linear regression - Can model complex real-world scenarios ## Disadvantages - Higher-degree polynomials can lead to overfitting - Computationally expensive for large datasets - Sensitive to outliers ## Hyperparameters 1. **Degree of Polynomial (`degree`)** - Controls complexity of the model - Higher degree → More flexibility, but risk of overfitting 1. **Regularisation (`Ridge` and `Lasso`)** - Prevents overfitting by penalising large coefficients ## Best Practices 1. **Choose the Right Degree** - Start with a low-degree polynomial and increase if needed 2. **Regularization** - Use L2 (Ridge) or L1 (Lasso) regularization to reduce overfitting 3. **Feature Scaling** - Standardize features for better performance ## Common Applications - Stock price prediction - Growth rate modelling - Weather forecasting - Sales trend analysis - Medical dose-response modelling ## Performance Optimization 1. **Reduce Overfitting** - Use Cross-Validation to find the optimal polynomial degree 1. **Feature Selection** - Avoid using too many high-degree terms 1. **Use [[Ridge Regression]] and [[Lasso Regression]]** - Prevents extreme coefficient values ## Evaluation Metrics - Mean Squared Error (MSE) - Root Mean Squared Error (RMSE) - R² Score