## Overview
Polynomial Regression is an extension of Linear Regression that models the relationship between the independent and dependent variables as an nn-degree polynomial. It is useful when data exhibits a non-linear relationship.
## Key Components
1. **Polynomial Equation**
- Instead of a simple linear equation (Y=wX+b), polynomial regression fits a higher-degree equation: $Y = w_0 + w_1X + w_2X^2 + w_3X^3 + ... + w_nX^n$
- Where:
- w0,w1,w2,...,wn are coefficients
- X is the input feature
- n is the polynomial degree
2. **Feature Transformation**
- Converts original features into polynomial features before fitting a linear model
1. **Loss Function (Mean Squared Error - MSE)**
- Measures how well the model fits the data
1. **Optimization (Gradient Descent or Normal Equation)**
- Finds the best coefficients by minimizing the MSE
## How It Works
1. **Transform Features**
- Convert input X into $X, X^2, X^3, ... X^n$
2. **Fit a Linear Model**
- Apply Linear Regression on transformed features
1. **Make Predictions**
- Use the polynomial equation to predict output
## Implementation Example
```python
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
# Transform input features into polynomial features
poly = PolynomialFeatures(degree=3)
X_train_poly = poly.fit_transform(X_train)
X_test_poly = poly.transform(X_test)
# Train the model
poly_reg = LinearRegression()
poly_reg.fit(X_train_poly, y_train)
# Make predictions
predictions = poly_reg.predict(X_test_poly)
mse = mean_squared_error(y_test, predictions)
print(f'MSE: {mse:.2f}')
```
## Advantages
- Captures non-linear relationships in data
- More flexible than simple linear regression
- Can model complex real-world scenarios
## Disadvantages
- Higher-degree polynomials can lead to overfitting
- Computationally expensive for large datasets
- Sensitive to outliers
## Hyperparameters
1. **Degree of Polynomial (`degree`)**
- Controls complexity of the model
- Higher degree → More flexibility, but risk of overfitting
1. **Regularisation (`Ridge` and `Lasso`)**
- Prevents overfitting by penalising large coefficients
## Best Practices
1. **Choose the Right Degree**
- Start with a low-degree polynomial and increase if needed
2. **Regularization**
- Use L2 (Ridge) or L1 (Lasso) regularization to reduce overfitting
3. **Feature Scaling**
- Standardize features for better performance
## Common Applications
- Stock price prediction
- Growth rate modelling
- Weather forecasting
- Sales trend analysis
- Medical dose-response modelling
## Performance Optimization
1. **Reduce Overfitting**
- Use Cross-Validation to find the optimal polynomial degree
1. **Feature Selection**
- Avoid using too many high-degree terms
1. **Use [[Ridge Regression]] and [[Lasso Regression]]**
- Prevents extreme coefficient values
## Evaluation Metrics
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
- R² Score