Recurrent Neural Networks (RNNs) - ML Pathway

RNNs are a type of neural network designed for **sequential data** like **text, speech, and time series**. They have a **memory mechanism** that allows them to retain past information when processing sequences. ![[RNN1.png | ]] ### Key Features of RNNs - **Handles Sequential Data** – Maintains context from previous inputs. - **Loops (Recurrence)** – Each neuron receives input from both the **current step** and **previous step**. - **Shared Weights** – The same weights are applied across all time steps. ### How RNNs Work? 1. The network processes an input **one step at a time**. 2. The **hidden state** stores information from previous steps. 3. The final output is generated based on the sequential context. Mathematically: $h_t = f(W_h h_{t-1} + W_x x_t + b)$ Where: - $h_t$ = Hidden state at time t. - $x_t$ = Input at time t. - $Wh$ , $Wx$ = Weights. - $b$ = Bias. - $f$ = Activation function (typically **Tanh**). --- ### Problems with RNNs - **Vanishing Gradient** – Gradients shrink over long sequences, making learning difficult. - **Long-Term Dependencies** – Hard to remember distant information. **Solutions:** - **LSTMs (Long Short-Term Memory)** – Introduce memory cells with **gates** to control information flow. - **GRUs (Gated Recurrent Units)** – A simpler version of LSTMs with fewer parameters. ### Applications of RNNs - **Natural Language Processing (NLP)** – Sentiment analysis, machine translation. - **Speech Recognition** – Converts spoken words to text. - **Time-Series Prediction** – Stock market trends, weather forecasting.