LSTMs & GRUs

Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) are advanced RNN architectures designed to handle long-term dependencies in sequential data.

The Problem with Standard RNNs

Standard RNNs suffer from vanishing gradients, making it hard to learn long-range dependencies. LSTMs and GRUs solve this with gating mechanisms.

LSTM

Forget Gate: Decides what to discard
Input Gate: Decides what to add
Output Gate: Decides what to output
Cell State: Long-term memory

GRU

Update Gate: Controls information flow
Reset Gate: Decides what to forget
Simpler than LSTM (fewer parameters)
Often performs similarly to LSTM

Gate Mechanism Example

This demonstrates how gates control information flow (simplified).

python

Output:

Click "Run Code" to see output