LSTMs & GRUs
Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) are advanced RNN architectures designed to handle long-term dependencies in sequential data.
The Problem with Standard RNNs
Standard RNNs suffer from vanishing gradients, making it hard to learn long-range dependencies. LSTMs and GRUs solve this with gating mechanisms.
LSTM
- Forget Gate: Decides what to discard
- Input Gate: Decides what to add
- Output Gate: Decides what to output
- Cell State: Long-term memory
GRU
- Update Gate: Controls information flow
- Reset Gate: Decides what to forget
- Simpler than LSTM (fewer parameters)
- Often performs similarly to LSTM
Gate Mechanism Example
This demonstrates how gates control information flow (simplified).
python
Output:
Click "Run Code" to see output