Recurrent Neural Networks
Networks That Remember: Processing Sequences Over Time
What You'll Discover
Understand how RNNs process sequential data with memory
Hidden State Memory
See how RNNs maintain context across time steps by passing hidden states forward.
Vanishing Gradients
Understand why vanilla RNNs struggle with long sequences and how gradients decay.
LSTM & GRU Gates
Learn how gating mechanisms control what to remember, forget, and output.
Real-World Applications
Match RNN architectures to tasks from sentiment analysis to machine translation.
Key Concepts
Sequential Data
Data where order matters: text, time series, audio, video
Hidden States
The RNN's memory that carries context between time steps
Vanishing Gradients
Gradients shrink exponentially, limiting long-range learning
LSTM
Long Short-Term Memory with forget, input, and output gates
GRU
Simplified gating with update and reset gates
Sequence-to-Sequence
Encoder-decoder architecture for translation and summarization
Continue Learning
Explore related topics to deepen your understanding
Why Sequential Data Needs Special Networks
Many real-world data types have a natural order that matters:
- •Text: "Dog bites man" vs. "Man bites dog" — same words, opposite meaning
- •Time series: Stock prices, weather data, sensor readings
- •Audio: Speech is a sequence of sounds over time
- •Video: A sequence of image frames
A regular feedforward network treats all inputs independently — it has no concept of order. If you feed it the words "the cat sat" as three separate inputs, it doesn't know which came first, second, or third.
Recurrent Neural Networks (RNNs) solve this by processing inputs one at a time in order, maintaining a hidden state that carries information from previous steps. Think of it like reading a sentence: your understanding of each word is shaped by the words that came before it.
The key idea: instead of processing the entire sequence at once, process it step by step, building up context as you go.
Sequential Data in the Real World
| Data Type | Example | Why Order Matters | Typical Task |
|---|---|---|---|
| Text | "I love this movie" | Word order determines meaning | Sentiment analysis |
| Time Series | Stock: 100, 105, 103, 110 | Trends depend on ordering | Price prediction |
| Audio | Speech waveform | Sounds must be in sequence | Speech recognition |
| DNA | ATCGATCG... | Gene sequences encode proteins | Protein structure prediction |
| Music | Notes over time | Melody = notes in order | Music generation |
| Video | Frame 1, Frame 2, ... | Actions unfold over time | Activity recognition |
Feedforward vs Recurrent Networks
| Aspect | Feedforward Network | Recurrent Network |
|---|---|---|
| Input processing | All at once (parallel) | One step at a time (sequential) |
| Memory | None — each input independent | Hidden state carries context forward |
| Word order | "cat sat the" = "the cat sat" | "the cat sat" ≠ "cat sat the" |
| Variable length | Fixed input size only | Handles any sequence length |
| Best for | Images, tabular data | Text, time series, audio |