Memory Interleaving & Organization
Memory is slow compared to the CPU. How do we make it faster? One technique is memory interleaving - organizing memory into multiple banks that can be accessed in parallel.
Why Interleaving?
Without interleaving, accessing sequential memory addresses requires waiting for each access to complete before starting the next. With interleaving, we can overlap these accesses!
Without Interleaving (Sequential)
+---------------------------------------------+
| Address 0 | Address 1 | Address 2 |
| | | | |
| v v v |
| [Access 1] [Access 2] [Access 3] |
| (wait...) (wait...) (wait...) |
| |
| Total time = 3 ร Memory Access Time |
+---------------------------------------------+
With Interleaving (Parallel)
+---------------------------------------------+
| Bank 0: Address 0 | Address 3 |
| | | |
| v v |
| Bank 1: Address 1 | Address 4 |
| | | |
| v v |
| Bank 2: Address 2 | Address 5 |
| | | |
| v v |
| [Access 0,1,2] [Access 3,4,5] |
| (overlapped) (overlapped) |
| |
| Total time โ Memory Access Time |
| (for burst of consecutive accesses) |
+---------------------------------------------+
Types of Interleaving
- High-Order Interleaving: Different memory modules store different address ranges
- Low-Order Interleaving: Consecutive addresses go to different modules
High-Order Interleaving
+---------------------------------------------+
| Module 0: Addresses 0-255 |
| Module 1: Addresses 256-511 |
| Module 2: Addresses 512-767 |
| Module 3: Addresses 768-1023 |
+---------------------------------------------+
Low-Order Interleaving
+---------------------------------------------+
| Module 0: Addresses 0, 4, 8, 12, ... |
| Module 1: Addresses 1, 5, 9, 13, ... |
| Module 2: Addresses 2, 6, 10, 14, ... |
| Module 3: Addresses 3, 7, 11, 15, ... |
+---------------------------------------------+
Memory Organization
Modern memory systems are organized hierarchically:
- Channels: Independent paths to memory controllers
- Slots/DIMMs: Physical memory modules
- Ranks: Sets of memory chips that can be accessed together
- Banks: Independent memory arrays within a rank
- Rows and Columns: Organization within a bank
Burst Mode
Modern DRAM supports burst mode - after an initial access delay (CAS latency), consecutive data can be read very quickly. This is why cache lines are typically 64 bytes - we fetch an entire burst to exploit this.
DRAM Burst Read
+---------------------------------------------+
| Time | Action |
|--------|-----------------------------------|
| 0 | Row Address Strobe (RAS) |
| tRCD | Column Address Strobe (CAS) |
| CAS | Data[0] available |
| CAS+1 | Data[1] available (burst) |
| CAS+2 | Data[2] available (burst) |
| CAS+3 | Data[3] available (burst) |
+---------------------------------------------+