Labs ICT
⭐ Pro Login

Data Warehousing

OLAP, star schemas, and analytical databases.

Analytical Databases

While OLTP (Online Transaction Processing) databases handle day-to-day operations, data warehouses are optimized for OLAP (Online Analytical Processing) β€” complex queries that analyze large volumes of historical data for business intelligence.

OLTP vs. OLAP


  OLTP (Transaction)              OLAP (Analytical)
  ─────────────────────           ─────────────────────
  Current data                    Historical data
  Small, fast operations          Large, complex queries
  Many concurrent users           Few analysts
  Row-oriented storage            Column-oriented storage
  Normalized schema               Denormalized schema
  e.g., MySQL, PostgreSQL         e.g., Redshift, BigQuery

Star Schema


  The most common data warehouse design:

              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚  Dim_Date   β”‚
              β”‚  DateKey    β”‚
              β”‚  Year       β”‚
              β”‚  Month      β”‚
              β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚ Dim_Product│────│ Fact_Sales   │────│ Dim_Customerβ”‚
  β”‚ ProductKey β”‚    β”‚ DateKey      β”‚    β”‚ CustomerKey β”‚
  β”‚ Name       β”‚    β”‚ ProductKey   β”‚    β”‚ Name        β”‚
  β”‚ Category   β”‚    β”‚ CustomerKey  β”‚    β”‚ Region      β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚ Quantity     β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚ Amount       β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                    β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
                    β”‚ Dim_Store  β”‚
                    β”‚ StoreKey   β”‚
                    β”‚ Location   β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  Fact table = measurements (sales, quantities)
  Dimension tables = context (who, what, when, where)

πŸ§ͺ Quick Quiz

What does OLAP stand for?