Labs ICT
Pro Login

Pivot Tables

Reshaping data for analysis.

Pivot Tables

Let me give you the Excel pivot table equivalent in Pandas. If you've ever used pivot tables in Excel, you'll feel right at home. If not, you're about to discover your new favorite tool.

The Basic Pivot Table

Here's a simple pivot table in action:


import pandas as pd

data = {
    'Department': ['Sales', 'Sales', 'HR', 'HR', 'Engineering', 'Engineering'],
    'Level': ['Senior', 'Junior', 'Senior', 'Junior', 'Senior', 'Junior'],
    'Salary': [60000, 50000, 65000, 55000, 80000, 65000]
}
df = pd.DataFrame(data)

pivot = pd.pivot_table(df, index='Department',
                       columns='Level', values='Salary')
print(pivot)
    

This reshapes your data. Think of it like rearranging rows into a grid. Departments become rows, Levels become columns, and Salaries fill the cells.

Custom Aggregation

Want to control how values are aggregated?


pivot = pd.pivot_table(df, index='Department',
                       columns='Level', values='Salary',
                       aggfunc='mean', fill_value=0)
print(pivot)
    

The `aggfunc` parameter specifies the aggregation function. `fill_value` handles missing combinations — super handy when not every department has every level.

Multiple Values

Need to pivot multiple columns at once?


data = {
    'Department': ['Sales', 'Sales', 'HR', 'HR'],
    'Year': [2023, 2024, 2023, 2024],
    'Revenue': [100000, 120000, 80000, 90000],
    'Cost': [60000, 70000, 50000, 55000]
}
df = pd.DataFrame(data)

pivot = pd.pivot_table(df, index='Department',
                       columns='Year', values=['Revenue', 'Cost'])
print(pivot)
    

One thing that confused me at first was that `index` becomes rows, `columns` becomes columns, and `values` fills the cells. Once you remember that, pivot tables are easy.

Try it Yourself →

Key Takeaways

  • `pd.pivot_table()` reshapes data like Excel pivot tables
  • `index` creates rows, `columns` creates columns
  • `values` specifies what fills the cells
  • `aggfunc` and `fill_value` control aggregation and missing data