describe()

Statistical summary in one line.

describe() — Statistical Summary

Let me give you the fastest way to understand your numeric data. The `describe()` method calculates key statistics for you in one shot.

The Basics

Run `describe()` on any DataFrame with numbers and you'll get a beautiful summary:


import pandas as pd

data = {'Age': [25, 30, 35, 28, 32],
        'Salary': [50000, 60000, 75000, 55000, 65000]}
df = pd.DataFrame(data)

print(df.describe())

This gives you count, mean, standard deviation, min, max, and the 25th/50th/75th percentiles. It's like having a statistics professor on demand.

Why Percentiles Matter

One thing that confused me at first was the percentiles. The 50th percentile is the median — the middle value. The 25th and 75th percentiles tell you where the middle 50% of your data falls. If the gap between them is huge, your data is spread out.

Including Non-Numeric Data

Want stats on string columns too?


print(df.describe(include='all'))

This adds count, unique values, and top (most frequent) value for string columns. Super useful when you want the complete picture.

Try it Yourself →

Key Takeaways

`describe()` gives statistical summary for numeric columns
Shows count, mean, std, min, max, and percentiles
The 50th percentile is the median
Use `include='all'` to include non-numeric columns

← Previous head(), tail() & info()

Next → Selecting Columns