Labs ICT
Pro Login

describe()

Statistical summary in one line.

describe() — Statistical Summary

Let me give you the fastest way to understand your numeric data. The `describe()` method calculates key statistics for you in one shot.

The Basics

Run `describe()` on any DataFrame with numbers and you'll get a beautiful summary:


import pandas as pd

data = {'Age': [25, 30, 35, 28, 32],
        'Salary': [50000, 60000, 75000, 55000, 65000]}
df = pd.DataFrame(data)

print(df.describe())
    

This gives you count, mean, standard deviation, min, max, and the 25th/50th/75th percentiles. It's like having a statistics professor on demand.

Why Percentiles Matter

One thing that confused me at first was the percentiles. The 50th percentile is the median — the middle value. The 25th and 75th percentiles tell you where the middle 50% of your data falls. If the gap between them is huge, your data is spread out.

Including Non-Numeric Data

Want stats on string columns too?


print(df.describe(include='all'))
    

This adds count, unique values, and top (most frequent) value for string columns. Super useful when you want the complete picture.

Try it Yourself →

Key Takeaways

  • `describe()` gives statistical summary for numeric columns
  • Shows count, mean, std, min, max, and percentiles
  • The 50th percentile is the median
  • Use `include='all'` to include non-numeric columns