Pandas Introduction
If NumPy is the engine, pandas is the car. Pandas is the single most important library for data manipulation in Python. It gives you DataFrames — think of them as supercharged spreadsheets you can program.
Series and DataFrames
A Series is a single column of data. A DataFrame is a table made up of multiple Series. Together, they form the backbone of data work in Python.
import pandas as pd
s = pd.Series([10, 20, 30, 40], index=["a", "b", "c", "d"])
print(s)
df = pd.DataFrame({
"name": ["Alice", "Bob", "Charlie"],
"age": [25, 30, 35],
"city": ["NYC", "LA", "Chicago"]
})
print(df)
Notice how the DataFrame looks just like a table? Each column is a Series with a name, and each row has an index. This is exactly how you'd think about data in a spreadsheet.
Try it Yourself →Reading Data
In the real world, data comes from files — CSVs, Excel spreadsheets, JSON files, databases. Pandas can read all of them with simple function calls.
import pandas as pd
df = pd.read_csv("data.csv")
print(df.head())
df_excel = pd.read_excel("report.xlsx", sheet_name="Sales")
print(df_excel.shape)
The .head() method shows the first 5 rows — always your first move when loading new data. The .shape attribute tells you how many rows and columns you have.
Selecting Data
You'll spend most of your time selecting, filtering, and transforming data. Pandas makes this intuitive once you get the syntax.
import pandas as pd
df = pd.DataFrame({
"product": ["A", "B", "C", "A", "B"],
"sales": [100, 200, 150, 120, 180],
"region": ["East", "West", "East", "West", "East"]
})
print(df["sales"])
print(df[df["sales"] > 150])
print(df.loc[0:2, ["product", "sales"]])
Column selection returns a Series. Filtering with boolean conditions returns a new DataFrame. The .loc accessor lets you select rows and columns by label.
Key Takeaways
- DataFrames are table-like structures — the core of pandas
- read_csv() and read_excel() load data from common file formats
- Use .head() to preview data and .shape to check dimensions
- Filtering, selecting, and grouping are the most common operations