Choosing the Right Chart
Having a dozen chart types is great — until you have to pick one. Choosing the wrong chart can confuse your audience or hide important insights. Let's break down when to use what.
Comparison Charts
When you're comparing categories, bar charts are your default choice. They're easy to read and hard to misinterpret.
import matplotlib.pyplot as plt
products = ["Widget", "Gadget", "Doohickey"]
sales = [450, 380, 520]
plt.barh(products, sales, color=["#3498db", "#e74c3c", "#2ecc71"])
plt.title("Sales by Product")
plt.xlabel("Units Sold")
plt.show()
Use horizontal bars when category names are long. Use vertical bars for time-based comparisons. Avoid pie charts for more than 3-4 categories — they're hard to compare accurately.
Try it Yourself →Trend Charts
Line charts show trends over time. They're the standard for time-series data because our eyes naturally follow connected lines.
import matplotlib.pyplot as plt
import numpy as np
months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun"]
revenue = [12000, 15000, 13500, 17000, 16000, 19500]
plt.plot(months, revenue, marker="o", linewidth=2, color="#3498db")
plt.fill_between(months, revenue, alpha=0.1, color="#3498db")
plt.title("Revenue Trend")
plt.ylabel("Revenue ($)")
plt.show()
The fill_between adds a subtle area under the line for visual emphasis. The marker parameter highlights individual data points.
Distribution Charts
When you want to understand how data is spread out, use histograms or density plots. They reveal the shape of your data — normal, skewed, bimodal, or uniform.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.normal(70, 15, 1000)
plt.hist(data, bins=30, edgecolor="black", alpha=0.7, color="#9b59b6")
plt.axvline(np.mean(data), color="red", linestyle="--", label="Mean")
plt.legend()
plt.title("Score Distribution")
plt.show()
The vertical line at the mean provides a quick reference point. Use bins to control granularity — too few bins hides patterns, too many creates noise.
Relationship Charts
Scatter plots reveal relationships between two numeric variables. Add size and color dimensions to show more information.
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(42)
x = np.random.randn(100)
y = 2 * x + np.random.randn(100) * 0.5
plt.scatter(x, y, alpha=0.6, c="#e74c3c", edgecolors="black")
plt.xlabel("Feature X")
plt.ylabel("Feature Y")
plt.title("Scatter Plot with Correlation")
plt.show()
The alpha parameter controls transparency — crucial when points overlap. Use color to represent a third variable.
Quick Reference
- Bar chart — comparing categories
- Line chart — trends over time
- Scatter plot — relationships between variables
- Histogram — distribution of a single variable
- Heatmap — correlations or matrix data
- Box plot — distributions with outliers