Labs ICT
โญ Pro Login

Basic Statistics

R is, at heart, a statistics language. So it's no surprise that computing averages, spreads, correlations, and summaries is as simple as calling a single function. Whether you're checking the mean of a column or the correlation between two variables, R has you covered.

Central Tendency: mean() and median()

mean() gives you the average โ€” sum divided by count. median() gives you the middle value when data is sorted. Both ignore NA if you set na.rm = TRUE.

ages <- c(23, 27, 31, 29, 35, 28, 42, 26, 30, 33)

mean(ages)
median(ages)

# With missing values
scores <- c(85, 92, NA, 78, 90)
mean(scores, na.rm = TRUE)
Try it Yourself โ†’

Spread: sd() and var()

Standard deviation (sd()) tells you how spread out the numbers are. Variance (var()) is the square of that. Low values mean the data clusters tightly around the mean; high values mean it's all over the place.

prices <- c(15, 22, 19, 28, 35, 18, 25)

sd(prices)
var(prices)
Try it Yourself โ†’

summary() and quantile()

summary() gives you a five-number summary plus the mean โ€” minimum, first quartile, median, mean, third quartile, maximum. quantile() lets you ask for specific percentiles.

data <- c(10, 15, 18, 22, 25, 28, 30, 35, 40, 50)

summary(data)
quantile(data, probs = c(0.25, 0.5, 0.75))
Try it Yourself โ†’

Correlation with cor()

cor() measures how two variables move together. A value near 1 means they increase together, near -1 means one goes up as the other goes down, and near 0 means no linear relationship.

hours_studied <- c(2, 4, 6, 8, 10, 12)
exam_score    <- c(55, 60, 70, 75, 85, 90)

cor(hours_studied, exam_score)
Try it Yourself โ†’

๐Ÿงช Quick Quiz

Which function calculates the average of a vector?