Learning with Labels
Supervised learning is the most common type of ML. The "supervised" part means you're providing the algorithm with labeled training data β each input comes with the correct output. The algorithm learns the mapping from input to output.
Think of it like studying with an answer key. You practice problems, check your answers, and gradually learn the patterns that lead to correct results.
Two Types of Supervised Learning
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SUPERVISED LEARNING β
β β
β ββββββββββββββββββββ ββββββββββββββββββββββββ β
β β REGRESSION β β CLASSIFICATION β β
β β β β β β
β β Predict a β β Predict a category β β
β β continuous β β or class label β β
β β number β β β β
β β β β Examples: β β
β β Examples: β β β’ Spam or not spam β β
β β β’ House price β β β’ Cat or dog β β
β β β’ Temperature β β β’ Disease or healthyβ β
β β β’ Stock price β β β’ Fraud or legit β β
β ββββββββββββββββββββ ββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Regression: Predicting Numbers
In regression problems, the output is a continuous value. You're predicting a quantity. Common algorithms include Linear Regression, Polynomial Regression, and Regression Trees.
Example: Given the size, location, and age of a house, predict its sale price. The output is a dollar amount β a number on a continuous scale.
Classification: Predicting Categories
In classification problems, the output is a discrete label. You're sorting inputs into categories. Common algorithms include Logistic Regression, Decision Trees, Random Forests, and Support Vector Machines.
Example: Given the text of an email, predict whether it's spam or not spam. The output is one of two (or more) categories.
The Training Process
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Training β β Model β β Prediction β
β Data βββββΊβ Learns βββββΊβ on New β
β (X, y) β β f(X)βy β β Data β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β
Features + Algorithm Input
Labels minimizes β Output
error
The model is trained to minimize the difference between its predictions and the actual labels. It adjusts its internal parameters to get closer and closer to the correct answers. The loss function measures how wrong the model is, and the optimizer adjusts accordingly.
When to Use Supervised Learning
Use supervised learning when you have labeled data β when you know what the right answer looks like. Most business ML problems are supervised: predicting customer churn, classifying support tickets, estimating revenue, detecting fraud. If you have historical data with outcomes, supervised learning is probably your starting point.