Artificial Neural Networks
Neural networks are the foundation of deep learning. They're inspired by the structure of the human brain โ layers of interconnected neurons that process information. Each neuron receives inputs, applies weights and a bias, and passes the result through an activation function. Stack enough of these neurons in layers, and the network can learn incredibly complex patterns.
Structure of a Neural Network
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ NEURAL NETWORK ARCHITECTURE โ
โ โ
โ Input Hidden Layers Output โ
โ Layer Layer โ
โ โ
โ (xโ) โโโโโโโโ โโโโโโโโ โ
โ โฑ โฒโโโโโบโ hโโ โโโโฒ โฑโโโบโ yโ โ โ
โ โโโโโโโโ โฒ โฑ โโโโโโโโ โ
โ (xโ) โโโโโโโโ โฒโฑ โโโโโโโโ โ
โ โฑ โฒโโโโโบโ hโโ โโโโฑโฒโฒโโโโโบโ yโ โ โ
โ โโโโโโโโ โฑ โฒ โโโโโโโโ โ
โ (xโ) โโโโโโโโ โฑ โฒ โ
โ โฑ โฒโโโโโบโ hโโ โโฑ โฒ โ
โ โโโโโโโโ โฒ โ
โ โ
โ Each connection has a weight โ
โ Each neuron has an activation function โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
How a Single Neuron Works
Inputs Weights Sum + Bias Activation Output
โโโโ โโโโโโ โโโโโโโโโโ โโโโโโโโโโ โโโโโโ
xโ โโwโโโโฒ
โฒ
xโ โโwโโโโบ ฮฃ (wแตขxแตข) + b โโโโ ฯ(ยท) โโโโโบ output
โฑ
xโ โโwโโโโฑ
ฮฃ = weighted sum of inputs + bias
ฯ = activation function (introduces non-linearity)
Activation Functions
- Sigmoid โ Squashes values between 0 and 1. Used in output layers for binary classification.
- ReLU (Rectified Linear Unit) โ Returns max(0, x). The most popular hidden layer activation. Simple and effective.
- Tanh โ Squashes values between -1 and 1. Zero-centered, which can help training.
- Softmax โ Converts outputs to probabilities that sum to 1. Used in multi-class classification output layers.
Training: Forward and Backward Pass
Training happens in two phases:
- Forward Pass โ Input flows through the network layer by layer to produce a prediction.
- Backward Pass (Backpropagation) โ The error is calculated, and gradients flow backward to update weights. Gradient descent adjusts each weight to reduce the error.
This process repeats for many epochs (complete passes through the training data) until the model converges.
Why Deep Learning Works
The "deep" in deep learning refers to the multiple hidden layers. Each layer learns increasingly abstract features. For image recognition: layer 1 detects edges, layer 2 detects shapes, layer 3 detects object parts, layer 4 detects whole objects. This hierarchical feature learning is what makes deep networks so powerful โ and so different from traditional ML where you had to engineer features manually.