Neural Networks
Neural networks are loosely inspired by the human brain. They are computational graphs made of interconnected nodes (neurons) that can learn remarkably complex patterns from data — given enough layers and enough data.
The Neuron
A single artificial neuron does three things:
x₁w₁ + x₂w₂ + x₃w₃+ boutput = f(sum)The weights and bias are learned from data during training. The activation function adds non-linearity — without it, stacking layers would be no different from a single linear equation.
🧠 Interactive Neural Network
Click Run Forward Pass to see data flow through the network. Use sliders to change the architecture.
Activation Functions
Activation functions determine whether a neuron "fires" and introduce non-linearity:
ReLU
Most common for hidden layers. Fast, simple, avoids vanishing gradients.
✅ Use for: Hidden layers in most networks
Sigmoid
Squashes output to (0, 1). Useful for binary probability output.
✅ Use for: Binary classification output layer
Softmax
Converts logits to a probability distribution summing to 1.
✅ Use for: Multi-class classification output
Tanh
Output in (-1, 1). Better than sigmoid for hidden layers in RNNs.
✅ Use for: RNN hidden states
Layers: Input → Hidden → Output
Neural networks are organised into layers:
One neuron per feature. No computation, just passes data in.
Where learning happens. Each layer extracts increasingly abstract features.
Final prediction. Neuron count = number of classes (or 1 for regression).
Forward Pass
A forward pass is when data flows from input → output to generate a prediction. At each layer, each neuron computes its weighted sum + bias, then applies its activation function.
import torch
import torch.nn as nn
class SimpleNet(nn.Module):
def __init__(self):
super().__init__()
self.layers = nn.Sequential(
nn.Linear(4, 16), # Input: 4 features → 16 neurons
nn.ReLU(),
nn.Linear(16, 8), # Hidden: 16 → 8 neurons
nn.ReLU(),
nn.Linear(8, 1), # Output: 8 → 1 (regression)
)
def forward(self, x):
return self.layers(x)
model = SimpleNet()
x = torch.randn(32, 4) # Batch of 32 samples, 4 features each
output = model(x) # Forward pass
print(output.shape) # → torch.Size([32, 1]) Backpropagation & Gradient Descent
Training a neural network means finding the right weights. This happens through:
Compute predictions from current weights
Measure how wrong predictions are (MSE, Cross-Entropy)
Use the chain rule to compute gradient of loss w.r.t. each weight
Update weights:
w = w - lr × ∇wThe learning rate controls how big each weight update is. Too high → weights diverge (exploding gradients). Too low → training takes forever. Typical starting values: 0.001 or 0.0001. Use learning rate schedulers to decay over time.
Complete Training Loop in PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
model = SimpleNet()
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
for epoch in range(100):
# Forward pass
predictions = model(X_train)
loss = criterion(predictions, y_train)
# Backward pass
optimizer.zero_grad() # Clear previous gradients
loss.backward() # Compute gradients
optimizer.step() # Update weights
if epoch % 10 == 0:
print(f"Epoch {epoch}: Loss = {loss.item():.4f}") Key Hyperparameters
Frequently Asked Questions
How many layers do I need?
Start with 1–3 hidden layers. Modern deep learning uses tens or hundreds of layers (ResNet has 152!). For tabular data, 2–3 layers is usually enough. Add more only if you have enough data and the simpler model underfits.
What is the vanishing gradient problem?
In very deep networks, gradients shrink as they travel backwards through many layers. Layers close to the input receive tiny gradient updates and stop learning. ReLU activations and residual connections (ResNets) largely solve this.
PyTorch or TensorFlow — which should I learn?
Both are excellent. PyTorch is more popular in research (imperative style, easier debugging). TensorFlow/Keras is strong in production deployment (TFLite, TFServing). We recommend starting with PyTorch — the syntax is more Pythonic and beginner-friendly.
Frequently Asked Questions
What will I learn here?
This page covers the core concepts and techniques you need to understand the topic and progress confidently to the next lesson.
How should I use this page?
Start with the overview, then follow the section links to deepen your understanding. Use the table of contents on the right to jump to specific sections.
What should I read next?
Use the navigation below to continue to the next lesson or explore related topics.