Welcome to NeuralNets! The core objective of this repository is to build and understand the fundamental concepts behind neural networks, backpropagation, and deep learning from the ground up.
This project is inspired by and follows Andrej Karpathy's highly acclaimed Zero to Hero deep learning series.
In this section, we implement Micrograd, a tiny scalar-valued autograd engine, and use it to construct a Multi-Layer Perceptron (MLP) trained to perform binary classification.
- Autograd DAG: Building a directed acyclic graph (DAG) dynamically as operations are performed.
- Backpropagation & Chain Rule: Automating gradient calculation through local derivatives using the multivariate chain rule.
-
Gradient Accumulation: Resolving the classic "gradient overwriting" bug using
+=to correctly handle multiple inputs/connections (multivariate paths). -
Neural Net API: Building modular abstractions for:
-
Neuron: A single artificial neuron ($y = \tanh(\sum w_i x_i + b)$). -
Layer: A fully connected layer of neurons. -
MLP: A stack of sequential layers.
-
-
Optimization Loop: Implementing the complete training cycle:
- Forward pass (predictions and Mean Squared Error Loss).
- Zero gradients.
- Backward pass (autograd).
- Parameter updates (Gradient Descent step).
Zero-to-hero/: Dedicated directory for lecture chapters.Chapter-1/:- 📓 NeuralNets_BackProp.ipynb: The main notebook containing a detailed step-by-step walkthrough, visualized computational graphs, bug resolution, and the MLP training loop.
- 📓 Micrograd_Part1.ipynb: Scratchpad/part 1 notebook for basic functions.
To run these notebooks locally, set up your Python virtual environment and install the required dependencies:
git clone <your-repo-url>
cd NeuralNets- Windows:
.venv\Scripts\activate
- macOS / Linux:
source .venv/bin/activate
Make sure you have the following libraries installed in your environment:
numpymatplotlibgraphviz(requires Graphviz binary installed on your system)torch(used to cross-reference and verify gradients)
The heart of backpropagation in custom autograd engine relies on the Value class, which handles local math operations. When a forward operation is performed, it stores a pointer to a _backward function:
# Addition implementation in Value class
def __add__(self, other):
other = other if isinstance(other, Value) else Value(other)
out = Value(self.data + other.data, (self, other), '+')
def _backward():
self.grad += 1.0 * out.grad
other.grad += 1.0 * out.grad
out._backward = _backward
return outCalling .backward() performs a topological sort of the graph and executes the _backward chain rule updates in reverse topological order, automatically computing gradients for all parameters.