Skip to content

logicarius/linear_regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Linear Regression from Scratch

1. What problem am I solving?

Given a set of data points, find the line that best fits them, without using any ML libraries like sklearn.

2. Why is it interesting?

Most people learn ML by calling model.fit() and never understanding what happens inside. This project implements everything manually : the math, the gradients, the update rule — so the black box becomes transparent.

3. How does my solution work?

Built gradient descent from scratch using only NumPy:

  • predict() : computes y = wX + b
  • loss() : measures error using Mean Squared Error
  • gradients() : computes ∂L/∂w and ∂L/∂b using calculus
  • update() : moves w and b downhill using the gradient

The model starts with w=0, b=0 and learns w≈3, b≈2 from noisy data after 10,000 epochs.

4. What did I learn?

  • Gradient descent updates parameters in the direction that reduces loss fastest
  • Learning rate controls step size: too large and the model diverges
  • Noisy data means you can never perfectly recover true parameters
  • Why b converges slower than w (no X amplification in db)
  • How to track model evolution using Git commits

5. How to run it?

Requirements: pip install numpy matplotlib

Run: python main.py

This will train the model and display the learned line over the data.

Project Structure

main.py   — training loop and visualization
model.py  — predict, loss, gradients, update functions
utils.py  — plotting helper functions
results/  — output plots saved here

Experiments

  • Tested learning rates: 0.001 (slow), 0.005 (stable), 0.1 (diverges)
  • Observed divergence at α=0.1 : loss exploded to inf then nan
  • Confirmed convergence at α=0.005 after 10,000 epochs

Why this matters

This project builds the foundation for understanding:

  • Neural networks (same gradient descent principle)
  • Deep learning frameworks (what model.fit() does internally)

Key Insight

Linear regression is simply an optimization problem solved using gradient descent, not a black-box ML algorithm. This project proves that by building every component from first principles.

Output

Fit Visualization

Fit

Loss Curve

Loss Curve

Results

The model converges close to ground truth:

  • Learned: w ≈ 3.015, b ≈ 1.811
  • Truth: w = 3.000, b = 2.000

Loss drops from 401.84 at epoch 0 to 0.81 at epoch 9900, confirming correct gradient descent implementation. The small gap in b is expected, irreducible error caused by noise in the data, not a bug.

About

Linear regression implemented from scratch in Python using gradient descent, MSE loss, and NumPy, no ML libraries.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages