This repository contains a fully from-scratch neural network implementation using only Python and NumPy, applied to the MNIST handwritten digits dataset.
mnist_cracker_neural_net/
├── data/
│ ├── train/ # Training images (MNIST format)
│ └── test/ # Test images (MNIST format)
│
├── models/
│ └── 40k_v1_model.pkl # Pre-trained model (40k parameters)
│
├── network/ # Core neural network logic (built from scratch)
│ ├── __init__.py
│ ├── neuron.py # Neuron class (with forward/backward passes)
│ ├── layer.py # Layer class (collection of neurons)
│ └── neural_network.py # Complete network (built from layers)
│
├── training/
│ ├── config.py # 🔧 Adjustable training parameters (epochs, LR, decay...)
│ ├── data_loader.py # Handles loading and preprocessing of MNIST data
│ └── train.py # Main training logic
│
├── testing/
│ └── test.py # Script to evaluate a trained model
│
├── main.py # Run this to test the model on custom images
└── README.md # This file
Each Neuron is implemented as a Python class with individual weights, biases, activation (ReLU), gradients, and update logic.
A Layer is a simple container of Neurons that performs forward/backward propagation in sequence.
The NeuralNetwork class combines layers, manages forward and backward passes, and updates parameters.
No TensorFlow, PyTorch or high-level abstractions. This is bare-metal NumPy to showcase what’s under the hood of deep learning.
git clone https://github.com/your-username/MNIST-NeuralNet-Scratch.git
cd MNIST-NeuralNet-ScratchOnly numpy (and pickle for saving/loading models) are needed:
pip install pickle
pip install numpyYou can configure your training run in training/config.py (e.g., learning rate, epochs, decay):
learning_rate = 0.01
epochs = 15
batch_size = 64
decay = 0.99Then run:
python training/train.pyA pre-trained model is included (models/40k_v1_model.pkl):
python testing/test.pyOr run the model on an image via:
python main.pyz = np.dot(inputs, self.weights) + self.bias
if self.use_relu:
return self.relu(z)
return zEach neuron tracks its last input and gradient, and applies ReLU conditionally for backpropagation.