Linear Algebra Projects

A collection of three hands-on projects exploring core linear algebra concepts and their real-world applications, implemented in Python using Jupyter Notebooks.

Projects Overview

Project 1 — Gram-Schmidt Process & Random Projections

Gram-Schmidt Process (Gram-Schmidt process.ipynb)

Implements the Gram-Schmidt orthogonalization algorithm from scratch. Given a set of linearly independent vectors, the algorithm constructs an orthonormal basis for the spanned subspace. Key concepts covered:

Orthogonal projection and vector decomposition
Iterative orthonormalization
Detecting the dimension of a spanned subspace

Random Projections (random_projections.ipynb)

Explores orthogonal projections in 1D and N-dimensional subspaces, applied to real Persian digit image data (TrainData.txt). Tasks include:

Implementing 1D and ND projection matrices using the formula $\pi_U(\mathbf{x}) = \mathbf{B}(\mathbf{B}^T\mathbf{B})^{-1}\mathbf{B}^T\mathbf{x}$
Projecting image vectors onto random subspaces
Finding the minimum subspace dimension that preserves image recognizability

Project 2 — Principal Component Analysis (PCA)

PCA (PCA.ipynb)

Implements PCA step by step on the Fashion-MNIST dataset (fashion_MNIST.zip). The pipeline covers:

Data standardization — zero mean, unit variance normalization
Covariance matrix computation — capturing pairwise feature relationships
Eigendecomposition — extracting principal components from the symmetric covariance matrix
Dimensionality reduction — projecting data onto the top-k eigenvectors
Reconstruction — recovering images from compressed representations and finding the minimum number of components needed for recognizable reconstruction
Visualization — 2D and 3D scatter plots of projected data

Project 3 — PageRank & TextRank

PageRank (PageRank.ipynb)

Implements Google's PageRank algorithm using three independent methods, applied to a real web-graph dataset:

NetworkX built-in — baseline using networkx.algorithms.pagerank
Eigendecomposition — computing the L1-normalized eigenvector of the Google matrix corresponding to eigenvalue 1
Power Method — iterative approximation via $b_{k+1} = \frac{A \cdot b_k}{|Ab_k|}$ until convergence
Random Walk — stochastic simulation of a surfer traversing the graph

The Google matrix incorporates a damping factor $p$ (typically 0.15): $M = (1-p) \cdot A + p \cdot B$

Results are visualized as directed graphs with node sizes proportional to rank.

TextRank (TextRank.ipynb)

Applies the PageRank idea to NLP for unsupervised keyword and sentence extraction, based on the Mihalcea & Tarau (2004) paper. Applied to two fairy tale texts (Cinderella and Beauty and the Beast):

Tokenization and part-of-speech tagging using NLTK
Co-occurrence graph construction with a sliding word window
Weighted PageRank via the Power Method for keyword ranking

Repository Structure

Linear_Algebra_Projects/
├── Project1/
│   ├── Gram-Schmidt process.ipynb
│   ├── random_projections.ipynb
│   └── TrainData.txt              # Persian digit image vectors
├── Project2/
│   ├── PCA.ipynb
│   └── fashion_MNIST.zip          # Fashion-MNIST dataset
└── Project3/
    ├── PageRank.ipynb
    ├── TextRank.ipynb
    ├── graphs.py                  # Graph loading & visualization utilities
    ├── page_rank.py               # PageRank helper functions
    ├── text_rank.py               # TextRank / NLP utilities
    └── data/
        ├── Beauty_and_the_Beast.txt
        └── Cinderalla.txt

Requirements

Python 3.9+
numpy
pandas
matplotlib
networkx
nltk
scikit-learn (for PCA comparison / data loading)
Jupyter Notebook / JupyterLab

Install all dependencies:

pip install numpy pandas matplotlib networkx nltk scikit-learn jupyter

For TextRank, also download the required NLTK data:

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

Usage

Clone the repository and launch Jupyter:

git clone https://github.com/Sepovsky/Linear_Algebra_Projects.git
cd Linear_Algebra_Projects
jupyter notebook

Then open any .ipynb file inside Project1/, Project2/, or Project3/ and run the cells sequentially.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Project1		Project1
Project2		Project2
Project3		Project3
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Linear Algebra Projects

Projects Overview

Project 1 — Gram-Schmidt Process & Random Projections

Project 2 — Principal Component Analysis (PCA)

Project 3 — PageRank & TextRank

Repository Structure

Requirements

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Linear Algebra Projects

Projects Overview

Project 1 — Gram-Schmidt Process & Random Projections

Project 2 — Principal Component Analysis (PCA)

Project 3 — PageRank & TextRank

Repository Structure

Requirements

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages