Skip to content

Sepovsky/Linear_Algebra_Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Linear Algebra Projects

A collection of three hands-on projects exploring core linear algebra concepts and their real-world applications, implemented in Python using Jupyter Notebooks.


Projects Overview

Project 1 — Gram-Schmidt Process & Random Projections

Gram-Schmidt Process (Gram-Schmidt process.ipynb)

Implements the Gram-Schmidt orthogonalization algorithm from scratch. Given a set of linearly independent vectors, the algorithm constructs an orthonormal basis for the spanned subspace. Key concepts covered:

  • Orthogonal projection and vector decomposition
  • Iterative orthonormalization
  • Detecting the dimension of a spanned subspace

Random Projections (random_projections.ipynb)

Explores orthogonal projections in 1D and N-dimensional subspaces, applied to real Persian digit image data (TrainData.txt). Tasks include:

  • Implementing 1D and ND projection matrices using the formula $\pi_U(\mathbf{x}) = \mathbf{B}(\mathbf{B}^T\mathbf{B})^{-1}\mathbf{B}^T\mathbf{x}$
  • Projecting image vectors onto random subspaces
  • Finding the minimum subspace dimension that preserves image recognizability

Project 2 — Principal Component Analysis (PCA)

PCA (PCA.ipynb)

Implements PCA step by step on the Fashion-MNIST dataset (fashion_MNIST.zip). The pipeline covers:

  1. Data standardization — zero mean, unit variance normalization
  2. Covariance matrix computation — capturing pairwise feature relationships
  3. Eigendecomposition — extracting principal components from the symmetric covariance matrix
  4. Dimensionality reduction — projecting data onto the top-k eigenvectors
  5. Reconstruction — recovering images from compressed representations and finding the minimum number of components needed for recognizable reconstruction
  6. Visualization — 2D and 3D scatter plots of projected data

Project 3 — PageRank & TextRank

PageRank (PageRank.ipynb)

Implements Google's PageRank algorithm using three independent methods, applied to a real web-graph dataset:

  1. NetworkX built-in — baseline using networkx.algorithms.pagerank
  2. Eigendecomposition — computing the L1-normalized eigenvector of the Google matrix corresponding to eigenvalue 1
  3. Power Method — iterative approximation via $b_{k+1} = \frac{A \cdot b_k}{|Ab_k|}$ until convergence
  4. Random Walk — stochastic simulation of a surfer traversing the graph

The Google matrix incorporates a damping factor $p$ (typically 0.15): $M = (1-p) \cdot A + p \cdot B$

Results are visualized as directed graphs with node sizes proportional to rank.

TextRank (TextRank.ipynb)

Applies the PageRank idea to NLP for unsupervised keyword and sentence extraction, based on the Mihalcea & Tarau (2004) paper. Applied to two fairy tale texts (Cinderella and Beauty and the Beast):

  • Tokenization and part-of-speech tagging using NLTK
  • Co-occurrence graph construction with a sliding word window
  • Weighted PageRank via the Power Method for keyword ranking

Repository Structure

Linear_Algebra_Projects/
├── Project1/
│   ├── Gram-Schmidt process.ipynb
│   ├── random_projections.ipynb
│   └── TrainData.txt              # Persian digit image vectors
├── Project2/
│   ├── PCA.ipynb
│   └── fashion_MNIST.zip          # Fashion-MNIST dataset
└── Project3/
    ├── PageRank.ipynb
    ├── TextRank.ipynb
    ├── graphs.py                  # Graph loading & visualization utilities
    ├── page_rank.py               # PageRank helper functions
    ├── text_rank.py               # TextRank / NLP utilities
    └── data/
        ├── Beauty_and_the_Beast.txt
        └── Cinderalla.txt

Requirements

  • Python 3.9+
  • numpy
  • pandas
  • matplotlib
  • networkx
  • nltk
  • scikit-learn (for PCA comparison / data loading)
  • Jupyter Notebook / JupyterLab

Install all dependencies:

pip install numpy pandas matplotlib networkx nltk scikit-learn jupyter

For TextRank, also download the required NLTK data:

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

Usage

Clone the repository and launch Jupyter:

git clone https://github.com/Sepovsky/Linear_Algebra_Projects.git
cd Linear_Algebra_Projects
jupyter notebook

Then open any .ipynb file inside Project1/, Project2/, or Project3/ and run the cells sequentially.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors