Graph Extractor

Automatically extracts data points from graph images using computer vision, OCR, and signal processing. The extracted data can be analyzed and exported to CSV for further use.

Features

Detect X and Y axes from graphs
Extract axis values using OCR (Tesseract)
Detect grid lines and map pixel positions to actual values
Digitize graph data points
Highlight deviation points (local maxima/minima)
Compute statistics (mean, min, max, RMS, skewness, kurtosis)
Export data to CSV for analysis

Installation

Clone the repository:

git clone https://github.com/meekhumor/Graph-Extractor.git
cd Graph-Extractor

Install dependencies:

pip install opencv-python numpy pytesseract scipy pandas

Make sure Tesseract OCR is installed and added to your PATH.

Usage

Place your graph image in the graph/ directory (e.g., graph/your_graph.png). The script expects a simple line graph with grid lines for best results.
Run the main script:
```
python extractor.py
```
- By default, it processes graph/graph3.png.
- To use a different image, modify the line:
```
preprocess_image('graph/your_image.png')
```
- OpenCV windows will display visualizations of axes detection, grid lines, data points, and deviations.
- Extracted data is saved as csv/graph3.csv (columns: X, Y).
- Statistics (mean, min, max, RMS, peak-to-valley, skewness, kurtosis) are printed to the console.
Press any key in the OpenCV windows to close them.

Example Output

Console Stats:

Mean: 5.2341
Min: 1.0000
Max: 10.0000
RMS: 5.6789
Peak-to-Valley: 9.0000
Skewness: 0.1234
Kurtosis: -0.5678

CSV File (csv/graph3.csv):

X,Y
0.0,4.5
1.0,5.2
2.0,3.8
...

Screenshots

Value Extraction:

Grid Lines:

Data Points:

How It Works

Preprocessing: Loads the image, converts to grayscale, and applies thresholding for edge detection.
Axis Detection: Uses Hough Line Transform to identify horizontal (X-axis) and vertical (Y-axis) lines.
Value Extraction: Crops text regions from axes and uses Tesseract OCR to parse numerical values.
Grid Detection: Scans for vertical/horizontal lines, clusters them, and maps to interpolated axis values.
Digitization: Finds non-zero pixels (data points), interpolates pixel coordinates to real values.
Analysis: Detects local extrema, computes statistics, and exports to CSV.

Directory Structure

meekhumor-graph-extractor/
├── README.md
├── extractor.py
├── graph/          # Input graph images (e.g., graph3.png)
└── csv/            # Output CSV files (auto-created)

Limitations

Best for simple 2D line plots with visible grid lines.
OCR accuracy depends on text clarity; may fail on handwritten or stylized labels.
No support for logarithmic scales or multi-line graphs (extendable).
Requires manual image path updates; consider adding command-line arguments for production use.

Contributing

Contributions are welcome! Fork the repo, make changes, and submit a pull request. For major changes, please open an issue first.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Graph Extractor

Features

Installation

Usage

Example Output

Screenshots

How It Works

Directory Structure

Limitations

Contributing

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.vscode		.vscode
csv		csv
graph		graph
README.md		README.md
extractor.py		extractor.py

meekhumor/Graph-Extractor

Folders and files

Latest commit

History

Repository files navigation

Graph Extractor

Features

Installation

Usage

Example Output

Screenshots

How It Works

Directory Structure

Limitations

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages