Skip to content

Latest commit

 

History

History
221 lines (163 loc) · 7.73 KB

README.md

File metadata and controls

221 lines (163 loc) · 7.73 KB

Smart Data Entry Killer

A Python Script to Excelize Parsed Complex Text, Image, Tables from Bulk PDF
Explore »

data-extractor

About The Project

The main goal of this project is to input data on excel from some complex PDFs. A PDF is called complex if it contains multiple pages with various shapes and dimensions of tables, chemical images, drawings, diagrams etc.

(back to top)

Built With

  • beautifulsoup4==4.11.1
  • cryptography==37.0.4
  • html5lib==1.1
  • lxml==4.9.1
  • numpy==1.23.1
  • pandas==1.4.3
  • pdfminer.six==20220524
  • pdfplumber==0.7.4
  • Pillow==9.2.0
  • pipreqs==0.4.11
  • PyMuPDF==1.20.1
  • urllib3==1.26.11
  • Wand==0.6.9
  • xlrd==2.0.1

Prerequisites

You need Python 3.7 or more and Pip 20.0 or more for this project. I have used Python 3.9.13 and pip 22.2.1

Installation

Below is an example of how you can instruct your audience on installing and setting up your app. This template doesn't rely on any external dependencies or services.

  1. Get a free API Key at https://example.com
  2. Clone the repo
     git clone https://github.com/akifislam/SmartDataEntryKiller.git
  3. Install Dependencies
    pip install -r requirements.txt
  4. Run Script
    python3 BurstProcessor.py

(back to top)

Contact

Akif Islam - Akif Islam - [email protected]

Project Link: Smart Data Entry Killer

(back to top)

Special Thanks

  • Mohammad Ruhul Ameen Bhai for boosting me to complete this impossible tasks
  • StackOverFlow for saving my life and giving me recognition to outsiders as a Python Developer (though I know nothing about it)

-->