Olive Scrapper

Introduction

This is a web scraper tool designed to extract Odia data from websites and collect relevant information for further analysis and processing. It is developed using Python and utilizes various libraries to fetch, parse, and store the extracted data.

Features

Extract data from multiple websites by providing a list of URLs or using a sitemap.
Handle different types of documents, including PDF, TXT and DOCX
Export the extracted data in various formats, such as JSONL (JSON Lines) or text files (.TXT), for easy storage and analysis.
Handle errors gracefully and provide informative messages in case of unsuccessful extractions.

Acknowledgments

This web scraper is inspired by and built upon various open-source libraries and tutorials available on the web. We thank the contributors of those projects for their valuable work.

Contact

For any issues, suggestions, or contributions, please contact OdiagenAI at [email protected]. Feel free to submit bug reports or feature requests on the repository's issue tracker.

Snapshots

Happy scraping!

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
.vscode		.vscode
User Interface		User Interface
olive_scrapper_snapshots		olive_scrapper_snapshots
pages		pages
utils		utils
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
olive_webscrapping.jpg		olive_webscrapping.jpg
requirements.txt		requirements.txt
styles.css		styles.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Olive Scrapper

Introduction

Features

Acknowledgments

Contact

Snapshots

About

Releases

Packages

Contributors 2

Languages

License

OdiaGenAI/Olive_Scrapper

Folders and files

Latest commit

History

Repository files navigation

Olive Scrapper

Introduction

Features

Acknowledgments

Contact

Snapshots

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages