Skip to content

OdiaGenAI/Olive_Scrapper

Repository files navigation

Olive Scrapper

Introduction

This is a web scraper tool designed to extract Odia data from websites and collect relevant information for further analysis and processing. It is developed using Python and utilizes various libraries to fetch, parse, and store the extracted data.

Features

  • Extract data from multiple websites by providing a list of URLs or using a sitemap.
  • Handle different types of documents, including PDF, TXT and DOCX
  • Export the extracted data in various formats, such as JSONL (JSON Lines) or text files (.TXT), for easy storage and analysis.
  • Handle errors gracefully and provide informative messages in case of unsuccessful extractions.

Acknowledgments

This web scraper is inspired by and built upon various open-source libraries and tutorials available on the web. We thank the contributors of those projects for their valuable work.

Contact

For any issues, suggestions, or contributions, please contact OdiagenAI at [email protected]. Feel free to submit bug reports or feature requests on the repository's issue tracker.

Snapshots

Snapshot 1 Snapshot 2 Snapshot 3

Happy scraping!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published