This repository showcases a collection of Python-based scraping projects developed during my Python stack internship at Infosys Springboard. The projects demonstrate the power of automation, dynamic user interfaces, and data extraction from web sources, providing practical solutions for real-world problems.
- A web scraping application built using Streamlit and BeautifulSoup to extract deals from DealsHeaven.
- Provides an intuitive user interface with status tracking, enhanced visuals, and a help section for user assistance.
- Includes category filtering and dynamic display of deal information such as product details, images, and prices.
- Scrapes library data from Public Libraries using Selenium.
- Data is stored in an SQLite database with two relational tables:
- States: Contains state IDs and names.
- Libraries: Stores library details, including city, address, zip, and phone, linked to the respective state ID.
- The GUI, built with Streamlit, allows users to select a state and view its libraries dynamically.
- Automates job scraping from Behance Jobs using Selenium.
- Displays job cards with company details, descriptions, and categories.
- Implements a dynamic search bar with suggestions for filtering job listings based on organizations.
- Enhanced user experience with light and dark themes and a sidebar for category selection.
- Web Scraping:
Extracts data from diverse web sources using BeautifulSoup and Selenium. - Data Storage:
Stores structured data in SQLite databases for efficient querying and manipulation. - Interactive GUIs:
Dynamic and responsive user interfaces built with Streamlit for intuitive navigation. - Theming:
Light and dark theme options for better visual customization. - Automation:
Seamless integration with automation tools for scraping processes.
- Ensure Python (version 3.8 or above) is installed on your system.
- Clone this repository:
git clone https://github.com/VarshiniShreeV/Scrapers.git
- Navigate to the project directory:
cd Scrapers
- Install dependencies:
pip install -r Requirements.txt
- Navigate to the respective folder.
- Run the main application file:
streamlit run ui.py
- Run the library scraper to populate the database:
streamlit run ui.py
- Run the main UI file (scraper and filters integrated):
streamlit run ui.py