Skip to content

Latest commit

 

History

History
116 lines (87 loc) · 4.5 KB

README.md

File metadata and controls

116 lines (87 loc) · 4.5 KB

Scraping Projects

This repository showcases a collection of Python-based scraping projects developed during my Python stack internship at Infosys Springboard. The projects demonstrate the power of automation, dynamic user interfaces, and data extraction from web sources, providing practical solutions for real-world problems.


Overview

1. Deals Hunter

  • A web scraping application built using Streamlit and BeautifulSoup to extract deals from DealsHeaven.
  • Provides an intuitive user interface with status tracking, enhanced visuals, and a help section for user assistance.
  • Includes category filtering and dynamic display of deal information such as product details, images, and prices.

2. Libraries Near You

  • Scrapes library data from Public Libraries using Selenium.
  • Data is stored in an SQLite database with two relational tables:
    • States: Contains state IDs and names.
    • Libraries: Stores library details, including city, address, zip, and phone, linked to the respective state ID.
  • The GUI, built with Streamlit, allows users to select a state and view its libraries dynamically.

3. Behance Job Listings Scraper

  • Automates job scraping from Behance Jobs using Selenium.
  • Displays job cards with company details, descriptions, and categories.
  • Implements a dynamic search bar with suggestions for filtering job listings based on organizations.
  • Enhanced user experience with light and dark themes and a sidebar for category selection.

Key Features

  • Web Scraping:
    Extracts data from diverse web sources using BeautifulSoup and Selenium.
  • Data Storage:
    Stores structured data in SQLite databases for efficient querying and manipulation.
  • Interactive GUIs:
    Dynamic and responsive user interfaces built with Streamlit for intuitive navigation.
  • Theming:
    Light and dark theme options for better visual customization.
  • Automation:
    Seamless integration with automation tools for scraping processes.

Installation Guide

  1. Ensure Python (version 3.8 or above) is installed on your system.
  2. Clone this repository:
    git clone https://github.com/VarshiniShreeV/Scrapers.git
  3. Navigate to the project directory:
    cd Scrapers
  4. Install dependencies:
    pip install -r Requirements.txt

How to Run the Applications

Deals Hunter

  1. Navigate to the respective folder.
  2. Run the main application file:
    streamlit run ui.py

Public Libraries Data

  1. Run the library scraper to populate the database:
    streamlit run ui.py

Behance Job Listings Scraper

  1. Run the main UI file (scraper and filters integrated):
    streamlit run ui.py

Screenshots

DealsHunter UI

1 2 3 4

Public Libraries Data Viewer

1 2 3 4 5

Behance Job Listings Viewer

1 2 3 4 5 6 7