Skip to content

A Selenium-based scraper for checking if Instagram posts contain sensitive content.

Notifications You must be signed in to change notification settings

avivbenami/instagram-post-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Instagram Post Scraper

A Selenium-based scraper for checking if Instagram posts contain sensitive content.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contact
  6. Acknowledgments

About The Project

This project is a Selenium-based scraper designed to check if Instagram posts contain sensitive content. It utilizes headless Chrome to access Instagram post URLs, and verify the presence of sensitive content based on the page source.

(back to top)

Built With

(back to top)

Getting Started

To get a local copy up and running, follow these simple steps.

Prerequisites

Installation

  1. Clone the repo

    git clone https://github.com/avivbenami/instagram-post-scraper.git
  2. Navigate to the project directory

    cd instagram-post-scraper
  3. Create a virtual environment

    python -m venv .venv
  4. Activate the virtual environment

    .\.venv\Scripts\activate
  5. Install dependencies

    pip install -r requirements.txt

(back to top)

Usage

 def get_sens(url_list: list) -> list:
     sensitive_urls = []
 
     for url in url_list:
         # Create an instance of InstagramPostScraper for the current URL
         scraper = InstagramPostScraper(url)
 
         # Sleep for 2 seconds between requests
         time.sleep(2)
 
         # Check if the post is sensitive
         if scraper.is_sensitive():
             sensitive_urls.append(url)
 
     return sensitive_urls
 
 if __name__ == "__main__":
     urls = ["https://www.instagram.com/cristiano/reel/C09VyjZtoyx/", 
     "https://www.instagram.com/reel/C0tWyIktI1Y"]
     print(get_sens(urls))

(back to top)

Roadmap

  • Improve error handling
  • Proxy management

(back to top)

Contact

Project Link: https://github.com/avivbenami/instagram-post-scraper

(back to top)

Acknowledgments

(back to top)

About

A Selenium-based scraper for checking if Instagram posts contain sensitive content.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages