Skip to content

Latest commit

 

History

History
30 lines (25 loc) · 1.07 KB

README.md

File metadata and controls

30 lines (25 loc) · 1.07 KB

Web Iota

Iota is a web scraper which can find all of the images and links/suburls on a webpage. To reach this goal, I used some python libraries such as Selenium, Request, and Beautifulsoup

Iota 1

  • Supports scraping images and links
  • Using request lib and Beautifulsoup
  • Unable to parse Javascript

Iota 2

  • Requires Selenium PhantomJS Driver
  • Using request lib, selenium, and Beautifulsoup
  • Able to parse JavaScript
  • Able to scrape most of the anti-scraping websites

Usage

Try to type python iota1.py -h

usage: iota.py [-h] [-img] [-all_img] [-link] [url]

positional arguments:
  url         The URL of the target website/webpage

optional arguments:
  -h, --help  show this help message and exit
  -img        Find all of the image on the webpage
  -all_img    Find all of the image on the webpage and subwebpages
  -link       Find all of the suburls/links on the webpage

Example: python iota2.py -img https://www.w3schools.com/html/html_classes.asp