Skip to content

Latest commit

 

History

History
 
 

Web-Scraping-with-Beautiful-Soup-master

Web-Scraping

A collection of small programs that extract data from a website and packages it to be useful with the use of BeautifulSoup, a Python package for parsing HTML and XML documents. Once you retrive the raw HTML of a site, you can start to select and extract with BeautifulSoup, which parses raw HTML strings and produces an object that mirrors HTML documents' structure.

The Rules of Scraping

  1. Check a website's Term and Conditions before scraping it and read the statements about legal use of the data.
  2. Do not request data from the website too aggressiely and ensure that your program behaves in a reasonable manner.
  3. Revisit the website and rewrite code as needed as the layout of the site may change.