Skip to content

Latest commit

 

History

History
29 lines (27 loc) · 827 Bytes

README.md

File metadata and controls

29 lines (27 loc) · 827 Bytes

long-island-datasets

Scrapes data from several websites.

Scrapers to be refactored

  • cms-data.py
  • cms-provider-data.py
  • college-scorecard.py
  • epa-echo.py
  • epa-sems-envirofacts.py
  • usace-fuds-arcgis.py
  • suffolk-county-food-establishment-inspections.py
  • socrata.py
  • open-fda.py
  • nyscjc-determinations.py
  • ntsb-carol.py
  • nhtsa-fars.py
  • irs_exempt_organizations.py
  • hhs_oig_exclusions.py
  • fhwa_nbi_arcgis.py
  • fra_nsrt.py
  • dol_osha.py

To-do

  • move all configuration to config.py
  • assess if we require a wrapper for requests
  • refactor li_scraper.py itself
  • utility function to check for LI zips
  • possibly consolidate hhs, irs, nyscjc into a standardized config-based scraper
  • rename scrapers to be more pythonic