Skip to content

Latest commit

 

History

History
22 lines (14 loc) · 896 Bytes

File metadata and controls

22 lines (14 loc) · 896 Bytes

ISO-3166 Countries with Subdivisions(Regions)

Overview

A simple web crawler created by use of Scrapy. It crawls Wikipedia for all countries list and extracts their name and ISO-3166-1 alpha-2 as well as ISO-3166-1 alpha-3 codes. Moreover it follows each country and extracts it's subdivisions (regions) and their corresponding ISO-3166-2 codes.

All of that is exported into a JSON file as following: alt text

Requirements

  • Python 3.5+
  • Scrapy

Install

Running

  • From the repo directory run scrapy crawl codes

Note that crawler will not overwrite output country_codes.json file, but will append to it. Therefore you might want to backup the output file first by renaiming it.