Code repository for the paper [Bajardi et al. EPJ Data Science 2015]
The following IPython Notebooks contain the code that has been used to perform the data analysis of the paper "Unveiling patterns of international communities in a global city using mobile phone data" and allow to reproduce the main results.
IPython notebooks are included with output, clicking on a link will open the corresponding notebook using the nbviewer service.
-
Parsing of the original dataset into an HDF5 store.
-
Aggregation of the dataset on wider time intervals.
-
Extraction of daily, monthly and yearly entropy time series.
-
03_census_rank_comparison.ipynb
Correlation between mobile calls volume and foreign resident population reported by census.
-
Classifier of point of interests based on entropy and phone activity.
-
Associations between country names and international calling codes, source: github/mledoze. See the accompanying notebook for the parsing procedure.
-
Census data for the city of Milan aggregate by NIL (Nuclei di Identità Locale), source: http://dati.comune.milano.it.
-
List of point of interests in Milan as defined by TripAdvisor, source: http://www.tripadvisor.com/Attractions-g187849-Activities-Milan_Lombardy.html
-
Remittance data used to compare clusters of countries based on persisten homology, source:http://data.worldbank.org/indicator/BX.TRF.PWKR.DT.GD.ZS.
In order to execute the notebooks you will need additional data and a working python environment containing the required dependencies.
Download the complete dataset "Telecommunications - SMS, Call, Internet - MI" inside the project sub-directory data/telco.
Install the required submodules by running:
git submodule init
git submodule update
Python dependencies can be installed in a virtual environment using the following instructions inside the project root directory:
virtualenv virtualenv
. virtualenv/bin/activate
pip install -r requirements.txt && pip install tables
To start an IPython notebook server using the newly created virtual environment:
. virtualenv/bin/activate
ipython notebook
Select a notebook using the IPython notebook interface and execute its cells. To correctly reproduce the analysis pipeline the notebooks must be executed according to the progressive numbering.
You can report eventual bugs or issues using the GitHub issue tracking tool.
See LICENSE.txt.