Skip to content

Latest commit

Β 

History

History
80 lines (70 loc) Β· 3.97 KB

README.md

File metadata and controls

80 lines (70 loc) Β· 3.97 KB

Latin American Women Writers

This repository was developed for the code and data behind the story: Una constelaciΓ³n de escritoras latinoamericanas (nacidas en el siglo XX).

The analysis uses web scrapping of Wikipedia entries for Latin American women writers and network graph visualization in order to create a web application.


Directory Structure

β”œβ”€β”€ app.py                              # Streamlit app file
β”œβ”€β”€ assets                              # Resources for the project
β”‚Β Β  β”œβ”€β”€ datacritica
β”‚Β Β  β”œβ”€β”€ imgs
β”‚Β Β  β”œβ”€β”€ imgs_processed
β”‚Β Β  β”œβ”€β”€ mosaics
β”‚Β Β  β”œβ”€β”€ targets
β”‚Β Β  └── targets_processed
β”œβ”€β”€ data                                # Categorized data 
β”‚Β Β  β”œβ”€β”€ processed                       # Cleaned data
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ escritoras_wiki.csv
β”‚Β Β  β”‚Β Β  └── escritores_destacados.csv
β”‚Β Β  └── raw                             # Original data
β”‚Β Β      └── escritoras.csv
β”œβ”€β”€ Dockerfile                          # Commands to build a docker image
β”œβ”€β”€ docs                                # Explanatory materials
β”‚Β Β  β”œβ”€β”€ data-dictionary.md              # Information about the data
β”‚Β Β  └── references                      # Papers, manuals, articles, etc.
β”œβ”€β”€ escritoras_latinas                  # Python package
β”‚Β Β  β”œβ”€β”€ data                            # Functions to manipulate data
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ analyze.py                  # Module to analyze data
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ export.py                   # Module to save exports
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ load.py                     # Module to load data and paths
β”‚Β Β  β”‚Β Β  └── process.py                  # Module to process data
β”‚Β Β  └── utils                           # Functions to make common patterns
β”‚Β Β      └── paths.py                    # Module to generate relative paths
β”œβ”€β”€ LICENSE                             # Project license
β”œβ”€β”€ notebooks                           # Jupyter notebooks
β”‚Β Β  β”œβ”€β”€ 0.0-scrapping-text.ipynb
β”‚Β Β  β”œβ”€β”€ 0.1-scrapping-text.ipynb
β”‚Β Β  β”œβ”€β”€ 0.2-scrapping-images.ipynb
β”‚Β Β  β”œβ”€β”€ 1.0-annotate-data.ipynb
β”‚Β Β  β”œβ”€β”€ 1.1-process-images.ipynb
β”‚Β Β  β”œβ”€β”€ 2.0-visualize-network.ipynb
β”‚Β Β  β”œβ”€β”€ 2.1-visualize-network.ipynb
β”‚Β Β  └── 2.2-visualize-donut-chart.ipynb
β”œβ”€β”€ outputs                             # Exports generated by notebooks
β”‚Β Β  β”œβ”€β”€ figures                         # Generated graphics, maps, etc.
β”‚Β Β  β”‚Β Β  └── index.html
|   β”œβ”€β”€ networks                        # Generated graph network
β”‚Β Β  β”‚Β Β  └── index.html
β”‚Β Β  └── tables                          # Generated pivot tables
β”‚Β Β  β”œβ”€β”€ LICENSE
β”‚Β Β  β”œβ”€β”€ photomosaics
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ photomosaics.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ run.py
β”‚Β Β  β”‚Β Β  └── scrape.py
β”‚Β Β  β”œβ”€β”€ README.md
β”‚Β Β  └── requirements.txt
β”œβ”€β”€ Pipfile                             # Project dependencies
β”œβ”€β”€ Pipfile.lock                        # Specific versions of packages on Pipfile
β”œβ”€β”€ README.md                           # Top-level README for this project
β”œβ”€β”€ README-ES.md                        # README in Spanish
β”œβ”€β”€ requirements.txt                    # Project dependencies
β”œβ”€β”€ setup.py                            # Import project as a python module
└── style.css                           # Styles for streamlit app

License

This project is released under MIT License.


This repository was generated with cookiecutter using a data-journalism template for python.