Skip to content

This is part of the hands-on project for the Master in Data & Decision Science course, by Sirius School of Technology - Module 1 - Statistical Analysis and Business Intelligence.

License

Notifications You must be signed in to change notification settings

joaomorossini/Sirius_Module-1_Startup_Genome

Repository files navigation

The Project

This is part of the hands-on project for the Master in Data & Decision Science course, by Sirius School of Technology - Module 1 - Statistical Analysis and Business Intelligence.


Startup Genome

Startup Genome’s Global Startup Ecosystem Report (GSER) is powered by the world’s most comprehensive and quality controlled dataset on startup ecosystems. Informed by information on 3.5 million startups across 290 global ecosystems, our data and insights are the product of over a decade of independent research and policy work.

GSER 2023 ranks the top 30 and 10 runner-up global ecosystems, and includes a top 100 ranking of emerging ecosystems. It also takes a look at startup communities from a regional perspective, separately ranking ecosystems in Africa, Asia, Europe, Latin America, MENA, North America, and Oceania.


Disclaimer

This repository shows the chosen approach to cleaning, organizing and structuring the data using Python, but it DOES NOT provide access to the data itself. The data used in this project is proprietary and can only be accessed by the Startup Genome team and by authorized partners.


Developers


Folder Structure

  • data/: Folder to store data.

    • raw/: Folder to store raw data.
    • processed/: Folder to store processed data.
    • external/: Folder to store external data.
  • docs/: Folder to store explanatory documents and additional information.

  • notebooks/: Folder to store Jupyter notebooks.

    • exploratory/: Folder for exploratory analysis notebooks.
    • preprocessing/: Folder for preprocessing notebooks.
    • analysis/: Folder for data analysis notebooks.
  • reports/: Folder to store reports and results.

    • figures/: Folder to store generated figures.
    • results/: Folder to store project results.
  • src/: Folder to store source code.

    • data/: Folder for data manipulation modules.
    • models/: Folder for machine learning model modules.
    • utils/: Folder for utility modules.
    • scripts/: Folder for auxiliary scripts.
  • tests/: Folder to store automated tests.

Dependencies

pip install -r requirements.txt

  • Place the datasets in "genome\data\raw"
    • abstract.csv
    • IPC Titles.xlsx
    • ListOfCompanies.csv
    • raw_patents.csv
    • table_for_applicants.csv

About

This is part of the hands-on project for the Master in Data & Decision Science course, by Sirius School of Technology - Module 1 - Statistical Analysis and Business Intelligence.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published