GitHub - joaomorossini/Sirius_Module-1_Startup_Genome: This is part of the hands-on project for the Master in Data & Decision Science course, by Sirius School of Technology - Module 1 - Statistical Analysis and Business Intelligence.

The Project

This is part of the hands-on project for the Master in Data & Decision Science course, by Sirius School of Technology - Module 1 - Statistical Analysis and Business Intelligence.

Startup Genome

Startup Genome’s Global Startup Ecosystem Report (GSER) is powered by the world’s most comprehensive and quality controlled dataset on startup ecosystems. Informed by information on 3.5 million startups across 290 global ecosystems, our data and insights are the product of over a decade of independent research and policy work.

GSER 2023 ranks the top 30 and 10 runner-up global ecosystems, and includes a top 100 ranking of emerging ecosystems. It also takes a look at startup communities from a regional perspective, separately ranking ecosystems in Africa, Asia, Europe, Latin America, MENA, North America, and Oceania.

Disclaimer

This repository shows the chosen approach to cleaning, organizing and structuring the data using Python, but it DOES NOT provide access to the data itself. The data used in this project is proprietary and can only be accessed by the Startup Genome team and by authorized partners.

Developers

Folder Structure

data/: Folder to store data.
- raw/: Folder to store raw data.
- processed/: Folder to store processed data.
- external/: Folder to store external data.
docs/: Folder to store explanatory documents and additional information.
notebooks/: Folder to store Jupyter notebooks.
- exploratory/: Folder for exploratory analysis notebooks.
- preprocessing/: Folder for preprocessing notebooks.
- analysis/: Folder for data analysis notebooks.
reports/: Folder to store reports and results.
- figures/: Folder to store generated figures.
- results/: Folder to store project results.
src/: Folder to store source code.
- data/: Folder for data manipulation modules.
- models/: Folder for machine learning model modules.
- utils/: Folder for utility modules.
- scripts/: Folder for auxiliary scripts.
tests/: Folder to store automated tests.

Dependencies

pip install -r requirements.txt

Place the datasets in "genome\data\raw"
- abstract.csv
- IPC Titles.xlsx
- ListOfCompanies.csv
- raw_patents.csv
- table_for_applicants.csv

Name	Name	Last commit message	Last commit date
Latest commit joaomorossini Delete .idea directory Aug 24, 2023 06a7bc1 · Aug 24, 2023 History 7 Commits
data	data	first upload to GitHub on this repo	Aug 24, 2023
docs	docs	first upload to GitHub on this repo	Aug 24, 2023
notebooks	notebooks	first upload to GitHub on this repo	Aug 24, 2023
reports	reports	first upload to GitHub on this repo	Aug 24, 2023
src	src	first upload to GitHub on this repo	Aug 24, 2023
tests	tests	first upload to GitHub on this repo	Aug 24, 2023
.gitignore	.gitignore	Initial commit	Aug 24, 2023
LICENSE	LICENSE	Initial commit	Aug 24, 2023
README.md	README.md	updated README file	Aug 24, 2023
requirements.txt	requirements.txt	first upload to GitHub on this repo	Aug 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Project

Startup Genome

Disclaimer

Developers

Folder Structure

Dependencies

About

Releases

Packages

Languages

License

joaomorossini/Sirius_Module-1_Startup_Genome

Folders and files

Latest commit

History

Repository files navigation

The Project

Startup Genome

Disclaimer

Developers

Folder Structure

Dependencies

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages