The goal of this project is to provide a robust yet easy way to search Github for OpenAPI and Swagger definitions. Understanding that there is a lot of noise available, that we only care about OpenAPIs that validate, and that the Github API has rate limits that require you to automate the crawling over time. Providing a robust open-source solution that will crawl public Github repositories for machine-readable API definitions. The project will consist of developing an open-source API that allows you to pass in search parameters and then utilize the GitHub API to perform the search, helping simplify the search interface, and handle conducting a search in an asynchronous way, allowing the user to make a call to initiate, but then separate calls to receive results over time as results come in, helping show outcomes over time.
- Node JS/Express JS
- Typescript
- Octokit.JS
- Jest (For testing)
- Docker
- Python (Scripting)
- ElasticSearch
Dependancies: NodeJS 19, npm, Github APIKey How to get a Github API Key: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens
- Clone the repository to your local setup
- Make sure you have Docker installed locally.
- Run
docker compose up
- Two Containers - Elasticsearch (The database container) and an instance of the server should have started.
- Now to load the database with OpenAPI Files, run
python scripts/seed_script.py
from the root of the folder. (Takes around 2-3hrs) (More configuration of organisation list you can edit the scripts/assets/organisations1.txt, scripts/assets/organisations2.txt is for the next 1000 organisations)
- Clone the repository to your local setup
- Run
npm i
- Make a
.env
file in the directory and add the variables: PORT= (port number you want to host the api) GITHUB_API_KEY= (github API key) ES_HOST= (determines location of elasticsearch db) - Run
npm run build:watch
on one terminal. - On another terminal, run
npm run start
to start the server on the port specified on. - Now the nodejs server should be running! To test it just go to
localhost:{{PORT}}
and then you will be able to see the admin panel through which you can inference with some of the API's - Now to load the database with OpenAPI Files, run
python scripts/seed_script.py
from the root of the folder. (Takes around 2-3hrs)
1. docker pull docker.elastic.co/elasticsearch/elasticsearch:8.8.2
2. docker network create elastic
3. docker run \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
docker.elastic.co/elasticsearch/elasticsearch:8.8.2
Currently, we are only indexing OpenAPI Files from the top 1000 most popular organisations from Github (Based on stars). Although more organisations can be indexed by adding them to the scripts/assets/organisations.txt
file.
🚧Under Construction