Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quantifying impact of github organization #59

Open
MathewBiddle opened this issue Mar 1, 2024 · 7 comments
Open

quantifying impact of github organization #59

MathewBiddle opened this issue Mar 1, 2024 · 7 comments
Labels
metric issue to start a new metric

Comments

@MathewBiddle
Copy link
Contributor

Can we develop a metric to quantify the impact of the IOOS GitHub organization?

related to #26 but expanding further into our non-packaged repositories (e.g. documentation).

Number of forks, stars, active contributors, etc.

@MathewBiddle MathewBiddle added the metric issue to start a new metric label Mar 1, 2024
@ocefpaf
Copy link
Member

ocefpaf commented Mar 5, 2024

Maybe something like this?

import os

import pandas as pd
from github import Github
from safer import open

try:
    with open(os.path.expanduser("~/.ghoauth"), "r") as f:
        access_token = f.read()
        access_token = str(access_token).strip()
except FileNotFoundError:
    access_token = None

g = Github(access_token)

user = g.get_user("ioos")
repos = user.get_repos()

ioos_gh = {}
for repo in user.get_repos():
    print(repo.name)
    if repo.fork is False:
        stars = repo.stargazers_count
        contributors = repo.get_contributors()
        contributors_contribution = {
            contributor.name: contributor.contributions
            for contributor in contributors
            }

        ioos_gh.update(
            {
                repo.name: {
                    "stars": stars,
                    "forks": repo.forks,
                    "contributors": contributors_contribution,
                },
            }
        )

df = pd.DataFrame(ioos_gh).T.sort_values(by="stars", ascending=False)

You will need a GH token to run it but it should not require elevated permissions, just read should do it. Here is what I got from the code above:

df.head(n=20)
                             stars forks                                       contributors
compliance-checker              96    51  [{'Benjamin Adams': 713}, {'Luke Campbell': 33...
erddapy                         75    29  [{'Filipe': 740}, {'Vini Salazar': 83}, {'Call...
bio_data_guide                  43    18  [{'Mathew Biddle': 222}, {'Tylar': 81}, {'Bret...
ioos_qc                         39    22  [{'Kyle Wilcox': 199}, {'Filipe': 71}, {'Luke ...
pyoos                           34    33  [{'Filipe': 52}, {'Dave Foster': 24}, {'Emilio...
conda-recipes                   20    29  [{'Filipe': 1186}, {'Rich Signell': 289}, {'IO...
notebooks_demos                 19    19  [{'Filipe': 774}, {'Jennifer Bosch Webster': 7...
gsoc                            16     9  [{'Mathew Biddle': 28}, {'Micah Wengren': 26},...
thredds_crawler                 16    22  [{'Kyle Wilcox': 63}, {'Luke Campbell': 15}, {...
Cloud-Sandbox                    9    11  [{'Patrick Tripp': 113}, {'Jonathan Joyce': 9}...
ioos-python-package-skeleton     9     9  [{'Filipe': 113}, {None: 3}, {'Alex Kerney': 2...
BioData-Training-Workshop        8     8  [{'Don Setiawan': 41}, {'Ben Best': 17}, {'Fil...
ioos_code_lab                    8     7  [{'Filipe': 1140}, {'Mathew Biddle': 96}, {'Je...
ioosngdac                        8    18  [{'John Kerfoot': 80}, {'Luke Campbell': 20}, ...
erddap-gold-standard             8    15  [{'Mathew Biddle': 16}, {'Kyle Wilcox': 6}, {'...
system-test                      7    14  [{'Bob Fratantonio': 69}, {'Filipe': 68}, {'Ri...
ckanext-ioos-theme               7    14  [{'Benjamin Adams': 202}, {'Luke Campbell': 10...
soundcoop                        6     2  [{'Clea Parcerisas': 15}, {None: 6}, {'Carlos ...
glider-dac                       6    12  [{'Benjamin Adams': 295}, {'Luke Campbell': 20...
service-monitor                  6    13  [{'Luke Campbell': 304}, {'Benjamin Adams': 16...

@MathewBiddle
Copy link
Contributor Author

I like what you've done here @ocefpaf! Maybe quantifying the number of contributors too. But, that should be easy with the list you developed.

FYI, I just ran across this https://opensource.guide/metrics/

@MathewBiddle
Copy link
Contributor Author

This is interesting too https://chaoss.community/software/

@MathewBiddle
Copy link
Contributor Author

MathewBiddle commented Mar 8, 2024

we can get a lot of stuff from github's advanced search:

https://github.com/search?q=org%3Aioos&type=repositories&ref=advsearch

@ocefpaf
Copy link
Member

ocefpaf commented Mar 9, 2024

I like what you've done here @ocefpaf! Maybe quantifying the number of contributors too. But, that should be easy with the list you developed.

Yes. we can do something like:

contributors = []
for repo, row in df.iterrows():
    s = pd.Series(row["contributors"])
    s.name = repo
    contributors.append(s)

index = pd.concat(contributors, axis=1).sum(axis=1).sort_values(ascending=False).index
contributors_per_repo = pd.concat(contributors, axis=1).reindex(index)

contributors_per_repo.sum(axis=1)

FYI, I just ran across this https://opensource.guide/metrics/
This is interesting too https://chaoss.community/software/

Those are a really nice resources! I knew about CHAOSS but nor the opensource.guide.

we can get a lot of stuff from github's advanced search

If you are just browsing, yes. But we can get all that info grammatically with PyGitHub and create tables, etc. The repo object in the main loop has all the info and, if you are using an elevated token, you can even do fancy things like write/create, but we don't need that for the metrics.

@MathewBiddle
Copy link
Contributor Author

MathewBiddle commented Apr 8, 2024

also could be worthwhile to look at the number of participants in issues
https://gist.github.com/ocefpaf/2ed11e4c977adfe3ffeb5eef9f576c1e

While they might not be directly contributing to a project, they are participating in the conversation.

@ocefpaf
Copy link
Member

ocefpaf commented Apr 9, 2024

While they might not be directly contributing to a project, they are participating in the conversation.

That indeed made a few repos popup, like ioos-atn-data and bio_data_guide. See the last two cells in https://gist.github.com/ocefpaf/11a7c4832b23dc3978a1a3fb20783988

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
metric issue to start a new metric
Projects
None yet
Development

No branches or pull requests

2 participants