You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 16, 2023. It is now read-only.
(NOTE: I made this ticket a while ago, but performance/ amount of time Starfish takes to run has never been a problem for us at Indeed. I think a company would have to be checking an extremely large number of employees for this to matter. So I'm going to put this in the backlog. If someone wants to work on it, great, but I don't think it's a particularly useful change at the moment.)
Currently We:
ping the github API for a person's events
Look at the first page and keep only the events that are event types we care about
Do this for every page of event history that Github has (they hold up to 300 events at a time, 10 per page)
Now, we look through that array of events to see if one is in the correct time period, and stop looking when we find one.
However, there's no reason to look through all 30 pages of a person's events if an event on the first page meets both criteria
So, refactor the code to check for
event type
if it happened in the time range
BEFORE fetching the next page. If those are both true, log the contributor's alternate id and move on to the next person.
The text was updated successfully, but these errors were encountered:
That's a good idea Jeff. I think I'll make a new issue for that to make it more visible.
Do you have a recommendation between using e-tags or the Last-Modified header?
I also saw in one of your other comments that you're using octokit, and I'm curious what advantage using octokit gives you over plain api calls?
We chose the e-tag and never looked back, so I don't have any useful guidance!
I want to say that Octokit has given us better long-term support, but we've still had to deal with a number of breaking changes over the years. So I'm going to put it in the "trying to help our GitHub friends build and bake this great library" as my reason.
There are also some plug-ins already created on retry policies, abuse limiters, etc., that may save time for some.
(NOTE: I made this ticket a while ago, but performance/ amount of time Starfish takes to run has never been a problem for us at Indeed. I think a company would have to be checking an extremely large number of employees for this to matter. So I'm going to put this in the backlog. If someone wants to work on it, great, but I don't think it's a particularly useful change at the moment.)
Currently We:
However, there's no reason to look through all 30 pages of a person's events if an event on the first page meets both criteria
So, refactor the code to check for
BEFORE fetching the next page. If those are both true, log the contributor's alternate id and move on to the next person.
The text was updated successfully, but these errors were encountered: