-
Notifications
You must be signed in to change notification settings - Fork 1
141 lines (131 loc) · 7.03 KB
/
Copy pathrefresh-counts.yml
File metadata and controls
141 lines (131 loc) · 7.03 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
name: Refresh install + GitHub counts
# Daily refresh of installs-cache.json (and manual-package-counts.json).
# Combines the two halves of the count update into a single job so they
# share a checkout and produce a single PR per day.
#
# Step 1 — registry counters (bash + curl + jq):
# npm (api.npmjs.org)
# PyPI (pypistats.org)
# crates (crates.io/api)
# Writes the per-source HWMs (npm, pypi, crates) into the cache.
# Does NOT write `total` or `fetchedAt` — step 2 owns those.
#
# Step 2 — GitHub-side counters (Node script):
# clones / clonesByRepo (day-cursor accumulator on /traffic/clones, 14d API window)
# releases / releasesByRepo (HWM on release-asset download counts; currently always 0)
# ghPackages (best-effort; manual-package-counts.json is the live source of truth
# because GitHub's REST API does not populate container download_count)
# Reads the cache (now containing the freshly-written registry HWMs from step 1)
# and writes the canonical `total` + `fetchedAt` based on the installs.data.ts
# formula. This makes step 2 the single writer for the displayed total, which
# eliminates the dual-writer race that the previous split workflow had.
#
# Step 3 — open one PR if anything actually changed.
#
# History: replaces update-installs.yml + update-github-counts.yml. Those two
# workflows wrote the same file (installs-cache.json) on staggered crons (06:00
# and 06:30 UTC). Both wrote `total` and `fetchedAt` from different inputs, so
# the second workflow's PR conflicted with the first whenever the first PR
# wasn't yet merged — which was almost always, since merges took 1–2 hours.
# Two days running (2026-05-08 and 2026-05-09) ended with the second PR closed
# without merging, replaced by a manual or duplicate run.
#
# Tokens:
# ORG_TRAFFIC_TOKEN (PAT) needs scopes:
# - repo (for /traffic/clones on each org repo)
# - read:packages (for /orgs/<org>/packages/container/<pkg>; if missing,
# the GHCR step is skipped with a warning and the
# manual-package-counts.json file remains the source
# of truth)
#
# Schema preservation: step 1's bash uses `jq '. + { ... }'` to MERGE updates
# into the existing cache, leaving every other field intact. Step 2's Node
# script writes through readCache()/writeCache() with HWM/cursor semantics.
# Both are idempotent and safe to interleave with the build's installs.data.ts
# loader, which uses the same formulas.
on:
schedule:
- cron: '0 6 * * *' # 6 AM UTC daily
workflow_dispatch:
permissions:
contents: read
jobs:
refresh:
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
steps:
# persist-credentials defaults to true on purpose. The PR-open step's
# git push uses an x-access-token URL with ORG_TRAFFIC_TOKEN, but
# ORG_TRAFFIC_TOKEN lacks write scope on runcycles/cycles-docs. With creds
# persisted, git falls back to the persisted GITHUB_TOKEN (which has
# contents:write from the job's permissions block) and the push succeeds.
- uses: actions/checkout@v7
- uses: actions/setup-node@v6
with:
node-version: '20'
- name: Refresh registry counts (npm / PyPI / crates)
# Per-package HWMs in scripts/update-registry-counts.mjs. Replaces
# the prior inline bash + curl + jq, which used aggregate-only HWMs
# and silently masked legitimate growth in one package whenever
# another package's API call failed on the same run (pypistats.org
# has known intermittent CDN issues — verified 2026-05-09 when two
# consecutive runs returned just one of two packages each).
run: node scripts/update-registry-counts.mjs
- name: Refresh GitHub-side counts (clones / releases / ghPackages)
env:
GH_TOKEN: ${{ secrets.ORG_TRAFFIC_TOKEN || secrets.GITHUB_TOKEN }}
run: node scripts/update-github-counts.mjs
- name: Open PR with refreshed counts
# Uses ORG_TRAFFIC_TOKEN for `gh pr create` (the default GITHUB_TOKEN
# cannot create PRs unless "Allow Actions to create PRs" is enabled,
# which it isn't on this repo). The `git push` falls back to the
# persisted GITHUB_TOKEN — see the actions/checkout note at the top
# of this job.
env:
GH_TOKEN: ${{ secrets.ORG_TRAFFIC_TOKEN }}
run: |
set -euo pipefail
# Skip the PR when the only diff is a freshness-timestamp bump.
# Step 2's update-github-counts.mjs unconditionally writes
# `fetchedAt` (and `lastVerifiedAt` in manual-package-counts.json
# whenever the GHCR sub-step succeeds), so a no-op refresh — fresh
# data fetched, but no count actually moved — still produces a
# 1-line diff. Without this filter that would open a PR every day
# whose only change is a timestamp; PR #605 (2026-05-09) closed
# without merging for exactly this reason.
DIFF=$(git diff --no-color -- \
.vitepress/theme/installs-cache.json \
.vitepress/theme/manual-package-counts.json)
if [ -z "$DIFF" ]; then
echo "No changes to counts — skipping PR."
exit 0
fi
# `^[+-][^+-]` keeps real change lines and excludes the +++/---
# file-header lines. Then drop any change lines that touch only
# the freshness-timestamp fields.
MATERIAL=$(echo "$DIFF" | grep -E '^[+-][^+-]' | grep -vE '"(fetchedAt|lastVerifiedAt)"' || true)
if [ -z "$MATERIAL" ]; then
echo "Only freshness timestamps changed (fetchedAt / lastVerifiedAt) — no material count updates, skipping PR."
exit 0
fi
echo "Material changes detected — opening PR. Sample lines:"
echo "$MATERIAL" | head -10
# Append the run number so re-dispatches in the same UTC day
# don't collide with a previous run's branch (which may still
# exist on the remote if its PR was closed without merging,
# as PR #605 was on 2026-05-09 — the trigger that surfaced
# this bug). Each run gets its own branch and PR.
BRANCH="chore/refresh-counts-$(date -u +%Y%m%d)-${GITHUB_RUN_NUMBER}"
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git checkout -b "$BRANCH"
git add .vitepress/theme/installs-cache.json .vitepress/theme/manual-package-counts.json
git commit -m "chore: refresh install + GitHub counts"
git push "https://x-access-token:${GH_TOKEN}@github.com/${GITHUB_REPOSITORY}.git" "$BRANCH"
gh pr create \
--title "chore: refresh counts $(date -u +%Y-%m-%d)" \
--body "Automated daily refresh of registry counts (npm / PyPI / crates) and GitHub-side counts (clones / releases / ghPackages). One PR per day from the unified refresh-counts.yml workflow." \
--head "$BRANCH" \
--base main