Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure GHA to scan for spelling errors and broken links #157

Draft
wants to merge 61 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
80ff31e
new branch with recent main changes
kaijli Jan 14, 2025
c31dff9
trying a different spell checker
kaijli Jan 15, 2025
f1a84f3
fix syntax error
kaijli Jan 15, 2025
618082e
update dictionary and json path
kaijli Jan 15, 2025
94e93e7
update dictionary and remove env vars
kaijli Jan 15, 2025
e148242
testing with import
kaijli Jan 15, 2025
349cf26
split yml into two
kaijli Jan 15, 2025
762fa5d
remove the need build
kaijli Jan 15, 2025
7f30a3a
recombined files, changed call order
kaijli Jan 15, 2025
e49d7e4
change spell check output
kaijli Jan 15, 2025
c3d6013
added resource links
kaijli Jan 15, 2025
62f52ce
typo
kaijli Jan 15, 2025
8a7c3f7
fix logic for spell check ticket creation
kaijli Jan 15, 2025
720dafa
trying to grab workflow outputs
kaijli Jan 15, 2025
30b3adc
pipe std out for spell check
kaijli Jan 15, 2025
4df4582
changed allow list name and link check call
kaijli Jan 17, 2025
746549a
removed file used to test
kaijli Jan 17, 2025
f5613c4
remove uses in link check
kaijli Jan 17, 2025
239e7d8
use verbose mode
kaijli Jan 17, 2025
ac9f592
testing npx usage
kaijli Jan 17, 2025
fa4fded
merge main into this branch for testing
kaijli Jan 17, 2025
d5bd79b
Merge branch 'main' into 61-add-link-checker
kaijli Jan 17, 2025
ab1a28e
add quiet mode
kaijli Jan 17, 2025
abb4a97
tee stdout instead of pipe
kaijli Jan 17, 2025
91f526a
change error search
kaijli Jan 17, 2025
4f80cd0
use boolean to activate github issue creation
kaijli Jan 17, 2025
728a56d
check output files
kaijli Jan 17, 2025
d3a8e46
testing json usage
kaijli Jan 17, 2025
4621a31
check syntax
kaijli Jan 17, 2025
20dc8c3
remove commented code
kaijli Jan 17, 2025
27f3287
i don't know why it's not working
kaijli Jan 17, 2025
035303e
change logic for ticket creation
kaijli Jan 17, 2025
1cf9473
try again with github output var
kaijli Jan 17, 2025
02a99c9
testing spell check issue creation
kaijli Jan 17, 2025
1fb1618
all parts working, generating pr
kaijli Jan 17, 2025
795e3f3
commenting out pull request run because logically, it needs some thin…
kaijli Jan 17, 2025
839c64d
retesting checkers
kaijli Jan 28, 2025
e630bab
add second link checker and pause issue creation
kaijli Jan 28, 2025
74a2523
put another link checker in to test
kaijli Jan 29, 2025
8cf2c58
remove other test link checkers and update artifact paths
kaijli Jan 29, 2025
4b32e6d
Merge branch 'main' into 61-add-link-checker
eecavanna Jan 29, 2025
91a06ec
Use `lychee` to scan website file tree for broken links
eecavanna Jan 29, 2025
91088ff
Configure workflow to only run when invoked by another workflow
eecavanna Jan 29, 2025
fce879c
Specify correct website root directory to `lychee` (oops)
eecavanna Jan 29, 2025
e2a393e
Bump `lycheeverse/lychee-action` to version 2
eecavanna Jan 29, 2025
38726a9
Merge branch 'main' into 61-add-link-checker
eecavanna Jan 30, 2025
4fcfd50
Update checker(s) workflow to depend upon `assemble-website` workflow
eecavanna Jan 30, 2025
5597395
Untar the artifact archive into a file tree before scanning it
eecavanna Jan 30, 2025
1999766
Create destination directory before using it (oops)
eecavanna Jan 30, 2025
c1e4c6a
Delete obsolete config file
eecavanna Jan 30, 2025
f80ac6e
Check out commit so job has access to `spellcheck_allow_list.txt`
eecavanna Jan 30, 2025
01a709e
Move the spellcheck allow list into new `supporting_files` subdirectory
eecavanna Jan 30, 2025
0269e8a
Separate the link check and spell check into two GHA workflows
eecavanna Jan 30, 2025
6e733ea
Clarify step name
eecavanna Jan 30, 2025
b0bd752
Invoke `spellchecker-cli` via off-the-shelf action instead of via `npx`
eecavanna Jan 30, 2025
9f30640
Trigger checker workflows from assembler workflow (instead of opposite)
eecavanna Jan 30, 2025
6797ef1
Avoid compiling home docs multiple times per PR update
eecavanna Jan 30, 2025
fe43af5
Bump `spellchecker-cli-action` to latest version
eecavanna Jan 30, 2025
693cfb5
Create directory before using it
eecavanna Jan 30, 2025
1412bbd
Only create GitHub Issue about broken links if processing `main` branch
eecavanna Jan 30, 2025
3c6ee67
Dump path of working directory (for debug)
eecavanna Jan 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions .github/workflows/check-links-and-spelling.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
###############################################################################
# Introduction:
# -------------
# This workflow builds the NMDC documentation, checks for broken links,
# and identifies any spelling errors in the documentation.
# If any issues are found (broken links or misspelled words), it creates
# GitHub Issues containing lists of the errors for further review and fixing.
#
###############################################################################

name: Check Links and Spelling in Documentation

on:
# FIXME: Revert the branch name here to its original value.
push: { branches: [ 61-add-link-checker ] }
eecavanna marked this conversation as resolved.
Show resolved Hide resolved
workflow_dispatch: { }
workflow_call: { }
# pull_request:
# branches: [main]

jobs:
check-links:
name: Check Links in Documentation
runs-on: ubuntu-latest
permissions:
issues: write # required for peter-evans/create-issue-from-file
steps:
- name: Get website file tree
uses: actions/download-artifact@v4
with:
path: _build/html

# Note: Most of this snippet was copied from https://github.com/microbiomedata/nmdc-schema/blob/9c663673f3110a31f3a54a26a0698211c42b6143/.github/workflows/check-links.yaml#L38C1-L58C45.
# A few parts were copied from https://github.com/lycheeverse/lychee-action.
- name: Use Lychee to check for broken links
# This step will populate `env.lychee_exit_code` with the exit code returned by lychee.
# Possible exit codes: https://github.com/lycheeverse/lychee?tab=readme-ov-file#exit-codes
# Reference: https://github.com/lycheeverse/lychee-action
id: lychee
uses: lycheeverse/[email protected]
with:
# Reference: https://github.com/lycheeverse/lychee#commandline-parameters
args: --base docs --verbose --no-progress --format markdown --timeout 5 '_build/html/**/*.html'
debug: true
output: ./lychee/out.md
fail: false
- name: Create GitHub Issue listing broken links
# This step will only run if lychee returned a non-zero exit code.
# Reference: https://docs.github.com/en/actions/learn-github-actions/variables#using-the-env-context-to-access-environment-variable-values
if: steps.lychee.outputs.exit_code != 0
uses: peter-evans/create-issue-from-file@v5
with:
title: Website file tree contains broken links
content-filepath: ./lychee/out.md

check-spelling:
name: Check Spelling in Documentation
runs-on: ubuntu-latest
steps:
- name: Check out commit
uses: actions/checkout@v4
- name: Download final build artifact
uses: actions/download-artifact@v4
with:
path: _build/html
- name: Run Spell Checker
id: spellcheck
continue-on-error: true
# Source: https://github.com/tbroadley/spellchecker-cli and https://github.com/austenstone/spellchecker-cli-action-summary
# Based off: https://github.com/austenstone/spellchecker-cli-action-summary/blob/main/.github/workflows/spellcheck.yml
run: |
mkdir -p spellcheck_reports
npx -y spellchecker-cli \
--files _build/html/**/*.html \
--dictionaries .github/workflows/spellcheck_allow_list.txt \
--plugins spell indefinite-article repeated-words syntax-mentions syntax-urls \
--reports spellcheck_reports/spellcheck_report.json |
tee spellcheck_reports/spellcheck_output.txt
if grep -q 'misspelt' spellcheck_reports/spellcheck_report.json; then
echo "misspelt=true" >> $GITHUB_OUTPUT
else
echo "misspelt=false" >> $GITHUB_OUTPUT
fi

# - name: Create GitHub Issue for Misspellings
# if: ${{ steps.spellcheck.outputs.misspelt == 'true' }}
# uses: peter-evans/create-issue-from-file@v4
# with:
# title: Documentation contains misspelled words
# content-filepath: spellcheck_reports/spellcheck_output.txt
# labels: documentation, spelling-error
42 changes: 24 additions & 18 deletions .github/workflows/deploy-to-gh-pages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
name: Build and deploy to GitHub Pages

on:
push: { branches: [ main ] }
push: { branches: [ 61-add-link-checker ] }
workflow_dispatch: { }

# Reference: https://docs.github.com/en/actions/using-jobs/using-concurrency
Expand Down Expand Up @@ -62,20 +62,26 @@ jobs:
with:
path: _build/html

deploy:
name: Deploy website
needs:
- build
runs-on: ubuntu-latest
# Reference: https://github.com/actions/deploy-pages
permissions:
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
# Reference: https://github.com/actions/deploy-pages
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
check-links-and-spelling:
name: Check Links and Spelling in Documentation
uses: ./.github/workflows/check-links-and-spelling.yml
needs: build


# deploy:
# name: Deploy website
# needs:
# - build
# runs-on: ubuntu-latest
# # Reference: https://github.com/actions/deploy-pages
# permissions:
# pages: write
# id-token: write
# environment:
# name: github-pages
# url: ${{ steps.deployment.outputs.page_url }}
# steps:
# # Reference: https://github.com/actions/deploy-pages
# - name: Deploy to GitHub Pages
# id: deployment
# uses: actions/deploy-pages@v4
7 changes: 7 additions & 0 deletions .github/workflows/mlc_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"timeout": "20s",
"retryOn429": true,
"retryCount": 5,
"fallbackRetryDelay": "30s",
"aliveStatusCodes": [200, 206]
}
36 changes: 36 additions & 0 deletions .github/workflows/spellcheck_allow_list.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
Bioinformatics
bioinformatics
Biosample
biosample
Changesheets
changesheets
Community-centric
community-centric
Diátaxis
Globus
IMG
JSON
json
Lipidomics
lipidomics
Metabolomics
metabolomics
MetaG
Metagenome
metagenome
Metagenomic
metagenomic
Metaproteomic
MetaT
Metatranscriptome
metatranscriptome
Metatranscriptomic
metatranscriptomic
Microbiome
microbiome
Mgt
MkDocs
multi-omics
NMDC
NMDC's
nmdc-runtime