-
Notifications
You must be signed in to change notification settings - Fork 248
Improve link checker #14871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Improve link checker #14871
Conversation
asafashirov
commented
Apr 28, 2025
- Add support for processing all sitemap files (not just main sitemap)
- Implement section filtering capability
- Enhance error handling and progress tracking
- Improve reporting with a complete list of broken links
- Add support for processing all sitemap files (not just main sitemap) - Implement section filtering capability - Enhance error handling and progress tracking - Improve reporting with a complete list of broken links
Your site preview for commit 0c8ed6f is ready! 🎉 http://www-testing-pulumi-docs-origin-pr-14871-0c8ed6fc.s3-website.us-west-2.amazonaws.com. |
We can look into this deeper post-launch, but I would be concerned making changes to our CI process pre-launch |
Thanks @shughes26. Can/should we capture #15080 under the same workflow? I.e. does it make sense to make sure all external links open in new windows/tabs when we want them to as part of the link checker workflow? |
@asafashirov: is this still needed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the link checker by adding support for processing all sitemap files, introducing an optional section filter, improving error handling and progress tracking, and producing a detailed report of broken links.
- Fetch URLs from multiple sitemaps and additional routes
- Support an optional
SECTION
filter and retry logic with progress logs - Group and list broken links with summary statistics and Slack notification
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
scripts/link-checker/check-links.sh | Added max_retries and optional section filter arguments, updated invocation of the Node script |
scripts/link-checker/check-links.js | Implemented multi-sitemap URL collection, section filtering, progress logging, grouped reporting, and enhanced error handling |
Makefile | Added a check_section target to run the link checker with a SECTION variable |
Comments suppressed due to low confidence (1)
scripts/link-checker/check-links.js:32
- New functionality for section filtering and multiple sitemap processing is introduced here; consider adding unit or integration tests to cover section filter logic and multi-sitemap URL fetching.
let [ baseURL, maxRetries, sectionFilter ] = process.argv.slice(2);
@@ -5,5 +5,22 @@ source ./scripts/common.sh | |||
|
|||
echo "Checking links..." | |||
|
|||
# Get required arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate that the base_url argument is provided (and optionally display usage instructions) before proceeding to avoid running the script with missing arguments.
# Get required arguments | |
# Get required arguments | |
if [ "$#" -lt 1 ]; then | |
echo "Error: Missing required argument 'base_url'." | |
echo "Usage: $0 <base_url> [section_filter]" | |
exit 1 | |
fi |
Copilot uses AI. Check for mistakes.
const axios = require("axios"); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
axios is imported but never used; consider removing this import and the axios dependency to keep the codebase clean.
const axios = require("axios"); |
Copilot uses AI. Check for mistakes.
|
||
// Add the additional routes | ||
urls = urls.concat(additionalRoutes); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The list of URLs is returned without sorting, which may lead to non-deterministic order. Consider sorting the URLs (e.g., alphabetically) before returning to maintain consistency.
// Sort URLs alphabetically to ensure deterministic order | |
urls.sort(); | |
Copilot uses AI. Check for mistakes.
Yes! The main thing is making sure all sitemaps are processed/handled accordingly. |