Skip to content

Improve link checker #14871

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

asafashirov
Copy link
Contributor

  • Add support for processing all sitemap files (not just main sitemap)
  • Implement section filtering capability
  • Enhance error handling and progress tracking
  • Improve reporting with a complete list of broken links

- Add support for processing all sitemap files (not just main sitemap)
- Implement section filtering capability
- Enhance error handling and progress tracking
- Improve reporting with a complete list of broken links
@pulumi-bot
Copy link
Collaborator

@asafashirov asafashirov requested review from thoward and dirien April 28, 2025 19:13
@shughes26
Copy link
Contributor

We can look into this deeper post-launch, but I would be concerned making changes to our CI process pre-launch

@asafashirov
Copy link
Contributor Author

Thanks @shughes26. Can/should we capture #15080 under the same workflow? I.e. does it make sense to make sure all external links open in new windows/tabs when we want them to as part of the link checker workflow?

@dirien
Copy link
Contributor

dirien commented Jul 3, 2025

@asafashirov: is this still needed?

@dirien dirien requested a review from Copilot July 3, 2025 08:17
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the link checker by adding support for processing all sitemap files, introducing an optional section filter, improving error handling and progress tracking, and producing a detailed report of broken links.

  • Fetch URLs from multiple sitemaps and additional routes
  • Support an optional SECTION filter and retry logic with progress logs
  • Group and list broken links with summary statistics and Slack notification

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
scripts/link-checker/check-links.sh Added max_retries and optional section filter arguments, updated invocation of the Node script
scripts/link-checker/check-links.js Implemented multi-sitemap URL collection, section filtering, progress logging, grouped reporting, and enhanced error handling
Makefile Added a check_section target to run the link checker with a SECTION variable
Comments suppressed due to low confidence (1)

scripts/link-checker/check-links.js:32

  • New functionality for section filtering and multiple sitemap processing is introduced here; consider adding unit or integration tests to cover section filter logic and multi-sitemap URL fetching.
let [ baseURL, maxRetries, sectionFilter ] = process.argv.slice(2);

@@ -5,5 +5,22 @@ source ./scripts/common.sh

echo "Checking links..."

# Get required arguments
Copy link
Preview

Copilot AI Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validate that the base_url argument is provided (and optionally display usage instructions) before proceeding to avoid running the script with missing arguments.

Suggested change
# Get required arguments
# Get required arguments
if [ "$#" -lt 1 ]; then
echo "Error: Missing required argument 'base_url'."
echo "Usage: $0 <base_url> [section_filter]"
exit 1
fi

Copilot uses AI. Check for mistakes.

Comment on lines +7 to 8
const axios = require("axios");

Copy link
Preview

Copilot AI Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

axios is imported but never used; consider removing this import and the axios dependency to keep the codebase clean.

Suggested change
const axios = require("axios");

Copilot uses AI. Check for mistakes.


// Add the additional routes
urls = urls.concat(additionalRoutes);

Copy link
Preview

Copilot AI Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The list of URLs is returned without sorting, which may lead to non-deterministic order. Consider sorting the URLs (e.g., alphabetically) before returning to maintain consistency.

Suggested change
// Sort URLs alphabetically to ensure deterministic order
urls.sort();

Copilot uses AI. Check for mistakes.

@asafashirov
Copy link
Contributor Author

Yes! The main thing is making sure all sitemaps are processed/handled accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants