Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Share logic around effective status codes and bad captures better #175

Open
Mr0grog opened this issue Feb 19, 2025 · 0 comments
Open

Share logic around effective status codes and bad captures better #175

Mr0grog opened this issue Feb 19, 2025 · 0 comments

Comments

@Mr0grog
Copy link
Member

Mr0grog commented Feb 19, 2025

Various projects that are part of the web monitoring ecosystem have grown overlapping approaches to two related problems:

  1. Determining the effective status code of a capture (e.g. the server said 200, but it’s really a 404 page).
  2. Indicating that a capture might be bad (either the crawler got blocked or it was just an intermittent server error & retrying would have gotten a better response). We want to avoid pushing these captures in peoples’ faces.

There’s already some copy-pasting or porting of largely identical logic between things. In other places, we have wildly varying approaches that maybe should be more similar. If possible, we should figure out a strategy for sharing implementations, or at least aligning them so it’s clear when and where you should port work in one place to a copy in another.

Some example spots:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

1 participant