You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Various projects that are part of the web monitoring ecosystem have grown overlapping approaches to two related problems:
Determining the effective status code of a capture (e.g. the server said 200, but it’s really a 404 page).
Indicating that a capture might be bad (either the crawler got blocked or it was just an intermittent server error & retrying would have gotten a better response). We want to avoid pushing these captures in peoples’ faces.
There’s already some copy-pasting or porting of largely identical logic between things. In other places, we have wildly varying approaches that maybe should be more similar. If possible, we should figure out a strategy for sharing implementations, or at least aligning them so it’s clear when and where you should port work in one place to a copy in another.
Page#calculate_status in the DB. (This could maybe be rewritten entirely in favor of underlying logic more like the maybe_bad_capture() stuff in task sheets.)
Various projects that are part of the web monitoring ecosystem have grown overlapping approaches to two related problems:
There’s already some copy-pasting or porting of largely identical logic between things. In other places, we have wildly varying approaches that maybe should be more similar. If possible, we should figure out a strategy for sharing implementations, or at least aligning them so it’s clear when and where you should port work in one place to a copy in another.
Some example spots:
maybe_bad_capture()
in task sheets.Page#calculate_status
in the DB. (This could maybe be rewritten entirely in favor of underlying logic more like themaybe_bad_capture()
stuff in task sheets.)The text was updated successfully, but these errors were encountered: