You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Link checker sees this as 200, but in reality the link is dead. Likely we may download reply bodies if their size (after HEAD request) is less than some sane minimum, parse and check for HTML redirects. In this case, though, HEAD reply does not contain Content-length.
PS. It turns out that Content-length header in response got from python requests often contains bogus 20 value. This should be fixed as well.
The text was updated successfully, but these errors were encountered:
Partial but bulletproof solution would be to store flag for each link whether it's homepage or download. Downloads which reply with text/html content-type can be treated as erroneous.
AMDmi3
transferred this issue from repology/repology-updater
Mar 30, 2019
AMDmi3
changed the title
Deal with html redirects somehow
Try to implement HTML redirect processing
Mar 30, 2019
For instance
Link checker sees this as
200
, but in reality the link is dead. Likely we may download reply bodies if their size (after HEAD request) is less than some sane minimum, parse and check for HTML redirects. In this case, though, HEAD reply does not containContent-length
.PS. It turns out that
Content-length
header in response got from python requests often contains bogus20
value. This should be fixed as well.The text was updated successfully, but these errors were encountered: