Right now, I can use http-crawler to tell me about links that return non-20x errors. That could be for two reasons:
- The page should exist, and it’s broken (in which case I should fix it)
- The page doesn’t exist, and there’s a page with an incorrect link (in which case I should change it)
In the latter case, it’s hard to find the source of the broken link from http-crawler’s current output. It would be useful if it could tell me how it found a given link, so I can check the page that’s providing the link.
(Edited, I rushed the first draft of this issue.)