-
Notifications
You must be signed in to change notification settings - Fork 115
Description
Hi!
First of all, thank you for writing this, it's very useful!
It looks like it has an issue parsing the wget-created .warc.gz files I give it, though:
Traceback (most recent call last):
File "./find-broken-links.py", line 16, in
for record in file:
File "/Library/Python/2.7/site-packages/warc/warc.py", line 393, in iter
record = self.read_record()
File "/Library/Python/2.7/site-packages/warc/warc.py", line 364, in read_record
self.finish_reading_current_record()
File "/Library/Python/2.7/site-packages/warc/warc.py", line 359, in finish_reading_current_record
self.expect(self.current_payload.fileobj, "\r\n")
File "/Library/Python/2.7/site-packages/warc/warc.py", line 352, in expect
raise IOError(message)
IOError: Expected '\r\n', found 'software: Wget/1.14 (linux-gnueabihf)\r\n'
-1
Alas, I suspect fixing this elegantly is probably out of my depth. Is this something you can do?
Thank you,
Zoë.