Skip to content

Commit

Permalink
Increase SERP download limit to 75k
Browse files Browse the repository at this point in the history
  • Loading branch information
sheineking committed Mar 29, 2023
1 parent 6eb12de commit 918b254
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion archive_query_log/download/warc.py
Original file line number Diff line number Diff line change
Expand Up @@ -377,7 +377,7 @@ async def download_service(
archived_urls = self._deduplicate_urls(archived_urls, snippets)
archived_urls = Random(0).sample(
archived_urls,
min(len(archived_urls), 50_000)
min(len(archived_urls), 75_000)
)

await self._download(archived_urls)

0 comments on commit 918b254

Please sign in to comment.