-
-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Produce a scrape of a smallish Wiktionary with dev-1.14 for testing #2098
Comments
Do we need to do this through zimfarm, or is it okay if I just do the scrape on my local machine and send you the ZIM? |
It depends whether it needs wider testing on other clients, I suppose. I only really test thoroughly on KJS Browser Extension and on Kiwix PWA. If I find an issue, I try to corroborate that the issue is also on other clients (Kiwix Desktop and Kiwix Android), or whether it's a problem with the JS client(s) only, so I do test a bit more widely, and then report if I find something significant that needs fixing at scraper level or in other clients. In this case I'll be particularly keen to see whether #2073 is solved or not, and will report back on that. So, whatever is easiest for you! |
So I'm unable to produce a es.wiktionary.org ZIM right now because of #2003. I'm getting:
|
Thanks for trying @audiodude! |
Tried again with wiktionary_es on Zimfarm and same result:, see https://farm.openzim.org/pipeline/b25161de-d764-4337-844e-19ec0a12afe4/debug |
Thanks @kelson42! Hmm, weird error. Who would know what it means?! Looks like some empty array was passed somewhere. |
I have made a test run with a "small" Wikipedia: wikipedia_ca. Unfortunately it died pretty quickly, see https://farm.openzim.org/pipeline/d75b3ef3-3fec-4914-8bed-050be72960f7/debug. I suspect here another small bug. |
I have open an issue for this last one #2113. I'm more concerned if Wikipedia is not scrap-able (for the moment) as Wiktionary. |
In general, it would be a good idea to do some road-testing of Wikimedia ZIMs other than Wikipedia with the new API (assuming other types also use this), but also there are a few issues where longstanding problems may have been fixed by dev (or others potentially introduced). For example, #2073 and the very similar #1033.
A smallish one with full features might be
wiktionary_es_all_max
(latest version we have is https://library.kiwix.org/viewer#wiktionary_es_all_maxi_2024-06, was produced by 1.13). This is just 890MB, so seems good for testing.The text was updated successfully, but these errors were encountered: