Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create all devdocs recipes/ZIMs #1230

Open
benoit74 opened this issue Dec 13, 2024 · 0 comments
Open

Create all devdocs recipes/ZIMs #1230

benoit74 opened this issue Dec 13, 2024 · 0 comments
Assignees
Labels

Comments

@benoit74
Copy link
Contributor

Now that devdocs scraper is ready, we need to create recipes for every ZIM we wanna create with this scraper.

To do this, we need:

  • a script tool to create all these recipes on the Zimfarm (repurposing the one done for TED)
  • a definition of which devdocs ZIMs we wanna create

On first point, I've started to create a generic tool (for developers) capable to create and maintain a set of Zimfarm recipes (Stackoverflow, TED, devdocs, libretexts, ...).

On second point, I've done a short analysis and I need help. Analysis data is here: https://docs.google.com/spreadsheets/d/1WYVUmYGHdTKKCuTpBXcoCe7XI7yfmefpTI-qpGGHWyI/edit?usp=sharing (mind the two tabs).

Initial idea was to create one ZIM per slug, also because it is what the scraper is capable of. This would give us for instance python 3.10, python 3.11, python 3.12, ... The fact is that there is 716 slugs, and I'm not totally convinced anymore that creating so many ZIM is really the good solution.

Another approach would be to create one ZIM per "Name", e.g. one for Python, one for Lua, ... with all versions inside. At least as an end-user, I can image one might prefer to have one ZIM for Python with all versions inside, so that we do not have to switch to another ZIM everytime we switch Python version. This would give us only 221 ZIMs.

But the fact is that the scraper is not (yet) capable to create these "mega-ZIM" (mega does not mean it is going to be long to create or consume lot of space, just that there is multiple things inside, and "über-ZIM" looks too German 🤣 ), and I'm pretty sure that it will make searching (via suggestion or full-text search) even harder because we will often have duplicates across versions.

I do not consider creating a ZIM only for most recent version (e.g. Python 3.13 only for Python), because it does not look very handy (e.g. I might still be forced to use Python 3.10 for whatever reason and need the doc for that version).

My recommendation so far would be to stick to the original idea to create these 716 ZIMs, despite the fact that it is "many ZIMs". But I'm not really bought by the idea.

WDYT?

@benoit74 benoit74 added the task label Dec 13, 2024
@benoit74 benoit74 self-assigned this Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant