Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for hint/help on kiwix-js #1

Open
cptolemy opened this issue Feb 7, 2021 · 2 comments
Open

Request for hint/help on kiwix-js #1

cptolemy opened this issue Feb 7, 2021 · 2 comments

Comments

@cptolemy
Copy link

cptolemy commented Feb 7, 2021

Good afternoon mossroy

How are you today?
I am contacting you, since sharun-s thought it would be best instead of trying to modify her code of kiwix-html5. I sent this exact request for help to mossroy by email.

Basically, I am trying to automatically read a default zim file in kiwix-js. I understand the browser doesn't give us direct permission, but I was thinking in opening and reading the file via a jquery/ajax call to a php script that would return the contents. But I don't know the exact point where the file is read, if it's read fully, or if by making the change I can simplify the reading by skipping the first html template.

Could you help me please?

sharun-s script is quite good, but to place another zim file proven to be hard.

Thank you so much.

Best regards, and stay safe

CPtolemy

@Jaifroid
Copy link
Owner

Jaifroid commented Feb 7, 2021

@cptolemy Hi, you've posted this on a very old personal clone of Kiwix JS that is no longer being used or updated. I now commit all new code to https://github.com/kiwix/kiwix-js, which is actively being developed, and to https://github.com/kiwix/kiwix-js-windows/, which is a PWA and UWP port of Kiwix JS.

You should be aware that a lot of changes have happened since sharun-s was last involved in this project, including a change of ZIM format (to use ZSTD compression and webp images), which means that unless he has been updating his code-base, then new ZIMs can no longer be read by his old codebase. There is a further change upcoming too (ZIMs with no namespace). We have recently implemented all these changes in Kiwix JS and Kiwix JS Windows.

I have developed Kiwix JS PWA version to use the File System Access API. In Chromium browsers, this nearly gets you what you want, but the user does have to pick a file or a folder first time, and thereafter the user simply has to click a prompt "Allow access to files?". It's a near-native experience. I understand that the API will soon allow installed PWAs to access the FS without further permission prompts (after picking an archive the first time).

If you want a packaged ZIM "app", e.g. WikiMed, that automatically loads a packaged archive with no prompts, we have those too, using either the NWJS framework of the Electron framework. For Windows 10, we also have a packaged version using the UWP framework (Store app, but it can be side-loaded). I have only compiled Electron and NWJS for Windows, but of course they could be compiled for Mac and Linux too.

Here are some links:

There is a thread on Kiwix JS where another dev has developed a way of accessing ZIM files over https, actually developing sharun-s's work, but it was a bit slow to load and involved a helper index, I believe. I'll ping you on that thread so you can have a read through.

To answer your specific queries, Kiwix JS does not read the whole file into memory, that would be impossible of course with the large multi-GB files we often have. It use the File API and its methods to access specific offsets from the start of the file or file-set. Sharun-s's idea was to emulate the same over https using XHR range requests. It works, but is very slow, and depends on the server's capabilities. I believe he wanted to do it using the file:// protocol, to avoid having a server and network issues, but file access is now basically deprecated in all browsers except the obsolete ones: IE11 and Edge Legacy.

I think if you try to use PHP, you'll end up with similar network issues that have made all these solutions very slow: binary search involves thousands of loookups as it hops from one dirEntry to another across the index. We have introduced a low-level block cache with a Least Recently Used ejection strategy, which helps a lot, but it's still an issue over networks.

Electron and NWJS effectively do exactly want you want without any further back-end: they hack the File object and add a path property to it. This means the code only has to know the relative path to a ZIM file, e.g. a packaged WikiMed or Wikivoyage, and it can be accessed with no user interaction whatsoever. But of course they are quite heavy frameworks with a large overhead (though its small compared to the ZIM file sizes). For the lightest approach I'd recommend the PWA version, because it feels native in Chromium browsers, especially if installed, is very lightweight, updates itself and works offline if no network access is available. However, the user must pick the archive first time, which can be a problem if you have naive users.

@cptolemy
Copy link
Author

cptolemy commented Feb 10, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants