-
Notifications
You must be signed in to change notification settings - Fork 19
Base64 decoded string for the Metadata #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tests are failing because of wget? |
|
Just 502 errors? It's been going on since the website rework, try retrying
…On Wed, Sep 17, 2025, 1:49 PM Amit Gupta ***@***.***> wrote:
*ipcamit* left a comment (openkim/kim-api#97)
<#97 (comment)>
Tests are failing because of wget?
—
Reply to this email directly, view it on GitHub
<#97 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AS2TGGATSJ7ZBREZEUSQX733TGUKFAVCNFSM6AAAAACGZGYPH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGMBUGE3TSMBQGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
Its |
|
So, I guess we introduced a bug when we added the encoding of files. Is that correct? |
|
Yes. When we added encoding for files, we implemented it such that input and output files will be correctly encoded. But metadata is little different as it is stored as a file, but read from memory directly. Therefore, it is still encoded. This patch provides two getter functions which can implicitly convert the file content and provide decoded pointers and length. Raw pointers still behave as they were. |
|
OK, this looks good. I've done a little refactoring and formatted with clang-format. Let me know if you see any issues or don't like what I've done. |
|
Looks good to me, except I guess the |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #97 +/- ##
==========================================
+ Coverage 46.13% 46.14% +0.01%
==========================================
Files 145 145
Lines 13359 13368 +9
Branches 1354 1359 +5
==========================================
+ Hits 6163 6169 +6
+ Misses 6518 6517 -1
- Partials 678 682 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Looks good to me. |
Add lazy in-memory base64 decoding for
SharedLibrary::EmbeddedFileI am starting this PR to get the conversation going. It is fully functional as it is and can be accepted. Unless some issue with the designed solution.
Summary
This PR refactors
SharedLibrary::EmbeddedFileto support on-demand base64 decoding of embedded files. Previously,GetMetadataFile()returned raw base64-encoded pointers and lengths. With this change, callers can transparently obtain the decoded binary/text contents via accessor functions.Issue
Currently KIM-API decode the files when it writes them to the hard drive. But Metadata is accessed directly from the memory. Hence it never got decoded. Now the
EmbeddedFilestructure has the option to instead decode in memory and use that pointers.The reason it is not the default behavior is that for large models, in-memory decoding might be an issue. Although I think it might be a cleaner solution. (Basically move the decoding method to EmbeddedFile, and not where it currently is,
SharedLibrary::WriteParameterFileDirectory.All the
mutableparams are there to ensure that we do not modify any other part of KIM-API.Changes
Added
mutable bool decodedStringAvailableandmutable std::string decodedFileContentto cache decoded data insideEmbeddedFile.Implemented:
decodeFileInMemory() const: lazily decodes the file into memory if not already decoded.getDecodedFileDataPointer() const: returns pointer to decoded bytes.getDecodedFileDataLength() const: returns decoded size.SharedLibrary::GetMetadataFile()to return decoded data instead of raw filePointer/fileLength.EmbeddedFile.Motivation
With this patch, LAMMPS logs show decoded metadata content instead of raw BibTeX source.
For example, currently the metadata section contained Base64 junk.
After this patch, the section now contains decoded LaTeX/BibTeX blocks extracted from the embedded base64.
Show LAMMPS output diff
People Concerned
Adding people who might be/should be interested:
@ellio167
@ilia-nikiforov-umn
@tadmor
@nav-mohan