Skip to content

Conversation

@ipcamit
Copy link
Contributor

@ipcamit ipcamit commented Sep 17, 2025

Add lazy in-memory base64 decoding for SharedLibrary::EmbeddedFile

I am starting this PR to get the conversation going. It is fully functional as it is and can be accepted. Unless some issue with the designed solution.

Summary

This PR refactors SharedLibrary::EmbeddedFile to support on-demand base64 decoding of embedded files. Previously, GetMetadataFile() returned raw base64-encoded pointers and lengths. With this change, callers can transparently obtain the decoded binary/text contents via accessor functions.

Issue

Currently KIM-API decode the files when it writes them to the hard drive. But Metadata is accessed directly from the memory. Hence it never got decoded. Now the EmbeddedFile structure has the option to instead decode in memory and use that pointers.

The reason it is not the default behavior is that for large models, in-memory decoding might be an issue. Although I think it might be a cleaner solution. (Basically move the decoding method to EmbeddedFile, and not where it currently is, SharedLibrary::WriteParameterFileDirectory.

All the mutable params are there to ensure that we do not modify any other part of KIM-API.

Changes

Added mutable bool decodedStringAvailable and mutable std::string decodedFileContent to cache decoded data inside EmbeddedFile.

Implemented:

  • decodeFileInMemory() const: lazily decodes the file into memory if not already decoded.
  • getDecodedFileDataPointer() const: returns pointer to decoded bytes.
  • getDecodedFileDataLength() const : returns decoded size.
  • Updated SharedLibrary::GetMetadataFile() to return decoded data instead of raw filePointer/fileLength.
  • Ensured decoding happens only once per EmbeddedFile.

Motivation

With this patch, LAMMPS logs show decoded metadata content instead of raw BibTeX source.
For example, currently the metadata section contained Base64 junk.
After this patch, the section now contains decoded LaTeX/BibTeX blocks extracted from the embedded base64.

Show LAMMPS output diff
@@ -72,92 +72,7 @@ Your simulation uses code contributions which should be cited:
 
 - OpenKIM potential: https://openkim.org/cite/MO_405512056662_006#item-citation
 
-@Comment
-{
-\documentclass{article}
-\usepackage{url}
-\begin{document}
-This Model originally published in \cite{OpenKIM-MO:405512056662:006a, OpenKIM-MO:405512056662:006b, 
Open
KIM-MO:405512056662:006c} is archived in the OpenKIM repository \cite{tadmor:elliott:2011, elliott:tad
mor:
2011} at \cite{OpenKIM-MO:405512056662:006, OpenKIM-MD:335816936951:005}.
-\bibliographystyle{vancouver}
-\bibliography{kimcite-MO_405512056662_006.bib}
-\end{document}
-}
-
-@Misc{OpenKIM-MO:405512056662:006,
-  author       = {Amit K. Singh and Frank H. Stillinger and Thomas A. Weber},
-  title        = {{S}tillinger-{W}eber potential for {S}i due to {S}tillinger and {W}eber (1985) v006
},
-  doi          = {10.25950/dd263fe3},
-  howpublished = {OpenKIM, \url{https://doi.org/10.25950/dd263fe3}},
-  keywords     = {OpenKIM, Model, MO_405512056662_006},
-  publisher    = {OpenKIM},
-  year         = 2021,
-}
-
-@Misc{OpenKIM-MD:335816936951:005,
-  author       = {Mingjian Wen and Yaser Afshar and Frank H. Stillinger and Thomas A. Weber},
-  title        = {{S}tillinger-{W}eber ({SW}) {M}odel {D}river v005},
-  doi          = {10.25950/934dca3e},
-  howpublished = {OpenKIM, \url{https://doi.org/10.25950/934dca3e}},
-  keywords     = {OpenKIM, Model Driver, MD_335816936951_005},
-  publisher    = {OpenKIM},
-  year         = 2021,
-}
-
-@Article{tadmor:elliott:2011,
-  author    = {E. B. Tadmor and R. S. Elliott and J. P. Sethna and R. E. Miller and C. A. Becker},
-  title     = {The potential of atomistic simulations and the {K}nowledgebase of {I}nteratomic {M}ode
ls},
-  journal   = {{JOM}},
-  year      = {2011},
-  volume    = {63},
-  number    = {7},
-  pages     = {17},
-  doi       = {10.1007/s11837-011-0102-6},
-}
-
-@Misc{elliott:tadmor:2011,
-  author       = {Ryan S. Elliott and Ellad B. Tadmor},
-  title        = {{K}nowledgebase of {I}nteratomic {M}odels ({KIM}) Application Programming Interface
 ({A
PI})},
-  howpublished = {\url{https://openkim.org/kim-api}},
-  publisher    = {OpenKIM},
-  year         = 2011,
-  doi          = {10.25950/ff8f563a},
-}
-
-@Article{OpenKIM-MO:405512056662:006a,
-  author = {Stillinger, Frank H. and Weber, Thomas A.},
-  doi = {10.1103/PhysRevB.31.5262},
-  issue = {8},
-  journal = {Physical Review B},
-  month = {Apr},
-  pages = {5262--5271},
-  publisher = {American Physical Society},
-  title = {Computer simulation of local order in condensed phases of silicon},
-  volume = {31},
-  year = {1985},
-}
-
-@Book{OpenKIM-MO:405512056662:006b,
-  author = {Tadmor, Ellad B. and Miller, Ronald E.},
-  doi = {10.1017/CBO9781139003582},
-  publisher = {Cambridge University Press},
-  title = {Modeling Materials: {C}ontinuum, Atomistic and Multiscale Techniques},
-  year = {2011},
-}
-
-@Article{OpenKIM-MO:405512056662:006c,
-  author = {Stillinger, Frank H. and Weber, Thomas A.},
-  doi = {10.1103/PhysRevB.33.1451},
-  issue = {2},
-  journal = {Phys. Rev. B},
-  month = {jan},
-  numpages = {0},
-  pages = {1451--1451},
-  publisher = {American Physical Society},
-  title = {Erratum: Computer simulation of local order in condensed phases of silicon [{P}hys. {R}ev.
 {B}
 31, 5262 (1985)]},
-  volume = {33},
-  year = {1986},
-}
-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
+QENvbW1lbnQKewpcZG9jdW1lbnRjbGFzc3thcnRpY2xlfQpcdXNlcGFja2FnZXt1cmx9ClxiZWdpbntkb2N1bWVudH0KVGhpcyBNb
2Rlb
CBvcmlnaW5hbGx5IHB1Ymxpc2hlZCBpbiBcY2l0ZXtPcGVuS0lNLU1POjQwNTUxMjA1NjY2MjowMDZhLCBPcGVuS0lNLU1POjQwNTU
xMjA
1NjY2MjowMDZiLCBPcGVuS0lNLU1POjQwNTUxMjA1NjY2MjowMDZjfSBpcyBhcmNoaXZlZCBpbiB0aGUgT3BlbktJTSByZXBvc2l0b
3J5I
FxjaXRle3RhZG1vcjplbGxpb3R0OjIwMTEsIGVsbGlvdHQ6dGFkbW9yOjIwMTF9IGF0IFxjaXRle09wZW5LSU0tTU86NDA1NTEyMDU
2NjY
yOjAwNiwgT3BlbktJTS1NRDozMzU4MTY5MzY5NTE6MDA1fS4KXGJpYmxpb2dyYXBoeXN0eWxle3ZhbmNvdXZlcn0KXGJpYmxpb2dyY
XBoe
XtraW1jaXRlLU1PXzQwNTUxMjA1NjY2Ml8wMDYuYmlifQpcZW5ke2RvY3VtZW50fQp9CgpATWlzY3tPcGVuS0lNLU1POjQwNTUxMjA
1NjY
2MjowMDYsCiAgYXV0aG9yICAgICAgID0ge0FtaXQgSy4gU2luZ2ggYW5kIEZyYW5rIEguIFN0aWxsaW5nZXIgYW5kIFRob21hcyBBL
iBXZ
WJlcn0sCiAgdGl0bGUgICAgICAgID0ge3tTfXRpbGxpbmdlci17V31lYmVyIHBvdGVudGlhbCBmb3Ige1N9aSBkdWUgdG8ge1N9dGl
sbGl
uZ2VyIGFuZCB7V31lYmVyICgxOTg1KSB2MDA2fSwKICBkb2kgICAgICAgICAgPSB7MTAuMjU5NTAvZGQyNjNmZTN9LAogIGhvd3B1Y
mxpc
2hlZCA9IHtPcGVuS0lNLCBcdXJse2h0dHBzOi8vZG9pLm9yZy8xMC4yNTk1MC9kZDI2M2ZlM319LAogIGtleXdvcmRzICAgICA9IHt
PcGV
uS0lNLCBNb2RlbCwgTU9fNDA1NTEyMDU2NjYyXzAwNn0sCiAgcHVibGlzaGVyICAgID0ge09wZW5LSU19LAogIHllYXIgICAgICAgI
CA9I
DIwMjEsCn0KCkBNaXNje09wZW5LSU0tTUQ6MzM1ODE2OTM2OTUxOjAwNSwKICBhdXRob3IgICAgICAgPSB7TWluZ2ppYW4gV2VuIGF
uZCB
ZYXNlciBBZnNoYXIgYW5kIEZyYW5rIEguIFN0aWxsaW5nZXIgYW5kIFRob21hcyBBLiBXZWJlcn0sCiAgdGl0bGUgICAgICAgID0ge
3tTf
XRpbGxpbmdlci17V31lYmVyICh7U1d9KSB7TX1vZGVsIHtEfXJpdmVyIHYwMDV9LAogIGRvaSAgICAgICAgICA9IHsxMC4yNTk1MC8
5MzR
kY2EzZX0sCiAgaG93cHVibGlzaGVkID0ge09wZW5LSU0sIFx1cmx7aHR0cHM6Ly9kb2kub3JnLzEwLjI1OTUwLzkzNGRjYTNlfX0sC
iAga
2V5d29yZHMgICAgID0ge09wZW5LSU0sIE1vZGVsIERyaXZlciwgTURfMzM1ODE2OTM2OTUxXzAwNX0sCiAgcHVibGlzaGVyICAgID0
ge09
wZW5LSU19LAogIHllYXIgICAgICAgICA9IDIwMjEsCn0KCkBBcnRpY2xle3RhZG1vcjplbGxpb3R0OjIwMTEsCiAgYXV0aG9yICAgI
D0ge
0UuIEIuIFRhZG1vciBhbmQgUi4gUy4gRWxsaW90dCBhbmQgSi4gUC4gU2V0aG5hIGFuZCBSLiBFLiBNaWxsZXIgYW5kIEMuIEEuIEJ
lY2t
lcn0sCiAgdGl0bGUgICAgID0ge1RoZSBwb3RlbnRpYWwgb2YgYXRvbWlzdGljIHNpbXVsYXRpb25zIGFuZCB0aGUge0t9bm93bGVkZ
2ViY
XNlIG9mIHtJfW50ZXJhdG9taWMge019b2RlbHN9LAogIGpvdXJuYWwgICA9IHt7Sk9NfX0sCiAgeWVhciAgICAgID0gezIwMTF9LAo
gIHZ
vbHVtZSAgICA9IHs2M30sCiAgbnVtYmVyICAgID0gezd9LAogIHBhZ2VzICAgICA9IHsxN30sCiAgZG9pICAgICAgID0gezEwLjEwM
Dcvc
zExODM3LTAxMS0wMTAyLTZ9LAp9CgpATWlzY3tlbGxpb3R0OnRhZG1vcjoyMDExLAogIGF1dGhvciAgICAgICA9IHtSeWFuIFMuIEV
sbGl
vdHQgYW5kIEVsbGFkIEIuIFRhZG1vcn0sCiAgdGl0bGUgICAgICAgID0ge3tLfW5vd2xlZGdlYmFzZSBvZiB7SX1udGVyYXRvbWljI
HtNf
W9kZWxzICh7S0lNfSkgQXBwbGljYXRpb24gUHJvZ3JhbW1pbmcgSW50ZXJmYWNlICh7QVBJfSl9LAogIGhvd3B1Ymxpc2hlZCA9IHt
cdXJ
se2h0dHBzOi8vb3BlbmtpbS5vcmcva2ltLWFwaX19LAogIHB1Ymxpc2hlciAgICA9IHtPcGVuS0lNfSwKICB5ZWFyICAgICAgICAgP
SAyM
DExLAogIGRvaSAgICAgICAgICA9IHsxMC4yNTk1MC9mZjhmNTYzYX0sCn0KCkBBcnRpY2xle09wZW5LSU0tTU86NDA1NTEyMDU2NjY
yOjA
wNmEsCiAgYXV0aG9yID0ge1N0aWxsaW5nZXIsIEZyYW5rIEguIGFuZCBXZWJlciwgVGhvbWFzIEEufSwKICBkb2kgPSB7MTAuMTEwM
y9Qa
HlzUmV2Qi4zMS41MjYyfSwKICBpc3N1ZSA9IHs4fSwKICBqb3VybmFsID0ge1BoeXNpY2FsIFJldmlldyBCfSwKICBtb250aCA9IHt
BcHJ
9LAogIHBhZ2VzID0gezUyNjItLTUyNzF9LAogIHB1Ymxpc2hlciA9IHtBbWVyaWNhbiBQaHlzaWNhbCBTb2NpZXR5fSwKICB0aXRsZ
SA9I
HtDb21wdXRlciBzaW11bGF0aW9uIG9mIGxvY2FsIG9yZGVyIGluIGNvbmRlbnNlZCBwaGFzZXMgb2Ygc2lsaWNvbn0sCiAgdm9sdW1
lID0
gezMxfSwKICB5ZWFyID0gezE5ODV9LAp9CgpAQm9va3tPcGVuS0lNLU1POjQwNTUxMjA1NjY2MjowMDZiLAogIGF1dGhvciA9IHtUY
WRtb
3IsIEVsbGFkIEIuIGFuZCBNaWxsZXIsIFJvbmFsZCBFLn0sCiAgZG9pID0gezEwLjEwMTcvQ0JPOTc4MTEzOTAwMzU4Mn0sCiAgcHV
ibGl
zaGVyID0ge0NhbWJyaWRnZSBVbml2ZXJzaXR5IFByZXNzfSwKICB0aXRsZSA9IHtNb2RlbGluZyBNYXRlcmlhbHM6IHtDfW9udGlud
XVtL
CBBdG9taXN0aWMgYW5kIE11bHRpc2NhbGUgVGVjaG5pcXVlc30sCiAgeWVhciA9IHsyMDExfSwKfQoKQEFydGljbGV7T3BlbktJTS1
NTzo
0MDU1MTIwNTY2NjI6MDA2YywKICBhdXRob3IgPSB7U3RpbGxpbmdlciwgRnJhbmsgSC4gYW5kIFdlYmVyLCBUaG9tYXMgQS59LAogI
GRva
SA9IHsxMC4xMTAzL1BoeXNSZXZCLjMzLjE0NTF9LAogIGlzc3VlID0gezJ9LAogIGpvdXJuYWwgPSB7UGh5cy4gUmV2LiBCfSwKICB
tb25
0aCA9IHtqYW59LAogIG51bXBhZ2VzID0gezB9LAogIHBhZ2VzID0gezE0NTEtLTE0NTF9LAogIHB1Ymxpc2hlciA9IHtBbWVyaWNhb
iBQa
HlzaWNhbCBTb2NpZXR5fSwKICB0aXRsZSA9IHtFcnJhdHVtOiBDb21wdXRlciBzaW11bGF0aW9uIG9mIGxvY2FsIG9yZGVyIGluIGN
vbmR
lbnNlZCBwaGFzZXMgb2Ygc2lsaWNvbiBbe1B9aHlzLiB7Un1ldi4ge0J9IDMxLCA1MjYyICgxOTg1KV19LAogIHZvbHVtZSA9IHszM
30sC
iAgeWVhciA9IHsxOTg2fSwKfQo=
CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
 

People Concerned

Adding people who might be/should be interested:
@ellio167
@ilia-nikiforov-umn
@tadmor
@nav-mohan

@ipcamit
Copy link
Contributor Author

ipcamit commented Sep 17, 2025

Tests are failing because of wget?

@ilia-nikiforov-umn
Copy link
Member

ilia-nikiforov-umn commented Sep 17, 2025 via email

@ipcamit
Copy link
Contributor Author

ipcamit commented Sep 17, 2025

Its 'wget' returned error code 8, which is server error. So I guess yes it is a 502.

@ellio167
Copy link
Member

So, I guess we introduced a bug when we added the encoding of files. Is that correct?

@ipcamit
Copy link
Contributor Author

ipcamit commented Sep 18, 2025

Yes. When we added encoding for files, we implemented it such that input and output files will be correctly encoded. But metadata is little different as it is stored as a file, but read from memory directly. Therefore, it is still encoded. This patch provides two getter functions which can implicitly convert the file content and provide decoded pointers and length. Raw pointers still behave as they were.

@ellio167
Copy link
Member

OK, this looks good. I've done a little refactoring and formatted with clang-format. Let me know if you see any issues or don't like what I've done.

@ipcamit
Copy link
Contributor Author

ipcamit commented Sep 18, 2025

Looks good to me, except I guess the decodedString variable should be called decodedFileContent, or vice versa.

@codecov
Copy link

codecov bot commented Sep 18, 2025

Codecov Report

❌ Patch coverage is 94.73684% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 46.14%. Comparing base (ab54495) to head (75d6684).
⚠️ Report is 43 commits behind head on devel.

Files with missing lines Patch % Lines
cpp/src/KIM_SharedLibrary.cpp 94.73% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel      #97      +/-   ##
==========================================
+ Coverage   46.13%   46.14%   +0.01%     
==========================================
  Files         145      145              
  Lines       13359    13368       +9     
  Branches     1354     1359       +5     
==========================================
+ Hits         6163     6169       +6     
+ Misses       6518     6517       -1     
- Partials      678      682       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ellio167 ellio167 merged commit 5d53c4b into openkim:devel Sep 18, 2025
10 checks passed
@nav-mohan
Copy link
Contributor

Looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants