Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to Tebako packager #126

Merged
merged 77 commits into from
Jan 10, 2022
Merged

Migrate to Tebako packager #126

merged 77 commits into from
Jan 10, 2022

Conversation

ronaldtse
Copy link
Contributor

Metanorma PR checklist

@ronaldtse
Copy link
Contributor Author

@maxirmx here's the real error from https://github.com/metanorma/packed-mn/runs/4419965948?check_suite_focus=true:

2021-12-05T01:23:57.8451319Z [info]: Compiling /home/runner/work/packed-mn/packed-mn/iso-19156/sources/iso-19156.adoc ...
2021-12-05T01:24:04.2376054Z [relaton] Info: detecting backends:
2021-12-05T01:24:05.1042651Z Fatal Error: No such file or directory @ rb_sysopen - /__tebako_memfs__/lib/ruby/gems/2.7.0/gems/rdf-xsd-3.1.1/lib/rdf/xsd/../../../VERSION
2021-12-05T01:24:05.1193801Z /__tebako_memfs__/lib/ruby/gems/2.7.0/gems/rdf-xsd-3.1.1/lib/rdf/xsd/version.rb:3:in `read': No such file or directory @ rb_sysopen - /__tebako_memfs__/lib/ruby/gems/2.7.0/gems/rdf-xsd-3.1.1/lib/rdf/xsd/../../../VERSION (Errno::ENOENT)
2021-12-05T01:24:05.1196934Z 	from /__tebako_memfs__/lib/ruby/gems/2.7.0/gems/rdf-xsd-3.1.1/lib/rdf/xsd/version.rb:3:in `<module:VERSION>'
2021-12-05T01:24:05.1198286Z 	from /__tebako_memfs__/lib/ruby/gems/2.7.0/gems/rdf-xsd-3.1.1/lib/rdf/xsd/version.rb:1:in `<top (required)>'
2021-12-05T01:24:05.1199546Z 	from /__tebako_memfs__/lib/ruby/gems/2.7.0/gems/rdf-xsd-3.1.1/lib/rdf/xsd.rb:14:in `require'
2021-12-05T01:24:05.1201045Z 	from /__tebako_memfs__/lib/ruby/gems/2.7.0/gems/rdf-xsd-3.1.1/lib/rdf/xsd.rb:14:in `block in <top (required)>'
2021-12-05T01:24:05.1204025Z 	from /__tebako_memfs__/lib/ruby/gems/2.7.0/gems/rdf-xsd-3.1.1/lib/rdf/xsd.rb:14:in `glob'

@maxirmx
Copy link
Contributor

maxirmx commented Dec 5, 2021

Thank you, @ronaldtse !

/__tebako_memfs__/lib/ruby/gems/2.7.0/gems/rdf-xsd-3.1.1/lib/rdf/xsd/../../../VERSION

I can see that the file is in memfs but ruby cannot find it so it shall be an issue with libdwarfs-wr. I think I would rather test it bottom up:
tamatebako/libdwarfs#51

@maxirmx
Copy link
Contributor

maxirmx commented Dec 5, 2021

Also this is the usual cost payed by the people who try to cut the corners.
I should do unit tests as planned (tamatebako/tebako#19) but I have skipped it.

@ronaldtse
Copy link
Contributor Author

ronaldtse commented Dec 5, 2021

Also this is the usual cost payed by the people who try to cut the corners. I should do unit tests as planned (tamatebako/tebako#19) but I have skipped it.

Usually it's better to pay the price early on rather than run into a crash later. Agree 😉

@ronaldtse ronaldtse force-pushed the maxirmx-tebako-packager branch from ce4c512 to 35183e5 Compare December 8, 2021 14:44
@maxirmx maxirmx marked this pull request as draft December 12, 2021 20:01
@ronaldtse
Copy link
Contributor Author

Strange to see usage of so much swap... how are we loading the DwarFS image in the binary?

@maxirmx
Copy link
Contributor

maxirmx commented Dec 13, 2021

Strange to see usage of so much swap... how are we loading the DwarFS image in the binary?

It happens prior to loading image to the binary. Memory is required for mkdwarfs to create compressed image. Libzstd

We are loading by INCBIN but it is not related

@ronaldtse
Copy link
Contributor Author

It happens prior to loading image to the binary. Memory is required for mkdwarfs to create compressed image. Libzstd

Isn't it strange to require 25GB of swap for an image of few hundred MBs...?

@maxirmx
Copy link
Contributor

maxirmx commented Dec 13, 2021

Actually it worked with 17 gb
Pls note ruby-packer workflow has 15 gb setup

@maxirmx
Copy link
Contributor

maxirmx commented Dec 13, 2021

https://github.com/mhx/dwarfs/blob/main/doc/mkdwarfs.md

Note memory limit option.

@ronaldtse
Copy link
Contributor Author

Is the large memory size limit just used for read ahead purposes? If we specify a lower limit for memory we should still be able to package the image right?

ruby-packer has the 15GB swap issue because it is trying to serialize the binary package as a C array text file...

@maxirmx
Copy link
Contributor

maxirmx commented Dec 13, 2021

Thye large memory size can be explained by the history of dwarfs and explanations of the author

I started working on DwarFS in 2013 and my main use case and major motivation was that I had several
 hundred different versions of Perl that were taking up something around 30 gigabytes of disk space, 
and I was unwilling to spend more than 10% of my hard drive keeping them around for when I happened 
to need them.
DwarFS is a read-only file system with a focus on achieving very high compression ratios in particular for 
very redundant data.

I believe he keeps the whole copy of filesystem to be packages in memory in order to look for duplicates, similar box, optimize cross-page boundaries and do other similar things. It is reasonable if you consider 300 sequential versions of something liek perl (or ruby )

@maxirmx
Copy link
Contributor

maxirmx commented Dec 13, 2021

Further he adds:

This alone wouldn't have been enough to get me into writing DwarFS, but at around the same time, I was 
pretty obsessed with the recent developments and features of newer C++ standards and really wanted a 
C++ hobby project to work on. 

Sure, advanced features of C++ are very cool. However, it is insidious. There is high risk that you the program be creating several copies of data unintentionally. I think it may be the case. Not only the full filesystem to be packaged, but several copies of it are loaded to memory.

@maxirmx maxirmx marked this pull request as ready for review December 14, 2021 21:45
@maxirmx maxirmx requested review from CAMOBAP and opoudjis December 14, 2021 21:45
@maxirmx
Copy link
Contributor

maxirmx commented Dec 14, 2021

@ronaldtse @CAMOBAP @opoudjis
Not sure what to do with it next but we can definetely talk about it.
I am afraid you would expect to see single executable like ruby-packer and not a build script. However, my approach has one advantage - if environment is set up packaging takes 5 minutes comparing to ruby-packer's 1 hour

Thank you

@CAMOBAP
Copy link
Contributor

CAMOBAP commented Dec 14, 2021

I am afraid you would expect to see single executable like ruby-packer and not a build script

Not a problem from my point of view

@maxirmx I have two questions sofar:

  1. Currently all setup logic is placed into the GHA workflow, which is mostly ok, but I think sometimes we will need to troubleshoot some issues locally so it would be nice to convert it to some sh script, how do you think, is it make sense?
  2. As far as I see currently we support only linux, right? What is the status of macos and windows?

@maxirmx
Copy link
Contributor

maxirmx commented Dec 15, 2021

  1. Currently all setup logic is placed into the GHA workflow, which is mostly ok, but I think sometimes we will need to troubleshoot some issues locally so it would be nice to convert it to some sh script, how do you think, is it make sense?

Actually it is sh script already.
You need two commands
bin/tebako setup creates reusable binary objects
bin/tebako press creates a package

Everything else is about prerequisites. We can probably use some package mamger to make one-liner.

There is a brief document here:
https://github.com/tamatebako/tebako/blob/master/doc/DEV-NOTES.md

@maxirmx
Copy link
Contributor

maxirmx commented Dec 15, 2021

2. As far as I see currently we support only linux, right? What is the status of macos and windows?

I have preliminary estimates only. I think I will be able to have MacOS version of the same functionality in approx 2 weeks. Windows ~ 6 weeks, but it is very prilimiary and shaky estimate.

I did not start either of them yet.

@maxirmx
Copy link
Contributor

maxirmx commented Dec 15, 2021

@ronaldtse

Also need documentation for tebako

here is a brief document here:
https://github.com/tamatebako/tebako/blob/master/doc/DEV-NOTES.md
I encourage everybody ask additional questions or otherwise critisize it.

Can I confirm that a binary built on Ubuntu 20.04 will work on a vanilla Ubuntu 18.04?

No, the status today is as follows:

  • A binary built on Ubuntu 18.04 will work on Ubuntu 20.04. It is tested now but I use GHA images. I can probably deploy clean Ubuntu container to make it "more vanilla".

  • However, a binary built on Ubuntu 20.04 WILL NOT WORK on Ubuntu 18.04 since I am linking shared glibc as ruby-packer does. There are some options to make 20.04 image work on 18.04 if we need it.

https://github.com/wheybags/glibc_version_header
https://github.com/crosstool-ng/crosstool-ng
and an overview here: https://stackoverflow.com/questions/2856438/how-can-i-link-to-a-specific-glibc-version

But it wil be one more hack that is hard to estimate.

@ronaldtse
Copy link
Contributor Author

Not sure why macOS build is failing here:

/__ruby_packer_memfs__/local/vendor/bundle/ruby/2.6.0/gems/ffi-1.15.4/lib/ffi/library.rb:145:in `block in ffi_lib': Could not open library '/__ruby_packer_memfs__/local/vendor/bundle/ruby/2.6.0/gems/libmspack-0.10.1/ext/x86_64-darwin/liblibmspack.bundle': dlopen(/__ruby_packer_memfs__/local/vendor/bundle/ruby/2.6.0/gems/libmspack-0.10.1/ext/x86_64-darwin/liblibmspack.bundle, 5): image not found (LoadError)
8
	from /__ruby_packer_memfs__/local/vendor/bundle/ruby/2.6.0/gems/ffi-1.15.4/lib/ffi/library.rb:99:in `map'
9
	from /__ruby_packer_memfs__/local/vendor/bundle/ruby/2.6.0/gems/ffi-1.15.4/lib/ffi/library.rb:99:in `ffi_lib'
10
	from /__ruby_packer_memfs__/local/vendor/bundle/ruby/2.6.0/gems/libmspack-0.10.1/lib/libmspack.rb:18:in `<module:LibMsPack>'
11
	from /__ruby_packer_memfs__/local/vendor/bundle/ruby/2.6.0/gems/libmspack-0.10.1/lib/libmspack.rb:16:in `<top (required)>'
12
	from /__ruby_packer_memfs__/local/vendor/bundle/ruby/2.6.0/gems/excavate-0.3.0/lib/excavate/extractors/cab_extractor.rb:1:in `require'
13
	from /__ruby_packer_memfs__/local/vendor/bundle/ruby/2.6.0/gems/excavate-0.3.0/lib/excavate/extractors/cab_extractor.rb:1:in `<top (required)>'
14

But hopefully we are able to migrate macOS to Tebako soon.

@CAMOBAP
Copy link
Contributor

CAMOBAP commented Dec 16, 2021

@ronaldtse liblibmspack.bundle - looks like wrong name it should be any of mspack.bundle or libmspack.bundle maybe some logic in ffi gem changed

@ronaldtse
Copy link
Contributor Author

Good find. I wonder why that is the case... seems that the name got an additional prefix.

.github/workflows/linux.yml Show resolved Hide resolved
.github/workflows/linux.yml Outdated Show resolved Hide resolved
.github/workflows/linux.yml Outdated Show resolved Hide resolved
bin/build-with-tebako.sh Outdated Show resolved Hide resolved
@ronaldtse
Copy link
Contributor Author

@maxirmx could you help resolve the merge conflicts and merge this PR? Thanks!

@maxirmx
Copy link
Contributor

maxirmx commented Jan 10, 2022

@maxirmx could you help resolve the merge conflicts ...
Done

@ronaldtse
Copy link
Contributor Author

Thanks @maxirmx !

Now, technically, the three remaining failures are due to metanorma/isodoc#367. This is a bug in the metanorma-iso gem, that requires the “sassc” gem.

However, it would be useful if we can package the “sassc” gem as well so that users can use custom SASS/SCSS files. Let me make that a separate issue.

@ronaldtse ronaldtse merged commit f6c14dc into main Jan 10, 2022
@ronaldtse ronaldtse deleted the maxirmx-tebako-packager branch January 10, 2022 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants