Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should szip always be required, as zlib is? #1591

Closed
edwardhartnett opened this issue Jan 7, 2020 · 5 comments
Closed

should szip always be required, as zlib is? #1591

edwardhartnett opened this issue Jan 7, 2020 · 5 comments

Comments

@edwardhartnett
Copy link
Contributor

This is related to some of the issues brought up by @lesserwhirls in Unidata/netcdf#50

The szip example is a good one to discuss, because szip has always existed as a flaw in the universality of the netCDF/HDF5 format in the netCDF model.

We may now make the szip library mandatory (as we do the zlib library). Should it be?

This would presumably require a netcdf-java implementation of this filter, if a java only netcdf reader is desired. But apparently that is already under consideration.

If this compression method was supported in all installs, that would remove the issue of some installs that cannot read the data. Universality would be guaranteed.

(Note that actually we have no way of currently enforcing our zlib requirement, see #1557, but we can actually enforce it in szip, which is already detected at netcdf configure time.)

@Dave-Allured
Copy link
Contributor

My opinion: The extra work to require secondary dependencies is complicated and troublesome. I would prefer these policies:

  • Unidata's role for filter support libraries should be to publish minimum installed library requirements for a desired basic set of feature compatibilities. This is my closest approach to your universality. The basic feature set is a work in progress, of course.

  • It is the user's explicit responsibility to either comply with the feature set requirements, or be prepared to do without some of those features.

  • There are no required filter support libraries at build time, not szip or even zlib.

  • Filter support libraries may be added or updated at any time on a given system, to provide missing filter capabilities, bug fixes, performance upgrades, etc.

  • The netcdf library never directly knows whether any given filter support library is present at build time.

  • At run time, the netcdf library never knows or cares whether any given filter support library is present, until such time as it is invoked by some user-initiated request.

  • When invoked, a missing support library results in a meaningful error message.

  • A valid response to missing filter error message is to install a missing library, and move on.

  • Generally speaking, leave all filter implementation details up to HDF5.

  • HDF5 build configuration is independent from any netcdf configuration requirements, with the exception of general design features incorporated for the benefit of netcdf.

  • In the modern scenario of all dynamic libraries: filter support libraries and the HDF5 and netcdf libraries should all be independently optional and replaceable, without rebuilding any dependency chains. This assumes full drop-in ABI compatibility in each case.

  • ABI breakage justifies rebuilding.

  • Minimize ABI incompatible changes.

I would like to keep the netcdf libraries and their build processes simple and sweet, within bounds of what is possible and reasonable.

@edwardhartnett
Copy link
Contributor Author

zlib handling is well-established, and I don't think anyone is looking to change it. It's required and always has been. No one is proposing to take it out at this point.

The question is: is the universality of the format worth the extra trouble of always installing libsz?

Libsz is trivial to build or install from package management systems. It will quickly be added to the package manifests of netcdf, which means most installers will never even notice. Furthermore, szip is very popular with Earth scientists. It's already installed as a matter of course on most NOAA and NASA machines.

This issue does not have a large impact on build system complexity, but what impact it has would be to simplify the builds. Instead of dealing with an optionally present libsz, they can always count on it.

Whether libsz is required also has no impact on how shared libraries work. Whether or not it is required, full drop-in ABI compatibility is not changed.

So making szip a required library would not be a large burden for installers, and would not make the build systems more complex. It would ensure that all users could read szip data.

I believe this would set to rest many valid concerns about supporting szip more fully. The additional installation burden is small and seems worth paying.

@WardF
Copy link
Member

WardF commented Jan 8, 2020

"Trivial" is subjective, and adds an additional burden on end users who may or may not be technically proficient. It also assumes that not only is szip currently easily available on currently supported platforms, but that the dev team will continue to support extant platforms and will also support whatever new environments crop up in the future.

I'm really hesitant to add a new required dependency, particularly when the benefit to the end user community is unclear. It adds additional overhead to users/systems who currently aren't using libsz, and those who are will already have their systems configured for it. I'd hazard a guess that the portion of our community that is using libsz is relatively small. Requiring that those who aren't install it anyways doesn't serve a purpose that I can see, particularly since we already have a build system set up that can handle libsz as an optional install.

Short of our user community requesting this sort of feature, and making a compelling argument for it, I don't anticipate making libsz a required install.

@DennisHeimbigner
Copy link
Collaborator

I never had the impression that szip was widely used in our
community. Is that impression incorrect?

@edwardhartnett
Copy link
Contributor Author

@DennisHeimbigner szip is used a lot at NASA and NOAA, in my experience. It's faster.

OK seems szip is going to be optional. I will close this issue. Thanks all for the input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants