You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Names that have trailing space characters are also not permitted.
Which some have read (e.g., here) as implying that internal space characters are allowed.
I'm pretty sure that was not the intent. So we should clarify.
Also, just before the "trailing space characters" sentence (and in a few other places, like the nc-3 file format BNF) it says that UTF-8 encoded Unicode characters are allowed. I think we might want to be a bit more restrictive, perhaps using Unicode character categories to specify any restrictions. [The use of Unicode character categories was also suggested in this comment in the same discussion mentioned above.]
The text was updated successfully, but these errors were encountered:
@ethanrd, internal spaces are explicitly part of the allowed character set for object names, in the netcdf-3 classic file format spec in current NUG appendix B. Space is ASCII \x20, included in the regular expression for internal characters. If the intent were to exclude, then the first exclusion range would be \x00-\x20, not \x00-\x1F.
name = nelems namestring
// Names a dimension, variable, or attribute.
// Names should match the regular expression
// ([a-zA-Z0-9_]|{MUTF8})([^\x00-\x1F/\x7F-\xFF]|{MUTF8})*
Also, a little lower down, there are the special1 and special2 enumerated subsets of allowed internal ASCII special characters. Space character is explicit at the start of the second set. (Note, typos in several other positions cleaned up here, for clarity.)
@Dave-Allured - Well, interesting. I stand corrected. Thanks for your closer look. I guess the netCDF format and library are quite permissive (which makes sense now that I think on it again) and it is up to conventions like CF to place more stringent limits on the characters allowed if they so desire.
Are there advisories or guidance the NUG should give on maximizing interoperability? Maybe in the Best Practices page.
The NUG "Permitted Characters in NetCDF Names" section includes this sentence:
Which some have read (e.g., here) as implying that internal space characters are allowed.
I'm pretty sure that was not the intent. So we should clarify.
Also, just before the "trailing space characters" sentence (and in a few other places, like the nc-3 file format BNF) it says that UTF-8 encoded Unicode characters are allowed. I think we might want to be a bit more restrictive, perhaps using Unicode character categories to specify any restrictions. [The use of Unicode character categories was also suggested in this comment in the same discussion mentioned above.]
The text was updated successfully, but these errors were encountered: