Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: Add global attribute imageinput:strict #4560

Merged
merged 1 commit into from
Dec 19, 2024

Conversation

lgritz
Copy link
Collaborator

@lgritz lgritz commented Dec 6, 2024

We don't do anything with this at the present time, but this PR reserves and documents this attribute for future use.

The intent is to be able to set whether we want image readers to try being as tolerant as possible when reading a file with flaws (press on and see if the rest of the file is ok?), or be more conservative and abandon reading any file as soon as a corruption or invalid data is encountered (because that might be a clue that the file is arbitrarily corrupted or even maliciously constructed).

I documented it as defaulting to 0 (err on the side of being permissive of bad input), with high-security applications being responsible for setting it to 1. But it's open for debate if people think that a better default is to be strict and let applications who want to be more tolerant be responsible for accepting the risk and switching the mode.

We don't do anything with this at the present time, but this PR
reserves and documents this attribute for future use.

The intent is to be able to set whether we want image readers to try
being as tolerant as possible when reading a file with flaws (press on
and see if the rest of the file is ok?), or be more conservative and
abandon reading any file as soon as a corruption or invalid data is
encountered (because that might be a clue that the file is arbitrarily
corrupted or even maliciously constructed).

I documented it as defaulting to 0 (err on the side of being
permissive of bad input), with high-security applications being
responsible for setting it to 1. But it's open for debate of people
think that a better default is to be strict and let applications who
want to be more tolerant be responsible for accepting the risk and
switching the mode.

Signed-off-by: Larry Gritz <[email protected]>
@lgritz
Copy link
Collaborator Author

lgritz commented Dec 16, 2024

Any opinions on this?

@EmilDohne
Copy link
Contributor

I appreciate this introduction and I think a 0 as default is perfectly valid as most of the case in non-critical applications I think people would prefer to get some usable data rather than some obscure error.

I have some overall concerns (although not in the scope of this MR) with the attributes being passed as const void* and reinterpreted based on the passed TypeDesc. I'm guessing around the time that this was written std::variant was not available but seeing as C++17 is the standard we are on now I think it would be valid to start deprecating it in favour of the more type safe std::variant.
I'd love to hear your opinion on this and would be happy to provide a std::variant based version and the other methods can then be slowly deprecated.

@lgritz
Copy link
Collaborator Author

lgritz commented Dec 18, 2024

Yes, we've already been in the process of trying to get out of the practice of passing around "unlengthed" raw pointers wherever possible. So I'm definitely interested in continuing that work, including here.

I'm not sure one way or the other whether std::variant is the right approach in this particular case, but it's definitely a contender, and if you want to prototype a variant-based way to set attributes, I'd love to see that and we can all collectively evaluate it. The three possible issues that come immediately to mind are: (1) how to handle arrays, (2) whether use of variant will incur making extra copies or allocations that we wish to avoid merely to pass the parameters, and (3) whether there could possibly be ABI incompatibilities (e.g., are we 100% sure that a library compiled with gcc and a client app compiled with clang, for example, can pass std::variant through public interfaces with guaranteed compatibility?).

One reason why we hadn't really used variant anywhere before is simply because the 3.0 release (just a month ago) is the first time that C++17 had been the minimum for public interfaces, so it just wasn't a possibility. So that also means we're a little inexperienced with its pros and cons.

@EmilDohne
Copy link
Contributor

EmilDohne commented Dec 18, 2024

I'm happy to have a look into this and provide a tentative MR to have everyone take a look at it. As for the allocation and copying part it's more than possible to construct a variant without copying (i.e. move, emplace etc.). It does have the limitations that it will use up the memory of the widest type but I figure for attributes specifically this is likely a non-issue as these look to be fairly small and not on a hot path.

As for the guarantee of public interfaces I would assume so as they are all based on the same standard but I have to be honest not yet tried it.

EDIT:

To add to this, I definitely don't think variant is the right type for everything. Where e.g. an array is passed a template would be much more fitting than a generic variant. In fact for attributes that may also hold true but I will play around with this a bit

@lgritz
Copy link
Collaborator Author

lgritz commented Dec 18, 2024

Great, I'd love to see a prototype of a std::variant based attribute() call. (As long as you don't mind that it may or may not turn out to fit the bill in the end. But I think there's no way to know without doing it.)

As an alternate design to consider, though, I wonder if it might avoid some pitfalls to use a templated attribute() call that takes a span<T>? That would definitely not have any extra allocations or copies or other unobvious overhead, has a future-proof ABI, and it handles arrays trivially.

@EmilDohne
Copy link
Contributor

After doing some initial testing I don't know if a templated approach is the right one as, unless you explicitly instantiate the template for every variation of input arguments there might be, the function can't really be implemented in the imageio.h header as it relies on the pvt:: externed variables. Perhaps a variant and/or specializations is more appropriate in this case.

I'll come back to this at a later point when I get some more time but it's definitely something to consider

@lgritz
Copy link
Collaborator Author

lgritz commented Dec 19, 2024

Well, what I was thinking about is something like:

// Concrete base case, implemented in a .cpp file and largely identical to the
// raw-pointer-based one we have now, except that its access to the caller's data
// is properly bounds checked:
bool attribute(string_view name, TypeDesc type, cspan<std::byte> data);

// Templated, inlined, type safe
template<typename T>
bool attribute(string_view name, span<T> data) {
    return attribute(name, TypeDesc(BaseTypeFromC<T>::value, data.size() > 1 ? data.size : 0), as_bytes(data));
}

@lgritz
Copy link
Collaborator Author

lgritz commented Dec 19, 2024

I'm going to merge this PR, but we should open a new issue or discussion for any further hashing out of a type- and bounds-safe attribute API.

@lgritz lgritz merged commit dcb38bd into AcademySoftwareFoundation:main Dec 19, 2024
29 checks passed
@EmilDohne
Copy link
Contributor

Well, what I was thinking about is something like:

// Concrete base case, implemented in a .cpp file and largely identical to the
// raw-pointer-based one we have now, except that its access to the caller's data
// is properly bounds checked:
bool attribute(string_view name, TypeDesc type, cspan<std::byte> data);

// Templated, inlined, type safe
template<typename T>
bool attribute(string_view name, span<T> data) {
    return attribute(name, TypeDesc(BaseTypeFromC<T>::value, data.size() > 1 ? data.size : 0), as_bytes(data));
}

That looks like a neat solution given the constraints of the current API! I'll open up a new ticket to track this and any progress I make.

@lgritz lgritz deleted the lg-strict branch December 23, 2024 06:29
lgritz added a commit to lgritz/OpenImageIO that referenced this pull request Dec 23, 2024
…ion#4560)

We don't do anything with this at the present time, but this PR reserves
and documents this attribute for future use.

The intent is to be able to set whether we want image readers to try
being as tolerant as possible when reading a file with flaws (press on
and see if the rest of the file is ok?), or be more conservative and
abandon reading any file as soon as a corruption or invalid data is
encountered (because that might be a clue that the file is arbitrarily
corrupted or even maliciously constructed).

I documented it as defaulting to 0 (err on the side of being permissive
of bad input), with high-security applications being responsible for
setting it to 1. But it's open for debate if people think that a better
default is to be strict and let applications who want to be more
tolerant be responsible for accepting the risk and switching the mode.

Signed-off-by: Larry Gritz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants