Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pointer subtraction underflows and leads to faulty verification result #8591

Open
AmPaschal opened this issue Feb 16, 2025 · 5 comments
Open

Comments

@AmPaschal
Copy link

When using CBMC to verify a function containing pointer arithmetic, I noticed it is possible for the symbolic object a pointer points to to underflow.

When a pointer is allocated, it is assigned $dynamicObject.
Additions to this pointer is represented as $dynamicObject + X and subsequent subtractions as $dynamicObject + X - Y.
However, if we try to subtract more than we added, the resulting difference underflows and typically leads to an dynamicObject with a large offset and subsequently, incorrect pointer dereferencing errors.

For example, in the code

void harness() {

    uint8_t bufsize;

    __CPROVER_assume(bufsize > 0);

    uint8_t *data = malloc(bufsize);

    __CPROVER_assume(data != NULL);

    uint8_t *data2 = data + 5;

    uint8_t *data3 = data2 - 7;

    // If data is validly allocated, we expect that data3's address should be less than data's address
    __CPROVER_assert(data3 < data, "data3 should be less than data!");

}

data is allocated and points to $dynamicObject.
data2 points to $dynamicObject + 5.
However, we would expect data3 to be smaller than data ($dynamicObject - 2) but instead, it underflows and points to a buffer at a very large offset ($dynamicObject + X where X is a large number).

@kroening
Copy link
Member

Please note that pointer arithmetic outside of the bounds of the object is undefined behaviour.

We should really add a check for this.

@remi-delmas-3000
Copy link
Collaborator

if you use --pointer-overflow-check you'll see that data + 5 and data2 - 7 are flagged

[main.pointer_arithmetic.5] line 16 pointer arithmetic: pointer outside object bounds in data + (signed long int)5: FAILURE
[main.pointer_arithmetic.11] line 18 pointer arithmetic: pointer outside object bounds in data2 - (signed long int)7: FAILURE

driving a pointer out of bounds of its allocation is already enough to trigger UB, even without dereferencing it

@AmPaschal
Copy link
Author

Hi,

I agree that driving a pointer outside its allocated bounds is enough to trigger UB and the pointer arithmetic error reports are good.

This code snippet was gotten from an embedded software and in many of these software, developer add a validation that a pointer is still within valid bound before attempting to dereference or access it. In these cases, CBMC will report a pointer dereferencing error or an OOB read/write error even when there is code that validates that the resulting pointer is still valid.

This leads to the question I asked in a different issue --

To what extent should CBMC follow the standard and meet developer expectations without conflicting with each other?

The problem with not meeting developer expectations is that CBMC reports memory access errors that are considered false positive by the developers (because existing validation already prevents invalid pointers from getting accessed).

@remi-delmas-3000
Copy link
Collaborator

remi-delmas-3000 commented Mar 5, 2025

Hi, internally CBMC models a pointer as a pair (allocation_id, offset) where the offset value is unsigned (cf #6893). Since driving a pointer before the start of an allocation triggers UB the internal representation does not even model such pointers.

This code snippet was gotten from an embedded software and in many of these software, developer add a validation that a pointer is still within valid bound before attempting to dereference or access it. In these cases, CBMC will report a pointer dereferencing error or an OOB read/write error even when there is code that validates that the resulting pointer is still valid.

The underlying problem is that this code is only "valid" if always executed on pointers that are not out of bounds to begin with, which defeats the purpose. If CBMC reports that OOB pointers are involved in the comparison, it means that UB happened before in the execution and program semantics are already gone.

Things can get way worse, since the compiler assumes that your code does not trigger UB in order to perform optimizations like dead code elimination.

For instance, if you used that code to perform bounds checks:

char *start = ... // pointer to the start of an allocation
char *end = ... // pointer to the end of an allocation

char *ptr = ...  // some pointer that is between start and end

size_t offset = nondet(); // arbitrarily offset

ptr += offset; // can potentially overflow the whole address space and wrap around, or underflow

if (ptr < start)
   return error;

if (ptr >= end)
   return error;

// all checks passed, use the pointer
*ptr = 12; // OOB access error !

UB-based optimization will cause the if(ptr < start) return error; statement to be removed from your program. Since it is only reachable after either a wrap-around overflow for large values of offset or triggering UB via a negative offset, and the compiler assumes your program does not trigger UB, the compiler can legitimately think that this code is unreachable and safe to delete.

For if (ptr >= end) return error; the reasoning is more subtle: the statement return error; is actually reachable without triggering UB when ptr == end so it is not erased by dead code elimination. However when ptr is driven strictly beyond end we have UB, so if that check mistakenly used strict comparison if (ptr > end) return error;, the return statement would not be reachable without triggering UB, and since UB is assumed to never occur in the program (under the developer's control) this would be considered as dead code and would potentially be removed.

This is a good source about that type of problems:

Performing pointer bounds checks without triggering UB is possible by keeping around start/end pointers for the allocation and by using pointer differences to estimate the offset and headroom of a given pointer, and refrain from adding or subtracting from that pointer if it would cause it to go out of bounds.

The second and really not recommended way to avoid CBMC flagging these cases is to add pragmas around the bad instruction (but you really should refrain from doing that):

#pragma CPROVER check push
#pragma CPROVER check disable "pointer-overflow"
 .... // pointer overflow that you don't want to see
#pragma CPROVER check pop

@AmPaschal
Copy link
Author

Hi @remi-delmas-3000
Thank you for this detailed explanation and the extra sources you shared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants