Skip to content

Undefined behavior in circular buffer #53

@vasyl-protsiv

Description

@vasyl-protsiv

Circular buffer implementation that involves page mapping mentioned in the recent video (which is a great video btw) behaves inconsistently on different optimization levels, which is likely caused by undefined behavior. The reason is probably because of aliasing rules that modern compilers use aggressively to optimize code. The following code uses the circular buffer defined in perfaware/part3/listing_0121_circular_buffer_main.cpp:

int main(void)
{
    printf("Circular buffer test:\n");
    
    const size_t BUF_SIZE = 64 * 4096;

    circular_buffer Circular = AllocateCircularBuffer(BUF_SIZE, 3);
    
    if(IsValid(Circular))
    {
        u8 *Data = Circular.Base.Data + BUF_SIZE;

        Data[0] = 1;
        Data[BUF_SIZE] = 2;

        printf("%u\n", Data[0]);

        DeallocateCircularBuffer(&Circular);
    }
    else
    {
        printf("  FAILED\n");
    }
    
    // NOTE(casey): Since we do not use these functions in this particular build, we reference their pointers
    // here to prevent the compiler from complaining about "unused functions".
    (void)&IsInBounds;
    (void)&AreEqual;
    (void)&AllocateBuffer;
    (void)&FreeBuffer;
    
    return 0;
}

This code outputs (which is the expected result) on each compiler with optimizations off (cl /Od, g++ -O0, clang++ -O0):

Circular buffer test:
2

But it gives the following output when optimizations are on (cl /O2, g++ -O2, clang++ -O2):

Circular buffer test:
1

It seems like compilers assume that writing to Data[BUF_SIZE] could not possibly affect the value of Data[0], so it can safely put the known value of Data[0] directly into printf.
Here is the assembly generated with g++ -O2 (g++ version 13.1, mingw-w64)

   140007eba:   c6 80 00 00 04 00 01    mov    BYTE PTR [rax+0x40000],0x1   ; write 1 to Data[0]
   140007ec1:   48 8d 0d 8b 21 00 00    lea    rcx,[rip+0x218b]
   140007ec8:   ba 01 00 00 00          mov    edx,0x1                      ; put 1 directly into printf args
   140007ecd:   c6 80 00 00 08 00 02    mov    BYTE PTR [rax+0x80000],0x2   ; write 2 to Data[BUF_SIZE]
   140007ed4:   e8 f7 fd ff ff          call   140007cd0 <_Z6printfPKcz>    ; call printf

And here is the assembly generated with g++ -O0

   140001aec:   c6 00 01                mov    BYTE PTR [rax],0x1           ; write 1 to Data[0]
   140001aef:   48 8b 45 f0             mov    rax,QWORD PTR [rbp-0x10]
   140001af3:   48 05 00 00 04 00       add    rax,0x40000
   140001af9:   c6 00 02                mov    BYTE PTR [rax],0x2           ; write 2 to Data[BUF_SIZE]
   140001afc:   48 8b 45 f0             mov    rax,QWORD PTR [rbp-0x10]
   140001b00:   0f b6 00                movzx  eax,BYTE PTR [rax]           ; read Data[0] again
   140001b03:   0f b6 c0                movzx  eax,al
   140001b06:   89 c2                   mov    edx,eax                      ; put the value of Data[0] into printf args
   140001b08:   48 8d 05 6b 85 00 00    lea    rax,[rip+0x856b]
   140001b0f:   48 89 c1                mov    rcx,rax
   140001b12:   e8 39 68 00 00          call   140008350 <_Z6printfPKcz>    ; call printf

Sorry if it's not the right place to disscuss this, but YouTube comments are disabled, and Computerenhance comments are for subscribers only. But I believe it should be mentioned somewhere that this kind of circular buffers are not really safe to use with modern compilers unless someone figures out how to reliably tell the compiler that this kind of page manipulation is involved.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions