Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure when compressing no data #57

Open
BenBE opened this issue May 5, 2020 · 2 comments
Open

Failure when compressing no data #57

BenBE opened this issue May 5, 2020 · 2 comments
Labels
Milestone

Comments

@BenBE
Copy link

BenBE commented May 5, 2020

When calling heatshrink_encoder_finish directly after heatshrink_encoder_reset the subsequent call to heatshrink_encoder_poll indicated by the return value of the heatshrink_encoder_finish call fails with HSER_ERROR_MISUSE. A more robust API should instead either return HSER_FINISH_DONE in heatshrink_encoder_finish directly (there's no pending data to be written) or detect this situation in heatshrink_encoder_poll and return without writing any output to the buffer.

To demonstrate the issue:

#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "heatshrink_encoder.h"

__attribute__((nonnull))
static bool run_compress(const char* buf, size_t bufsize)
{
    size_t destsize = 256;
    char* dest = calloc(destsize, sizeof(char));
  
    uint8_t *ptr = (uint8_t *)buf;
    uint8_t *dst = (uint8_t *)dest;

    HSE_poll_res rsepr;

    heatshrink_encoder hse = {0};
    heatshrink_encoder_reset(&hse);

    size_t processed = 0;

    while(bufsize) {
        switch(heatshrink_encoder_sink(&hse, ptr, bufsize, &processed)) {
        case HSER_SINK_OK:
            break;
        case HSER_SINK_ERROR_NULL:
        case HSER_SINK_ERROR_MISUSE:
        default:
            return false;
        }

        ptr += processed;
        bufsize -= processed;

pollagain:
        switch(heatshrink_encoder_poll(&hse, dst, destsize, &processed)) {
        case HSER_POLL_MORE:
            dst += processed;
            destsize -= processed;
            goto pollagain;
        case HSER_POLL_EMPTY:
            dst += processed;
            destsize -= processed;
            break;
        case HSER_POLL_ERROR_NULL:
        case HSER_POLL_ERROR_MISUSE:
        default:
            return false;
        }
    }

    heatshrink_encoder_finish(&hse);

drainagain:
    switch(rsepr = heatshrink_encoder_poll(&hse, dst, destsize, &processed)) {
    case HSER_POLL_MORE:
        dst += processed;
        destsize -= processed;
        goto drainagain;
    case HSER_POLL_EMPTY:
        dst += processed;
        destsize -= processed;
        break;
    case HSER_POLL_ERROR_NULL:
    case HSER_POLL_ERROR_MISUSE:
    default:
        free(dest);
        printf("Failed encoding with %zu bytes remaining at %zu bytes done (%d).\n", bufsize, destsize, rsepr);
        return false;
    }

    destsize = (size_t)((char*)dst - dest);

    free(dest);

    printf("Successfully encoded into %zu bytes.\n", destsize);

    return true;
}

int main (int argc, char** argv)
{
    (void)argc;
    (void)argv;

    const char data[] = "SAMPLEDATA";

    bool overall = true;

    // Succeeds
    overall &= run_compress(data, strlen(data));

    // Fails
    overall &= run_compress(data, 0);
    
    return overall ? 0 : 1;
}

Expected output:

Successfully encoded into 12 bytes.
Successfully encoded into 0 bytes.

Actual output:

Successfully encoded into 12 bytes.
Failed encoding with 0 bytes remainung at 0 bytes done (-2).

The original code I used for compression (when I noticed the underlaying issue) is like this:

typedef struct compression_state {
    bool active;
    message_t *msg;
    size_t parts;
    size_t bufsize;
    size_t bufused;
    char *buffer;
    heatshrink_encoder *encoder;
} compression_state_t;

static char compress_buffer[1536] = { 0 };

static heatshrink_encoder compress_encoder = { 0 };

static compression_state_t compress_state = {
    .active = false,
    .msg = NULL,
    .parts = 0,
    .bufsize = sizeof(compress_buffer),
    .bufused = 0,
    .buffer = &compress_buffer[0],
    .encoder = &compress_encoder
};

bool compress_init(message_t *msg)
{
    if(compress_state.active) {
        // Trying to start compressor while another one already active
        return false;
    }

    compress_state.msg = msg;
    compress_state.parts = 25;
    compress_state.bufused = 0;

    memset(compress_state.buffer, 0, compress_state.bufsize);

    heatshrink_encoder_reset(compress_state.encoder);

    compress_state.active = true;

    return true;
}

static void compress_crank()
{
    HSE_poll_res res;
    do {
        size_t processed = 0;

        res = heatshrink_encoder_poll(
            compress_state.encoder,
            (uint8_t *)(compress_state.buffer + compress_state.bufused),
            compress_state.bufsize - compress_state.bufused,
            &processed);

        if(HSER_POLL_MORE != res && HSER_POLL_EMPTY != res) {
            break;
        }

        compress_state.bufused += processed;

        if(compress_state.bufused == compress_state.bufsize || !processed) {
            process_buf(compress_state, compress_state.buffer, compress_state.bufused);

            compress_state.parts--;
            compress_state.bufused = 0;
        }
    } while(HSER_POLL_MORE == res);
}

bool compress_push(char* buf, size_t bufsize)
{
    if(!compress_state.active) {
        // Trying to compress data while no compressor active
        return false;
    }

    while(bufsize) {
        size_t processed = 0;

        HSE_sink_res res = heatshrink_encoder_sink(compress_state.encoder, (uint8_t *)buf, bufsize, &processed);
        if(res != HSER_SINK_OK) {
            return false;
        }

        if(processed > bufsize) {
            return false;
        }

        bufsize -= processed;
        buf += processed;

        compress_crank();
    }

    return !!compress_state.parts;
}

bool compress_finish()
{
    if(!compress_state.active) {
        // Trying to finalize compression while no compressor active
        return false;
    }

    HSE_finish_res res = HSER_FINISH_DONE;
    do {
        res = heatshrink_encoder_finish(compress_state.encoder);

        compress_crank();
    } while(HSER_FINISH_MORE == res);

    if(compress_state.bufused) {
        process_buf(compress_state, compress_state.buffer, compress_state.bufused);
        compress_state.bufused = 0;
    }

    return HSER_FINISH_DONE == res;
}

Which ran into a buffer overrun when no call to compress_init and compress_finish took place due to this unexpected behaviour of the heatshrink API.

@silentbicycle silentbicycle added this to the v0.5.0 milestone May 20, 2021
@BenBE
Copy link
Author

BenBE commented May 20, 2021

As discussed, here's some more detail plus the proposed patch:

$ git diff
diff --git a/src/heatshrink_encoder.c b/src/heatshrink_encoder.c
index 59088e48..ba2e4a39 100644
--- a/src/heatshrink_encoder.c
+++ b/src/heatshrink_encoder.c
@@ -254,7 +254,7 @@ HSE_finish_res heatshrink_encoder_finish(heatshrink_encoder *hse) {
     if (hse == NULL) { return HSER_FINISH_ERROR_NULL; }
     LOG("-- setting is_finishing flag\n");
     hse->flags |= FLAG_IS_FINISHING;
-    if (hse->state == HSES_NOT_FULL) { hse->state = HSES_FILLED; }
+    if (hse->state == HSES_NOT_FULL) { hse->state = hse->input_size ? HSES_FILLED : HSES_DONE; }
     return hse->state == HSES_DONE ? HSER_FINISH_DONE : HSER_FINISH_MORE;
 }
 

Also just tested this again with the demo encoder/decoder tool and this easily reproduces the issue too:

touch emptyfile.txt
./heatshrink -e emptyfile.txt emptyfile.hs.bin

When compiling this with -fsanitize=address and active tracing you get the following (shortened):

-- allocated encoder with buffer size of 4096 (2048 byte input size)
@ read 2048
-- polling, state 0 (not_full), flags 0x00
@ sink 0
-- setting is_finishing flag
@ drop 0
@ read 2048
-- polling, state 1 (filled), flags 0x01
-- polling, state 2 (search), flags 0x01
## step_search, scan @ +0 (0/4096), input size 0
-- scanning for match of buf[2048:2048] between buf[0:2047] (max 0 bytes)
-- none found
ss Match not found
-- polling, state 3 (yield_tag_bit), flags 0x01
-- adding tag bit: 1
++ push_bits: 1 bits, input of 0x01
-- polling, state 4 (yield_literal), flags 0x01
-- yielded literal byte 0x00 ('.') from +2048
++ push_bits: 8 bits, input of 0x00
 > pushing byte 0x80
-- polling, state 2 (search), flags 0x01
## step_search, scan @ +1 (1/4096), input size 0
-- scanning for match of buf[2049:2065] between buf[1:2064] (max 16 bytes)
-- none found
ss Match not found
-- polling, state 3 (yield_tag_bit), flags 0x01
-- adding tag bit: 1
++ push_bits: 1 bits, input of 0x01
-- polling, state 4 (yield_literal), flags 0x01
-- yielded literal byte 0x00 ('.') from +2049
++ push_bits: 8 bits, input of 0x00
 > pushing byte 0x40
-- polling, state 2 (search), flags 0x01
## step_search, scan @ +2 (2/4096), input size 0
-- scanning for match of buf[2050:2066] between buf[2:2065] (max 16 bytes)
-- none found
ss Match not found
-- polling, state 3 (yield_tag_bit), flags 0x01
-- adding tag bit: 1
++ push_bits: 1 bits, input of 0x01
-- polling, state 4 (yield_literal), flags 0x01
-- yielded literal byte 0x00 ('.') from +2050
++ push_bits: 8 bits, input of 0x00
 > pushing byte 0x20
-- polling, state 2 (search), flags 0x01
## step_search, scan @ +3 (3/4096), input size 0
-- scanning for match of buf[2051:2067] between buf[3:2066] (max 16 bytes)
-- none found
ss Match not found
-- polling, state 3 (yield_tag_bit), flags 0x01
-- adding tag bit: 1
++ push_bits: 1 bits, input of 0x01
-- polling, state 4 (yield_literal), flags 0x01
-- yielded literal byte 0x00 ('.') from +2051
++ push_bits: 8 bits, input of 0x00
 > pushing byte 0x10
-- polling, state 2 (search), flags 0x01
## step_search, scan @ +4 (4/4096), input size 0
-- scanning for match of buf[2052:2068] between buf[4:2067] (max 16 bytes)
-- none found
ss Match not found
-- polling, state 3 (yield_tag_bit), flags 0x01
-- adding tag bit: 1
++ push_bits: 1 bits, input of 0x01
-- polling, state 4 (yield_literal), flags 0x01
-- yielded literal byte 0x00 ('.') from +2052
++ push_bits: 8 bits, input of 0x00
 > pushing byte 0x08
-- polling, state 2 (search), flags 0x01
## step_search, scan @ +5 (5/4096), input size 0
-- scanning for match of buf[2053:2069] between buf[5:2068] (max 16 bytes)
-- none found
ss Match not found
-- polling, state 3 (yield_tag_bit), flags 0x01
-- adding tag bit: 1
++ push_bits: 1 bits, input of 0x01
-- polling, state 4 (yield_literal), flags 0x01
-- yielded literal byte 0x00 ('.') from +2053
++ push_bits: 8 bits, input of 0x00
 > pushing byte 0x04
-- polling, state 2 (search), flags 0x01
## step_search, scan @ +6 (6/4096), input size 0
-- scanning for match of buf[2054:2070] between buf[6:2069] (max 16 bytes)
-- none found
ss Match not found
-- polling, state 3 (yield_tag_bit), flags 0x01
-- adding tag bit: 1
++ push_bits: 1 bits, input of 0x01
-- polling, state 4 (yield_literal), flags 0x01
-- yielded literal byte 0x00 ('.') from +2054
++ push_bits: 8 bits, input of 0x00
 > pushing byte 0x02
-- polling, state 2 (search), flags 0x01

… -->snip<-- …

-- polling, state 2 (search), flags 0x01
## step_search, scan @ +2044 (2044/4096), input size 0
-- scanning for match of buf[4092:4108] between buf[2044:4107] (max 16 bytes)
-- none found
ss Match not found
-- polling, state 3 (yield_tag_bit), flags 0x01
-- adding tag bit: 1
++ push_bits: 1 bits, input of 0x01
-- polling, state 4 (yield_literal), flags 0x01
-- yielded literal byte 0x00 ('.') from +4092
++ push_bits: 8 bits, input of 0x00
 > pushing byte 0x08
-- polling, state 2 (search), flags 0x01
## step_search, scan @ +2045 (2045/4096), input size 0
-- scanning for match of buf[4093:4109] between buf[2045:4108] (max 16 bytes)
-- none found
ss Match not found
-- polling, state 3 (yield_tag_bit), flags 0x01
-- adding tag bit: 1
++ push_bits: 1 bits, input of 0x01
-- polling, state 4 (yield_literal), flags 0x01
-- yielded literal byte 0x00 ('.') from +4093
++ push_bits: 8 bits, input of 0x00
 > pushing byte 0x04
-- polling, state 2 (search), flags 0x01
## step_search, scan @ +2046 (2046/4096), input size 0
-- scanning for match of buf[4094:4110] between buf[2046:4109] (max 16 bytes)
-- none found
ss Match not found
-- polling, state 3 (yield_tag_bit), flags 0x01
-- adding tag bit: 1
++ push_bits: 1 bits, input of 0x01
-- polling, state 4 (yield_literal), flags 0x01
-- yielded literal byte 0x00 ('.') from +4094
++ push_bits: 8 bits, input of 0x00
 > pushing byte 0x02
-- polling, state 2 (search), flags 0x01
## step_search, scan @ +2047 (2047/4096), input size 0
-- scanning for match of buf[4095:4111] between buf[2047:4110] (max 16 bytes)
-- none found
ss Match not found
-- polling, state 3 (yield_tag_bit), flags 0x01
-- adding tag bit: 1
++ push_bits: 1 bits, input of 0x01
 > pushing byte 0x01
-- polling, state 4 (yield_literal), flags 0x01
-- yielded literal byte 0x00 ('.') from +4095
++ push_bits: 8 bits, input of 0x00
-- polling, state 2 (search), flags 0x01
## step_search, scan @ +2048 (2048/4096), input size 0
-- scanning for match of buf[4096:4112] between buf[2048:4111] (max 16 bytes)
=================================================================
==3377913==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x625000002108 at pc 0x55d77f668f19 bp 0x7ffc51d61250 sp 0x7ffc51d61240
READ of size 2 at 0x625000002108 thread T0
    #0 0x55d77f668f18 in find_longest_match /home/user/heatshrink/heatshrink_encoder.c:505
    #1 0x55d77f669a9c in st_step_search /home/user/heatshrink/heatshrink_encoder.c:305
    #2 0x55d77f66ca18 in heatshrink_encoder_poll /home/user/heatshrink/heatshrink_encoder.c:231
    #3 0x55d77f672972 in encoder_sink_read /home/user/heatshrink/heatshrink.c:267
    #4 0x55d77f67331e in encode /home/user/heatshrink/heatshrink.c:300
    #5 0x55d77f6742de in main /home/user/heatshrink/heatshrink.c:471
    #6 0x7f1606c190b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
    #7 0x55d77f6685ed in _start (/home/user/heatshrink/hs+0x165ed)

0x625000002108 is located 0 bytes to the right of 8200-byte region [0x625000000100,0x625000002108)
allocated by thread T0 here:
    #0 0x7f160785ebc8 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8)
    #1 0x55d77f66bec9 in heatshrink_encoder_alloc /home/user/heatshrink/heatshrink_encoder.c:103

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/user/heatshrink/heatshrink_encoder.c:505 in find_longest_match
Shadow bytes around the buggy address:
  0x0c4a7fff83d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff83e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff83f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff8400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff8410: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c4a7fff8420: 00[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8430: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8440: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8450: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8460: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8470: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==3377913==ABORTING

An alternatvie patch might be adding the hse->input_size check to case HSES_FILLED switching to hse->state = HSES_DONE; on empty input buffer.

I hope these information help with further assessment of the situation.

@silentbicycle silentbicycle modified the milestones: v0.5.0, 0.4.2 Jun 8, 2021
@silentbicycle
Copy link
Collaborator

This is going to go in a buxfix point release in the next few weeks, since the new features for 0.5.0 might take longer. Thanks for the detailed report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants