-
Notifications
You must be signed in to change notification settings - Fork 178
Issue with decodeRunRepeated #127
Description
Hi,
I think there is an issue here:
Line 114 in 07fb2fd
| value << 8; |
I worked on a parquet file where decodeRunRepeated was basically supposed to convert a [18, 1] buffer into 274 as the repeated value but yielded 19 instead.
[18,1] is supposed to be interpreted as 18 * 2^(8 * 0) + 1 * 2^(8 * 1) = 18 + 256 = 274, which would lead to something like this:
value += (cursor.buffer[cursor.offset] << 8*i)
The current code yields the correct result if there is only one byte needed: [18, 0] yields 18 which is expected.
The issue is only visible if the parquet file has some repeated values above 256, as those repeated values will need more than 1 bytes to be encoded, and the current code would yield incorrect values.
I think value << 8 without affectation has no effect. There might be a similar problem in the encoding function but I haven't used it so far:
Line 26 in 07fb2fd
| value >> 8; |