-
Notifications
You must be signed in to change notification settings - Fork 200
Open
Description
before serialization, kaitai was only about decoding from bytes to values
with serialization, kaitai can also do encoding from values to bytes
so far, this encoding works for simple values, but fails for "complex" values
example: vlq_base128_be.ksy
seq:
- id: groups
type: group
repeat: until
repeat-until: not _.has_next
types:
group:
seq:
- id: has_next
type: b1
- id: value
type: b7
instances:
last:
value: groups.size - 1
value:
value: |
(groups[last].value
+ (last >= 1 ? (groups[last - 1].value << 7) : 0)
+ (last >= 2 ? (groups[last - 2].value << 14) : 0)
+ (last >= 3 ? (groups[last - 3].value << 21) : 0)
+ (last >= 4 ? (groups[last - 4].value << 28) : 0)
+ (last >= 5 ? (groups[last - 5].value << 35) : 0)
+ (last >= 6 ? (groups[last - 6].value << 42) : 0)
+ (last >= 7 ? (groups[last - 7].value << 49) : 0)).as<u8>
in python, (naively) setting value
fails with
$ kaitai-struct-compiler kaitai_struct_formats/common/vlq_base128_be.ksy --target python --read-write --no-auto-read
$ python
>>> import vlq_base128_be
>>> i = vlq_base128_be.VlqBase128Be()
>>> i.value = 123
AttributeError: property 'value' of 'VlqBase128Be' object has no setter
... because instances
are always read-only
workaround:
# TODO encode int(2**14-1) to vlq_bytes
vlq_bytes = b"\xff\x7f"
i = vlq_base128_be.VlqBase128Be.from_bytes(vlq_bytes)
assert i.value == 2**14-1
possible solution:
i = vlq_base128_be.VlqBase128Be.from_value(2**14-1)
assert i.value == 2**14-1
in the .ksy file, the from_value
constructor could be declared like
constructors:
from_value:
inputs:
- value
outputs:
bytes: |
value < 2**7 ? [value] :
value < 2**14 ? [
(value >> 7) | 2**7,
value & (2**7 - 1)
] :
# ...
>>> value = 2**14-1
>>> list(map(lambda b: bin(b).split("b")[1].zfill(8), [(value >> 7) | 2**7, value & (2**7 - 1)]))
['11111111', '01111111']
see also pyvlq.encode
the outputs
dict can hold temporary variables
needed to produce the final output bytes
every byte is an integer from 0 to 255
todo: cache bytes
for serialization
this is faster than deriving bytes from seq
values
keywords:
- inverse of instances
- opposite of instances
- reverse instances
- builtin types vs user-defined types
Metadata
Metadata
Assignees
Labels
No labels