Skip to content

SAT_DataLib : specifications

J.F. Omhover edited this page Mar 2, 2014 · 1 revision

This wiki page presents the specification of the SAT DataLib binary data storage model. This document is still in construction!

The general purpose of the SAT_DataLib is to propose a framework for coding/storing binary data in the ArduSat experiments, and the specifications to decode that data once retrieved on earth. The point of using binary data is to reduce the consumption of data storage : using plain text like CSV would require more space than using raw binary values.

This library (and the corresponding data storage framework) have been designed to have the following specific capacities :

  • store different kind of data in one single file : the point is not to store only one kind of message, but several in one single file, the different messages can be decoded one after the other. For instance, depending on your experiment, you might be interested by storing raw values from data periodically, then send some computed values of your own, then a serie of values or a text message. You have to be able to do that in one single file.
  • consume as less data as possible
  • enable the use of extensible structures, not fixed size messages

The general mechanism underlying the storage framework is to structure your data by messages, called "Packets" in the following. Different packet types are proposed, each has a specific purpose (store raw values from sensors, store data series, store text messages...). Each packet consists in one header that caracterizes some content, its format and its length. The length of the packet can be exactly computed from the content of the packet's header.

1. Packet General Structure

All the packet have the same generic structure. They begin with a HEADER CODE (1 byte). This header's first byte defines the packet's type. Then each type of packet has a different HEADER CONTENT. The HEADER CONTENT typically contains parameters values that describe the underlying struture of the following BODY. Then comes the BODY of the packet, whose structure depends on the packet type and the parameters stored in the header content.

Address 0 1 (HEADER_SIZE)
Content HEADER_CODE HEADER_CONTENT BODY

Depending on the HEADER_CODE, the following table specifies the kind of content the packet is designed to carry, and the size of the header.

HEADER_CODE PACKET NAME, HEADER CONTENT HEADER_SIZE(*)
0x23 ['#'] “CHUNK” packet - Contains only raw values taken from the sensors, each value is assigned to a DATATYPE. The header contains a 2 byte block containing all datatypes. The length of the packet can be calculated from the datatypes and their corresponding length. 3
0x21 ['!'] “SERIE” packet - Contains a serie of values indexed by a key (time ?). 5
0x55 ['U'] “USER DEFINED” packet - Whatever fits your needs to put the values you'd like 2
0x53 ['L'] “LOG” packet - Contains a string (for logging purpose, or error messages). The headers consists in 1 byte that contains the length of the whole packet (CODE+HEADER+BODY). The body contains the string (no need for a '\0' tail char, as the length is given by the header). 2
(*) in bytes, the header code byte is included in this number

2. Packet types specifications

2.1. CHUNK PACKET specifications

Address 0 1-2 3-(LENGTH)
Content 0x23 DATATYPES CONTENT

The parameter "DATATYPES" corresponds to a 2 byte block and aggregates (OR) several "DATATYPE". Each "DATATYPE" is designed to be a unique identifier of a sensor. There are 16 possible unique identifiers (each corresponds to one bit in the 2 byte block), that can be aggregated (OR). The size of the data in the CONTENT and the underlying DATA TYPE is given by the following table. In order to calculate the full LENGTH of the packet, one needs to sum all the DATA SIZE corresponding to the bits that are up in the DATATYPES block.

DATATYPE LOW BYTE DATATYPE HIGH BYTE CONTENT / SENSOR DATA SIZE DATA STRUCTURE
0x01 (b00000001) 0x00 (b00000000) MS : milliseconds 4 UINT32
0x02 (b00000010) 0x00 (b00000000) Luminosity sensor 1, VISIBLE + IR 4 2 * INT16
0x04 (b00000100) 0x00 (b00000000) Luminosity sensor 1, VISIBLE + IR 4 2 * INT16
0x08 (b00001000) 0x00 (b00000000) Magnetometer X,Y,Z 6 3 * INT16
0x10 (b00010000) 0x00 (b00000000) Temperature 1 2 INT16
0x20 (b00100000) 0x00 (b00000000) Temperature 2 2 INT16
0x40 (b01000000) 0x00 (b00000000) Temperature 3 2 INT16
0x80 (b10000000) 0x00 (b00000000) Temperature 4 2 INT16
0x00 (b00000000) 0x01 (b00000001) InfraTherm 2 INT16
0x00 (b00000000) 0x02 (b00000010) Accelerometer X,Y,Z 6 3 * INT16
0x00 (b00000000) 0x04 (b00000100) Gyroscope X,Y,Z 6 3 * INT16
0x00 (b00000000) 0x08 (b00001000) Geiger 1(*) ??? ???
0x00 (b00000000) 0x10 (b00010000) Geiger 2(*) ??? ???
0x00 (b00000000) 0x20 (b00100000) User defined block 1 5 depending
0x00 (b00000000) 0x40 (b01000000) User defined block 2 5 depending
0x00 (b00000000) 0x80 (b10000000) CRC 16(*) 2 UINT16
(*) not implemented yet

Let's take an example. Let's say the DATATYPES block indicates 0x42 (addr 1), 0x01 (addr 2). In binary, this gives b00000110 b00000001. So that means the content is : Luminosity sensor 1, Luminosity sensor 2, InfraTherm. The body of the packet's body will contain 2INT16 + 2INT16 + INT16 = 10 bytes.

The values in the packet's body are always ordered as indicated by the table above (increasing datatype). In our example, that means the packet will look like :

Address 0 1 2 3-4 4-5 6-7 8-9 10-11
Content 0x23 0x42 0x01 LUM1 VISIBLE LUM1 IR LUM2 VISIBLE LUM2 IR INFRATHERM

Be aware that the INT16, UINT16 or UINT32 values are stored by the Arduino in Little Endian. This means the least significant byte comes first in memory address.

2.2. SERIE PACKET specifications

Address 0 1 2 3-4 5 - ...
Content 0x21 KEYSTRUCT VALSTRUCT COUNT SERIE CONTENT

2.2.1. Header coding serie structure

The idea here is that a serie is defined by a set of couples (KEY,VAL). The size of this set is given by COUNT (2 bytes unsigned integer, between 0 and 65536). Both KEY and VAL can have specified UNIT types (you're free to set up what you need !). This UNIT type is specified by KEYSTRUCT and VALSTRUCT.

  • KEYSTRUCT (or VALSTRUCT) LOWEST 4 BITS codes the type of the data unit (see table below).
  • KEYSTRUCT (or VALSTRUCT) HIGHEST 4 BITS codes the dimensionality of the data unit (between 0 and 15).
UNIT CODE (HEX) CODE (BIN) type of values / output
HEX8 0x00 b00000000 hexadecimal 1 byte
HEX16 0x01 b00000001 hexadecimal 2 bytes
HEX24 0x02 b00000010 hexadecimal 3 bytes
HEX32 0x03 b00000011 hexadecimal 4 bytes
INT8 0x04 b00000100 1 byte signed integer
INT16 0x05 b00000101 2 bytes signed integer
INT24 0x06 b00000110 3 bytes signed integer
INT32 0x07 b00000111 4 bytes signed integer
UINT8 0x08 b00001000 1 byte unsigned integer
UINT16 0x09 b00001001 2 bytes unsigned integer
UINT24 0x0A b00001010 3 bytes unsigned integer
UINT32 0x0B b00001011 4 bytes unsigned integer
0x0C b00001100 unused
STR 0x0D b00001101 4 chars
0x0E b00001110 unused
FLOAT 0x0F b00001111 float (4 bytes)

NOTE: except for STR, the size of the unit can be calculated by operating (UNIT CODE & 0x03) + 1).

For instance, a data serie that would consist in 64 values of the magnetometer X,Y,Z (INT16), indexed by time measured in millis (UINT32):

  • KEY STRUCT = 0x1B (dimensionality 1, unit = UINT32)
  • VAL STRUCT = 0x35 (dimensionality 3, type = INT16)
  • COUNT = 64 (0x4000 in hex, little endian)

2.2.2. Structure of the serie in SERIE CONTENT

Now, the content of the serie is coded as a sequence of (KEY,VAL) as specified by KEYSTRUCT and VALSTRUCT. For instance, let's take the example above (size of KEY is 4 bytes, size of VAL is 6 bytes). That means we have something like :

| Address | 0 | 1 - 4 | 5 | 9 | 15 | 19 | 25 | | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | | Content | 0x21 | 0x1B 0x35 0x40 0x00 | KEY 1 | VAL 1 | KEY 2 | VAL 2 | ... |

To be sure everything's clear, in our example, that means we would have :

Address 0 1 - 4 5 9 11 13 15 19 21 23 25
Content 0x21 0x1B 0x35 0x40 0x00 MS1 MAGX1 MAGY1 MAGZ1 MS2 MAGX2 MAGY2 MAGZ2 ...

2.3. USER DEFINED PACKET specifications

Address 0 1 2 to LENGTH
Content 0x55 LENGTH VALUE BLOCKS

LENGTH is a 1 byte unsigned integer (UINT8) that indicates the total length of the packet between 0 and 255. That means that the length of VALUE BLOCKS is LENGTH minus 3 (length of the header).

The following BODY is made of VALUE BLOCKS. These blocks consist in a first byte coding the unit of the value, and the value itself.

This first byte is taken in the table of UNIT CODES and is consistent with the units used in the coding of series.

Except that, in this user defined packet, even if dimensionality is zero, it should be considered as at least 1 (or else, you wouldn't have put a block !).

Let's say, for instance, that you only want to send one temperature value (INT16) and a variance of this temperature (FLOAT). The corresponding user packet would read as follows :

Address 0 1 2 3-4 5 6-9
Content 0x55 0x0A (length=10) 0x05 (INT16) TEMP 0x0F (FLOAT) VARIANCE

2.4. LOG PACKET specifications

The purpose of the LOG packets is just to send verbose ascii chars. The point may be to send some comment, or some error or debug message. You never know...

For format is pretty easy :

Address 0 1 2 to LENGTH
Content 0x53 LENGTH CHAR CONTENT

LENGTH is a 1 byte unsigned integer (UINT8) that indicates the total length of the packet between 0 and 255. That means that the length of CHAR CONTENT is LENGTH minus 3 (length of the header).

You don't have to use an ending character at the end of your CHAR CONTENT, because the length of the content is defined by LENGTH.