SAT_DataLib : specifications

This wiki page presents the specification of the SAT DataLib binary data storage model. This document is still in construction!

The general purpose of the SAT_DataLib is to propose a framework for coding/storing binary data in the ArduSat experiments, and the specifications to decode that data once retrieved on earth. The point of using binary data is to reduce the consumption of data storage : using plain text like CSV would require more space than using raw binary values.

This library (and the corresponding data storage framework) have been designed to have the following specific capacities :

store different kind of data in one single file : the point is not to store only one kind of message, but several in one single file, the different messages can be decoded one after the other. For instance, depending on your experiment, you might be interested by storing raw values from data periodically, then send some computed values of your own, then a serie of values or a text message. You have to be able to do that in one single file.
consume as less data as possible
enable the use of extensible structures, not fixed size messages

The general mechanism underlying the storage framework is to structure your data by messages, called "Packets" in the following. Different packet types are proposed, each has a specific purpose (store raw values from sensors, store data series, store text messages...). Each packet consists in one header that caracterizes some content, its format and its length. The length of the packet can be exactly computed from the content of the packet's header.

1. Packet General Structure

All the packet have the same generic structure. They begin with a HEADER CODE (1 byte). This header's first byte defines the packet's type. Then each type of packet has a different HEADER CONTENT. The HEADER CONTENT typically contains parameters values that describe the underlying struture of the following BODY. Then comes the BODY of the packet, whose structure depends on the packet type and the parameters stored in the header content.

Address	0	1	(HEADER_SIZE)
Content	HEADER_CODE	HEADER_CONTENT	BODY

Depending on the HEADER_CODE, the following table specifies the kind of content the packet is designed to carry, and the size of the header.

HEADER_CODE	PACKET NAME, HEADER CONTENT	HEADER_SIZE(*)
0x23 ['#']	“CHUNK” packet - Contains only raw values taken from the sensors, each value is assigned to a DATATYPE. The header contains a 2 byte block containing all datatypes. The length of the packet can be calculated from the datatypes and their corresponding length.	3
0x21 ['!']	“SERIE” packet - Contains a serie of values indexed by a key (time ?).	5
0x55 ['U']	“USER DEFINED” packet - Whatever fits your needs to put the values you'd like	2
0x53 ['L']	“LOG” packet - Contains a string (for logging purpose, or error messages). The headers consists in 1 byte that contains the length of the whole packet (CODE+HEADER+BODY). The body contains the string (no need for a '\0' tail char, as the length is given by the header).	2
() in bytes, the header code byte is included in this number*

2. Packet types specifications

2.1. CHUNK PACKET specifications

Address	0	1-2	3-(LENGTH)
Content	0x23	DATATYPES	CONTENT

The parameter "DATATYPES" corresponds to a 2 byte block and aggregates (OR) several "DATATYPE". Each "DATATYPE" is designed to be a unique identifier of a sensor. There are 16 possible unique identifiers (each corresponds to one bit in the 2 byte block), that can be aggregated (OR). The size of the data in the CONTENT and the underlying DATA TYPE is given by the following table. In order to calculate the full LENGTH of the packet, one needs to sum all the DATA SIZE corresponding to the bits that are up in the DATATYPES block.

DATATYPE LOW BYTE	DATATYPE HIGH BYTE	CONTENT / SENSOR	DATA SIZE	DATA STRUCTURE
0x01 (b00000001)	0x00 (b00000000)	MS : milliseconds	4	UINT32
0x02 (b00000010)	0x00 (b00000000)	Luminosity sensor 1, VISIBLE + IR	4	2 * INT16
0x04 (b00000100)	0x00 (b00000000)	Luminosity sensor 1, VISIBLE + IR	4	2 * INT16
0x08 (b00001000)	0x00 (b00000000)	Magnetometer X,Y,Z	6	3 * INT16
0x10 (b00010000)	0x00 (b00000000)	Temperature 1	2	INT16
0x20 (b00100000)	0x00 (b00000000)	Temperature 2	2	INT16
0x40 (b01000000)	0x00 (b00000000)	Temperature 3	2	INT16
0x80 (b10000000)	0x00 (b00000000)	Temperature 4	2	INT16
0x00 (b00000000)	0x01 (b00000001)	InfraTherm	2	INT16
0x00 (b00000000)	0x02 (b00000010)	Accelerometer X,Y,Z	6	3 * INT16
0x00 (b00000000)	0x04 (b00000100)	Gyroscope X,Y,Z	6	3 * INT16
0x00 (b00000000)	0x08 (b00001000)	Geiger 1(*)	???	???
0x00 (b00000000)	0x10 (b00010000)	Geiger 2(*)	???	???
0x00 (b00000000)	0x20 (b00100000)	User defined block 1	5	depending
0x00 (b00000000)	0x40 (b01000000)	User defined block 2	5	depending
0x00 (b00000000)	0x80 (b10000000)	CRC 16(*)	2	UINT16
() not implemented yet*

Let's take an example. Let's say the DATATYPES block indicates 0x42 (addr 1), 0x01 (addr 2). In binary, this gives b00000110 b00000001. So that means the content is : Luminosity sensor 1, Luminosity sensor 2, InfraTherm. The body of the packet's body will contain 2INT16 + 2INT16 + INT16 = 10 bytes.

The values in the packet's body are always ordered as indicated by the table above (increasing datatype). In our example, that means the packet will look like :

Address	0	1	2	3-4	4-5	6-7	8-9	10-11
Content	0x23	0x42	0x01	LUM1 VISIBLE	LUM1 IR	LUM2 VISIBLE	LUM2 IR	INFRATHERM

Be aware that the INT16, UINT16 or UINT32 values are stored by the Arduino in Little Endian. This means the least significant byte comes first in memory address.

2.2. SERIE PACKET specifications

Address	0	1	2	3-4	5 - ...
Content	0x21	KEYSTRUCT	VALSTRUCT	COUNT	SERIE CONTENT

2.2.1. Header coding serie structure

The idea here is that a serie is defined by a set of couples (KEY,VAL). The size of this set is given by COUNT (2 bytes unsigned integer, between 0 and 65536). Both KEY and VAL can have specified UNIT types (you're free to set up what you need !). This UNIT type is specified by KEYSTRUCT and VALSTRUCT.

KEYSTRUCT (or VALSTRUCT) LOWEST 4 BITS codes the type of the data unit (see table below).
KEYSTRUCT (or VALSTRUCT) HIGHEST 4 BITS codes the dimensionality of the data unit (between 0 and 15).

UNIT	CODE (HEX)	CODE (BIN)	type of values / output
HEX8	0x00	b00000000	hexadecimal 1 byte
HEX16	0x01	b00000001	hexadecimal 2 bytes
HEX24	0x02	b00000010	hexadecimal 3 bytes
HEX32	0x03	b00000011	hexadecimal 4 bytes
INT8	0x04	b00000100	1 byte signed integer
INT16	0x05	b00000101	2 bytes signed integer
INT24	0x06	b00000110	3 bytes signed integer
INT32	0x07	b00000111	4 bytes signed integer
UINT8	0x08	b00001000	1 byte unsigned integer
UINT16	0x09	b00001001	2 bytes unsigned integer
UINT24	0x0A	b00001010	3 bytes unsigned integer
UINT32	0x0B	b00001011	4 bytes unsigned integer
	0x0C	b00001100	unused
STR	0x0D	b00001101	4 chars
	0x0E	b00001110	unused
FLOAT	0x0F	b00001111	float (4 bytes)

NOTE: except for STR, the size of the unit can be calculated by operating (UNIT CODE & 0x03) + 1).

For instance, a data serie that would consist in 64 values of the magnetometer X,Y,Z (INT16), indexed by time measured in millis (UINT32):

KEY STRUCT = 0x1B (dimensionality 1, unit = UINT32)
VAL STRUCT = 0x35 (dimensionality 3, type = INT16)
COUNT = 64 (0x4000 in hex, little endian)

2.2.2. Structure of the serie in SERIE CONTENT

Now, the content of the serie is coded as a sequence of (KEY,VAL) as specified by KEYSTRUCT and VALSTRUCT. For instance, let's take the example above (size of KEY is 4 bytes, size of VAL is 6 bytes). That means we have something like :

| Address | 0 | 1 - 4 | 5 | 9 | 15 | 19 | 25 | | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | | Content | 0x21 | 0x1B 0x35 0x40 0x00 | KEY 1 | VAL 1 | KEY 2 | VAL 2 | ... |

To be sure everything's clear, in our example, that means we would have :

Address	0	1 - 4	5	9	11	13	15	19	21	23	25
Content	0x21	0x1B 0x35 0x40 0x00	MS1	MAGX1	MAGY1	MAGZ1	MS2	MAGX2	MAGY2	MAGZ2	...

2.3. USER DEFINED PACKET specifications

Address	0	1	2 to LENGTH
Content	0x55	LENGTH	VALUE BLOCKS

LENGTH is a 1 byte unsigned integer (UINT8) that indicates the total length of the packet between 0 and 255. That means that the length of VALUE BLOCKS is LENGTH minus 3 (length of the header).

The following BODY is made of VALUE BLOCKS. These blocks consist in a first byte coding the unit of the value, and the value itself.

This first byte is taken in the table of UNIT CODES and is consistent with the units used in the coding of series.

Except that, in this user defined packet, even if dimensionality is zero, it should be considered as at least 1 (or else, you wouldn't have put a block !).

Let's say, for instance, that you only want to send one temperature value (INT16) and a variance of this temperature (FLOAT). The corresponding user packet would read as follows :

Address	0	1	2	3-4	5	6-9
Content	0x55	0x0A (length=10)	0x05 (INT16)	TEMP	0x0F (FLOAT)	VARIANCE

2.4. LOG PACKET specifications

The purpose of the LOG packets is just to send verbose ascii chars. The point may be to send some comment, or some error or debug message. You never know...

For format is pretty easy :

Address	0	1	2 to LENGTH
Content	0x53	LENGTH	CHAR CONTENT

LENGTH is a 1 byte unsigned integer (UINT8) that indicates the total length of the packet between 0 and 255. That means that the length of CHAR CONTENT is LENGTH minus 3 (length of the header).

You don't have to use an ending character at the end of your CHAR CONTENT, because the length of the content is defined by LENGTH.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly