-
Notifications
You must be signed in to change notification settings - Fork 61
Komponent
- Description
- BinaryReaderX
- BinaryWriterX
- Attributes
- Tools
- SubStream
- KMP Searcher
- Extensions
- Exceptions
Komponent is an independant base library used by most plugins developed by the Kuriimu 2 dev team. Its main features are BinaryReaderX and BinaryWriterX, which are extensions of System.IO.BinaryReader and System.IO.BinaryWriter.
Namespace: Komponent.IO
BinaryReaderX is an extension of System.IO.BinaryReader with the following changes made.
BinaryReaderX allows for optional parameters to decide endianess, bit order, and buffer size for bit reading (explained in Bit Reading).
Endianess, BitOrder and BlockSize are changeable after the objects creation.
Example for different endianess reading based on the following content of a stream ms
00 00 00 01
using (var brx = new BinaryReaderX(ms, ByteOrder.LittleEndian)) {
var leInt = brx.ReadInt32(); //leInt now contains the value 0x01000000
}
using (var brx = new BinaryReaderX(ms, ByteOrder.BigEndian)) {
var beInt = brx.ReadInt32(); //beInt now contains the value 0x00000001
}
The ReadString method allows for reading a string with a specified length and/or encoding. It does advance the stream position.
The PeekString method allows for peeking a string with a specified offset, length and/or encoding. It doesn't advance the stream position.
Example for different ReadString and PeekString scenarios based on the following content of a stream ms
54 68 69 73 20 69 73 20 61 20 73 74 72 69 6e 67 "This is a string" in UTF8
using (var brx = new BinaryReaderX(ms)) {
var str = brx.ReadString(16); //str now contains "This is a string"
brx.BaseStream.Position = 0;
str = brx.ReadString(6); //str now contains "This i"
brx.BaseStream.Position = 0;
str = brx.ReadString(16, Encoding.ASCII); //str now contains "This is a string"
brx.BaseStream.Position = 0;
str = brx.ReadString(6, Encoding.Unicode); //str now contains "桔獩椠", which doesn't make sense, because of the encoding given
}
using (var brx = new BinaryReaderX(ms)) {
//this shortcut method is meant to be used for peeking a magic num,
//at the beginning of a file, which mostly is just 4 bytes long
var str = brx.PeekString(); //str now contains "This"; The default value of length is 4
str = brx.PeekString(16); //str now contains "This is a string" and didn't advance the position
str = brx.PeekString(4, 6); //str now contains " is a "
//the offset value is always relative to the current position of the stream
brx.BaseStream.Position = 4;
str = brx.PeekString(0, 6); //str now contains " is a "
brx.BaseStream.Position = 0;
str = brx.PeekString(0, 16, Encoding.ASCII); //str now contains "This is a string"
str = brx.PeekString(0, 6, Encoding.Unicode); //str now contains "桔獩椠", which doesn't make sense, because of the encoding given
}
There are also already some shortcut methods given, to read null-terminated strings. All strings are either read up to 999 chars or until the null terminator was reached. An exception will be thrown if the end of stream was reached.
Given is the following stream content 54 68 69 73 20 69 73 20 61 20 73 74 72 69 6e 67 00 00 "This is a string\0\0" in ASCII
using (var brx = new BinaryReaderX(ms)) {
var str = brx.ReadCStringASCII(); //str now contains "This is a string"
brx.BaseStream.Position = 0;
str = brx.ReadCStringUTF16(); //str now contains "桔獩椠瑳楲杮"
brx.BaseStream.Position = 0;
str = brx.ReadCStringSJIS(); //str now contains "This is a string"
}
Most file formats work with different alignments, to for example create either blocks of data for encryption or compression or to align data for better performance on certain processor architectures. To make seeking to those aligned positions easier, there is a method SeekAlignment.
Example for seeking alignment based on the following content of a stream ms
FF FF 00 01 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
FF FF 00 02 00 00 00 00 00 00 00 00 00 00 00 00
using (var brx = new BinaryReaderX(ms)) {
var value = brx.ReadInt32(); //value now contains 0x0100FFFF
brx.SeekAlignment(0x20); //current position 4 will be aligned to the next multiple of 0x20
//position now is 0x20
value = brx.ReadInt32(); //value now contains 0x0200FFFF
}
Some file formats contain bitfields, to hold more data in a smaller range of bytes. To coordinate bit reading, the constructors allow the optional parameters bitOrder and blockSize.
To achieve bit reading, a buffer is filled with a certain amount of bytes when reading bits. BlockSize defines how big this buffer is and can only be 1, 2, 4, or 8 bytes, since most file formats are designed to have bitfields with a size of a power of 2.
ResetBitBuffer can reset the internal buffer and allows for a new buffer to be read for the following bit readings.
Like Endianess, there are also 2 different ways to read bits. From most-significant bit or least-significant bit. There are 4 possible ways to define bitOrder:
- MSBFirst (most-significant bit first)
- LSBFirst (least-significant bit first)
- HighestAddressFirst (in combination with LittleEndian equivalent to MSBFirst, in combination with BigEndian equivalent to LSBFirst)
- LowestAddressFirst (in combination with LittleEndian equivalent to LSBFirst, in combination with BigEndian equivalent to MSBFirst)
There is a 5th BitOrder "Inherit" only usable in combination with BitFieldInfo Attribute.
The 2 address modes for bitOrder are meant to set bitOrder dynamically, based on endianess. One can set bitOrder definitly with either MSBFirst or LSBFirst, regardless of endianess.
Example for reading bits based on the following content of a stream ms
FF FF 00 01 00 00 00 11 00 00 00 01 22 22 22 22
FF 01 02 03 00 00 00 00 00 00 00 00 00 00 00 00
using (var brx = new BinaryReaderX(ms, ByteOrder.LittleEndian, BitOrder.MSBFirst, 8)) {
var value = brx.ReadBits(3); //value now contains 0 (due to the 8 buffer bytes being read as little endian and most-significant bit first)
//the stream position is advanced to 8, due to blockSize being 8
//bits are read from the buffer until it's empty
//the private bitBuffer currently contains data 0x110000000100FFFF
//Due to ReadBits returning a long, the result also can only contain up to 64 bits
//Reading >64 bits will shift the first read bits out of the result
var value2 = brx.ReadBits(65); //value2 now contains 0x10000000100FFFF2
//First the remaining 61 bits of the buffer are read; then the buffer is filled up again with 8 bytes read in little endian; and another 4 bits are read
//the very first bit read in this call, was shifted out of the result
// stream position is now 16
//Reading a normal value will reset the bit buffer
var value3 = brx.ReadInt32(); //value3 now contains 0x030201FF
//There also is a generic method, which only takes in numeric primitives
var value4 = brx.ReadBits<byte>(4); //value4 now contains 0 and has the type byte
}
Since file formats contain data in different structures, one can use the method ReadType<> to read data directly into a class or struct, defining that structure through fields.
ReadType<> supports every primitive (signed/unsigned), strings, floating point types (float, double, decimal), enums, strings, arrays, List<>, and classes/structs. Every other type will throw an UnsupportedTypeException.
ReadMultiple<> takes in a count and executes ReadType<> multiple times. It returns an IEnumerable of the read generic type.
Lengths for strings, arrays and List<> need to be decorated with a FixedLength Attribute or VariableLength Attribute, else their value will be set to null.
Example for reading a class based on the following content of a stream ms
00 11 11 22 22 22 22 33 33 33 33 33 33 33 33 00
public class ExampleClass {
public byte val1;
public short val2;
public int val3;
public long val4;
public bool val5;
}
using (var brx = new BinaryReaderX(ms)) {
var example = brx.ReadType<ExampleClass>();
//example.val3 would now contain 0x22222222
}
Namespace: Komponent.IO
BinaryWriterX is an extension of System.IO.BinaryWriter with the following changes made.
BinaryWriterX allows for optional parameters to decide endianess, bit order, and buffer size for bit writing (explained in Bit Writing).
Endianess, BitOrder and BlockSize are changeable after the objects creation.
Example for different endianess writing in a stream ms
using (var bwx = new BinaryWriterX(ms, ByteOrder.LittleEndian)) {
bwx.Write(0x00000001); //now the content of the stream is "01 00 00 00"
}
using (var bwx = new BinaryWriterX(ms, ByteOrder.BigEndian)) {
bwx.Write(0x00000001); //now the content of the stream is "00 00 00 01"
}
There are two possibilities for writing strings with BinaryWriterX.
First being the normal override of Write(string value)
. This method encodes the input as ASCII, writes a leading count of total bytes used by the encoded string before it, and adds a null-terminator. The count of total bytes can at max be 255, else you need to expect overflows to a wrong count.
Second being the method WriteString. It takes in a string value, encoding, optionally a boolean to disable the leading count, and optionally a boolean to disable adding a null-terminator. Writing a leading count and adding a null-terminator are true by default.
Example for different string writing scenarios in a stream ms
using (var bwx = new BinaryWriterX(ms)) {
bwx.Write("This is a string");
//now the content of the stream is "10 54 68 69 73 20 69 73 20 61 20 73 74 72 69 6e 67"
bwx.BaseStream.Position = 0; //for convenience, assume the stream to be empty again
bwx.WriteString("This is a string", Encoding.ASCII, false, false);
//now the content of the stream is "54 68 69 73 20 69 73 20 61 20 73 74 72 69 6e 67"
bwx.BaseStream.Position = 0; //for convenience, assume the stream to be empty again
bwx.WriteString("Hör doch auf", Encoding.UTF8, true, false); //tl;dr "Hör doch auf" means "Just stop it"
//now the content of the stream is "0d 48 c3 b6 72 20 64 6f 63 68 20 61 75 66"
}
Most file formats work with different alignments, to for example create either blocks of data for encryption or compression or to align data for better performance on certain processor architectures. Unused memory between the last byte of relevant data and the aligned position often is filled with the same byte (in most cases being 0).
WriteAlignment writes the same byte until a multiple of a certain value is reached.
WritePadding writes the same byte for certain times and is therefore equal to Write(byte[] value)
.
In both cases the default byte to write is 0.
Example for writing alignment and padding in a stream ms
using (var bwx = new BinaryWriterX(ms)) {
bwx.WritePadding(13);
//now the content of the stream is "00 00 00 00 00 00 00 00 00 00 00 00 00"
bwx.BaseStream.Position = 0;
bwx.WritePadding(13, 0xFF);
//now the content of the stream is "FF FF FF FF FF FF FF FF FF FF FF FF FF"
bwx.BaseStream.Position = 0;
bwx.WriteAlignment(0x20, 0x11); //this writes 0x11 until the stream position reaches a multiple of 0x20
//now the content of the stream is
//11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11
//11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11
}
Some file formats contain bitfields, to hold more data in a smaller ranges of bytes. To coordinate bit writing, the constructors allow the optional parameters bitOrder and blockSize.
To achieve bit writing, a buffer is filled with the bits to write. BlockSize defines how big this buffer can get and can only be 1, 2, 4, or 8 bytes, since most file formats are designed to have bitfields with a size of a power of 2.
Like Endianess, there are also 2 different ways to write bits. From most-significant bit or least-significant bit. There are 4 possible ways to define bitOrder:
- MSBFirst (most-significant bit first)
- LSBFirst (least-significant bit first)
- HighestAddressFirst (in combination with LittleEndian equivalent to MSBFirst, in combination with BigEndian equivalent to LSBFirst)
- LowestAddressFirst (in combination with LittleEndian equivalent to LSBFirst, in combination with BigEndian equivalent to MSBFirst)
The 2 address modes for bitOrder are meant to set bitOrder dynamically, based on endianess. One can set bitOrder definitly with either MSBFirst or LSBFirst, regardless of endianess.
Example for writing bits in a stream ms
using (var bwx = new BinaryWriterX(ms, ByteOrder.LittleEndian, BitOrder.LSBFirst, 4)) {
bwx.WriteBits(0x00000007,3);
//The stream position isn't advanced until the buffer was completely filled or other write operations are used
//The private bitPosition now is 3
//Due to BitOrder being LSBFirst the 3 least significant bits of 0x00000007 are written
bwx.WriteBits(0x00000007,3);
//The private bitPosition now is 6
//the bitBuffer is a long and now contains value 0x000000000000003F
bwx.FlushBuffer();
//Bits are only written to the stream if the buffer is full, another write operation is used or FlushBuffer is executed
}
Since file formats contain data in different structures, one can use the method WriteType<> to write classes/structs directly to the stream without more code overhead.
WriteType<> supports every primitive (signed/unsigned), strings, floating point types (float, double, decimal), enums, strings, arrays, List<>, and classes/structs. Every other type will throw an UnsupportedTypeException.
WriteMultiple<> takes in a count and executes WriteType<> multiple times.
Lengths for strings, arrays and List<> need to be decorated with a FixedLength Attribute or VariableLength Attribute, else their value won't be written.
Example for writing a class in a stream ms
public class ExampleClass {
public byte val1;
public short val2;
public int val3;
public long val4;
public bool val5;
}
using (var bwx = new BinaryWriterX(ms)) {
var example = new ExampleClass {
val1 = 2
val2 = 45
val3 = 345
val4 = 999675435
val5 = true
}
bwx.WriteType(example);
//stream now contains
//"02 2D 00 59 01 00 00 2B D6 95 3B 00 00 00 00 01"
}
Namespace: Komponent.IO
- Description
- Endianess Attribute
- BitField Attribute
- BitFieldInfo Attribute
- FixedLength Attribute
- VariableLength Attribute
- Alignment Attribute
- TypeChoice Attribute
All attributes covered here are only used for Read/WriteType<> and Read/WriteMultiple<>.
They are used to define specific properties for the fields to be read/written properly into the FieldType.
All following attributes only show examples on how the attributes get applied to a class/struct and explain some use cases.
This attribute can be applied to classes, structs and fields. It is used to determine the ByteOrder for all following reads/writes on the same or a lower level. Its default ByteOrder is LittleEndian.
Without the attribute the current ByteOrder of the BinaryReader/WriterX is used.
[Endianess]
public class Example
{
public int var1; //this value will be read as LittleEndian
[Endianess(ByteOrder.BigEndian)]
public int var2; //this value will be read as BigEndian
public int var3; //this value will be read as LittleEndian again
}
This attribute can only be applied to fields. It has to take in a bitCount. This attribute can only be used in combination with the BitFieldInfo Attribute. If the BitFieldInfo attribute doesn't decorate the class/struct, all BitField attributes are ignored and don't do anything.
[BitFieldInfo]
public class Example
{
public int var1; //is read normally
[BitField(5)]
public int var2; //5 bits get read, in compliance to BitOrder and BlockSize
//of either BitFieldInfo or BinaryReader/WriterX
public Exampl2 var3;
public class Example2
{
[BitField(6)]
public int var1; //this BitField attribute gets ignored, since the class wasn't decorated with BitFieldInfo
}
}
This attribute can be applied to classes and structs. It defines BlockSize and BitOrder for this class. By default BlockSize is 4 and BitOrder is "Inherit", which means it uses the BitOrder of the used BinaryReader/WriterX.
[BitFieldInfo]
public class Example
{
[BitField(5)]
public int var1; //5 bits are read/written, using BlockSize 4 and the BitOrder of the reader/writer
[BitField(6)]
public byte var2; //6 bits are read/written
[BitField(5)]
public int var3; //5 bits are read/written
public Exampl2 var4;
[BitFieldInfo(BlockSize = 2, BitOrder = BitOrder.MSBFirst)]
public class Example2
{
[BitField(6)]
public int var1; //6 bits are read/written, using BlockSize 2 and BitOrder MSBFirst
//since there are still 2 bytes fetched into the bitBuffer, due to BlockSize 4 earlier,
//bits are read out of this buffer, before the changed BlockSize takes effect
//the changed BitOrder however already takes effect, so take that in mind!!
[BitField(10)]
public int var2; //the first bitBuffer is now empty, and BlockSize will take effect
//with the next bit reading/writing
[BitField(1)]
public int var3; //1 bit is read/written, using BlockSize 2 and BitOrder MSBFirst
}
}
This attribute can only be applied to fields. It's used for strings, arrays, and List<> to define their length.
It takes in a length and optionally a StringEncoding enum, which is only used for string reading/writing and is set to ASCII by default.
For writing it throws an exception if the declared length and length of the string, array, or List<> differentiates.
public class Example
{
[FieldLength(5)]
public byte[] var1; //reads/writes 5 bytes into an array
[FieldLength(3)]
public List<int> var2; //reads/writes 3 ints into a List
[FieldLength(6, StringEncoding = StringEncoding.Unicode)]
public string var3; //reads/writes 6 !bytes! and converts them to a string with Unicode;
//in this case the string will be 3 characters long
}
This attribute can only be applied to fields. It's used for strings, arrays, and List<> to define their length dynamically.
It takes in a FieldName and optionally a StringEncoding enum and Offset. FieldName is used to define the field to take the length value from. Offset defines a fixed change to this field value just for this read/write operation.
For writing it throws an exception if the declared length and length of the string, array, or List<> differentiates.
public class Example
{
public int var1; //assume this field contains the value 5 for this example
[VariableLength("var1")]
public byte[] var2; //reads/writes as many bytes as defined by field "var1"; in this example 5 bytes
[VariableLength("var1", Offset = 2)]
public byte[] var3; //reads/writes as many bytes as defined by field "var1"; in this example 5 + 2 bytes
[VariableLength("var1", Offset = 2, StringEncoding = StringEncoding.Unicode)]
public string var4; //reads/writes as many bytes as defined by field "var1"; in this example 5 + 2 bytes
//those bytes are then converted to a string with the given encoding
// As of commit 3bd8ebbf fields from lower classes are also usable in upper classes
[VariableLength("var5.varEx2")]
public byte[] var6; //this byte[] will now be 3 elements long
}
This attribute can only be applied to classes. It's used to fulfil an alignment after the decorated class was completely written/read.
It takes in an integer, which defines the alignment to seek to.
[Alignment(16)] //the class will be read/written as usual, but gets aligned to the given integer
public class Example
{
public int var1; //an int takes up 4 bytes,
//but for ReadType the position gets simply seeked to 16 in this example
//while for WriteType, the method Write(byte[]) gets used to expand the stream to the desired alignment
//the alignment is filled with null bytes
}
This attribute can only be applied to fields. It's used to determine a specific type, based on an earlier read value in the model, to inject into the fields type.
This attribute will be taken into account for ReadType. It should be used on objects, dynamics or any other type that can inherit all type choices given through the attributes on the field.
If any TypeChoice isn't injectable to the fields type, an InvalidOperationException gets thrown.
The attribute takes in 4 constructor parameters in the following order:
- FieldName: The name of the field on which value the type gets chosen on
- TypeChoiceComparer: An enum to declare the comparison between the field value and the following constant The enum represents the numeric comparisons equal, smaller, greater, smaller equal and greater equal
- Constant: The value to compare the field value with
- Type: The type that should get injected
Example:
public class Example
{
public int flag;
[TypeChoice("flag", TypeChoiceComparer.Equal, 0x00000001, typeof(int))]
[TypeChoice("flag", TypeChoiceComparer.Equal, 0x00000002, typeof(long))]
public object dependant; // this field will be either int (if flag is 1) or long (if flag is 2)
[TypeChoice("flag", TypeChoiceComparer.Greater, 0x00000001, typeof(Dependance1))]
[TypeChoice("flag", TypeChoiceComparer.Smaller, 0x00000002, typeof(Dependance2))]
public IDependance dependant; // this field will either contain Dependance1 (if flag is >1)
// or Dependance2 (if flag is <2)
}
public interface IDependance {}
public Dependance1 : IDependance
{
public int var0;
}
public Dependance2 : IDependance
{
public int var0;
}
Namespace: Komponent.IO
This is a static method which statically measures any type given. Since it's a measurement based on the type context of a type alone, there are some drawbacks.
MeasureType will not work if any "dynamic" type is used in the model. Those types include for example:
- object
- dynamic
- string (if no FixedLength attribute is used)
- IList
- Any similar types that need more context than just the type itself to have a set size
MeasureType will also ignore some of our attributes, since they are context dependant, such as:
- Alignment
- TypeChoice
It will also throw an exception if it encounters a VariableLengthAttribute while measuring the given model.
Example:
// This model would be 12 bytes long due to one int and one long
public class Example
{
public int var0;
public long var1;
}
Namespace: Komponent.IO
SubStream is an extension for System.IO.Stream. It takes in an input stream, offset and length.
A SubStream constrains access to data in the baseStream to the area defined by offset and length. One can only read and seek in that area. Writing is not allowed.
That extension can be helpful for file formats or data structures that utilize relative offsets. Instead of making those absolute to the fileStream, one can create a SubStream of an area and apply the relative offset directly.
Example stream ms
00 00 00 04 00 00 00 05 00 00 00 06
using(var sub = new SubStream(ms, 4, 8))
using(var br = new BinaryReaderX(sub, ByteOrder.BigEndian))
{
br.BaseStream.Position = 4; //the SubStream starts at position 4 and is 8 bytes long
//SubStream position 4 is therefore absolute position 8 in ms
var value = br.ReadInt32(); //value now contains 6
}
Namespace: Komponent.Tools
Reference explanation and workings of the Knuth-Morris-Pratt algorithm: Wikipedia
This object is a string searching algorithm to search smaller strings in a bigger one faster, through optimizations and rules in comparing decoded strings.
The constructor takes in a byte[], which represents the string to search for, decoded with an encoding.
The exposed method "Search" searches the initial string given through the constructor, with the one given through this method.
This method can either take in a byte[] or BinaryReader.
var str = "This is a string to search in";
var substr = "is";
var searcher = new KmpSearcher(Encoding.ASCII.GetBytes(substr));
var offset = searcher.Search(Encoding.ASCII.GetBytes(str));
//offset now contains 2, since "is" was found first at position 2 in str
Namespace: Komponent.Tools
Make a copy of all properties of an instance object to another one of the same type.
public class Example
{
public int var1;
}
var example = new Example();
example.var1 = 5;
var example2 = new Example();
example.CopyProperties(example2); //example2.var1 now contains 5
Convert a string to a byte[].
var str = "001122334455";
var hex = str.Hexlify(); //hex now contains {0x00, 0x11, 0x22, 0x33, 0x44, 0x55}
Namespace: Komponent.IO
Gets thrown in Read/WriteType<> and Read/WriteMultiple<> if the length given through FixedLength Attribute or VariableLength Attribute differentiates from the actual length of the string, array, or List<>
Gets thrown if the given type for Read/WriteBits<> or a FieldType in Read/WriteType<> and Read/WriteMultiple<> is not supported.
Gets thrown in Read/WriteType<> and Read/WriteMultiple<> if the BlockSize of a BitFieldInfo Attribute is neither 1, 2, 4, or 8.