%QlFuGsQKY2nSE5IXPk2c/9lRuTpqX6hpBHi0KJ+XI98=.sha256

@aljoscha6 years ago %QlFuGsQKY2nSE5IXPk2c/9lRuTpqX6hpBHi0KJ+XI98=.sha256

...continued from previous post

VarU64

The VarU64 encoding for unsigned 64 bit integer works as follows:

first byte	value
0b0000_0000 - 0b1111_0111	The numeric value of the byte
0b1111_1000	The byte is followed by an 8 bit unsigned big-endian integer
0b1111_1001	The byte is followed by a 16 bit unsigned big-endian integer
0b1111_1010	The byte is followed by a 24 bit unsigned big-endian integer
0b1111_1011	The byte is followed by a 32 bit unsigned big-endian integer
0b1111_1100	The byte is followed by a 40 bit unsigned big-endian integer
0b1111_1101	The byte is followed by a 48 bit unsigned big-endian integer
0b1111_1110	The byte is followed by a 56 bit unsigned big-endian integer
0b1111_1111	The byte is followed by a 64 bit unsigned big-endian integer

Each integer may only be encoded using the smallest possible number of bytes. When decoding, violations of that constraints must be reported as errors.

This format (compared to the current ipfs varuint):

restricts the domain to 64 bit unsigned integers
indicates the length of the value in the first byte
admits exactly one valid encoding per number
can be parsed very efficiently
optimizes for small values (can store 248 different values in a single byte, compared to 128 for the ipfs varuint)
pays for these advantages by leaving quite a few byte strings unused (the encodings that do not use the smallest possible number of bytes)
- if it ever becomes absolutely necessary to extend this format to handle integers of larger (or even arbitrary) size, these unused values can enable such an extension

I'll revisit the #yamf formats as necessary once I'll need them in the rust ssb implementation.