You are reading content from Scuttlebutt
@aljoscha %QlFuGsQKY2nSE5IXPk2c/9lRuTpqX6hpBHi0KJ+XI98=.sha256
Re: %GrYD8mj3w

...continued from previous post

VarU64

The VarU64 encoding for unsigned 64 bit integer works as follows:

first byte value
0b0000_0000 - 0b1111_0111 The numeric value of the byte
0b1111_1000 The byte is followed by an 8 bit unsigned big-endian integer
0b1111_1001 The byte is followed by a 16 bit unsigned big-endian integer
0b1111_1010 The byte is followed by a 24 bit unsigned big-endian integer
0b1111_1011 The byte is followed by a 32 bit unsigned big-endian integer
0b1111_1100 The byte is followed by a 40 bit unsigned big-endian integer
0b1111_1101 The byte is followed by a 48 bit unsigned big-endian integer
0b1111_1110 The byte is followed by a 56 bit unsigned big-endian integer
0b1111_1111 The byte is followed by a 64 bit unsigned big-endian integer

Each integer may only be encoded using the smallest possible number of bytes. When decoding, violations of that constraints must be reported as errors.


This format (compared to the current ipfs varuint):

  • restricts the domain to 64 bit unsigned integers
  • indicates the length of the value in the first byte
  • admits exactly one valid encoding per number
  • can be parsed very efficiently
  • optimizes for small values (can store 248 different values in a single byte, compared to 128 for the ipfs varuint)
  • pays for these advantages by leaving quite a few byte strings unused (the encodings that do not use the smallest possible number of bytes)
    • if it ever becomes absolutely necessary to extend this format to handle integers of larger (or even arbitrary) size, these unused values can enable such an extension

I'll revisit the #yamf formats as necessary once I'll need them in the rust ssb implementation.

Join Scuttlebutt now