Simpler Message Metadata
Warning: This is a lot of text, containing a lot of information, as well as strong opinions. Dive in at your own risk.
My proposals around #hsdt are focusses on a json replacement in schemaless free-form data, i.e. the content
field of ssb messages. But that is not the only place where ssb uses json. Another place where json is used for schemaless data is muxrpc. I won't go into muxrpc here, but it is worth to keep in mind that most of the current discussion around data formats can be applied to muxrpc as well. In this post, I'm going to discuss the metadata of messages.
CC @Dominic @arj @cel @cryptix @Piet
tldr: message metadata can be radically simplified and shrinked, roughly an order of magnitude
Introducing hsdt will require a new hash suffix. That gives us complete freedom on how to lay out message metadata. And there is some room for improvement over the current situation. @Dominic has just posted some information relevant to these considerations.
A quick quote from the protocol guide for convenience:
previous Message ID of the latest message posted in the feed. If this is the very first message then use null
. See below for how to compute a message’s ID.author Public key of the feed that the message will be posted in. sequence 1 for the first message in a feed, 2 for the second and so on. timestamp Time the message was created. Number of milliseconds since 1 January 1970 00:00 UTC. hash The fixed string sha256, which is the hash function used to compute the message ID. content Free-form JSON. It is up to applications to interpret what it means. It’s polite to specify a type field so that applications can easily filter out message types they don’t understand.
There's also a "signature"
field, containing an ssb cypherlink which depends on the encoding of all the fields listed in the above tables.
All messages have this same set of metadata, and we can utilize that. In practice, some oversights have crept in, but hsdt can ignore all of them. In this post, I'll discuss the idealized setting that applies to hsdt. I'll do a later post on how this can be adapted to the semi-canonical json replacement.
So how should the metadata be encoded? We can simply concatenate the actual metadata values in a predetermined order, no need for any keys. The encoding becomes <total size of message in bytes?><feed id?><backlink to previous message><sequence number?><hash?><timestamp?><content><signature>
, where ?
means that we could get rid of the entry. Now on to the details. There are a lot of them!
Message Size In Bytes
This is an interesting one with a few possible ways to go.
The first take on this: Messages have variable length, so to decode them, we need to tell the decoder the length. Thus we need to prefix it. This could be done either with a fixed-width integer (depending on what message size limit we end up settling upon, 2 bytes might be enough), or with a varint.
A second view: The message is already transported somehow, the transport framing already knows the length, so no need to include it again, simply reuse the length as specified by the transport layer. This places some restrictions on the transport protocol though, it can't do something like <sum of length of three messages><msg1><msg2><msg3>
. The current transports don't do this (afaik), but I dislike the idea of tight coupling with transport protocol details.
And a third perspective: All metadata has to be self-describing in its length, so there's no strict need to preix the length. A parser can just read through the metadata one piece at a time, allocating memory on the fly, and automatically knowing when to stop. An implementation can chose to do memory-related optimizations such as bulk allocation by taking transport framing data into consideration, but there's no tight logicl coupling.
I currently favor the third perspective, but I'm not settled yet. Including a size makes parsing easier and adds to the robustness of the whole thing. If the message size was included, it should be the first entry, so that data bases could simply drop it if it wasnt needed. I'd argue against putting it into the hash computation though (it is easy to ignore the first part of the encoding), since it does not include any actual information. It's just meta-meta-data that can be reconstructed from the logical model of the meta-data.
continued in next post...