You are reading content from Scuttlebutt
@aljoscha %L9m5nHRqpXM4Zkha1ENTk5wNOXQMduve8Hc9+F0RLZI=.sha256

Merkle Trees for SSB Feeds

Or: Enabling partially subscribable feeds, but with a lot of words before I get to that point.

Like offchain content, this is another idea that has been floating through the scuttleverse at various points in time. A compact summary that is only useful if you already know what this is about: By changing the metadata of ssb messages, we could create merkle trees rather than linked lists. This allows verifying the k-th message of a feed by traversing only log(k) many messages, rather than the full k messages of the linked list in the current model.

This is an issue that

  • touches the very core of the protocol
  • seems to be both rather abstract and hammering on pointless detail at first
  • but turns out to have far-reaching consequences

This post will at times be very technical and somewhat computer-sciency, but I'll try to explain enough that anybody interested can follow the gist of it. It's also a pretty long read, but I hope a few of you can still take the time to read it. Some of it my look complicated at first, but once you gain some intuition, it all becomes manageable. It's definitely not too complicated for real-world use - dat uses a data structure like this internally.

Outline:

  • background on ssb security guarantees
  • why would we need merkle trees
  • how merkle trees work
  • implementing it via ssb metadata
  • complexity analysis
  • further optimizations
  • conclusion

CC protocol humans: @Dominic, @dinosaur,
@Piet, @cel, @Christian Bundy, @cryptix, @keks

Background: SSB's Security Guarantees

In order to meet its security guarantees, ssb verifies certain properties for all messages. The most basic one is to check that it indeed belongs to its claimed feed. This is done by checking the signature attached to every message. If that were all the guarantees ssb provided (and such a protocol is interesting in its own right), then there would be no need for the backlinks in each message's metadata. The backlink is simply a secure hash of the previous message. The first property gained through backlinks that might come to mind is immutability. Even the feed owner can not retroactively change the feed, since that would break the linked list of hashes. Once somebody knows the first k messages, the feed owner can not convince that person that any message up to k was different than that person first thought.

At a first glance, sequence numbers seem to give immutability already. If there were no backlinks, but you had a set of k messages, each with a different sequence number between 0 and k - 1, then you could still restore the correct order and you could detect any retroactive changes by just checking whether two different messages claim the same sequence number, at which point you'd declare the feed broken (just like a forked feed in ssb's actual model is considered broken). So why go through the trouble of a linked list?

If ssb only used sequence numbers, then a malicious feed could hand out different copies of itself to different peers, and they could not tell that something weird was going on until they compared feeds. But they's never have an incentive to compare feeds, why would they request a message they already have?

Why do backlinks solve this, whereas sequence numbers fail? The list of hashes guarantees that the messages have actually been created in the order of their claimed sequence numbers.

Suppose you just asked me for the fifth message in my feed, and there were no backlinks. I could just make up some message content, give it seqnum 5, sign it, and hand it over to you. I would not need to commit to the content of any previous messages. If some time later you asked for the third message in my feed, I could again freely invent something. You have no way of knowing that the third message has actually been authored at a later point in time.

Backlinks prevent this and forces everyone to adhere to the linear temporal order of messages. If I made up the fifth message, I'd also need to give it a backlink. But if I freely invented that backlink, and later you asked me for message four, I couldn't come up with a message that has that invented hash (if I could, the hash function would be broken and I could do much more profitable things than lying about ssb messages). So instead, to invent message five, I'd also need to invent the content of message four first. But since that one again has a backlink, I need to commit to the content of mesage three before that, and so on. As a result, ssb gives the guarantee that the order of messages in a feed corresponds to temporal order in real life. That's pretty amazing once you think about it.

This temporal order guarantee also translates into truly immutable feeds, where it is impossible to hand out more than one version of the feed to different parties.

@aljoscha %/DmqNODlO97dAlm7a7jrEQlnF5XhH7dQv1+xjwCyJ0c=.sha256

The Problem

Ssb has that property, so why should there be a problem? The problem is one of complexity. If I gave you the k-th and l-th message from the same feed, and k has a lower sequence number than l, how could you verify that message l has indeed been created after message k? You'd have to traverse the full hash chain from l back to k. In the worst case, k would be zero. Then, you'd need to traverse the full feed. So to verify message l, you need to perform l many checks - and to do so, you need to know the data of l other messages. Put into more mathematical terms: the length of a validity certificate is linear in the sequence number of the message.

What if you only cared about a subset of the messages of a feed, but still wanted the full security guarantees? With current ssb, your only chance is to download the full feed, even though you don't care about much of it. This is the problem that merkle trees can solve. But first a few words on a related concept: Out-of-order (ooo) messages.

Ssb provides ooo messages, where a server can ask its peers to deliver it a message of a certain hash. That way, you can retrieve messages without downloading their full feed. That's fine, but it forfeits the guarantee about creation order. Instead, you trust that whoever gave you that hash has already verified the validity of that message. While that's fine for relatively inconsequential, one-off retrievals such as fetching a post in a thread you are interested in, it is not a solid foundation for regularly fetching a subset of a feed. And in general, it is somewhat dangerous that there are essentially two types of messages floating around: ones where you know for sure that creation order guarantees hold, and ones where you only trust somebody that they hold. Application developers need to cleanly separate those if things are to stay secure.

More generally, this split between verifiable and unverifiable messages makes the whole system more complex to reason about. This may sound vague and abstract, but this additional complexity has surfaced multiple times already, imposing metadata overhead on all messages.

I don't want to argue that ooo messages are a bad feature per se. But if there was an alternative that managed to preserve security guarantees, it should surely be considered. And merkle trees provide this alternative.

Merkle Trees

(This section is rather technical and assumes some familiarity with basic computer science themes regarding data structures and complexity.)

Merkle trees (see the wikipedia article) are collections that can give a certificate (basically a proof) that they contain a certain item. Crucially, the size of such a certificate is only logarithmic in the number of items, as is the verification time. Proof of membership is not exactly what ssb needs, but it is close enough to be adapted to our needs. There are few other points that need some tweaking. For one, merkle trees assume a bounded size and a well-known root hash, but ssb feeds have neither. Furthermore, we want to avoid adding additional nodes to our message graph, so we'd rather store non-leaf nodes implicitly.

Let's start with a basic scenario to get a feel for the setting: The following graphic shows an ssb feed of eight messages and a corresponding merkle tree.

m0 to m7 are the messages, h0 to h7 are hashes of those messages, the dotted edges between the first two layers are ssb's regular backlinks. h<x><y> is the hash of the concatenation of h<x> and h<y>. You can imagine a time-axis going from left to right. Instead of the regular symmetric tree layout, I've moved the hash nodes to the first point in time at which they can possibly be computed. An non-dotted edge from node v to node u means that u can not have been computed if v didn't already exist (else, the hash function would be broken).

In the regular merkle tree setting, certificates work as follows: Assuming I already knew the hash of the item I'm interested in (say h6) and the root hash (h01234567), how can someone prove to me that h6 is in the tree? They only need to give me h7, h45, and h0123, and then I can check whether hash(h0123 | hash(h45 | hash(h6 | h7))) == h01234567. If it is, then h6 did indeed contribute to the creation of my well-known root value h01234567. What we care about here is that this also implies that h6 (and thus m6) was created before h7. And if I later asked for a certificate for h3, the correct certificate also implies that m3 was indeed created before m6 (by checking that the h0123 value is the same in both certificates (which happens implicitly since otherwise the final root hash would differ)).

Transfering this to ssb, the first problem is that we don't have a well-known root hash. But that is fine, we just send the root as part of the certificate instead. m6 is signed, so we already know that it is either a member of the correct feed, or the feed owner has been forking the feed. We can't detect forks with only a single message, since for a fork you always need at least two, contradictory ones. Rather then collection membership, we only care about being able to verify that any message of seqnum < 6 that we receive at a later point in time must have already existed when we received m6. And that's what the merkle tree gives us. Well, except for a few problems we still need to deal with.

Assume we received m6 and the associated certificate c6 described above (h7, h45, h0123, and h01234567), and now we ask for m1. The certificate c1 we get consists of h0 and h01. The server we are requesting from might not even know about any messages that were published after m1. But now we can't check whether c1 contradicts c6. c6 contains h0123, but does not tell us how h01 must relate to it. Fortunately, there's an easy fix: Each certificate must contain all the diamon-shaped nodes (h0, h01, h0123, and so on) that existed up to that point in time. Each diamond node is the root of the complete binary merkle trees over the first 2^i messages for some i. Each certificate is created by "working downwards" until one of these roots is reached, so by including them, we ensure that all certificates can be checked against each other.

@aljoscha %LeKA6U4euO23h7VxdRm8ajldUFEMIXLGqXWtoPYMiu4=.sha256

So are we done now? Nope. Our c6 certificate assumed that m7 was available, but what if it didn't exist yet (or the certificate creator didn't know about it yet)? This is depicted in the following graphic:

Here we don't have a single merkle tree, but a merkle forest with three roots: h0123, h45 and h6. This feed as a length that is one less than a power of two, these feeds are the worst-case scenario as they result in the largest number of disjoint trees. The naive certificate attempt consisting of only these roots doesn't work, since the three hashes are completely independent. I could first create m6, and once someone requested it, I could then invent messages m0 to m5, compute h0123 and h45, and then use these hashes in the certificate. To prevent this, we add Hh45h6 (the hash of the concatenation of h45 and h6), and analogously Hh0123h45. Hh45h6 must be published (and signed) together with m6, Hh0123h45 must be published (and signed) together with m5. This gives us a valid certificate h0123, h45, Hh0123h45, h6, Hh45h6, h01, h0 for m6. (credit for this goes to the dat whitepaper's merkle tree and replication sections)

The following graphic depicts the graph where all of these additional hashes are present, marked in blue.

As a consequence, there are two possible certificates for all non-power-of-two'th messages: The one described above that bridges across different merkle trees, and a more compact one that can only be computed once the feed length has advanced to the next power of two. A simple solution is to always use the one that can immediately be computed. But these two don't conflict, so the actual implementation can use the more efficient one whenever possible. Servers just need to be aware that even if they know that the efficient one can be computed, their peer's view of the feed might be outdated, forcing the peer to send the inefficient certificate instead.

Before I move on to the actual implementations, a quick note on correctness. I won't do a full prove here, that would need a bunch of definitions first. But a basic proof sketch is based around three properties of certificates:

  1. A certificate must somehow depend on all previously published messages. In the graph, that means that there must be a directed path from each message to one of the nodes in the certificate
  2. A certificate must be self-consistent (unlike the naive attempt in the length-7 feed above). In the graph, that means that there must be a directed path from the certified message to each node in the certificate.
  3. It must be possible to check that any two certificates do not contradict each other. In the graph, that means that after taking the union of the two certificates, each vertex from the certificate for the earlier message must be reachable from a vertex from the other certificate.

You can then go on and show that the scheme I presented fulfills these criteria. My intuition tells me that there's also a proof of optimality somewhere in there. But the more difficult part is actually showing that these criteria are exactly what we need to give ssb's security guarantees. Formal proofs in that domain are outside my current knowledge level though.

Implementation in SSB

This is a very direct translation of the above tree construction scheme into ssb. It omits some obvious optimizations, these are discussed later.

To the regular metadata of the message msg_s with sequence number s, append the following (in order, before signing)

  • hash(msg_s) as field "hash0"
  • hash(msg_(s - 1).hash0 | msg_s.hash0) (called "hash1") iff s == (l * 2^1) - 1 for some l
  • hash(msg_(s - 2).hash1 | msg_s.hash1) (called "hash2") iff s == (l * 2^2) - 1 for some l
  • ...
  • hash(msg_(s - 2^(k - 1)).hash<k - 1> | msg_s.hash<k - 1>) (called "hash<k>") iff s == (l * 2^k) - 1 for some l
  • If s is one less then a power of two, we are done. Else we need to add the additional hash that allows bridging between disparate merkle trees:
    • add hash(msg_(s - 2^(k - 1)).hash<k> | msg_s.hash<k - 1>) (called "bridge")

This may look intimidating, but it does nothing more than constructing the trees described above. Also a note on addressing: I used these names "hash42" to indicate the level of the tree, which is nice for humans. But for machines, there are relatively simple numbering schemes that provide bijections between natural numbers and those inner tree nodes (stealing from the dat whitepaper again). So addressing those is actually very efficient, both in terms of in-memory access time, and address size when sending stuff over the network.

@aljoscha %5BBfWSMjmhhXu2ejMXkFTTxXNSH33UdneDRe57kCNxY=.sha256

Complexity

The scheme encodes a binary tree in the feed, the number of inner nodes in a binary tree is linear in the number of leafs, so we end up with only linear space overhead (compare with e.g. this approach that has quasilinear overhead) in the feed. The metadata is not evenly distributed accross nodes, but it is bounded at log_2(seqnum).

More concretely: On average, this adds two hashes to each metadata entry (one for the bridge (except for powers of two), one for the regular merkle tree (on average)).

Certificates are paths to the root of the tree and then back to the first message, so they have length at most 2 * log_2(seqnum). Verification time complexity has the same bound.

Optimizations

A few rather trivial ones:

  • no need to store hash(msg_k) in the feed, you can compute that directly from msg_k
  • no need to store hash(hash(msg_<k-1>) | hash(msg_k)) in the feed, you can compute that directly from the msg_k as well (msg_k already contains hash(msg_<k-1>) - it's the backlink)
  • we can interpolate between feed size overhead and certificate size by dropping the highest non-leaf layers of the tree, sending a bounded number of leaf nodes instead as part of the certificate. The actual certificate can then be computed from those leaf nodes. This is not really an optimization, it just allows us to keep feeds more compact at the cost of making certificates less performant. Note that this only adjusts constants, it never improves the linear feed size overhead or breaks the logarithmic certificate size/performance.

Another nifty optimization can be done when sending certificates around. Two certificates for messages from the same feed will always share some nodes (if they didn't they couldn't prove the guarantees we are interested in). So it is not always necessary to send full certificates. Certificates share more data the closer the two keys are. When requesting a message of seqnum k, the server can also attach the next smaller and the next larger seqnum from the same feed for which it already has certificates. The peer can then easily compute the minimum amount of data it needs to send. <unproven intuition> I think when requesting a full feed in random order that way, the total amount of certificate overhead is only linear in the feed size (rather than quasilinear without that optimization)</unproven intuition>.

Conclusions

We can efficiently allow random-access verification of messages. The current approach for random-access messages (ooo) simply gives up and forfeits some of ssb's security guarantees. Merkle-ooo can fix that.

Another application would be initial sync (CC @andrestaltz): You could first request the newest n messages of a feed, and hand them to the client. Afterwards, you can request the remainder of the feed, doing traditional verification via backlinks.

More generally though, this could provide the basis for partial feed subscriptions. Plugins could expose their own replication rpcs that efficiently determine what kind of messages to send, and the merkle-tree validation would do the rest. The core does not need to care about the criteria by which the subset of the feed is selected. Due to the optimization where it is not necessary to send full certificates, this would allow linear-time verification of the partially subscribed feeds, just like regular ones. And that's the really, really, really powerful feature enabled by merkle trees.

The inability to do partial subcriptions is one of the two weakest points of ssb in my opinion (the other one being the non-hierarchical nature of identities). Merkle trees can fix that.

User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
@aljoscha %0eJLwmZ8uePGPtoTx4PAq8ZwhVOmcMxWwBSQHpobzCM=.sha256

@andrestaltz_phone

Quick question: would it be easy to fetch 'about' messages (which are typically early in the feed)?

That's a separate concern. The whole feed infrastructure is unopinionated about replication rpcs. The obvious ones are still full feed replication and random-acces via hash (aka ooo but with security guarantees). But of course it is possible to build others as plugins.

A really sweet spot between efficiency and expressiveness/granularity could be prefixes of message types. So you'd say "I have feed @foo at sequence number 42, but only those messages whose type starts either with about or chess::." The peer could then check whether it has any messages newer than seqnum 42 whose type matches one of the prefixes and send them.

This could be implemented pretty efficiently by sending around radix trees of the type prefixes. And the empty radix tree could be special-cased to mean "I replicate the full feed".

A different, much more expressive (and thus inefficient and complex) plugin could allow sending code in a query language, or even some sort of non-turingcomplete programming language (à la dhall) for computing at the peer's machine what kind of messages to retrieve.

And on the other end of the complexity spectrum, you could just build a plugin that exposes an rpc that does nothing but exchanging about messages.


I haven't made up my mind whether I would want to have something like the prefix-based pubsub in the core rpc set, but I think if we want something in-between full-feed replication and ooo as a core rpc, then it is a very good candidate.

@Dominic %PjcPuxx4P3ZFJn80Btdk7OuTVS59p0OvB+wgNmL8o5E=.sha256

the concern that @andrestaltz brings up:

Quick question: would it be easy to fetch 'about' messages (which are typically early in the feed)?

Is really the most important thing if we are proposing partial replication - the context and meaning of later messages is some times related to earlier messages (for example, whether or not you followed or blocked someone, or about messages...) - so if you are replicating only the latest messages... how do you know you havn't missed something, and what sort of impact does that have on the application? what kind of applications works with this, or is broken by this?

regards out-of-order messages: ooo messages are securely verified to be the messages your friend mentioned. Your friend posts a message referring to a message by someone you don't replicate. So you pull that one message in so that you can understand the context of your friend's message, it's the signature on your friends log that secures it.

If you have two out-of-order messages by the same author, o1 and o2, you can't prove that o2 really signs o1 - but what is that even useful for? you only know or care that these messages exist because of your friend, so I suggest that we've already captured the important thing.

User has not chosen to be hosted publicly
@aljoscha %qdqVSMJQDmd67pkD8qznOsEFyDMQNM2f29VSwQ6zbp8=.sha256

@Dominic

If you have two out-of-order messages by the same author, o1 and o2, you can't prove that o2 really signs o1 - but what is that even useful for? you only know or care that these messages exist because of your friend, so I suggest that we've already captured the important thing.

Doesn't that argumentation apply to ssb as a whole? Why do we even bother with backlinks, we can just use sequence numbers then.


Speaking of ooo: I can't see the message %eDQzaPn... that Dominic has replied to, and patchwork doesn't have ooo yet. Could somebody paste that message and/or link to the author?

@Dominic %n+dL0nfl/gXdMvcqDkWDet+jw2BQS/q4W5QVuBtEXdY=.sha256

@aljoscha it only works because we linked by the hash. If we linked by sequence number ooo wouldn't be secure. And even if we had merkle trees, if we linked by sequence number it would not be secure if the signer's key is compromised, but with hashes it still is.

btw, I that message was by @Dash

@Dominic %YV3STx50yElDRXEYkg1z6QOIh3S8/C43Hvk1oC9LW5E=.sha256

One approach that might change the story with partial replication (regards missing context) is if you used a merkle tree that had non-membership proofs. I have heard that these exist but I do not understand how they actually work as of this writing.

@aljoscha %8EVkMv/JpHiBOhcUO+qkw01UrVXKl7BJgC/R1uQr/90=.sha256

@Dash Huh, looks like I did nothing but reinvent hypercore. Oh well...

I guess that means it is time to reimplement ssb on top of hypercore feeds, but keeping our own replication protocol? Only semi-joking here.

User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
@aljoscha %fJTyyLK8+9t9/Uhp6nmIr/2o46xnFdnAkYqeIQP8xSU=.sha256

Through this thread I have @mentioned everyone whose replies I can see. If I didn't mention you, then your ssb messages don't reach me.


@nanomonkey

What if each message type had their own log-chain and sequence number. So that I can say, I don't care about your chess and gathering messages, but I'd love to have and replicate your about and post messages. This seems ideal to me. You could start a client out replicating about messages, and then add on other feeds as interest requires.

Ack, that's something to consider. It is fairly orthogonal to the merkle-tree discussion though, so I won't go into this approach right now if that's ok for you. I'll definitely pick this up when the time for the metadata redesign comes. If you (or perhaps अभिनाश) want to flesh this out further, I'd appreciate if you could fork the thread to keep things somewhat structured.


@keks

One note though: please try to make your posts shorter. Specs can be walls of text (to avoid ambiguity), but starting a discussion like that doesn't scale.

Duly noted. I will try to clearly separate these background-knowledge-transfer-posts from protocol-discussion-posts. But the background-knowledge-transfer-posts definitely won't go away, I want to keep this reasonably accessible to the wider, technially-interested scuttleverse (so no worries @jeswin).

First question: In the graphs, what's the difference between the diamond, hexagon and rectangle nodes, and why does the path to m0 consist of diamonds and the others are rectangles? I suppose hex nodes are basically incomplete subtrees?

  • diamond: Root of a complete binary merkle tree that contains the first 2^k messages of the feed for some k. The second half of a path to m0 always runs along these (with the first half going downwards from some m_i until a diamond node is reached).
  • hexagon: roots of complete binary merkle trees that do not span messages from the beginning of the feed and are not inner nodes of one of the diamond trees
    • once the feed grows enough, a new diamond node gets created and the hexagons covered by it become regular rectangles in this visualization
      • this is only visual, the actual data structure does not need to tag node types at all
  • non-blue rectangle: a node that is not a root of one o the relevant (i.e. diamond- or hexagon-rooted) complete binary merkle tree

Second question: I assume some of these hashes are signed by the feed author. I guess it's all but the blue ones? Otherwise I don't see a benefit over the current system, because if I ask you for an individual message, you could just make up some hashes.

All of them need to be signed, even the blue ones. Else a malicious peer could substitute a different blue hash, resulting in conflicting certificates for an otherwise valid feed. Only the feed owner is allowed/able to publish data that results in conflicting certificates (just as currently only he feed owner is allowed/able to fork the feed).

I think each hash needs a separate signature (I got this wrong in the "implementation" section), because otherwise certificate size would blow up to log(n) * log(n) (maybe?) *waves hands and hopes nobody cares about a precise complexity analysis*.

Bonus question: How does all of this relate to Merkle Mountain Ranges?

Another name for the same thing. This whole process would have been easier had I alredy known what to search for. But at least I had to think through everything myself. For example neither dat nor the mountain-range stuff I found mention the problem that certificates need the full path back to m0 so that all conflicts can be detected immediately. Maybe someone more knowledgeable about #dat could check whether hypercore does indeed not give that guarantee, and whether that conflicts with what dat wants to guarantee.

@aljoscha %q8cLlR5qDQNQ8IRYWgNbk83sezBTrEJ4/eWTcv/OyxU=.sha256

@Piet

The merkel tree seems like a really beautiful idea.

Yeah, binary trees are magical.

I particularly agree with @Dominic's point:

the context and meaning of later messages is some times related to earlier messages ... what sort of impact does that have on the application? what kind of applications works with this, or is broken by this?

That's definitely something application designers need to consider, yes. But we'd be giving them more options, not constraining them. An application would choose whether it wants self-contained messages (working with arbitrarily chopped-up feeds), or whether it wants to perform reductions over multiple messages into a single piece of application state (à la event-sourcing). The latter case would break.

But it is possible for applications to get the best of both worlds: Any message that relies on previous messages can link to them, and the application can then instruct the server to (transitively if necessary) fetch those via merkle-ooo. But in any case, all of that is part of the application layer, not the protocol layer. (CC @Dominic)

Now that I think about it, making these event-sourcing-backlinks part of the protocol would save a bunch of roundtrips. Hmm... (I don't think that's goinging to be a good idea, but it's one to consider nonetheless)

As for older applications: I'm not aware of any current client that uses event-sourcing accross multiple message types, so a partial feed subscription mechanism like the type-prefix based one should be sufficient to prevent any breakage. And the type-prefix solution also allows events-sourcing accross different message types authored by the same application if they share a common namespace (e.g. the fictional "chess.move" and "chess.annotate") messages. That's a simple middleground where you can assume that the clients have the full log and can thus do event sourcing without any explicit backlinks, yet disinterested clients don't need to download the full feed.

I'm trying to find the killer real world pain in ssb we can solve that justifies the implementation and increase in complexity throughout the stack.

To me, "ooo introduces messages with second-class security guarantees" is more of a "killer real world pain" than any kind of application you could come up with. But then again, I may have a non-standard view of what constitutes "real world" concerns.

The ooo security guarantees are what led me down this rabbit-hole, partially subscribable feeds were just a byproduct. But I guess it easier to lobby for those. The all-or-nothing approach of current ssb may be the last part of the protocol that doesn't scale well. But again, I feel like that might not be "real worldy" enough for everybody. So I guess someone else should do the marketing =D

@aljoscha %nZDDuGa5bu10f72vW8ZwjRsv6+azROP9vB+HFU4dDVw=.sha256

Here is a strawman proposal for what it could look like if we go (moderately) crazy on partial feed replication rpcs:

In current ssb, a replication request consists of a feed id and the newest sequence number on that feed known to the requesting server. If both peers have that feed, but their newest known seqnum differs, the peer who knows about more messages sends the new ones to the other.

A partial replication request would additionally contain the following values:

  • range: exactly one of
    • all: don't filter by seqnum
    • start(amount): only exchange messages with sequence number below amount
    • end(amount): only exchange message among the newest amount many
      • note that this results in different ranges for the two peers, but that's by design
  • types_exact: A set of utf-8 strings. Only exchange messages whose type is in the set, unless the set is empty.
    • the set can be stored as a radix tree (or really any compression scheme)
  • types_prefixes: A set of utf-8 strings. Only exchange messages whose type begins with a string in the set, unless the set is empty
    • the set can be stored as a radix tree (or really any compression scheme)

These values act as filters, only messages that pass all filters are replicated.

This scheme allows encoding a reasonable subset of all possible feed partitions. For fully general feed partions, you can still expose replication rpcs via plugins that use merkle-ooo to do the actual message exchange.

@andrestaltz This scheme would be sufficiently expressive for all your manyverse needs, right?


And a strawman proposal for allowing applications to use message schemes where state is reduced over multiple messages of the same feed:

Each message metadata contains up to eight integers, each smaller then the sequence number of the message. When replicating a message (e.g. via partial replication, merkle-ooo, or transitively), the messages from the same feed whose sequence number is one of those integers are also replicated.

Huh, that turned out surprisingly simple.

User has not chosen to be hosted publicly
@aljoscha %rzKG1M/B9+KicacpfoVjESG9fcpy2BUnmOZXwMqJRwc=.sha256

@Jan van Brügge Jup, pretty much. The mechanism is fully general though, it is not tied to any decision about actually want to provide replication-by-type in the core or not. That's a decision for (slightly) higher up in the protocol stack.

The merkle-ooo approach is indeed strictly more powerful, but also strictly more complex. The different sigchains are much less expressive, but they wouldn't require any additional metadata overhead.

User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
@aljoscha %tZulTFxntwnWD88WbePrceczgbjukGbSZ+LOXSTK9VI=.sha256

@rabble

This might be naive, but would using merkel trees for the feeds allow a user to force a rebase of their feed? If for example, somebody needed to delete something. They could post a 'this feed has been rebased at x hash' on their feed, then post something a link to what the new hash for that content would be after each content which references the orphaned hash for the content?

I'm having trouble parsing that paragraph. Are you suggesting a way of deliberately forking and then later merging the feed? If so, I think the fundamental problem with that is that the feed needs to stay valid even if only a subset of its messages are known. So no subset of the feed may look like it was forked in the usual, forbidden way. Also, I don't quite see how this is related to merkle-trees rather then simple linked lists. So I guess I'm missing something?

As usual with deletion, the following caveats apply:

  • you can't reliably delete any data that ever left your machine
  • a whole bunch of things rely on ssb feeds staying immutable
  • application-level "please ignore this message"-messages are already possible today
  • the freedom to locally delete content while retaining the ability to verify a feed is something we could get trough #offchain-content, no need for merkle-shenanigans there

And finally: As a German, I find the idea of a merkel tree is somewhat disturbing :stuck_out_tongue_closed_eyes:

@mikey %ZBeZLMz/D+Mzi5H6kzbg5BMjAMl1kNKW56yyq+MTqOU=.sha256

FWIW: Seeing the strawman proposal got me over the line. I'm keen to see this idea go further.

@Aljoscha @piet: as the Sunrise Choir navigator :tophat:, for this to be considered in the roadmap, i'd need clarity on how our old feeds could gracefully "upgrade" to this new feed format. or in a general sense, how to migrate feed data to new formats without breaking cypherlinks (my best guess is to package up many feeds which cross-reference each other into a single blob), ideally so we can move beyond old protocol implementations. also, bonus points in my book if we can re-use Dat's hypercore protocol, i reckon that's a big win.

User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
@mikey %VrwLrNIFmCl0A+D6HSp7RpDkCceKP+62QO/gV08/v4A=.sha256

ping @Aljoscha, @Tim Makarios wants your attention on their comment: %oDD2oqV...

Surely the certificate (h7, h45, h0123, h01234567) proves only that the messages supposedly prior to m6 were created before the certificate, not necessarily before the message. Otherwise it would also prove that h7 was known before m6, which is backwards.

To prove that a message was created after certain other messages, I think the certificate has to be part of the message itself, contributing to its hash, doesn't it?

I'm using Manyverse, so I can't see your diagrams.

User has not chosen to be hosted publicly
@aljoscha %5wxUCloLlFkmzsLqODoQusYMx06tKDSEf7KyegGLaec=.sha256

@Tim Makarios (on Manyverse)

I followed your feed, so I can see your posts now. I might unsubscribe at a later point though. No hard feelings, I'm just trying to keep the ssb fire hose manageable. But thank you for reaching out, I value all comments on these posts.

Surely the certificate (h7, h45, h0123, h01234567) proves only that the messages supposedly prior to m6 were created before the certificate, not necessarily before the message. Otherwise it would also prove that h7 was known before m6, which is backwards.

The (h7, h45, h0123, h01234567) certificate for m6 fully validates m6, but it can only be created once m7 exists. It does not declare that h7 existed before m6, it just uses a path through the merkle tree that happens to include m7. This certificate (one you also add h0, h01) encodes the creation order of the first eight messages. It doesn't say that m7 existed before m6, it says that m6 existed before m7.

If m7 isn't known yet, you need to create a different certificate - that's what the third post is all about.

To prove that a message was created after certain other messages, I think the certificate has to be part of the message itself, contributing to its hash, doesn't it?

Not exactly. A certificate is just a collection of hashes. These hashes refer to the original message, and are indeed signed. There's a large number of possible certificates (at the very least one per message). So if all of these were put into the feed, we'd get a blow-up in feed size. The beauty in this mechanism is that these hashes verify each other, so you can just pick the subset you actually need.

User has not chosen to be hosted publicly
@aljoscha %16F5YV1yju98iy5q3o6x0ljnZ09aQEAHaC20nmYDlN8=.sha256

@Tim Makarios (on Manyverse)

You are right, that scenario actually breaks things. So the remaining question is: "Can you still break it if we always use the certificates that can be created when the message itself is created"? If the answer to that is "yes", then we can probably scrap the whole thing. And if the answer is "no", then I'll need an actual proof for that. Proving that such a scheme works is something I'd do in any case before rolling it out.

I'm also thinking right now about using additional backlinks further back into the list rather then using the additional hashes (the blue nodes) for bridging disparate trees. It seems kinda weird that there are different mechanisms (combined hashes vs single-message backlinks) for establishing creation order.

What's more, if the bogus hash of m5 doesn't contribute to h6, why can't I rebundle m6 with a new certificate, this time based on the real hash of m5? Then even careful verifiers that verify everything won't detect a problem.

m6 creates a backlink to m5, or in your scenario, a bogus backlink. A "careful verifier" can just traverse all backlinks, so there's no way of fooling those.

@aljoscha %8S3Sh3PI3bNSk9/bzVr5nIeewfPg5UDiFm7Y/PslkJc=.sha256

Ok, I think I found the way to make the blue nodes less ugly: Instead of publishing and signing these extra hashes, each message (except for those one less than a power of two) includes the previous "mountaintop" (diamond node, the "left" input to the blue node of that message) in its (signed) metadata. But then, the certificates would need to validate the right spine of each subtree, which would be more data than the scheme with the extra hashes. Still logarithmic though. And who knows, maybe the other one can be somehow circumvented... I really need to sit down and do a correctness proof (or disproof...).


A few general things that are nice to know:

  • the whole scheme (placement of additioal hashes etc.) works by describing each number as a sum of different powers of two - i.e. its binary representation
  • message k needs to store t certificate hashes, where t is the number of trailing ones in the binary representation of k
  • to compute the sequence number of the mountaintop previous to message k (where k is not itself a mountaintop), set all non-leading ones in the decimal representation of k to zero and subtract one
  • the "height" of that mountaintop is the number of of places in the decimal representation of k minus the number of leading ones
User has not chosen to be hosted publicly
@aljoscha %KPseOV91dFC8Tk3od3UTb2Z1ufSIgzF37l1N5UBqWlE=.sha256

@Tim Makarios (on Manyverse)

For example, if B is the finite set of powers of 2 whose sum is n (that is, basically, the binary expansion of n), then for each element b of B, include in message n a backlink to the message whose sequence number is 1 less than the sum of all members of B that are greater than or equal to b.

Wow, this is beautiful.
Looks like it is indeed amortized O(1) overhead in signed data, and O(log(n)) in certificate space and validation time.

I've pondered constructing a deterministic skip list over the feed, but I didn't make the jump to use backlinks only. Your approach is skip-list-ish, but not exactly a skip-list. So my take on this theme:

For all k such that 2^k <= n: if n+1 is divisible by 2^k, include the hash of n - 2^k in the message.

@aljoscha %lvWUTql6F4zhglK6QhRsHUhR8f+c17I4Y8cy3T60XJQ=.sha256

There's a drawback to the purely backlink-based approaches though (at least to my naive skip list, but it also applies to your suggestion I think): While there is a certificate of logarithmic length for any two messages x, y that proves that x was published before y, that certificate is specific to that pair of messages. With the merkle-approaches, one of the goal was that there is a certificate per sequence number, and any two of those certificates allow verifying the existed-before relation of those two messages.

When requesting a message by hash, you don't know its sequence number yet. So you can't know beforehand what certificate to request.

Even worse, the peer might only partially replicate that feed itself, and it might not have the messages that make up the particular certificate. I think that's a showstopper and forces us to come up with a correct merkle-tree-based approach?

@aljoscha %qZ5dtIH+w8LVCcpG5NjguNdz9axNtpxV3YnMk0AFSBw=.sha256

Aaand I think none of this actually works.

Past Aljoscha wrote:

Assume we received m6 and the associated certificate c6 described above (h7, h45, h0123, and h01234567), and now we ask for m1. The certificate c1 we get consists of h0 and h01. The server we are requesting from might not even know about any messages that were published after m1. But now we can't check whether c1 contradicts c6. c6 contains h0123, but does not tell us how h01 must relate to it. Fortunately, there's an easy fix: Each certificate must contain all the diamon-shaped nodes (h0, h01, h0123, and so on) that existed up to that point in time.

The same problem also applies to verifying message 8 and 15. And 16 and 23, and 24 and 31. And so on. So we'd need to include the spines of all merkle-subtrees with height four. But it's worse than that: For a feed of length at least 32, the same problem arises for verifying the creation order of messages 16 and 31. So we also need to include the spines of all merkle-subtrees with height five. And so on. As a result, we'd need to include all layers of the tree except the three lowest ones. And that is in O(n).


So either we

  • find the error in my reasoning
  • figure out how to actually get message-specific certificates with size in O(log(n))
  • figure out how to work with an approach that produces logarithmic certificates specific to each pair of messages
  • give up

=(

User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
@aljoscha %3NEa/lHbD0xT5lS1XcY5e6bZr8144HyCYG71ecDZNgY=.sha256

@keks No worries, this didn't feel like blame to me at all, just valuable meta feedback. Now that I get to work full-time on protocol stuff, it is always good to be reminded that I can't expect everybody else to invest as much time as me.

User has not chosen to be hosted publicly
@Dominic %m98tX4W+HAsW41jZmRq8X0s4Pb8l0/UvCyf6Nl0hpy4=.sha256

I was looking for this image:

merkle-tree

Which I drew for @mafintosh when we were was hanging out in Copenhagen one time. (just to show I was well aware of merkle trees, but chose not to use them in ssb to avoid other complications, specifically to make indexes easy)

original post

@aljoscha %7lRLhQUrX6Z8FKg7aJzhV9JpVsirONz90VZAKaTwVwE=.sha256

@Tim Makarios (on Manyverse)

I don't think it's possible to do this in less than O(n) space. Otherwise, for sufficiently large n, message number n will have a certificate that proves that it was created after each message with a smaller sequence number, but the certificate will have less than, say, 1 bit per previous message. I don't think that can be cryptographically secure.

Intuitively I want to agree, but since we are dealing with probabilistic guarantees in any case, it might still be possible to cheat the "1 bit per previous message" limit.

The reason I was considering that variant was as a way of avoiding single points of failure. In my earlier suggestion (and your variant, I think), message 2^k-1 is a single point of failure. If that message is lost, you'll never be able to verify that messages with higher sequence numbers were created after messages with lower ones. Actually, every odd sequence number is a minor single point of failure; for example, if message 9 is lost, you'll never be able to verify that message 10 was created after message 8, but you'll still be able to verify that both were created after message 7 (which you wouldn't be able to do with the current simple linked list).

This is funny, I was actively seeking out those points. Instead of viewing them as points of failure, I viewed them as desirable overlap between certificates. Half of the entries in the skiplist scheme are the left spine of the tree, and that is shared between all of them. That makes it very likely that exchanging partial certificates is enough. My hunch is that if this were to form the basis for partial feed subscription, the larger certificates with high overlap would be overall more efficient than a scheme that has shorter certificates and lower (relative) overlap.

But in any sub-O(n) out-of-order scheme, I'm pretty sure that it's possible that two of your peers will each have a different message from the same feed (but each has only that one message from that feed), and you'll get both messages, but neither peer will be able to help you verify their creation order, unless there happens to be a direct backlink from one message to the other.

And I guess that makes the whole thing unsuitable. The property you stated about verifyability of messages received from the same peer is nice, but just not enough.

I think ssb can either drop the creation-order-guarantee, or it won't get verifiable out-of-order messages.

User has not chosen to be hosted publicly
@Dominic %ZqykYtVPRcLQI0pYAqhsxupAiYe1dFqXm8mT6Y2K+48=.sha256

@Tim Makarios @aljoscha it's most certainly possible to verify any two messages are in the same merkle tree using only O(log(n)) data. You don't need to put the branches of the tree in the message (which would expand each record to log(n) size, making the space for the entire feed O(n*log(n))) you just need to hash the top branches together, and put that hash in the record. And if you are missing a single record, knowing it's hash is enough to validate messages following that.

However, for partial replication (of say, specific message types) this doesn't tell you whether you have all the messages of that type. I have heard that there are also merkle tree designs that can do non-membership proofs, but I don't understand how they work yet. Could be worth looking into.

Since it's likely many of the important messages are encrypted, so if you go back and ask for particular encrypted messages it will basically reveal who is talking to who, but if you get every message, that doesn't happen.

That's just an example of why you need to look at the greater impact of protocol design decisions, and think through what sort of thing might be built on top of it in the application layer.

I think this sort of thing really makes sense for something like streaming live video or audio. Or for sensors or say, a weather report where having the latest thing is useful, but having the old things is not as valuable.

@aljoscha %Y4mSXUAguoyqXhyZCiItTJsddY2c07CVZiRBmwEZuHU=.sha256

@Dominic

it's most certainly possible to verify any two messages are in the same merkle tree using only O(log(n)) data.

Yes, but that's not the problem. The problem is verifying that the message of lower sequence number has already existed when the message with the higher sequence number was created. Current ssb has that guarantee, the merkle approaches do not.

@Dominic %BWw0dIDbDOwgDjjqk662E//41//Oudsu41KTuK9R+zk=.sha256

@aljoscha that is incorrect. To prove that the earlier message already existed, you just have to show that the later message signs the hash of that message, or the hash of something containing the hash, or the hash of something containing the hash of something... etc recursively.

So if I can take two messages, one with a low sequence number and one with a high sequence number, and I ask you if you can prove that one signs the other - if you can provide any set of hashes, and values they may be hashed with, such a hash in the value of the later message is produced - that is cryptographic proof that the earlier message really was earlier.

If you look at the picture I posted - the boxes are the messages/leaves (note - leaves are assigned odd numbers, and hash trees assigned even numbers). The numbers written to the right of the box are the hashes that leaf signs - so leaf 25 signs 20 and 8.

Lets say we want to prove that message 25 signs message 5. message 25 signs hash(tree[20]+tree[8]) (+ is concat) and message 5 is inside tree[8]. so we just have to show that tree[8] contains hash(message[5]) we can do that if we are provided with tree[7] and tree[2].

tree[8] == hash(
  tree[2] +
  hash(
    hash(msg[5]) + //aka, tree[5]
    tree[7])
  )
)

and we know that message 25 signs hash(tree[8]|tree[20]) so there you go, proof that messages 5 existed before message 25.

@aljoscha %31l7NnXcqLVztw0R2O0hnRjaX92xZ1H7ACEBpzTkab4=.sha256

@Dominic

that is incorrect. To prove that the earlier message already existed, you just have to show that the later message signs the hash of that message, or the hash of something containing the hash, or the hash of something containing the hash of something... etc recursively.

With all respect, did you even bother reading this thread before proclaiming that we didn't think about this? This post gives a counterexample to your claim.

@aljoscha %wvo9xm0oCkkX87rE4Zwyx+ghr6gV/H35Qy81u9VTQg4=.sha256

Meh, the full problem is not given in that link, but it is the root cause. We can only use the parts of the tree that already existed at the time of message creation, and that leads to problems with certificate size.

@Dominic %6X1/aDJs0IOtQ/4hfSWPnInZdWB8RJlcwCU9WLHTDiY=.sha256

@aljoscha this thread is already very long and dense! I noticed you making obviously wrong statements - so I did what you would have done. Looking back at the older messages, it appears the mistake you are making is that you are not authenticating the root. It doesn't help much to just share around what your replicating peer claims is the root hash, this should be signed by the author instead - I'd suggest as a part of the message, but some merkle tree implementations just sign the root hash separately.

That way, in the example @Tim Makarios gives with bogus hash for message 5 can't actually work. If message 6 signs the whole tree, if there was a random number instead of a hash at 5, then you created a new message 5, you could create a proof that message 6 signed something else instead of message 5, so therefore the feed has forked. You could skip a message and just pretend that some random number is the hash, but you couldn't replace it with another message without getting caught out.

Also, a "certificate" is a claim by an authority - what we are talking about here is a "proof".

@aljoscha %9r8sUUj8cLO6KaL1+MRFREkLHDQGxfHwaQvf7pVHpMk=.sha256

@Dominic

(I'll continue to use the term "certificate" - both because changing terminology right now would be confusing, and because it's a standard term from complexity theory.)

I think I get how you determine which leaves sign which nodes, assuming that node 31 erroneously omits signing node 20.

But can you elaborate on what exactly the certificates look like? As argued here, it is not sufficient to provide short certificates per pair of messages. Instead, we need to provide some logarithmically sized certificate per message, such that any two such certificates taken together can be used to check the claimed existed-before relation.

To show why this is needed, let's take a look at your example of verifying that node 5 existed when node 25 was created, but also considering what might lead up to this situation.

Suppose peer C has all messages of feed Z. Peer A only cares about message Z-25, so it asks C for that message. What does C send? We have no idea, since that's something your example brushed over. But we do know that C only sends logarithmically much data (since otherwise we are outside our desired bounds in any case). For the sake of argument I'll just assume C sends "the most sensible" set of data: 25 (duh), 20, 8, 0 (and if we are being charitable, we can also add in 4, 2 and 1, it won't matter).

Meanwhile, B only cares about message Z-5, and it also asks C for it. C answers with 5, 2, 0 (and 1 for good measure).

Now, C wanders off to Siberia, never to be seen again. But after some time, A wants to receive message Z-5, and B happens to have it. So they happily start gossiping, and after a moment of awkwardness, they realize that B is unable to prove to A that 5 did indeed exists before 25, because this would involve Z-7, which neither of them have.

If B had known in advance that it would have needed to supply a certificate validating the pair (5, 25), it could have requested the necessary data from C. But then again, B might have gotten Z-5 from a peer that didn't know about any messages newer than Z-5. So clearly, we can't place the burden of providing additional information on B. Instead A (i.e. the one who supplies the greater number) should have requested more data from C. But which data? We can't assume A to know in advance which other messages it might later want to have. So the "fix" is that C should have sent along more data to A than just the original set {25, 20, 8, 0 (and possibly 4, 2 and 1}.

This is the core problem we failed to solve. The schemes we could come up with where C sent enough data (where "enough" means that it is possible to validate any previous message) had to send more than just a logarithmic amount.

Additionally, you need to ensure that whoever gives message 25 to A also needs to actually have all the data to include in the "super-certificate".

@Dominic %ZX4tJOwgV7O7IhHbQflzby5+Fwf1Qqtt/bqXn8y21BQ=.sha256

@aljoscha node 31 is correct. It signs the previous node, then the previous tree (tree[26] two nodes), then next previous tree (tree[8]). The hashes you send are all correct, but you can leave off the extras.

I can't remember why I included the pubkey at the start. If that's the identify of the feed it's not necessary to send that. Lets drop that.

If do decide to do a full replica of a feed, you don't need to send any additional data - you have everything you need to reconstruct a proof.

When you request a one-off message, and it's the first message you have, your peer (who we are assuming has the entire feed) sends that message, plus the preceding full tree hashes. log(n) (using the numbering system it's easy to figure out which trees using some bitwise stuff, but I don't recall the code... and it's quite late here now.)

If you already have a message in the feed, give me 5, I have 25. So, whoever Z5 to A, lets say D, knows that A has 25, and assumes that they also have the tree hashes preceding 25.
So D sends tree[2], msg[5], tree[7]=hash(msg[7]), and tree[12]. that's enough to reconstruct

tree[8] == hash(
  tree[4]=hash(
    tree[2] +
    hash(
      tree[5]=hash(msg[5]) +
      tree[7]
    )
  ) +
  tree[12]
)

If they have msg[25], we can assume they'll have tree[8], so that should be enough to convince them msg[5] is signed by 25.

and A now has tree[7], which they can provide to C when needed.

Since you can reconstruct the structure if you know the numbers, a proof/certificate is just the lower number, the higher number and a list of hashes.

Also, now that A has two messages, if they want a message in-between those two, they say I want message 17, I have 5, and 25. Their peer then sends sufficient hashes to tie 5 to 17 (because we know they'll have the trees before 5) and 17 to 25. A sends request(17, 5, 25)
and D returns [proof(5, 17), get(17), proof(17, 25)]

If A requests a range, we only need send extra hashes for the gaps, because the hashes of the messages in the range can be recalculated.

I also made a better picture:

Arrows point to the arguments to a hash function - dotted lines for what a message signs, solid lines for trees. Unfortunately, to make it more confusing, I used the odd numbers for the trees like dat's paper. Looking at this now, I think I prefer even numbers, so the the big trees are easily recognizable as powers of two.

It's possible that a peer has a message - say only 17, but not 25, so can't prove that 17 was part of 25. I think it would be better in this case just not to give out msg[17]. On the other hand, if they'd asked for msg[15], they can tie it back to tree[8], which they know is part of 25 even without knowing 25.

@aljoscha %BJJOeqLdBjHoyrjOP9OLUXg6YKM9sR/FqDzkYb2CRs8=.sha256

@Dominic

Still not sure I get the hashes signed by 31. It signs 29, 26 (which covers 25 and 27) and 8 (which covers 1 - 15), so 17 - 23 are left uncovered (which is why I thought you missed 20, as that would cover exactly the missing leaves). But more importantly:

your peer (who we are assuming has the entire feed)

This is the critical part that everything boils down to. Under that assumption, everything works. But is this a reasonable assumption? It means that only certain peers can successfully provide partial replicaton, namely those who have previously requested more than just a logarithmic amount of data. This would result in partial replication being second-class.

You could only perform fully general partial replication with a peer who has the complete feed. This violates the principle that it doesn't matter whom you get your data from, with whom you replicate. In particular, full feed replication is transitive (D can get data from C who gets data from B who got it from A). Current ooo also provides transitivity, by dropping security guarantees. D could get a single ooo message from C, who got it via ooo from B, who got it via ooo from A. D might never get to see the remainder of the message's feed. But merkle-based partial replication would not be transitive. The data could travel only one "hop" from someone who has the full feed. So "D gets a subset from C, who got a (not necessarily strict) superset of that set from B, who got got a superset of that from A" does not work (if we want to be able to verify all messages), the best you can do (in full generality) is "D gets a subset from C, who got the full feed from B, who got the full feed from A".

I consider this transitivity of passing data around to be crucial, and we failed to come up with a logarithmic scheme that maintains both transitivity and verifiability.

User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
@Dominic %F1lXAFFSWTHPq0n4xMeXYH42GmTUB8zJMpDLLaG1LQo=.sha256

I'm sorry, yes you are correct. It should sign 20.

You are also correct that we do not get general partial replication. sometimes it works, sometimes a partial peer is able to provide a proof anyway, but there are some times it doesn't work. Merkle trees are most typically used in applications where there is, by design, a number of full nodes that have everything and then light weight clients who trust but verify.

Although, if the default is to replicate a whole feed, and partial replication is only used around the edges then it would probably still work anyway, because you'd usually be able to find someone with a full feed. But if everyone was doing sparse partial replication, that probably wouldn't work. The question really, is what proportion of the time does it not work? which is really an application question - the worse case is probably if peers want a few random messages, but that's probably not the usual case.
To make a model of what the usual case might be, you need to think about how possible applications are likely to use it.


Security of ooo:

I think the answer here is that out-of-order messages have a different security interpretation. They are only secure if you already know you want their hash. in-order-messages are considered secure if you know the public key they are signed with.
ooo messages actually have a stricter security requirement!

For example, key compromise doesn't effect ooo messages, because their hash doesn't change. After key compromise, we can't trust new messages we see for a feed (which may have fake old timestamps and low sequence numbers) but we can still trust that messages
replied to (thus hashed, like ooo messages).
ooo messages will still be just as secure as they are now when quantum computers become available. If we requested ooo messages by key and sequence number, then I'd agree, they'd have relaxed security, but hashes actually significantly stronger than signatures.

in-order-messages are for when we explicitly care about future messages signed by that key, we're trusting the key. but in ooo we explicitly do not care about future messages, we are not trusting anything, just that there isn't a hash collision.

@aljoscha %/hanLnbb/QOAxdnt6TdtpjeO0Jqu2Wlg5COWr6XyZqI=.sha256

@Dominic I think patchwork somehow swallowed my response. The gist of it:

  • ooo messages actually have a stricter security requirement!

    let's just agree that ooo-messages and regular messages have different security requirements =P

  • which is really an application question

    I'm very wary of putting something into the protocol that works well for some applications, but not for others. The (conceptual) simplicity of ssb's API is a really strong point, and adding a feature that only works well under specific circumstances (which are partly out-of-control of the application designer) is detrimental to that simple interface

@aljoscha %7umzE4rQ8TRYkKzIi4kN/l0LKhUEtYgLf1QqcL1Ct0U=.sha256

I came up with a workaround for some of the problems discussed above, I figure I should at least sketch it out here...

Assume a signature-skip-list where each message n contains the hashes of all messages k such that the k-th digit in the binary representation of n is a one.

The problem we looked at so far was to find for each message n a logarithmic number of messages older then n, such that the combined sets of any two such messages hold enough infomation to verify that the newer one (transitively) signs the older one. In this framing, we didn't manage to find a solution. But what if we allow including messages newer than n itself in this set? We can then include the smalles set of messages that shows that x signs n, where x is the power of two greater or equal then n.

This satisfies our validation criteria: Given two messages n and m (wlog n < m), let x be the power of two greater or equal then n. If m > x, then m contains the hash of x, and we can then verify n from x. If m < x, it also works out I think (proof by intuition... my gut feeling is that the skiplist of m and the reverse-skiplist between n and x always meet at some point... nobody bothers following this anyways, right?).

For this to make sense, requesting a single message n also involves requesting some newer messages, which may not even exist yet. To guarantee fully transitive replication, implementations need to keep non-selfishly requesting those newer messages even though they may have already received and verified n. Still, I think this is a fairly reasonable assumption to make.

CC @Tim Makarios (on Manyverse) in case you come back to the scuttleverse at some point =)

@aljoscha %+x64NA8QrmIsdxM4Xi+fiBAoAosxCklYXWZXx4yWu3Q=.sha256

It works! Here's a rust implementation (gist with syntax highlighting and tests):

use std::collections::BTreeSet;

/// Find the power of two <= the given number (`None` for 0).
fn prev_power_of_two(n: u64) -> Option<u64> {
    if n == 0 {
        None
    } else if n.is_power_of_two() {
        Some(n)
    } else {
        Some(n.next_power_of_two() >> 1)
    }
}

/// Return the set of sequence numbers whose hash the n-th message has to include.
pub fn link_to(n: u64) -> BTreeSet<u64> {
    // Each natural number can be uniquely decomposed into a sum of powers of two (i.e. a binary
    // representation). The set of messages to link to depends on this decomposition:
    // Begin with an additional accumulator of zero, then for all those powers of two in descending
    // order: Add the power of two to the accumulator and link to the message whose sequence number
    // is one less than the accumulator. Note that this results in the zero-th message not linking
    // to anything, and each other message links to at least its predecessor. Each message links to
    // as many previous messages as the binary representation of the sequence number has ones,
    // resulting in O(n * log(n)) total size of the feed.
    let mut acc = 0;
    let mut out = BTreeSet::new();

    while (n - acc) > 0 {
        match prev_power_of_two(n - acc) {
            None => return out, // the zero-th message links to no other message
            Some(largest_pow_2) => {
                acc += largest_pow_2;
                out.insert(acc - 1);
            }
        }
    }

    debug_assert!(out.len() == (n.count_ones() as usize));
    return out;
}

/// Compute the smallest possible certificate that proves that m transitively links to n.
pub fn find_min_cert(n: u64, m: u64) -> BTreeSet<u64> {
    assert!(n <= m); // Good luck finding a certificate otherwise.
    let mut cert = BTreeSet::new();

    // The smallest certificate is obtained by iteratively following backlinks starting at m, going
    // to the smallest message greater or equal to n.
    let mut current = m;
    while current > n {
        current = *link_to(current).range(n..).next().unwrap();
        cert.insert(current);
    }

    // Since the backlinks can jump up to the next-smaller power of two in each iteration step,
    // the certificate contains at most O(log_2(n)) entries.
    return cert;
}

/// Check whether a given set of messages proves that message m transitively links to message n.
pub fn check_cert(n: u64, m: u64, cert: &BTreeSet<u64>) -> bool {
    assert!(n <= m);

    // The backlinks have been chosen in a way such that a certificate is valid if and only if it
    // contains the minimal certificate.
    return cert.is_superset(&find_min_cert(n, m));
}

/// Suppose we have message n, and a peer wants to verify that it is signed by some m. While it is
/// easy to find a logarithmically sized certificate for a pair of messages via
/// `find_min_cert(n, m)`, there's still a problem: We don't know in advance which m the peer will
/// provide. Storing the results of find_min_cert for all possible m takes linear space, which we
/// want to avoid.
///
/// Instead, we need to store a logarithmically sized set of messages depending only on n, such
/// that the union of this set and the corresponding set for m is a valid certificate for the pair
/// (n, m) for any n and m. This function computes such a set.
pub fn cert_pool(n: u64) -> BTreeSet<u64> {
    let mut pool = BTreeSet::new();
    // Let (l + 1) be the largest power of two less than or equal to n, and let (o + 1) be the
    // smallest power of two greater than n. The certificate pool needs to contain message to cover
    // the following cases:
    //
    // m <= l <= n
    // l <= m < n
    // n < m <= o
    // n <= o <= m
    match prev_power_of_two(n) {
        None => {
            // For n == 0, the cert pool is just the zero-th message itself.
            pool.insert(0);
            return pool;
        }
        Some(prev_pow) => {
            let l = prev_pow - 1;
            let o = (n + 1).next_power_of_two() - 1;

            // Intuitively, we make sure that we can connect o to n, and n to l, and l to 0, in a way that
            // we can always "catch" m in-between. The last case where o < m is handled by symmetry (the
            // certificate pool of m "catches" n in-between).
            pool.append(&mut find_min_cert(0, l));
            pool.append(&mut find_min_cert(l, n));
            pool.append(&mut find_min_cert(n, o));
            // All three of those sets are logarithmic in size.

            return pool;
        }
    }
}
User has chosen not to be hosted publicly
@aljoscha %+dKbtXJXSTSVxZvH43D6CPj9frIFdz446CFdxiYg6wY=.sha256

Note to future travelers: Lipmaa, Helger. Secure and efficient time-stamping systems. Tartu University Press, 1999 already solved large parts of this discussion (see also here for more links).

User has chosen not to be hosted publicly
@cryptix %qHmmhUx6h7h8X2MMhKc1XHCsY4TcxeMMKvsFM+CqRSk=.sha256
Voted Note to future travelers: [Lipmaa, Helger. Secure and efficient time-stampi
@Linas %LWy5FDEaRh4Vuskpl5zYNexjFAPcA+2b7RdljHpnsQw=.sha256
Voted [@Piet](@U5GvOKP/YUza9k53DSXxT0mk3PIrnyAmessvNfZl5E0=.ed25519) I think the
Join Scuttlebutt now