I thought about self-organizing messages as well, although the ideas I'm presenting here use a more explicit approach. I am new to the scuttleverse and its concepts, so there might be some obvious flaws in my thoughts. Anyways, here we go:
Nearly everything in the scuttleverse is identified by a unique hash. Message types however use strings as a type. Some quotes from the scuttlebot documentation:
There is no official mechanism for making sure message-types interoperate, except for the documentation which you're reading here. As it becomes clear that new types are coming into common use, we'll add them to this site.
Manually posting information about the intended use to a central place seems primitive compared to the scuttlebot architecture.
To avoid accidental collisions with other applications, it's a good idea to add your org or application name (or both!) to the message type.
We are using a system where everything is uniquely identified by a hash, but the fundamental building block of all communication relies on namespacing?
This seems to be such a fundamental decision, that I'll just assume you had good reasons to chose the current system instead of hash message types. But even with string identifiers, it would be nice to come up with a self-organizing system. The meaning of messages should not be dictated up-front, but rather through their usage. One way of achieving this, would be to represent message types and message consumers (e.g. frontends like patchwork, or maybe patchbay plugins/modules) in the scuttleverse. Just like smalltalk uses objects to represent language concepts themselves (classes, control-flow, etc.) at runtime, the scuttleverse could use messages to represent message types and consumers.
For example, somebody might add a message-entity to the scuttleverse, which represents a move in a tic-tac-toe game. This entity consists of nothing but its identifier. Next, I might write a GUI frontend for playing tic-tac-toe via scuttlebut. I'd add a consumer-entity to the scuttleverse, which represents the frontend. This entity can refer to the tac-tac-toe message-entity, and claim that it uses the following fields of such a message: game
, àctivePlayer
and position
. Additionally, the consumer-entity could provide human-readable information about how it interprets these fields. Of course, nobody forces me to set up the message-entity, or to be truthful in which fields it claims to use.
Other consumers might expect different fields in a tic-tac-toe message, or they might only consume a subset of fields. Every tool can still interpret messages as it sees fit. But by querying which tools accept which messages, and what they expect of these messages, we get a self-description of the system.
Another interesting dimension to this is that users could publish statements such as I use these consumers to handle this message-entity
. This gives credibility to the message-consumer's claims on how it deals with messages. At this point, we could build some pretty interesting (and useful) features for scuttleverse clients. If a friend sends me a message, but my client does not know how to handle it, it could look up which consumer the friend would use to handle the message. Then, it could recommend me to install this consumer. This would be especially powerful for plugin-based clients (i.e. patchbay).
Finally, the message types for describing message-entities and message-consumers would be message-entities themselves. There is no need to add any of this on the protocol level, you would just add some messages. Which happen to be interpreted as a meta-description.
Nobody would be forced to use this, you could simply ignore the whole thing. I just believe that having a self-aware, reflective ecosystem of message-entities and message-consumers would make for a healthy, robust and extensible system. For example, if you want to write an alternative frontend with overlapping functionality with other frontends, you can query the system itself on how existing messages should be handled, rather than looking up disparate sources of documentation.
@Aljoscha we did consider this actually, but decided that developing a really good system, that is suitable in the context, was it's own rabbit hole, at least as deep as ssb itself.
Although, the type field, which is mandatory, is allowed to be long enough to hold message ids, so that someone could develop one, if they wanted.
Suppose I define a meta-message type, and its type is the hash %123...321=.sha256
. Now I can create a message {type: "%123...321=.sha256", displayName: "vote"}
to represent the message type vote
. So far so good. But I also need to create the representation of the message type %123...321=.sha256
(the type of meta message descriptions). The type field of this message would need to refer to its own hash. Which is - as far as I understand the protocols - impossible, right?
It would be a shame if it was impossible for the message metamodel to describe itself, so let's hope that there can be a workaround via a stringly-typed message.
@aljoscha correct, if it's possible to generate a hash that contains itself then that hash is broken. The solution in this case is just to have a special syntax for a self-reflexive link. So maybe %%
refers to the message's "self".
Yep, a special case for the meta-message type itself seems to be the easiest solution. Clients who consume/produce these meta messages need to be aware of the special message anyways. So expecting them to be aware of the special case would be inelegant, but probably not be a problem.
Some more questions on messages: Are there any messages that get special treatment from the protocol? The message types docs single out Post
, About
, Contact
, Vote
and Pub
. Is that because they form the basis of the social graph, or are they relevant at the protocol level? The docs for these message types don't mention any special behaviour, except for Pub
messages.
Also a question on links: Does SSB build a search index for any field in any message that contains a link, or only for some messages it knows about? What about arrays of links? Links in nested objects/arrays?
@Aljoscha hey, i tried to find when we've talked about a similar pattern, best i came up with was: %ycm4IqQ..., %hSEmE75..., %PQSNjDI....
in your case, wouldn't the "type" of a message schema be a normal hash, pointing to a message which describes that specification? then that specification wouldn't have a type, since there's no reason any message must have a type
property, it's only convention.
@dinosaur Thanks for digging those up.
I wouldn't focus on things like absolute specifications and types at first. I'd rather try to find a minimal model to describe relationships between message types and their interpretation by clients. Once these relationships can be expressed, one could add some more metadata, like types, links, documentation, git-ssb repositories, and so on. This can develop organically - basically you build a client which uses some type of metadata and then encourage everyone to add this specific sort of metadata.
As for type schemata, I'd be interested in exploring the possibilities there. I spent a lot of time in the past year or so thinking about programming languages, which included exploring a bunch of type systems. But I believe finding a minimal vocabulary for self-description of message usage needs to be the first step. Any specific usage of this vocabulary should not determine the details of the design.
I think I'll spend the coming weekend exploring the meta-description rabbit hole and to write a summary - both of designs, and of use-cases. Then, I'll start writing a patchbay plugin to actually get something working.
@Aljoscha
SSB indexes links, and allows querying them using the links method. it indexes links to blobs, feeds, and messages, in any message, but only in top-level properties of a message or second-level properties if the rel is "link", i think. so deeply nested links are not detected. but links like these would be:
{
type: '...',
thing: Ref,
things: [Ref, {link: Ref}]
}
where Ref is an ssb ref (blob, feed, or msg id).
contact
messages have special meaning at the protocol level: they are used to determine what feeds you will request from your peers (and what feeds you will not offer, if a contact is blocked). pub
messages are used for discovering addresses of pubs to connect to. i don't think any other messages currently are given special meaning in scuttlebot.
IIRC mentions
are also special-cased.
links are additionally indexed by the ssb-links ("links2") plugin. this has special-case handling of links of rel mentions
, vote
, flag
, about
, image
, and contact
. i think this plugin is currently only used in patchbay for querying mentions.
Turns out writing is hard and takes time. For anyone willing to read it, here's a somewhat structured brain-dump:
Self-Describing Message Schemas for Scuttlebot
This post explores a way of building self-describing message schemas in the scuttleverse.
Background
Scuttlebot is a protocol for building up an eventually consistent peer-to-peer database. Unlike related projects (e.g. the IPFS), each entry in the data store is tied to a user identity. Data is added to the store in the form of messages. Each message contains some metadata, a type string, and arbitrary data. The data may contain links to other messages. The database automatically creates indices based on these links, the resulting message graph is bidirectional.
For a developer, the primary interface to the database thus consists of the following operations:
- add a message to the database, signed by the current user
- retrieve messages of a certain type
- retrieve messages by a certain user
- look up messages referenced by a certain message
- look up messages referencing a certain message
- the expected mechanisms to filter data and to be notified of new data
The resulting ecosystem of applications has some interesting, unique properties. All data in the scuttleverse inhabits the same graph and is tied together by persistent user identities. This way, any application can use data produced by any other application. For example, an application might combine status updates à la twitter and commits to git repositories into a single newsfeed.
As a consequence of this, any application is free to interpret messages as it sees fit. There is no authority on what messages to produce, or how to consume them or present them to the user. Interoperability happens because several people interpret messages in similiar ways, not because a grand authority decides on how things are supposed to work.
Going Meta: Messages Describing Messages
- need for some structure (why?)
- given a message without context, do something with it
- type field
- problems
- conflicts
- names as resource
- difficult to find information on given message type
- primitive
- opportunities:
- tooling, linkability
- documentation
- gather information about ecosystem
- client-recommendations
- type information
- serialization
- automatic API negotiation
- plan interoperability
- requirements (what?)
- powerful (should allow everything listed above, and anything one might come up with later)
- subjective
- flexible/evolutionary
- future-proof
- how?
- as simple as possible
- minimal setup to allow self-description
- type can be a hash
- add meta-messages, and use their hashes as the types of normal message
- one special meta-message that is its own instance: %et1Dc8i...
- no assumptions about what self-description should look like
- meta-messages don't need any fields
- but we can add a human-readable name just for good measure
- optional and no predescribed functionality
- certainly useful for building interfaces
- it might turn out to be useful to add more information on meta-messages, who knows
- doing something useful: post descriptive messages which link to a meta-message
- what to post?
- I don't care. Anything useful
- free to experiment: everything is nothing but messages (which may be ignored by everyone)
- who posts?
- one approach: represent message consumers and producers (clients) as messages themselves
- a usage description could link to a meta-message, a client-message, and contain some other information, e.g:
- a human-readable description of how the client deals with the message
- or which library it uses for dealing with the message
- or a structured representation of the fields and the types in the message
- and so on
- a usage description could link to a meta-message, a client-message, and contain some other information, e.g:
- one approach: represent message consumers and producers (clients) as messages themselves
- trust in these claims?
- votes on these messages?
- trust the author of the message?
- different versions of clients might do different stuff
- tie client identity to git-ssb repositories?
- what to post?
Implementation
- not protocol-level, can be organically adopted (or left to die)
- create meta-messages for the currently used message types
- ideally, apps would start using the corresponding hashes for their produced messages
- apps should consume both the old, stringly-typed messages and the new hashly-typed ones
what about protocol-relevant messages?
- pub
- contact
implement a simple cli for querying meta-messages
- start adding some information, e.g. human-readable documentation
- no need to enforce this tough, it's enough if people turn to this once they have genuine need for some message metadata