Offset log, flumedb, Re-indexing and Re-syncing
I got some questions about flume from @Danie recently in this thread. Since I remember that I was also very confused about flumedb, views, offset logs and all these things, I thought I'd write down a few lines to explain. As usual, a few lines turned out to be a small blog post. So here we go. :)
#flume #flumedb #ssb-learning
What is the "Offset log"?
~/.ssb/flume/log.offset or its equivalent on Windows/Mac) stores the single source of truth" for your client. These are all messages that your client has seen and chosen to keep, in the order they were seen by the client. That means the offset log is an append-only storage, in that whenever a new message is received, the ssb stack will look at it, decide whether it needs to be stored (depending on which feed it is from, whether that feed is blocked or such, and of course whether this message is maybe already in the offset log) and if it should be stored, it will get added to the end* of the offset log.
Newer implementations of ssb may not use an offset log. It is an implementation detail that will not be visible to other peers, and so some clients/libraries use "boring tech" like sqlite3 instead.
What is flume or flumedb?
flumedb, or flume for short, makes it fast to answer questions like "who follows feed X?" Using only the offset log, answering such question would --- for every single such question --- require reading the entire offset log (mine is about 1.2GB at the moment!) to see "okay, feed A followed X last march, but then unfollowed again in June, then followed again in September..."
So flumedb is the database that clients like patchwork or patchbay will query from whenever they want to show stuff to you, the user. It reads the offset log and builds views on top of it.
The views will hold this information in aggregate: every time a message arrives, it will be processed by all the views that it concerns. For the example above, view will process all messages with
"type": "contact" and store the aggregate information of "Which feeds follow X, given all the messages I've seen from the offset log so far?"
All these views are stored inside
~/.ssb/flume alongside the offset log.
~/.ssb to separate the two (or, as noted above, just use a different database/storage mechanism). Or, again, just not use flume at all.