You are reading content from Scuttlebutt
@Daan %udGeAIq/I36t/TnjtmZfXcb/ZnHBZkLLiP7qndMoWCc=.sha256

Offset log, flumedb, Re-indexing and Re-syncing

I got some questions about flume from @Danie recently in this thread. Since I remember that I was also very confused about flumedb, views, offset logs and all these things, I thought I'd write down a few lines to explain. As usual, a few lines turned out to be a small blog post. So here we go. :)

#flume #flumedb #ssb-learning

What is the "Offset log"?

In the "classic" javascript SSB stack the "offset log" (stored at ~/.ssb/flume/log.offset or its equivalent on Windows/Mac) stores the single source of truth" for your client. These are all messages that your client has seen and chosen to keep, in the order they were seen by the client. That means the offset log is an append-only storage, in that whenever a new message is received, the ssb stack will look at it, decide whether it needs to be stored (depending on which feed it is from, whether that feed is blocked or such, and of course whether this message is maybe already in the offset log) and if it should be stored, it will get added to the end* of the offset log.
Newer implementations of ssb may not use an offset log. It is an implementation detail that will not be visible to other peers, and so some clients/libraries use "boring tech" like sqlite3 instead.

What is flume or flumedb?

flumedb, or flume for short, makes it fast to answer questions like "who follows feed X?" Using only the offset log, answering such question would --- for every single such question --- require reading the entire offset log (mine is about 1.2GB at the moment!) to see "okay, feed A followed X last march, but then unfollowed again in June, then followed again in September..."
So flumedb is the database that clients like patchwork or patchbay will query from whenever they want to show stuff to you, the user. It reads the offset log and builds views on top of it.
The views will hold this information in aggregate: every time a message arrives, it will be processed by all the views that it concerns. For the example above, view will process all messages with "type": "contact" and store the aggregate information of "Which feeds follow X, given all the messages I've seen from the offset log so far?"
All these views are stored inside ~/.ssb/flume alongside the offset log.
The fact that the offset log is in the same folder as the flume views is, again, a leftover from the past, and hard to change in the current javascript stack. Conceptually, the flume views are a function of (that is, they are entirely determined by) the offset log. If we were to re-implement ssb today, we might put the offset log directly into ~/.ssb to separate the two (or, as noted above, just use a different database/storage mechanism). Or, again, just not use flume at all.

@Daan %ucq4ua65iRfYo4ovKdN55VMDdLn2tHrmgP1zPjyntMM=.sha256

What happens when I delete the flume views? What if I delete the flume folder entirely?

First off: Don't do this lightly! Read on to understand the implications!

In many discussions of bugs, stuck ssb clients, and other such issues you will read advice like "Just reset the flume indexes" or "Clear out the flume folder" or, very rarely, "I guess you'll have to do a full re-sync." Re-indexing and re-syncing are two measures that can help unblocking a bugged-out ssb client.
I personally use them quite liberally but there is a solid case to be made that they should be used sparingly and much later in the debugging process. To be clear: every time you have to do one of these, something has already gone wrong! That's why from a developer perspective it is often a good idea to look at what actually went wrong first.
Now, what do these things mean?

Deleting the flume views (aka index reset, aka re-index)

As described above, the flume views are essentially "everything inside ~/.ssb/flume except log.offset".
Before deleting the flume views, make sure you stop your ssb client. Upon restart, your ssb client will see that there is an offset log with data, but there are no views that aggregate that data and make it useful. It will thus start re-indexing the existing data in the offset log, plus any data that may arrive while the indexing is happening. During this time patchwork will refuse to publish any new messages. This is a heuristic way of preventing you from "forking" your feed, but by no means completely safe. Other clients may not even try though, so beware. As a general rule: Don't publish messages, like posts, or "attend/unattend" gatherings while a full re-index is ongoing! See the checklist below for how to verify it is safe to post again.
Your client may also likely become unresponsive which is simply the effect of being very busy right now.
If the client does show you data, it may be incomplete: posts you already know should be there may not show up, threads can show up out-of-order, the number of "like" reactions to posts may be missing or wrong, things like that. This is due to the fact that your client does not yet have access to every information in the offset log. It is just processing it after all.
You can track the progress of the indexing by calling ssb-server status on the commandline:

{
  "progress": {
    "indexes": {
      "start": 1220581057,
      "current": 1220581057,
      "target": 1220581057
    },
    "ebt": {
      "start": 0,
      "current": 532,
      "target": 1206
    }
  },
  "sync": {
    "since": 1220581057,
    "plugins": {
      "last": 1220581057,
      "keys": 1220581057,
      "clock": 1220581057,
      "time": 1220581057,
      "feed": 1220581057,
      "contacts2": 1220581057,
      "backlinks-MRiJ-CvDn": 1220581057,
      "private-MRiJ-CvDn": 1220581057,
      "query": 1220581057,
      "search": 1220581057,
      "tags": 1220581057,
      "patchwork-subscriptions": 1220581057,
      "patchwork-channels": 1220581057,
      "patchwork-contacts": 1220581057
    },
    "sync": true
  }
}

You can see an overall progress in the progress.indexes object, and the per-view indexing status below that in the sync object. You can safely ignore the ebt part for now.
The numbers indicate the "offset" that each view is currently at. This is literally the number of bytes of the offset log that this view has processed so far. Once all views (and thus the progress.indexes.current value) are equal to the value of progress.indexes.target, the indexing is "complete". This means that your client now has a complete view again of everything currently stored in the offset log. Of course, once new messages arrive via gossip, these need to be indexed again. The target value will increase, your CPU will have to do some work, and then the current value will catch up with target.
How long this indexing will take can vary significantly, depending on a number of factors:

  • Your hardware. I've seen indexing take a day or so on a small, underpowered arm smartphone, but it can also be a matter of 10 minutes on a fast machine with fast storage.
  • The number and kind of indexes you compute. This is mostly a problem of implementation, and Christian Bundy actually ran some experiments that showed that doing the indexing differently will speed things up considerably. But at the moment, building many indexes is significantly slower than rebuilding just a few. How many indexes your client uses depends on the client, its features, and whether you installed and use extra plugins in ~/.ssb/node_modules.
  • The process environment including the NodeJS version and build of your ssb client. SSB makes heavy use of some cryptography routines that take a very long time to compute in javascript. To speed this up, we use sodium-native, javascript module that "binds" to a "native library", i.e. a piece of C++ code compiled ahead of time for your operating system. For the non-programmer: we try, if we can, to use fast code to do the cryptography. But this depends on tight control of the software versions being executed in the end. When this fails, you will see messages like error loading sodium bindings: [...] falling back to javascript version. Using the javascript version is possible on any platform, so that your client will never just refuse to work, but it will be considerably slower. (If you see this happening with one of the official app releases, don't hesitate to point that out, and drop the devs some notes about which platform you're using (arm, amd64...) which OS (windows, ubuntu, macos, archlinux...) and any other pointers that may help.)
@Daan %rQD4Iy7J66llnmcPbm/lJANlLyw6CtEHXyW/juNaG7Y=.sha256

Deleting the flume folder (aka re-sync)

This is riskier than deleting the indexes only and very rarely necessary!

I wrote above that resetting the indexes means deleting everything except the log.offset file. I also wrote that having to do so means something already went wrong. Deleting the entire ~/.ssb/flume folder will naturally delete the views, but also delete the log.offset file. Well, sometimes things don't just go wrong, they go very wrong, if not to say horribly wrong. And in these cases your log.offset may actually be "broken" in that your client won't be able to "make sense" of it anymore.

Some examples:

  • You're recovering from a backup. You kept a copy of your ~/.ssb/secret file holding your identity, and kept or obtained a valid ~/.ssb/conn.json. Probably something did go horribly wrong if you have to do this, but it may be entirely ssb-unrelated.
  • A low-level bug in the code writing the offset log may corrupt the file itself. If I recall correctly, this happened once. In that case, rebuilding the indexes won't help since the "single source of truth" was already bad.
  • Messages may have gotten written to the offset log when they really shouldn't have. I stumbled over this one not so long ago. There was a bug in the message verification code which meant that some messages that should have been rejected as invalid (they had a format that doesn't correspond to the ssb protocol) were in fact written to the offset log. After the bug was identified and fixed, I was left with an offset log that my client was refusing to work with, since some of the messages contained in it were considered invalid. The structure of the logfile was valid, but the data stored in it wasn't. Since the offset log doesn't provide deletion mechanisms (ahem... not really anyhow) this is a problem.
  • You blocked a lot of accounts with many posts, and you want to make sure that not only will you not be hosting their content, but you also want to reclaim those valuable bytes they wasted on your drive.

Since deleting the flume folder means deleting the indexes, they will have to be recomputed as above. But there is a crucial difference of course: when you start your client, it will find... an empty/non-existent offset log. So the progress.indexes.target value will be 0, which makes the initial indexing trivially fast. But your client doesn't actually have anything to display now!
This is where the distributed store & forward nature of ssb becomes useful: your client will now proceed to download data from all the peers it can reach. Typically these will first and foremmost be pubs but really, any peer that replicates you and that you can reach will do just fine. Your client will start by pulling your own feed, which is of course also missing from the offset log. Why your own feed? Because your client doesn't currently know who's feed you want to replicate, but it is always interested in your own feed. So it reaches out to the first peer it can find and asks "Hey, any chance you would have updates from this @feed_id that happens to be my own?" And hopefully the other side will happily send over our entire own history. Starting from that, the indexes will be computed. "Oh, so I followed @X last March. Next time I speak to a peer, I shall ask for updates about @X." And so the next gossip exchange will involve updates from @X, which might in turn trigger more messages to be downloaded and so forth.
How does your client know which peers to reach out to? If you have a second ssb device on the same local network, this is easy. Ssb clients "shout out" to the local network and so they find each other automatically. But if you don't have this, the first peer to contact will be a remote one, either a pub or a peer in an ssb room. These are found through #ssb-conn which stores the list of possible connections in ~/.ssb/conn.json. This is why you can re-sync your entire ssb folder only with ~/.ssb/secret and ~/.ssb/conn.json even though that is a bit risky currently and takes a while. See below for a checklist on how to do it in the safest way I can think of.

Why is full a re-sync risky?

As with an index reset, you shut down your ssb client before deleting the flume folder. After the delete your client "wakes up" to an empty offset log. So it has no idea which/how many messages it needs to re-sync to be "done". Since your indexes may be "up to date" with the offset log at many times during the syncing process (until another round of gossip arrives) patchwork will not prevent you from posting anything. In fact, it will encourage you to post messages, at least initially, because it thinks you have never posted anything. So it asks you to set your name and profile description, thinking you are a first-time user. This is dangerous because you probably actually do have previous posts, your client just doesn't know about them yet.

Let's say for example that your client has not processed any messages from your own feed yet. In that case patchwork will actually pop up the "set up your profile" dialog. If you now set your profile description as patchwork suggests, or post something, or even just like some else's post, your client will now think that this is the first message of your feed.
Later it connects to a remote peer and that peer sends over hundreds of messages you had previously posted. Your client now says "But this makes no sense! You're telling me that message number 1 is a cat picture, but I have it on good authority (myself!) that is actually a dog picture!"
You have now "forked" your feed. In fact, you have forked your feed at the earliest possible location: the first message.
No other peer that has seen your messages before will accept the new version of your feed. They may actually mark your feed as "provably forked" and stop talking to you. It's a whole mess. I won't go into all the gory details of how feeds fork and what that means, but trust me: you don't want that.

There is currently no mechanism to prevent feed forks during a full re-sync, which is why this is risky business!
And just to be clear: having some of your own messages doesn't help with that. You can post your dog picture as the 100th message (because you already have the first 99 messages locally but not the 100th) and fork your feed there.

@Daan %hW8m7+X5AmRkTDbMPgrnz8GE5BSdvOf6CA498iZENXM=.sha256

How do I safely do a full re-sync?

As I wrote before, there are (very few, but not zero) situations in which you want to do a full re-sync. So, how do you do it safely? The goal here is to make sure you don't publish any messages before you are done syncing your own feed.
To be clear: to "publish a message" does not only include writing a posts, but also all other activity on ssb: Liking posts, joining & unjoining gatherings, following & unfollowing feeds and hashtags, writing private messages, updating your profille, playing chess... Essentially, all you can do safely until you're done is reading & lurking.

So, how do we do it in practical terms?

  1. Make sure your network has a copy of your last post. Keep in mind that not every message will show up as an item of a timeline. For example, a "like" is a message but will not get its own timeline entry. The same is true for chess moves and other specialized message types. The easiest & safest way to do this is to post something:

    Hey all, I'mma do a full resync because my flux-compensator is borked and I need to reset the gamma converter. Could someone please confirm they're reading this message?
    Kthxbi

    Once you see a "like" or reply on that post, you can be sure it has been backed up in your network. 👍
    Also some friendly ssb dev may reach out and offer help to fix the situation withot re-sync, or ask whether you can help understand the issue before you delete the only "evidence" of what went wrong.
    If you cannot post anymore (because your client is severely broken or such) then you need to do this "by hand" or rather "out-of-band". This means you need to contact (by email, phone, whatever) some peers and ask them to run the command in point 2. Contact multiple friends, and verify that they all agree on the last message. Also check with your own brain whether that makes sense.

  2. Check your latest message. Open a terminal and execute this: ssb-server getLatest '@my_feed_id' replacing @my_feed_id with, well... your feed id.
    Sorry, no good way to do this without terminal (patches welcome) but hey, this isn't for the faint of heart anyhow.
    This will allow you to do the things in points 3. and 4.
  3. Note down the sequence number of your last post. This number is now your official finish line!
  4. Verify that that last post is actually the one you just made! If it is not then you probably absentmindedly clicked "like" on some post in the meantime. Start back at the top of the checklist.
  5. Close your ssb client. Remember: no posting anymore until you're done!
  6. Make a backup of ~/.ssb/secret and ~/.ssb/conn.json. As I said: risky business. If you lose your secret, nothing and nobody can help you recover it. If you are restoring from backup then that means you already have one. And aren't you glad you do? Yes you are.
  7. Ready? Here we go: Delete the flume folder. That is ~/.ssb/flume or its equivalent on Windows or Mac.
  8. Launch your ssb client. If this is patchwork, it will come up empty and tell you it's a good idea to update your profile. Don't believe its lies! You've been here before, you've done this before. Be patient!
  9. Wait. If you're in patchwork, you can click the "Connect" button for any remote peers as frantically as you want:
    screenshot of patchwork's sidebar allowing connections to remote peers
    But that's about it! Reading and connecting, but no following
  10. Check the indexing progress. Notable markers are:
    • Is your CPU being used?
    • Does patchwork keep showing you "Show X new posts" and then increasing the "X"?
    • Do you see disk I/O from patchwork?
    • This is how I do it: watch -n 1 -d 'ssb-server status' This will show the progress as in the example above and highlight which indexes have changed offset in the last second. Change it to -n 10 to get less strobing.
  11. You're done when ssb-server getLatest '@my_feed_id' gives you the same message back as before the reset.
    Make sure that the sequence number matches. If it does, you're home and dry. At this point your client may still be indexing messages and hogging the CPU, but your own feed is now finished and you can safely restart posting.
  12. Make a reply to your post from point 1. Say thank you and celebrate your achievement. Document any pitfalls you found in #ssb-learning.

I'm not gonna guarantee you that you won't still break your feed somehow, but this is how I do this and so far my feed is fine. I hope this cleared some things up. :)

@enkiv2 %F+eQjuwB6MjMl6ebdKpdK3na/8DQ2w7ZFD05iZgqp3o=.sha256

This is an amazingly accessible & exhaustive explanation! Thank you so much for writing it!

@Daan %P+qxaWAvNMMPFms0bd9z5kH65vKcLEWQFVVafZZGXbE=.sha256

Thanks @enkiv2! Let me know if it leaves questions open!
I'm thinking of aggregating some of these (and other) materials into patchwork at some point. Maybe make a "Help/Learning" tab that just displays some static markdown. That's why I've been tagging a bunch of these posts with faq. 🙂

@enkiv2 %tZ+URxK2XOo/kdjx1Zl4FiqJ7Z5KIxTe1xTnGGOYm64=.sha256

I'm interested in SSB internals in general, & if you don't mind I may bug you later with questions about them, but they're mostly theoretical & not applicable to fixing client problems so I won't clog the thread with them :)

(I'm ex-Xanadu, where working with fast indexes on append-only logs was very important but we didn't have a lot of overlap with the cutting edge of algorithms research past about 1980, so for instance I'm very interested in how flume indexes might work & whether or not there are any neat tricks involved in quickly generating them from the offset log.)

@Danie %he/m5kxyewLrj0tID0Y7ISCVRcjQfh5iSJOtUQ8Y8SM=.sha256

Thanks very much @Daan that was very well explained and I've saved an offline copy for myself. I'm going to try resetting indexes now, but I do always do a full .ssb backup. Will see if Beta-3 then loads OK.

@Pedro %6WH/C0tn++mwMrju64ZLRP9yUPPrB5D3pMjSJ8yb1DY=.sha256

Thanks a lot for this thread! really informative ;)

Join Scuttlebutt now