You are reading content from Scuttlebutt
@Christian Bundy %OoBqCtaYm6ayBQqCVlHi66vsWfvaK5+t98aqsXlRyZU=.sha256

Deleting feeds and comparing flumelogs

In the past few months I've been increasingly interested in feed deletion, which gives users the ability to delete content from folks that they've blocked. Being unable to delete feeds from your computer is disempowering, and means that bad people can post bad things that have to stay on your computer forever.

That's bad.

I've been working on feed deletion with #verse and I've been making some great progress adding deletion to the following JavaScript modules:

  • aligned-block-file
  • flumelog-offset
  • flumedb
  • ssb-db
  • ssb-server

The gist is pretty simple: flumelog-offsets stores all messages in one big file, and deletion works by replacing the target message with something else. My first approach was to replace it with a bunch of 0s, but ssb-db always expects a message and tends to blow up when you hand it empty buffers.

Currently my approach is to add a fake message, which is fine, but I'm feeling unsatisfied by the amount of hackiness required to delete items from the log.

Over time I've been wondering: why flumelog-offset? It's impressively fast, but being double the speed of leveldb doesn't matter unless that's the bottleneck. The benchmarks show flumelog-offset writing at 13 MB/s, but in reality I've never seen it write faster than ~800 KiB/s because the bottleneck
is our peering bandwidth, not the file write speed.

If that's the case, does it matter that flumelog-level only writes at 8 MB/s? I'm not completely sold on leveldb, but it seems to have a lot of positives:

  1. Already-implemented read, write, update, and deletion.
  2. Streaming as a first-class citizen.
  3. Loads of compatibility, from a Raspberry Pi to a browser.
  4. Heaps of documentation, support, and smart folks working on it.
  5. We're already using it for many/most of our flumeviews.

Don't get me wrong, I think flumelog-offset is incredibly impressive and unexpectedly fast as an append-only database, but I'm wondering which trade-offs we're making when we optimize for appends and ignore mutability.

What do you all think? I'd love some feedback on your experiences with flumelog-offset, leveldb, and thoughts on what we should look for in a log.

cc: @dominic @arj @mix @regular #flumedb ( #flume ) #javascript #scuttlebot

@Christian Bundy %M6dcw9k4/lriSVvzwAU7j/DQpq5MxqXcs7eBv/7wGJk=.sha256

Also, I found this PR where I think leveldb was replaced with flumedb + flumelog-offset, but I can't find any context for the change. If there are any threads I should read or history I should know, I'd love a pointer!

cc: @mikey 💁

@mix %hqEKYROqjZqfQNfksRD6XelKyeQ4Wjc6/jzq8VgW+98=.sha256
Voted # Deleting feeds and comparing flumelogs In the past few months I've been
User has not chosen to be hosted publicly
@Christian Bundy %VdBCFBaCjS5Ec9QQ5acUH327MZGM5Yx/83XAcFD+qVE=.sha256

@moid

Thanks! I've deleted my log many times today and am currently in the process of downloading messages, so I'll check out that link once it's available. Right now it's just throwing "not found".

The compaction method is how the current flumedb deletion prototype works, streaming from one log to the other while filtering out the deleted content. The updated code makes deletion a first-class citizen in flumelog-offset and views that support it (only my experimental one), but it relies on replacing the deleted content with a fake message, which causes problems down the line.

Really looking forward to checking out your link, I appreciate the new information.

@ssb.celehner.com %1JflJROvHdqGRK+TUG3Xu9/cyF4lLIRjBy4YjBPFsUk=.sha256

I like flumelog-offset and suggest delete messages by e.g. replacing their value or content in the log with null. -cel

User has not chosen to be hosted publicly
@Christian Bundy %5ax6uaE9fiyGRa83chwSca3RQfTq/ldFFA9MNK/8SB4=.sha256

@pub_cel

The problem with 0, null, {}, etc., is that all of our code expects a well-formed message, and tends to throw errors when you don't have one. We can use a fake or modified message, which is what I'm doing now, but I've seen an abundance of peering errors that may suggest that the gossip code is viewing it anyway.

@moid

The way you've done it in filter_log() works, and that's how I'm currently doing feed deletion. The problem is that it's incredibly slow, especially on mobile devices, and so the next step is to do deletion without rebuilding the log or indexes when possible. Anything built on leveldb can do quick and easy deletes with db.del(key), so as long as our log supports deletion then it's simple to say "if deletion is supported, delete, otherwise rebuild".

@Christian Bundy %SKkH9gSkAwFbgClsNiHKZchN0gIZmmQqD+eQU3JYZ9I=.sha256
Voted [@Christian Bundy](@+oaWWDs8g73EZFUMfW37R/ULtFEjwKN/DczvdYihjbU=.ed25519) ,
@cel %LWSf9tDKr4+UuqYYcjd776rA2t8kYDUhIMoglLLVdE0=.sha256

@christianbundy what about set the content to an object of a specific type, like {"type":"deleted"} and add a property to the msg object like "deleted":true. Then it should not choke up client code, but would still be distinguishable as a deleted message. What do you mean by peering errors?

@Christian Bundy %Lii3FtXSpHi9RJuG0UMCTOAToFlZOn/GAROqMVGouXE=.sha256

@cel

Yep, although the message must be the lesser than or equal in length to the message you're deleting, so if the message is "content": {} then we can't add a type unless we remove characters from the signature or something. Also we don't want to depend on that type because others could add similar messages, so we'd need another lookup mechanism (e.g. ensure msg.value.author === "@00000000000000000000000000000000000000000000.ed25519") anyway.

The errors are all "Error replicating with @${X}: Error: parent stream is closing" or "Error: stream ended with:${Y} but wanted:${Z}". It's possible that they're just your run-of-the-mill network errors on dropped connections and such, but they seemed to get worse after I've done deletion.

I'll post an update once I'm at a good stopping place with the new code and be sure to cc you for feedback. Thank you!

@cel %E1d7Dxu+fmyXB7zjOMfUbdLU8GuGLRQXdrCa0+oIajk=.sha256

@christianbundy good point that the message length cannot be increased. So adding "deleted": true at the flumelog layer would not work. "content": {} would not be valid in a feed according to ssb-feed because it needs a string type of length ≥ 3, ≤ 52 - but if it was box-encrypted it could in fact be shorter, e.g. "0.box", which would not leave room to replace with an object with a type property. How about use that, or even a shorter string, like "content": "X" to mark a deleted message? Client code should treat it as a private message because the content is a string, and fail to decrypt just as if it was encrypted to some other recipients. Or, I think it would be okay to borrow bytes from the signature if need be. Zero'ing the author field (or any of the other metadata fields) I think would be fine; we will see if it breaks anything.

Another situation would be deleting messages from ssb-ooo. ssb-ooo also uses a flumelog to store messages, but it does not validate them, except by hash. So I think it would be possible ssb-ooo could store a message value that is just {} - although I don't think this would happen now unless there was a broken or malicious implementation. I think it could be fixed in ssb-ooo by at just making it ensure that e.g. a message has a content property, which is not equal to the value used to mark deletion, but is at least as long as it in bytes.

That's odd about those stream errors. It reminds me, when replicating feeds, ssb-server should probably abort sending a feed if it encounters a deleted message in the feed - since the peer will not be able to validate it or any of the subsequent messages and will therefore discard them. This could be done for legacy gossip by hooking the createHistoryStream function, like how it is done for blocks in ssb-friends (formerly in scuttlebot/plugins/block.js) ("break off this feed if they suddenly block the recipient"). For ssb-ebt, I think hooking sbot.getAtSequence would be the way to do it. I don't think getting invalid messages would cause a peer to break the stream though, so those errors might be from something else. Maybe it could be local errors in replication causing the stream to be aborted locally. Do local RPC methods still work when they would output a deleted message or do they end with an error? like sbot.getAtSequence([feedId, seq]) for a feed id + seqnum of a deleted message, or createHistoryStream({feed: feedId, seq: seq-1, limit: 3}) (to output a deleted message with its previous and next messages)

@Christian Bundy %JiU8TWfMBPon3KwcPYHkjheEplPPn4GLgrs+cUbWsBE=.sha256
Voted @christianbundy good point that the message length cannot be increased. So
@Christian Bundy %gt3XoVhlYgEA/2QShHoOy+pTwuSM15viUl8XeN4bSD8=.sha256

Replacement

As discussed above, replacing the item with a fake message should work, although it may be creating errors. The biggest problem right now is that not all flumeviews support rebuild(), so we don't have a way to force rebuilds without restarting ssb-server. I've pasted some of the view errors I'm seeing, so maybe we should be outputing warnings when a flumeview only has partial API support?

Merge delete

Assuming we have JSON items A, B, and C, one way to delete B would be to add a bunch of whitespace at the end of A so that it overwrites B. I was having problems where my writes weren't being respected, maybe due to the multiple layers of caching, but this seems possible.

Unfortunately it's hacky as hell and only works if you can pad the previous item with whitespace, plus you have to do the opposite with the first item since there isn't a previous item to merge into. Another downside is that if you request B then it seems to just output a corrupted string -- again, I imagine this is a cache thing but I wasn't able to circumvent it while working on the problem yesterday.

Skip delete

When we get/stream an item we're using a getMeta() function that works with pull-cursor to iterate through the file. This function returns the value of the requested item plus the sequence numbers of the items before and after the current item. One hacky and less-performant idea was to read the before/after values, and skip them if they're null.

For example, assuming A - F are arbitrary messages, we delete C and then request D. The function should return the current value of D, notice that C is an empty buffer, and return B and E as the before/after sequences. This is fine, but doesn't work at the beginning or end of the file. It also means that we need to read at least 3x as much data on each read, plus buffer comparisons, which I'd imagine is prohibitively slow (unless we add another layer of caching).

Misc

Again, I'm super impressed by the speed of flumelog-offset, and it's way cool that it can write at 13.8 MB /s, but I've been watching the file size and I've never seen it increase faster than 800 KiB/s because the flumelog isn't the bottleneck -- the network and indexes are.

The offset db is incredibly fast and efficient, but it feels like we've traded away mutability for some speed gains that we don't actually use. I'm sure it's possible to add mutability to flumedb, but when working on it I feel like I'm really cutting across the grain. This might just be another incantation of boring Scuttlebutt, but since we're nowhere close to the theoretical speed of flumelog-offset then I'm wondering whether it'd make sense to use something more... boring.

cc: @cel

@Christian Bundy %fP/HvVdPPRJWUSp6Q6CB2pavpPxOx45PjE5YDfzsL0w=.sha256

Just one more thought: another problem we're going to have is that you can have views loaded in one client but not another, so while appends are added whenever the view is loaded that isn't true for the rest of the API.

For example, if you start Patchbay and do a flumedb.append() followed by flumedb.rebuild() then all of the active views will append and then self-destruct. If you switch to Patchwork and it has different views, then they'll see the append but not the rebuild, because from the perspective of flumedb that rebuild has already been completed.

I suppose my concerns are that:

  1. We're attempting full rebuilds on level views that support del() natively.
  2. Some/many views don't support rebuilds in the first place.
  3. Rebuilds are not respected by inactive views.
User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
@Christian Bundy %D+Zyr32df9TUiGr1no1Eqi1eBvENCusGekqTZqHk4sc=.sha256

@keks

You could hack flumelog-offset to add a field of bits per message (e.g. 1-8 bytes) and use one of these bits to say "this message was deleted".

Do you think that would remain backward-compatible with the current flumelog, or would that require a migration between the old format and the new format? I was a bit surprised by some behavior that I found which I think is the current implementation, where if you request an incorrect offset (e.g. 42, if the valid ones are 30 and 48) then it will return a message without validating that it also has the same length appended after the message ends.

I'm having the same feelings about the fact that a list of operations rather than data might be useful. It's true that Scuttlebutt has that semantic, where each message is a type, but it might be really nice for flumedb items to have this capability. I'm still bummed that we don't have rebuilds on all flumeviews, but I suppose that will come with time.

@Dominic %tSrTn48MfAQ+KP2eGtG+p+TDe5YyK4aWU/c777vmvkY=.sha256

do you think that would remain backward-compatible with the current flumelog, or would that require a migration between the old format and the new format?

it would definitely require a migration.

return a message without validating that it also has the same length appended after the message ends.

this is a bug which can be easily fixed.

User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
@Christian Bundy %mj3QVQd+P3W/WRDc5ZWbwKICP1l15w0lbXd4XEAthzk=.sha256

I had a brief conversation with @dominic where he walked me through how he'd go about implementing deletion. My implementation was mostly on the right track, but the gist is that I was messing around with this sort of nonsense instead of just filtering out deleted items before returning.

As far as I can tell it's working, but it requires changes to:

  • aligned-block-file
  • flumelog-offset
  • flumeview-level
  • flumeview-reduce (TODO)
  • flumedb
  • ssb-db
  • ssb-server

I'm thinking that the best way to go about this is probably starting with the leafs of the dependency tree, bubbling my way up from there:

dependency tree graphviz dot png

Most of this work took a few iterations so I'm sure I'll need some cleanup, but I think we're close. Once this is done then we can teach flumeviews to delete single items so that we can avoid resource-intensive rebuilds, but that's still on the horizon.

@Dominic %5Mty5jmOeHb9j7hcIcUaovCRK7gHfRiNl8YSxe+bC1w=.sha256

@christianbundy can you give a brief description of what changes are needed in each of those modules?

@Anders %kZyw8/+tGVnLLd9Cmz1iTCvacQKK/774CLzzXYZEVhs=.sha256
Voted ![video:itsworking.mp4](&yEjoEuS9rkbp9tKqTPNwZnnXRSUMfxyNWTK2Y+W0NkY=.sha25
@Christian Bundy %ukMhXyppFmKrt2ceHtTu3DQwdV0stsLcKp44dF+3i4M=.sha256

@dominic

Sure!

  • aligned-block-file: add write() that writes to a specific position rather than appending
  • flumelog-offset: add del() that overwrites an item with null bytes, filter null items from get() and stream()
  • flumeview-level: maybe bugfix, also implement del() so it doesn't have to rebuild (later)
  • flumeview-reduce: fix bug where view doesn't start rebuild until server restart (todo)
  • flumedb: add del() that triggers flumelog.del() and flumeview.del() if available, falling back to rebuild if it's unavailable
  • ssb-db: fix a bug in create.js, export flumedb.del() and flumedb.stream()
  • ssb-server: depend on new version of ssb-db (trivial)
Join Scuttlebutt now