Deleting feeds and comparing flumelogs
In the past few months I've been increasingly interested in feed deletion, which gives users the ability to delete content from folks that they've blocked. Being unable to delete feeds from your computer is disempowering, and means that bad people can post bad things that have to stay on your computer forever.
That's bad.
I've been working on feed deletion with #verse and I've been making some great progress adding deletion to the following JavaScript modules:
- aligned-block-file
- flumelog-offset
- flumedb
- ssb-db
- ssb-server
The gist is pretty simple: flumelog-offsets stores all messages in one big file, and deletion works by replacing the target message with something else. My first approach was to replace it with a bunch of 0
s, but ssb-db always expects a message and tends to blow up when you hand it empty buffers.
Currently my approach is to add a fake message, which is fine, but I'm feeling unsatisfied by the amount of hackiness required to delete items from the log.
Over time I've been wondering: why flumelog-offset? It's impressively fast, but being double the speed of leveldb doesn't matter unless that's the bottleneck. The benchmarks show flumelog-offset writing at 13 MB/s, but in reality I've never seen it write faster than ~800 KiB/s because the bottleneck
is our peering bandwidth, not the file write speed.
If that's the case, does it matter that flumelog-level only writes at 8 MB/s? I'm not completely sold on leveldb, but it seems to have a lot of positives:
- Already-implemented read, write, update, and deletion.
- Streaming as a first-class citizen.
- Loads of compatibility, from a Raspberry Pi to a browser.
- Heaps of documentation, support, and smart folks working on it.
- We're already using it for many/most of our flumeviews.
Don't get me wrong, I think flumelog-offset is incredibly impressive and unexpectedly fast as an append-only database, but I'm wondering which trade-offs we're making when we optimize for appends and ignore mutability.
What do you all think? I'd love some feedback on your experiences with flumelog-offset, leveldb, and thoughts on what we should look for in a log.
cc: @dominic @arj @mix @regular #flumedb ( #flume ) #javascript #scuttlebot