You are reading content from Scuttlebutt
@Dominic %YmVRCpfc+fjXg24FkVq1MRSq3BhWx6tSNQl7A86Dqtw=.sha256

I've been benchmarking parts of ssb, with %bench-ssb

So far, have separate benchmarks for:

  • generating messages
  • validating messages
  • writing messages to a raw flumedb
  • writing messages to minimal secure-scuttlebutt (in progress)
  • writing messages to standard secure-scuttlebutt

The idea is that by writing a bunch of small benchmarks we can see where the problem areas. I'm gonna get it to generate graphs etc, but currently I can see that secure-scuttlebutt write is weirdly slow. I'm seeing ~500 messages a second. That is way too slow, especially since ssb minimal can do 2.9k per second. Maybe the difference is in views? (because minimal has very few views)

I havn't even gotten to reads yet, but the short term goal is to get initial replication to be fast.

@Dominic %Cvp2r8vBrSUhSe+t1NFqVl3Z20CESTtlbB3K6GpY/n4=.sha256

this now has

  • reading all messages randomly (very fast!)
  • reading all messages in order by local sequence (much slower, because it depends on level, but still fast)
  • the above, but over muxrpc
  • the above, but writing to another instance (closer to replication)

Some of the benchmarks crash because of Stack Overflows. This is a good thing, because it means initial sync has probably been crashing and we didn't previously have a test case for this so I expect this will be fixed soon.

Scuttlebutt needs you!

If you would like to help, run clone and run the benchmarks, and see if you can figure out ways to make it faster. Maybe there are some things that can be disabled? It might be hard to actually get it running, but the point here is to get into the nuts and bolts and learn stuff.

@Dominic %HxkP0YHbAsstIFY3ZXpN5lev3NL62DEP6I0AuhXyNVA=.sha256

Hmm, okay I found something. I think https://github.com/flumedb/flumeview-level is slow. If I disable the links index, 05-ssb-legacy.js completes in 72 seconds instead of 104. That is still leaving several simpler flumeview-level instances in place...

My bet is that the culprit is https://www.npmjs.com/package/bytewise
because we are reencoding the object data as bytewise, implemented in javascript. One approach would be to make bytewise faster, or make something that has the same behaviour as bytewise but is faster.

What I want to do is discard it entirely and replace it with something lightweight and fast (also, for websbot)

This is bad news for patchcore, which depends on a bunch of flumeview-level's (some, via flumeview-query) but I believe we can still build viable apps without it. Unless we can make bytewise way faster? (hunch: maybe just rewrite for our usecase, not in general)

@Dominic %pXwzW/RQvXrfo5oOLQMivSdOMxrGJJkIT/7mTVEWkNY=.sha256

made some changes to the way flumelog-offset handles decoding - now it can keep a small cache of decoded objects, which makes having say, 10 streams in parallel, as is the case when building indexes in a initial sync much faster. In it's the flumelog-offset benchmark, it's 4-5 times faster than decoding each record, and it's maybe 10% faster on bench-ssb/05-ssb-legacy.js but that is mainly slowed by the views. Stil it makes a difference!

@Anders %65OcpUB2qVAMsqfl+cZMtD7JKMT4WwUvPznU5ZvChUs=.sha256

This is great @dominic. Very happy to read these updates. I'll try to look at your repo after I get a bit futher with patch-book. Benchmark and optimization is one of my favorite programming sports :soccer:

@Dominic %TQFUb56xiERc2U/SdWdftVff+tKod9N0YzJBUTNgzBk=.sha256

whadayaknow? I found a string based reimplementation of bytewise sitting their in my dev directory.

http://github.com/dominictarr/charwise

it's not complete but the stuff we need is simple (strings, numbers, bool, undefined, null, not sure about nested arrays?)
But for the simple stuff, it much faster than bytewise!

@Dominic %bjd0zYKzJi7CgglI/TaqWBelp7Q4jJoErGwiqxQJDis=.sha256

challenge issued: https://github.com/deanlandolt/bytewise/issues/23

@cel %dZzW83UZ92o9qYDOknIlNVWRxf6o2FGnZlz1hWTVaF0=.sha256

i think ssb-links needs nested arrays for its indexes like ['source', 'rel', 'dest', 'ts'] where rel is an array

@Dominic %hi2hbixiRe5kg4eeRq8hEVJ0jcZuFbLiSQdU9ll6vao=.sha256

@cel yeah, I think flumeview-query needs that. It's not hard to add that, I think you just need to have a terminator on the encoded array, so that short arrays sort before longer arrays, and arrays sort before other values (that might be inside an array, or after, whatever it was)

@Dominic %9mmssBHoGx3Et7zJYg+0+U3gW5klDIxqchjqVDGtgDU=.sha256

I also added a benchmark for full, old style replication, and to my surprise I discovered that ssb-friends was a bottleneck!

The problem was obvious - it iterates the entire follow graph twice when ever it sees a contact message! There are some pretty simple ways to improve this - one is to batch these ops, add a bunch of items, then traverse. But the simpler option, is just to check if a follow message is gonna move a peer closer! (quite often it doesn't because it's probably just a friend following another friend)

@Anders %qtlDuAjSp8kKtoBSC6tylIRnkHbKatLqsfjQBJyd9TU=.sha256

I'm having trouble cloning this repo. Getting stuck with "Fetching blobs: 5/6" and git-ssb-web shows: "Missing blobs for latest 1 update"

@Anders %Oriv8xzmPkuSkUGA7qdrSZm3v128mRGKYyteoP7gmB4=.sha256

Got the blob now after a while.

@Dominic %vKNZG53onXFZeQSOkzJTWkp+bJtzpcRX6eS/PDQgndk=.sha256

@matt do you remember how long it took to do an initial sync when we measured it at art-hack a few weeks back? I've just fixed some stuff in ssb-friends that was making sync performance really bad. ssb-friends should never have been a perf problem, it was just bad code - correct code, but the wrong algorithm - exactly the sort of problem that benchmarks catch!

update ssb-friends@2.3.1 to get it!

@Dominic %WsSbVWJ7Q6B1ygB+5XtQlydJeykxEupd4rW7g5gxC3A=.sha256

oops, no I broke something...

@Dominic %VEa9pJ7dM72ZgG1cVQCmkr41pxacSW80XkusY8BbV0A=.sha256

okay, it's fixed -- use ssb-friends@2.3.2 instead (2.3.1 is unpublished)

@mmckegg %5nceeXN5hHvts2dO7soExzdU7zccdBT4iYpe7ZrKbnU=.sha256

@Dominic

do you remember how long it took to do an initial sync when we measured it at art-hack a few weeks back?

I think it was about 5 and a half minutes or so.

Just tried out ssb-friends@2.3.2. Seems to be working well! Individual feeds are syncing much faster. Not sure about overall time, I think it has improved too. Added this to patchwork!

@Dominic %0vVUDcA3bOn5X+vCXPjIWsC96iiefinXoTQIMcbfyMI=.sha256

@matt the other thing I noticed is that flumeview-level is kinda slow (because of bytewise encoding) ... and there are a bunch of those indexes.

hmm, to merge ssb-ebt we are gonna need to update various pubs to use ebt, and probably need to update the gossip strategies too. about time because that code is a mess

@Dominic %DpBpO8wPnEOV0T8EiGcLeNdFRpYcrsAvrGWVyBIfCWs=.sha256

I noticed that flumeview-reduce was making too many writes, I had actually implemented proper flow control, but hadn't passed the configuration in! It now waits for incoming records to stop for half a second before writing, which causes it to write once early then after everything has stopped syncing. (instead of writing many times a second!)

@matt update to secure-scuttlebutt@17.0.2 to get another 20% performance improvement on initial sync! make sure you are using flumeview-reduce@1.3.9

Join Scuttlebutt now