dev diary - muxrpc
I've been working on rewriting muxrpc the last few days. This has needed to happen because it doesn't have back pressure... and streaming async systems need back pressure. I believe this is a cause a number of performance problems - currently if you do say, call createHistoryStream
for an entire feed it will load all the messages and write them to the network without waiting for any kind of backpressure (and in initial sync, this will happen thousands of time in a short period)
When I originally started working on ssb I tried to make multiplexing that had back pressure, with pull-streams, but it was too hard... so I decided to come back to it later. In the meantime, used some hacks to get around the problem (such as for the UI using limit
and instead of reading one stream, read many streams in "pages"
Since the muxrpc didn't use backpressure, it didn't use pull-streams either. It used something I called "weird-streams"... Yes, as a sign that was a duct tape solution.
A few months ago I had a discussion with someone calling themselves "SlugFilter" in the comments of my pull-stream blog post that made me realize a case I had missed with pull-steams (and not needed, anyway)
But looking at weird-streams
again made me think I should adapt that idea to support back-pressure... and I did and I called it push-stream (I also remember @creationix working on a model like this too - he also independently pretty much came up with the same api as pull-streams (and used it in js-git) I remember he also explored a object oriented model but wasn't happier with it than closures.
Anyway, push-streams isn't quite as code minimal as pull-streams, but is probably easier to understand, also it's looking like they use less memory. It is much much simpler than any flavor of node streams, and in a microbenchmark, 3-4x faster than pull-streams, which is 6x faster than node streams (please remember this is a _micro_benchmark about setting up the pipeline, in a real world case, there might be so much other stuff happening that the stream overhead isn't relavant. That said, node streams are very bloated!)
Okay - so push-mux
is not gonna start out with full back pressure, the initial goal will be to reflect tcp back pressure into the substreams. This will make syncing the whole database not load the whole database into memory faster than it can be written to the network. I'll measure this effect on perf and see how it goes!