You are reading content from Scuttlebutt
@cryptix %1ts45jTyl1H+paGP9iUBa4D1/QX6UqDC+DJfpZ7Fc2U=.sha256

Next episode of my #dev-diary for #ngipointer. E01 about private messages and (re-)indexing was here.

The theme for this one is EBT, live streaming, muxrpc and how abstractions can haunt you. Since my work mostly revolves around #go-ssb, this will be quite #golang specific but I will try to make it entertaining for everyone.

Recapping the current state of go-ssb always strains me. There is a lot of If I wanted it would do X already nested inside hidden gotchas. Or simply areas where I hoped to get more help from others or just underestimated the involved complexities myself and couldn't get it done.

One central area for ssb is what we call live streaming. This usually means, forwarding messages as they arrive. You can do createHistoryStream --id @theFeed --live (or the JS version of it) and it would keep printing messages of that feed for forever. Network outages would introduce a delay but once it recovers, new messages are displayed as soon as the local bot verified and ingested them. The databases also support this concept, which means that you can open a query for a thread or gathering and new updates trickle in as they permeated the network. Building reactive things with this is very neat (iff the UI is build so that it doesn't flicker but let's not go there here and now).

Now, how does this relate to go-ssb? Well, as you might have guessed it doesn't do it live. Conceptually the flumedb-port (margaret) supports live-queries but the legacy replication doesn't use it. Which means that it has to fire-and-forget the replication calls over and over, essentially making it a polling solution. Mostly because doing it live introduces a hefty overhead to juggle 15-20k individual queries, one for each feed.

About 11 month ago, when I was still working for planetary, I tried to re-write this central piece of infrastructure and tried to come up with hacks so that the overhead is smaller. I also wrote a new kind of test-suite for multiple sbots to connect and let messages trickle across different network setups like 1 to N fan-out scenarios, making sure that messages appear before a certain timeout.

The muxrpc re-write sadly turned into a disaster and it's name (SunkenSource) shall never be spoken of again. It was the leakiest and most unstable code I ever produced. Deadlines hamper with code quality a lot.

Now, a year later, I'm tasked with implementing EBT for NGI. And one central thing in EBT, apart from reducing do you have newer then X? overhead is that it's all live. So.. let's see what I came up with this time in the next post.

@cryptix %BALMzpFezTRtg+jA8aVjIF5h0SbkhxUoLMEvy3JDIl0=.sha256

One word of warning and acknowledgment that I'm making for now is that I'm not trying to get legacy replication performant at the back of this. It will be functional conceptually and the tests will pass but those will just cover a small set of feeds. Scaling this to ten's of thousands just isn't in scope. That's what EBT is for anyhow. My current assumption is that you can enable certain feature flags to switch this stuff on and off (like prefer EBT but only do legacy non-live/polling) and try to live with the flakiness of your decisions.

Next let's talk briefly about muxrpc and our go-port. While I talked about it's issues last year, I don't see a replacement in the here and now, sadly. My last hope (QUIC) was just ripped from WebRTC and Chromium and in full google spirit, everyone is just shrugging what's going on and if it's coming back.

So, with that out of the way.. as a small re-cap about the different kinds of streams that muxrpc offers: we have async, source, sink and duplex calls in muxrpc. Source means the caller reads from it (and the server writes to it). Sink is the reverse (sending something to the server). Async is basically a source with a single element in the stream. Finally, duplex means both sink and source (server and client can send and receive).

With regard to replication, in legacy mode we had lot's of source calls (createHistoryStream). Under EBT it's one gigantic duplex call where all the messages get stuffed through.

@cryptix %cJ075n28swi6zzuYbfZYzXpysfJTOshnrnpIal5ye5A=.sha256

While go-muxrpc v1 supported all those types, using them was not as performant as I would have liked and the problems show ones you turn it up to eleven and try to sync the whole network.

Our streaming iterator abstraction, luigi, uses to much empty interfaces (go-ssb major refactor issue 48 also outlines this problem)).

tl;dr: if you see interface{} ask why, a lot.

In Go everything has a type, even if you don't care about it. If you want to accept all the types in your print function, you tell the compiler func print(arg interface{}) and then you can pass everything to it. Looks neat from the outside and is incredibly easy to use, too.

interface{} is a inline, type literal and means the interface with zero functions. For comparison here is the common io.Reader interface with one function:

type Reader interface{
  Read([]byte) (int, error)
}

So to save everyone a big hunk of trouble the go team could have added this as a builtin and save everyone a lot of typing while also having a place to document gotchas:

type any interface{}

The neat thing about interfaces is, that you usually don't care about the concrete type of a thing. It can be a file or a network connection or a buffer. If you just want to read from it, make your function parameter of type io.Reader and the compiler figures out how you can call it for you.

However, some functions have to do things depending on the concrete type. For instance unmarshaling JSON. Do you want a struct with some fields filled or is it a map[string]... of some sorts? The list goes on and on but the signature of the function is just func Unmarshal(data []byte, v interface{}) error. Inside it does it's job with what is commonly known as reflection, which is a fancy way of saying asking the runtime for type information. The kicker here is runtime. Since using empty interfaces means giving up compile-time guarantees, the only thing you are left with is finding out at runtime if you can stuff a JSON blob into a hoozit.

This long and windy explanation tries to manifest a single argument: empty interfaces are good for exactly one area and that is dynamic behavior where you don't know up front what you are dealing with.

IO-heavy data piping does not fall in that area and this is the abstraction that is now coming back to haunt me.

Not only is the onus on the programmer to keep track of what's going in and out of a luigi stream. Which is really bonkers in hindsight. go-ssb was an attempt to make ssb less magical and then we rip out the rug by subjecting coders to a dynamic stream abstraction inside a strongly typed language. It also makes it impossible to do it efficiently since every value that you pass into an empty interface has to be boxed in a special pointer and needs to escape to the heap, where it incurs additional garbage collection costs. Even if all you are doing is a tight loop and passing lot's of integers.

@cryptix %LmqKf4lB513hCTotX8A06lbuwXS35BYBWU29STMx0DA=.sha256

That being said, when I started out working on EBT, I told myself: "you know luigi is a bad plumber but if you want to get this done you will have to squint and ignore it. Making it proper will be much simpler once there the structure and tests are done." right?! well... This worked for a couple of weeks until I glued the parts together and started sending actually messages.

Due to a fluke in duplex mode, I couldn't get the raw bytes out of a duplex stream and since the crappy message format is really strict, it's impossible to validate once the JSON system turned it into map[string]interface{} (which is the default for an object with unknown fields).

It was this shitty hand that made me finally re-think the luigi abstractions and set out to write go-muxrpc v2 which doesn't use it, working branch wario. At it's core, it replaces luigi sink and source with ByteSink and ByteSource.

So where before you did this to read from a luigi source:

// method name and a type-hint (other args ommitted for clearity)
src, err := muxrpc.Source("createHistoryStream", theActualType{}, ...)
// handle err

for {
  // ctx is for timeout control (like if the connection goes away)
  v, err := src.Next(ctx)
  if luigi.IsEOS(err) { break }

  // we told the client what type we want the results as buttt
  // dictated by luigi.Source, the return from Next is always interface{}
  // so we need to runtime-assert it before it can be used
  msg, ok := v.(theActualType)
}

With a ByteSource you get a Next function and a normal reader.

// method name and expected result encoding flag
src, err := muxv2.Source("createHistoryStream", TypeJSON, args)
// handle err

for src.Next(ctx) {
    var a theActualType
    err = src.Reader(func(r io.Reader) error {
        // decode into the variable a
        return json.NewDecoder(r).Decode(&a)
    })
    // handle err

    // 1) you can now directly use the value without any assertion madness

    // 2) it's possible to re-use the buffer that is read into between the loop calls!

    // this also works if you don't care about allocations
    // pktBody, err := src.Bytes()
}

Similarly, a ByteSink just gives you a io.Writer and you need to know if you want to fill it with JSON or what ever.

It's more explicit, yes but it doesn't treat you like it knows whats best and then only goes half the way.

Realizing these problems was my christmas this year. Having converted most of go-ssb to this new API now, I really like where it's at. It's hard to pinpoint a single example. The brave might poke into the commits of my current WIP branch.

@cryptix %GOMsRFQacji0fjlK/xlBNhxNky9avlJXPMPawFzjaFs=.sha256

Coming up next: refactors to enable feature negotiation between legacy and EBT, how to (not) store vector clocks and the current state of testing.

@Rabble %T+X+IngTCUIc+2FtKROzYxFk1BaOzu/yc2vFTX7o6gU=.sha256
Voted Next episode of my #dev-diary for #ngipointer. E01 about private messages a
@Anders %6veJYQtFDp2F5fExn23NHCLzaJnrJi5UQhXEL3gmyCE=.sha256

That is a nice writeup. The new interface looks a bit like templates from c++?

@Anders %56a6KCZzy03O6sDBBhGZ0piUE9dZIyu4UaTEWh+XKsM=.sha256

Remote go before:

full feeds: 2.797s
partial feeds: 30.752s

Remote go after:

full feeds: 1.891s
partial feeds: 19.352s

🚀

@cryptix %TISxuQSuF56CD+as4HSzNl/7PrPaGX3pocUnBy2NRhs=.sha256

The numbers arj just posted are from running scripts/full-sync.js from browser-core.

I'm amazed to see them. They clearly demonstrates the overhead that was put on the runtime before and there is still quite a bit left since I just focused on the muxrpc implementation of it to work on ebt. The database queries have the same problem but that will be for another time.

@cryptix %Tfmj5V37Hjfio9NsXkUcLqZ6HD0lHEEy1vJu5Xr0prY=.sha256

Medium-sized update to this mad adventure: spent the last couple of weeks on polishing the muxrpc v2 re-write, still not realease but on this branch, like discovering (previously) untested behavior. Mostly around errors when creating streams for unhandled methods since there is no concept of an ACK or similar on stream creation. It's fixed now but I'm unhappy with the solution because it involves a timer, which might missfire if the other peer is really high latency, very busy or both..

Another fun one I ran into first when starting with the ebt work: getting early data on stream that you didn't accept. On a sink or duplex call, where data is send to the remote party, the stream creating peer can start writing immediately.. The previous code was answering the first packet with an EndErr but then removing the request id and assuming the next packet was a new request.. that request id's shouldn't be reused might deserve some more documentation and I will link to this post when I open an issue for a muxrpc ACK packet when creating streams since it would help with both of these issues and could also help as a starting vehicle for flow control since the receiving side can limit the stream creation.

@cryptix %t9azDturwroAcd29zDPST5rqBPFhNbhS3Htjxa2i7UY=.sha256

I'm not 100% happy with the state of ebt but it's not in a bad state. I will copy the recent comments from the github PR here:

Works

  • Verifies incoming messages
  • Stores vector clocks of each peer it talked to and updates them
  • Streams messages to peers live
  • Only send messages of feeds that a peer wants (receive bit on the note)
  • Filter by feed type -just disabled everything that isnt current ssb for now

TODO

1) Update the vector clock / want list not just once after connecting
Currently the sets of feeds we want to receive is just send once when the connection is established.
The JS documentation is a bit lacking about when to actually re-send those and thus confirm received messages.
But also it doesn't send a clock update once we call sbot.Replicate(feed) - which means reception is delayed until a reconnect... at least this a part should be fixed before merging this.

1) implement a more scalable solution to canceling individual feed subscriptions
Right now there is a context per feed. This works but is inefficient. Probably need to re-work the multisink so that they can be unsubscribed directly.

1) Partition sets of feeds over multiple connections
One of the advantages of ebt, other then not asking about feeds that didn't change, is receiving feeds from the best known source. For this we need a bit of heuristics, like ping to the peer and who has messages sooner.

...

Preliminary testing results: functional but a couple of glitches.

  1. rebuilding of the own/self network frontier after changes on the replication lister (graph walk)
    The TODO above about this acknowledged a slight variation of this but it's actually worse. Right now, it just builds the network frontier for ebt when it doesn't have one.. any (un)follows after the first ebt connection are without effect.

  2. Followgraph walk finishes to soon
    Tangential and slightly sure it was the case before these changes. The graph/builder walk along the follow messages sometimes collapses and returns to few results.

  3. Untangle circular dependency between state matrix and graph builder
    Before the graph builder was hooked up after the indexing system.
    The way it's setup now, the combined index needs the state matrix to update it with new received messages... which also needs the graph now for the first fill.
    This circle needs to be broken so that the server setup doesn't race.

(the mentioned deadlock glitch turned out to be a muxrpc problem, unrelated to EBT but related to the missing ACK and how to decide if this stream will error problem I mentioned in the previous.)

@cryptix %5URTnw/M8nzo6aD0Jy2ybgahkegqKYt56DQWFNN6ULo=.sha256

Lastl, I did a comparison of the JS test suite with what I did. @arj helped me to fill in some of the blanks

comparison of the JS test suite

  • binary.js - a test for encoding binary data instead of json blobs
    not supported at all yet and thus not tested.
    needs support for encoding messages by type/sign algo.

  • block.js - not sending messages to blocked peers
    not tested yet.
    though, blocked peers shouldn't end up on the feed want list.

  • chain.js - three peers connected in series
    translates to sbot/feed_live_test.go where this is done with chains of 2,3 and 5.

  • live-stall.js - network of 4 peers and one of them stops after a couple of ticks
    not yet tested but should be easy to do with feed_live

  • multiple.js - if connects to multiple peers, should replicate a feed from only one
    not yet implemented, thus not tested

  • timeout.js - tests similar behavior as multiple AFAICT
    not yet implemented, thus not tested

  • simple.js - connect two peers and sync 3 messages
    covered by sbot/feed_*

  • two.js / three.js - connect a couple of peers in series
    coverd by sbot/feed_live_test.go

  • fork.js - bob has a forked version of alices feed, should disable rx

  • client.js - when sends their note first, client or server?
    I slightly misunderstood the protocol first. I thought it was just important which side starts the duplex stream and then each send their notes/vector clock.
    But it's a bit more intricate, as highlighted by the comments in this github discussion

The biggest outstanding issue is fork detection and stopping to receive those feeds.

Join Scuttlebutt now