Thanks @glyph for the update, and most of all for all the brain cycles spent on figuring out the arcana of ssb encoding. Some of these questions were pretty much what I was asking myself when playing with this, but of course with very limited free time and even more limited knowledge I didn't find the answers. Having this encoded in a nice crate and explained here is super valuable; I still have a script or two that e.g. had trouble with the encoding length. Glad you figured this out!

For reference (i.e. for myself, later) here's the "count utf-16 code units in a string" method for python:

