thankfully, something sensible here is happening here @regular and you did not just create a message that makes @aljoscha's spec more complicated ;)
[...]
So @aljoscha good news, reject such a message!
It's not this simple: If I sent this message to an sbot during replication, the sbot would not reject it. It would perform the character replacement and then accept it if the hash is correct. This is the behavior we have to put into the spec, unless you deem it a bug and change sbot to flat-out reject such messages rather than patching them up and then validating them as if nothing happened.
I'll take your "good news" as confirmation that a conforming implementation should fully reject such a message, but you'll need to update the js implementations (apparently here and here, and you might want to double-check whether those are indeed the only places were buffer -> string conversions happen).
Can I get a final ack that unconditional rejection of invalid utf-8 is the behavior to put into the spec?