You are reading content from Scuttlebutt
@aljoscha %zx0GVnlMl179osXbP2r8lC1a9VsDn8tiZli+1BdA6kA=.sha256
Re: %NeOkZhPNh

Thank you @regular.

Sbot replaced the invalid utf8 with the replacement character U+FFFD. So now we need to decide whether to codify this behavior in the spec, or whether to change sbot.

I see three main reasons for changing sbot to reject invalid utf instead of performing lossy replacement:

  • If we allow invalid utf8, then the rpc protocol isn't valid json anymore. This is against the json spec js ssb would otherwise follow.
  • Any string that actually contains U+FFFD in a string has multiple valid encodings, namely all those that substitute an invalid utf8 sequence for the U+FFFD character. Conforming implementations have to accept and convert all of them.
  • The legacy message format becomes even more ridiculous.

Reasons for keeping the silent replacement behavior:

  • No need to fix the sbot code base

My preferred choice is clear, but I guess it is @Dominic who gets to decide this?

Join Scuttlebutt now