You are reading content from Scuttlebutt
@Dominic %EyGGCcjAbaShKFCMxXKYiZjQe17SR298D0SLTuKmZpo=.sha256

semicannonical-json

semicannonical json is a subset of json that avoids the float problems but considers key order in objects {} to be significant.

examples of non-cannonical floats

As @aljoscha has pointed out recently, there are a number of valid JSON expressions that are non-cannonical - that is they parse to something, but do not stringify back to the same thing. in particular, around the handling of floats.

JSON.parse('2.00000000') => 2
JSON.parse('123e456') => Infinity

but, JSON.stringify(Infinity) => "null" so that isn't cannonical.

indeed, just the representation of the exponent is liberal in what it accepts but conservative in what it gives

JSON.parse('1e123') => 1e+123
JSON.parse('1E123') => 1e+123

obviously wether the e is upper or lower case or the + is included or not is gonna change the signature.

I've tested these cases in both V8 and firefox, and they both have the same behaviour.

handling of these cases in reference node.js implementation

I'm happy to say that these cases are implicitly forbidden in the reference implemention. if you created a message with a capital E in the exponent, it's signature would be considered invalid. sbot receives messages on the wire, parses them, then when it checks the signature: restringifies them. This means any thing which is not preserved by parsing and stringifying is excluded.

so, only the following is semi-cannonical json

json === JSON.stringify(JSON.parse(json))

that is a subset of valid json, which we fortunately already use.

@aljoscha %91oWEK4ePS7ijWpYofdRTxbtu4fPSNcfWt2CM6VQ/4c=.sha256

That's nice to hear. This also means that sbot enforces canonical string escapes and canonical whitespace (disclaimer: I'm afraid to check whether it treats \n and \r\n identically).

This would also mean that if V8 changed its JSON.stringify defaults, previously valid messages would suddenly get rejected. But, good news: This only applies to objects (which we were already aware of), not to floats and strings (see below).

For those curious enough, here is the specification of JSON.stringify in all its glory. Relevant points for this discussion:

  • stringify is allowed to print object entries in arbitrary order (in 24.3.2.3, step 8 it iterates over EnumerableOwnNames, which helpfully includes a comment that this yields the same iteration order as for ... in, which is unspecified)
  • stringify specifies the escape sequences used when printing strings, so those can't randomly break on us
  • stringify specifies how to print floats. Fun fact: negative zero must be encoded as "0", so it is impossible to have it in the current ssb messages. Our number type is officially "IEEE 754 double precision floating point numbers except the NaNs, positive infinity, negative infinity and negative 0".
@Dominic %g0fifNmOI2iQ1WwSQvyPl6089Am+D5Q3hCh63vqBx2A=.sha256

@aljoscha "\r\n" is preserved.

So, apart from treating key order as significant, does this mean we have cannonical json?

@aljoscha %v0FMgoxOZuEGmCI/u/qiwl9UBMPepHQW/ApXg5QXR2s=.sha256

Yup, I think so.

@aljoscha %fako/wSdcMgpPbiHizlP9Qgs2Vu4reun2c1kBY9zgws=.sha256

"Fun" realization of the day: Sequence numbers are json numbers adhering to the canonicity requirement, so once a feed reaches length 2^53, the legacy encoding can not correctly append new messages. Feeds are currently of bounded length.

@Dominic %zsLzXxmBi1Wwi0qImtyu1FMSZbcQlTKG+FudGJB975A=.sha256

2^53 is 9*1000*1000*1000*1000*1000 it's over 9000 trillion messages. What is the name for 1000 trillion? 9 million billion. Maybe this is like the moment when some IBM guy said "I think there is a world market for maybe 5 computers" or bill gates said he couldn't imagine why anyone would need more than 640k of ram. But I think this will be fine.

To get to that number in a year, you'd have to create 3 million messages per millisecond. Or, one message every 100 seconds for the age of the universe. If that happens and someone is still following you, you have to start a new feed. sorry!

(in fact, I wouldn't be against a setting a lower message number limit, such as uint32)

Join Scuttlebutt now