Dev Diary 04/10/2018
Today was somewhat grindy, chasing a lot of errors.
- talked: %6uhFP/G...
- failed to set up afl for rust: https://github.com/rust-fuzz/afl.rs/issues/141
- set up cargo-fuzz: https://github.com/ssbrs/legacy-msg/commit/57b8767b861d4762b1374b9737bd98dc3f0de8e3
Fuzzers are magical, this is all the test code I wrote to weed out errors in the json parser:
#![no_main]
#[macro_use]
extern crate libfuzzer_sys;
extern crate ssb_legacy_msg;
use ssb_legacy_msg::json::{from_slice, Value, to_vec};
fuzz_target!(|data: &[u8]| {
match from_slice::<Value>(data) {
Ok(val) => {
let sign_json = to_vec(&val, true);
let redecoded = from_slice::<Value>(&sign_json[..]).unwrap();
assert_eq!(val, redecoded);
}
Err(_) => {}
}
});
fixed everything the fuzzer found in a few seconds of running: https://github.com/ssbrs/legacy-msg/commit/f2cb1656d2bcb3283a6775ddd30e2cfac745c5b0
the fuzzer found that
f64::from_str
erroneously rejects some long numbers such as11111111111111111111111111111111111111111111111111111111111111111111111111e-323
(which can still be precisely represented as an f64). Guess I'll now have to look into float deserialization...- strtod to the rescue, there are a few remaining problems, but those are easily fixed
- started fuzzing
strtod
, because why not: https://github.com/ssbrs/strtod/blob/master/fuzz/fuzz_targets/fuzz_target_1.rs
- implemented the weird utf16/latin1 encoding that skips a bunch of bytes that is used for hash computation
- created a script for generating test data for legacy json messages: https://github.com/sunrise-choir/legacy-msg-js/blob/master/index.js
- ran that script over the corpus produced by fuzzing the rust roundtrip encoding/decoding
- created a rust executable that tests conformance of the rust impl with this test data
- not published yet because it loads some code from the file system for a faster feedback loop
- still a few errors to fix, but not today...
Next steps:
- fix the remaining errors so that the rs implementation is compliant with the js one
- a lot more fuzzing, then minimize the corpus, generate test data, and publish a test suite
- decide on an order in which to do the remaining legacy message work:
- spec and impl json metadata
- clean up the code
- clean up the specs
- define and implement cbor encoding for legacy messages
- js bindings?