You are reading content from Scuttlebutt
@Dominic %gVFvcYy2aknKuYwThI+dD2BKr5LfNN5CA4FglEPqnow=.sha256

encrypted groups paper

I've decided that encrypted groups is complicated enough that I need to write a paper for it, like I did with secret-handshake. This thread is for my notes as i go.


reading pgp spec. ignoring signatures and other cruft - a basic pgp message is:

((public key encrypted session key: receiver.id | zeros, pk_alg.id, pk_alg.encrypt(sessionkey, reciever.public, sender.private) | (symmetrically encrypted session key: string-to-key))+,
(symmetrically encrypted data: sym_alg.id, sym_alg.encrypt(message, session_key))

guide to notation: () means a "packet" which is length delimited and has a "tag" 1 byte id number. for symmetric algorithms: encrypt(plaintext, key) => ciphertext, for asymmetric algorithms encrypt(plaintext, public_key, private_key) => ciphertext (note, asymmetric algorithms are only used to encrypt session keys for symmetric algorithms.

if the receiver.id | zeros is zeros, then the recipient is not specified so the receiver should attempt decryption with all their keys. This is intended to hide metadata.

The message contains at least one either public key encrypted session key (section 5.2), or symetrically encrypted key (section 5.3), and finally the symmetrically encrypted data (section 5.7)


usually a pgp message identifies the recipient (but it tends to be sent over email, with the to field unencrypted, anyway) if the recipient.id is replaced with zeros then the number of recipients are shown.

I didn't know this, but pgp can also encrypt symetrically, the way suggested "to a passphrase" sounds dubious, though.

following this description (so it should apply to real pgp) if you are the recipient of a group message, you could reuse the session key to encrypt another message, and reuse the same session key packets (even if you couldn't decrypt some of them)
also, you could replay someone elses encrypted messages, to the same receiver, which may have confusing effects. Since pgp signatures are an optional separate packet, you could replay a message and remove the signature.

@Dominic %trG84SwK20EPGIj4MzCuUjaXCpb1cLGz/wvlfXVQg04=.sha256

pgp spec is rfc4880

@Dominic %lO+DZxXnU4C4U+bQvWJSizvdpnelLEAtHGWXHUeyD0Q=.sha256

minilock

minilock is based on nacl and uses curve25519 keys as identities (no signatures). It's essentially a format for encrypting files to multiple recipients. It uses a nacl's box with long term keys inside a box with an ephemeral key, which gives some authentication but also has a wildcard capability (as described in my secret-handshake paper)

the message header is as follows.

encrypted_file = hash(encrypt(file, file_key =  random(), file_nonce = random())

{ephemeral: ephemeral.public, {
  (nonce = random(): encrypt({
     sender: sender.public,
     recipient: recipient.public,
     file: encrypt({
       key: file_key, nonce: file_nonce,
       hash: hash(encrypted_file)
     })
  }, recipient.public, ephemeral.private, nonce)
)*}

like in pgp, the number of recipients is not hidden,
also like in pgp, the header encrypts a session key to each recipient.

one point where it differs: not just the key, but the hash of the ciphertext are encrypted to recipient. This means you cannot reuse the header on a new body with the same key (but you could do that with pgp)

Although, this depends on the implementation actually checking the fileHash, and the user would not notice if they didn't (unless they intentionally sent a malformed message to test this). As a principle, I prefer security constructions which simply fail if not implemented correctly.

There are basically 3 layers of encryption - the body, the outer header, and the inner header. Failure to decrypt at any of these steps results in a (distinct) error message, which is shown to the user. This can be used to extract some information.

For example, if we have an "error oracle" (someone we can send a message to and they'll tell us what error they got - which could also be detected via timing) then we can replay a file that we think it may be a recipient of, and if it responds with "Error 7: Could not validate ciphertext hash" then we know it received that message. To learn which position it was in the recipient list, we can modify a multirecipient message to just be one recipient at a time (until we get the confirming error message)

However, I do having the hash of the ciphertext is a step towards something worthy, because it does mean we cannot replay a header with a different file and get a successful decryption.

but since I want to be able to use encrypted group messages in automatic systems (where a bot may respond to receiving an encrypted message) it's pretty important it won't be confused by a replay.

@Dominic %qSBGLPrCZ+doLULz1Y55td9i9lyON5UDBQuoytddmcI=.sha256

bitmessage uses something like nacl's seal function. it generates an ephemeral key, then diffiehelmans that with the recipient key (which means the sender is not cryptographically linked)

it uses this library:
https://github.com/yann2192/pyelliptic/blob/master/pyelliptic/ecc.py#L495-L517 (i'm not a python expert, but I think that pubkey_x, pubkey_y = ECC._decode_pubkey(...) is actually a function with two return values (ecc keys are 2d points, so x and y coords)

reading the code it mentions "Surreptitious Forwarding Attack" as the reason for including the recipient on the inside of the message.
https://tools.ietf.org/html/draft-ietf-smime-sender-auth-00

"In this attack, Alice signs and encrypts data for Bob's eyes, and Bob re-encrypts and forwards Alice's signed data to Charlie, making the document seem to
come directly from Alice to Charlie." we don't have this in ssb because the signature is on the outside.

@Dominic %YWj+39NdDMYkhJb1cORWXpk0bnnzUOi+eUQ4uxZ91EY=.sha256

correction to minilock pseudocode:

encrypted_file = hash(encrypt(file, file_key =  random(), file_nonce = random())

{ephemeral: ephemeral.public, {
  (nonce = random(): encrypt({
     sender: sender.public,
     recipient: recipient.public,
     file: encrypt({
       key: file_key, nonce: file_nonce,
       hash: hash(encrypted_file)
     }, recipient.public, sender.private, nonce)
  }, recipient.public, ephemeral.private, nonce)
)*}

since the recipient is already in the file information, it already has protection against Surreptitious Forwarding.

@Dominic %BsC0ONs+QMHnHktF6Df94B4tMf9strmlStY2o6HnXXk=.sha256

bitmessage is not particularily interesting, messages can only have one recipient (and there is a broadcast, but I think that might be unencrypted?)


ephemeral = random()

content = [
  SenderSigningKey.public,
  SenderEncryptionKey.public,
  recipientId,
  message
]
//note: we are ignoring a bunch of other fields like version numbers that do not affect the cryptographic properties

signature = sign([timestamp, content)
//i am not sure about `timestamp`, but i think it's something about the context of the message, but bitmessage is not a persistent blockchain, so probably it's not a hash.

encrypt(
  [content, sign([timestamp, content], senderSigningKey.private)],
  recipient.public,
  ephemeral.private
)

// the timestamp is part of the message framing

since bitmessage seeks to hide both sender and recipient it must put the signature inside the encryption. There is nothing really wrong with this given the other design decisions they have made, but it doesn't allow multiple recipients either so it's not very interesting.

note: messages that are more than 2.5 days old are discarded by the network.

@Dominic %9ARDvCQHMRMb7HbygLjjCh6JHbJgeK6cox6TbZ1Wa8Q=.sha256

this paper on "naive sign and encrypt" http://world.std.com/~dtd/sign_encrypt/sign_encrypt7.html

"when signing and encryption are combined, the inner crypto layer must somehow depend on the outer layer, so as to reveal any tampering with the outer layer."

it is mainly about encrypt(sign(msg)) (bitmessage style) and not sign(encrypt(msg)) (ssb style) unfortunately, the current ssb private messages do have a problem - you can replay someone else's ciphertext and it looks like you wrote it, although that is not as destructive as forwarding a message to make it look like a different recipient was intended.

User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
@Dominic %5UfZSbjWVfy9jmOqyk2GMtXoszb+uArsq/OsnEOBad4=.sha256

@keks yes but i'm looking at the structure of the protocol operations, not the choice of cryptographic primitives. I'll check out the revision, though.

@Dominic %wFXiRRdNZHNgRyde8vW+aAffz442KpbiiOEt6RuRzvw=.sha256

@keks I think it's gonna be about the various ways of doing groups of various kinds. The thing that made me realize that I had to write a paper about this, was things like - what if you have a one-way group (single writer, many reader) and then I create a two-way group (many writer, many reader) with the same key? now it looks like you are posting to the two-way group but you didn't intend that. It's not possible to cryptographically prevent you from leaking a key if it has been entrusted to you, but I think it's possible to make it so you can't leak it surreptitiously, by say, reusing that key in some way.

@Dominic %oRZOLEcBeBr7nlWrgHEbGb75fV/0jlPERM4fNkOp/M8=.sha256

I read robustness-principles-for-public-key-protocols.pdf. It has some good advice, but I don't really think that avoiding a list of "don'ts" is a great way to design protocols. It also suggets things like, use a hash before signing so that your signature algorithm cannot interact with your message. These days, all the libraries do that for you.

User has not chosen to be hosted publicly
@Bob %FIipVCpEVY1Vyk1A5X9dq+F3W6EsBAywZdbXzq58m1E=.sha256

what do you think about ratchet?

@Bob %rM8uubMKmA7DnUuBHhMNJzclmHyg4XWlZ4tM97TzWnE=.sha256

full spec here https://github.com/trevp/double_ratchet/wiki

User has not chosen to be hosted publicly
@Dominic %oY1n82OCZwvX4J8EVFMGP9gd5WDLcnA5Wr5O+mPQ3mY=.sha256

@keks great minds think alike! I was am considering both those possibilities. kdf(key, type) would protect one-way groups, but not two way groups.

If instead it's kdf(key, create_msg) that could work.

But that adds state to the crypto, which can be good (eg in ratchet - and in ssb, a signature is only valid in a precise context in each log)

The other side of the coin, is how do you add people to groups?

Obviously you need to at some level, send them the key for that group. Actually I think the best verb is "entrust" them the key. Because you can't cryptographically prevent them from "sharing" (aka "leaking") it again. even if forward secure, you can't prevent them screenshotting you etc. you can't avoid needing to trust your partners... (remember goal is that abuses of trust be obvious, so it's not easy to abuse trust and get away without being noticed)

I'll add ratchet to the list, but read it specifically thinking about groups of recipients.

@keks forward secure has been done ;) I want flexible groups. Maybe there is a way to do that forwardly secure, but for now, I'm gonna focus on flexibility.

User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
@Dominic %q5/P+knIfGFh7WWJ/gRUuekecQrYxQ2gdJ9LiTAJbso=.sha256

@bobhaugen for situations like that, you'd probably have a "weak" group for the outer circle, then something more exclusive, and the core team might just use regular direct messages without groups. You can boot someone from the group by creating a new key, although then you have the question of who creates it.

@keks yes, that was what I was thinking too. I need to get back to the point where I remember what was difficult about that - something about how encrypted messages where indexed, that made me feel it was all complicated enough that it needed a paper.

@Dominic %xLwCOf1x5k+7f2YsjD1Nz7VYbvFb8dnBr7KE9ChAsoA=.sha256

I just had an idea, about how reveal messages are indexed - the idea would be to not index them.

basically, you can link to an encrypted message, but if you link to a encrypted message you include the key in the link. (taking care with who you reveal it to, obviously)

then there is a group create message, and you add someone to the group by revealing that message to them, and by including the hash of that message in it's derived key, then you can never have two groups with the same key (of any sort of group!)

Also, if replies to a private thread included the key on the branch then you'd be able to reply to a thread on my one-way group, and other people in my group could see it, once I have responded to it. So, you'd get moderation on your personal threads!

How the indexer would work, if it decrypted a group key, it would add that to it's keyring, and if it saw a keyed-link it would decrypt the target and index that if it was a group.

Okay, hmm, this might benefit from out-of-order messages being implemented. Although, if the indexer just remembered the set of incomplete items (maybe because they depend on an async network lookup)... then it can try to process those again and keep them until they are eventually processed. Hmm, even with out-of-order messages I think we'd need this anyway.


I an old group message is revealed to you, you might have to reindex some messages. but it doesn't really matter if those come immediately, so we can reindex them in the background (since it's only tens of seconds to scan the entire log, I don't think it really matters if it's slow, since the UX is already async, as long as it doesn't slow down the UX)

User has not chosen to be hosted publicly
@Dominic %pIdQ2kZvh0TwafMUcyNLVYU96dHMHslYhqkYNCEhCN4=.sha256

also, link to this other discussion encrypted groups, lite client and the way forward

User has not chosen to be hosted publicly
@Dominic %WNcFe3mbJY/0j1Ch8NulDW/3QGLbGAVWhZwOsuX3Jj4=.sha256

Looking at this again, I went digging a bit to see if there was new stuff around.
this is actually old, but bluray has "broadcast encryption" which in some ways has similar properties to what we are working with:

https://en.wikipedia.org/wiki/Advanced_Access_Content_System

it's pretty simple: there is a "title key" a symmetric key that encrypts the content. Then each model of player shares a decryption key. this makes region locking possible, and also for content producers to make new content that is not playable on machines that have been cracked.

To view the movie, the player must first decrypt the content on the disc. The decryption process is somewhat convoluted. The disc contains 4 items—the Media Key Block (MKB), the Volume ID, the Encrypted Title Keys, and the Encrypted Content. The MKB is encrypted in a subset difference tree approach. Essentially, a set of keys are arranged in a tree such that any given key can be used to find every other key except its parent keys. This way, to revoke a given device key, the MKB needs only be encrypted with that device key's parent key.

Once the MKB is decrypted, it provides the Media Key, or the km. The km is combined with the Volume ID (which the program can only get by presenting a cryptographic certificate to the drive, as described above) in a one-way encryption scheme (AES-G) to produce the Volume Unique Key (Kvu). The Kvu is used to decrypt the encrypted title keys, and that is used to decrypt the encrypted content.

sounds like it's mainly concerned with encrypting the key to hundreds of keys, and making that look up efficient. There is no metadata to protect.

@Dominic %VcTTM35jYdIOm9szAnLdJbUrVf00VMwLQJMhdgdJ86Q=.sha256

@keks suggested deriving the key for group messages as group_key = kdf(key, create_msg). if you then give someone key you know that they must also know create_msg. However, that does not guarantee someone who can decrypt the message knows create_msg - they could have modified software and someone gave them group_key.

but instead, if you had group_msg_key = kdf(key, external_nonce + create_msg) (external_nonce is the previous message hash, always unique on every message), then the key used for a particular group would differ on each message, so you'd have to know both key and create_msg. They couldn't modify the software so that they wouldn't have to know create_msg. To give someone the key without telling them create_msg you'd have to give them every individual key, which you can do anyway.

This is the sort of cryptographic design that I try to create - a design you can verify just by looking at the protocol. And whats more, you can know that any implementation able to speak the protocol has that property.

@Dominic %WcM04+pudK/TIgU4sKOzBZdGHoSvR/blVA49AosyG2Q=.sha256

when creating a group, you need to pick a key. Someone just needs to know key and create_msg to decrypt the message. They don't need to have the actual create_msg just it's hash. It's possible that someone creates another group, with your create message. At least they can't trick you into posting into that group, but they might make someone else think you are involved (since you created create_msg). If there is a way to look at create_msg and verify it's related to key then they could not do this. The most obvious way would be to use the same unbox key. create_msg would be encrypted, and if you have the unbox key you can open it.

Then, if group_key = kdf(unbox, external_nonce + create_msg) to verify the group creation, just retrive (via ooo) create_msg and see if you can open it with unbox.

User has not chosen to be hosted publicly
@Dominic %tT/82xuLKmX3NS+nJ65qJtDP7nbl5mnEJXg45Uibzlk=.sha256

@Stardust yes I've seen those. While signal etc does hide content very well, it doesn't protect metadata. (the service can see who is talking to who, and they need to know this to operate the service).

ssb is a secure broadcast model. we assume that everybody sees the ciphertext of all your messages, also messages are kept permanently - this is quite contrary to common goals in cryptographic design, but denyability and forward secrecy are not goals. The main goal is metadata privacy. Forward secrecy isn't possible because it requires ephemerality (forgetting messages) and it's more useful to retain old messages. (okay there are uses for forgetting, but ssb doesn't target that use case. It would complicate the design overall). I'm also skeptical of the utility of denyability - it's a neat trick, but I don't think it's actually useful. With a cryptographically denyable system, you can't cryptographically prove to a third party that what I said to you but, you can be sure it is me (it's interactive, so participants can know who they are talking to but non-participants can't) However, if this was actually a legal case, one of your friend is compromised and the police (eg) get your transscript... now even if that isn't evidence on it's own (because it's denyable) they can just corroborate what you've said, treat it as a lead instead. So, the denyability property is only maintained if you are very very careful about what you say. So I think it's not useful in practice.

But what is very important is metadata privacy. metadata is easier to analyze than content. Who you are having private messages with is the most vital part. something like ssb wouldn't be viable without metadata privacy, but not having forward secrecy isn't a deal breaker. (btw, if you really want forward secrecy, delete your secret and create a new identity occasionally)

User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
User has not chosen to be hosted publicly
Join Scuttlebutt now