SAFE communication protocols

For a laugh I looked into compiling the safe-api into wasm and creating a simple SAFE:GET function in a normal browser so I could look at text and pictures on the SAFE network in a browser without needing to proxy it. It’s not that simple of course!

That exercise took me down the path of trying to do a simple SAFE:GET from golang, nodejs, python etc without depending on any maidsafe code, just using quic libraries.

I managed to connect to the vault but quic is still very experimental so I never got past the handshake. Honestly even being able to connect was above my expectations, so please consider this topic very experimental.

It made me wonder, what’s the full spec for doing something like a GET on SAFE, without using rust or maidsafe.

It’s not something I have a clear answer on (or expect to have one any time soon) but here’s some brain dump. If anyone has anything to add that would be cool.

  • Uses quic protocol
  • Vault offers quic versions of 0xa1a2a3a 0xff00001b
  • Looks like messages are serialised with serde using bincode, not sure how cross-language bincode is…?

Anyway, just taking notes on my experiments, if you have any other info or experiments would be cool to hear it.

6 Likes

You’re right on serde + bincode. The comms are all serialised and sent through quic-p2p right now.

A quick google makes it look like bincode is designed for rust only. Which, probably means we should look to remove that down the line and use something a bit more universal.

qp2p shepherds bytes with a WireMsg format, everything coming from SCL is a Message type (or will be… this is in the midst of an overhaul clarify things a bit more)

I’m not suuupper au fait with our network layer comms (yet! digging into qp2p at the moment actually), but definitely an interesting exercise! And happy to help answer any more Qs that may come up :+1:

3 Likes

Might be of relevance: Writing a core library in Rust --- possibilities for SAFE Browser (WASM)

2 Likes

This is the list of formats implemented for serde

JSON (inefficient)
Bincode (rust only)
CBOR
YAML (inefficient)
MessagePack
TOML (inefficient)
Pickle
RON (rust only)
BSON
Avro
JSON5 (inefficient)
Postcard (rust only)
URL (inefficient)
Envy (impractial)
Envy Store (impractical)
S-expressions (impractial)
D-Bus’s binary wire format
FlexBuffers (rust only)

I’m surprised not to see protobuf or capnproto in the list.

From the remaining options there are some that are fairly specific to certain use cases even though they are portable

CBOR
MessagePack
Pickle (vendor specific, python)
BSON (vendor specific, mongodb)
Avro (vendor specific, hadoop)
D-Bus’s binary wire format (seems to have a fairly complex interface)

So that leaves CBOR and MessagePack.

3 Likes

(Ignore my earlier post, I misread the source I linked and mentally swapped CBOR and MessagePack in my head, but can’t edit or delete the post because of permissions right now) :upside_down_face:
EDIT: Seems to display properly now

Seems CBOR is claimed to be more “flexible”, but the design and stability seems to come into question. Don’t know all the requirements for datatypes SAFE would need, but MessagePack is extensible at any rate, and seems like less complexity and fewer potential bugs has good security implications in the long term.

1 Like

I can’t delete the reply but have made it a wiki and edited it. You may not be able to edit though - but I can remove the content of you want.

1 Like

I’m not familiar with what it means to make the reply a wiki, but, at any rate, it seems to display as intended in the timeline, so I suppose it’s fine now, thanks!

2 Likes

You need more trust to do things like make a post into a wiki, also to edit your own posts it seems. You can always ask the mods to do stuff.

2 Likes

I tried modifying vault and dependencies replacing bincode with cbor or messagepack to hopefully do some benchmarking of performance, but their interfaces are different enough that this is not a trivial task. Just making a note of it here, might go back to this one day cause I’m fairly curious about it.

Interesting that all the serde options seem to be schemaless, whereas protobuf and capnproto (which seem like good options to me) require schema definitions.

I feel there’s some strength in having schemas, especially for versioning and upgrades and deprecation purposes.

Also interesting that serde does not have xml as an option (not that I would want it just an observation!).

I also note that safe-vault uses pickledb quite a bit (which is not the same as serde-pickle used for serialization despite having similar origins in python pickle).
https://github.com/maidsafe/safe-vault/search?q=pickle

3 Likes

Following on from these two topics

Rust error handling for production and test

Integration tests

Testing public interfaces.

and

Simple web-based tool for BLS keys

the output from safe-api commands did not seem to be compatible with any existing BLS tools, despite being based on a standard BLS pairing BLS12-381

It seems like use of bincode is one of the major bits affecting cross-language compatibility in the chain of “data standard X (eg bls12-381)” > “encoding Y (eg bincode+hex)” > “communicated to Z (eg quic)”

One of the easiest ways to ensure the integrity of this chain is specifying integration tests with explicit language-agnostic test vectors, such as

and

https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki#Test_Vectors

Now, even though these particular example vectors would currently still depend on bincode, it would mean that a python or golang or nodejs or c# or whatever language could (in theory) still check whether it complies with the protocol.

Looking through the threshold_crypto tests there are no fixed test vectors because the library (probably rightly so) does not concern itself with serialization or inter-language compatibility.

So the question arises, should maidsafe code be specifying some fixed test vectors? Should serialization be moved away from bincode for the benefit of inter-language compatibility? I feel this is a major point for the robustness of the network since a more diverse ecosystem of libraries and development reduces the impact of bugs. I know it’s too early for that ecosystem to begin, but if test vectors are left till the end I feel that would be too late.

Doing this would also force a strong consideration around what is being encoded, eg in threshold_crypto for a multisig wallet is it best to record just the secret key shares or should the poly used to derive them also be retained for the user?

My main point here is SAFE is currently not possible to use without rust (FFI is still depending on the underlying rust), and would for a very long time remain that way since it’s very hard to reverse the public interfaces into, say, Java, mainly because of the use of bincode but also because the content of the messages is implicit in the data structures and not explicit in, say, a schema.

3 Likes

Came across this today explaining current blockchain serialization approaches:

As part of the upgrade to Ethereum 2.0, researchers and developers have been working on significant improvements to the Ethereum protocol in addition to its network architecture. Based on what was learned with RLP, Simple SerialiZe (SSZ) was developed as the culmination of this hard work.

All data structures in Bitcoin use a custom Bitcoin specific serialization format. The standard that is followed is the Bitcoin defined standard, not any other standard.

Different forms of serialization are being used in Secure Scuttlebutt, a global cryptographic social network. Here, they’re using JSON for its messages which configures messages to a specific format to allow signing.

Looks like SAFE will be using msgpack: https://safenetforum.org/t/safe-network-dev-update-august-20-2020/32496

We will switch bincode to MsgPack for serialisation

I’ve been playing with comparing bincode and msgpack by converting vault and safe-api from bincode to msgpack (using rmp-serde 0.14.4). There is a lot of bincode throughout the repos! Mostly it’s trivial, but there are some tough spots where my rust is falling short. Gonna keep trying to get msgpack working cause it’s a great learning experience. Then hopefully put up some comparisons of speed / bandwidth between bincode and msgpack.

eg mostly it’s this sort of replacement:

// serialize
- let mut bytes = bincode::serialize(&payload)?;
+ let mut bytes = rmp_serde::encode::to_vec(&payload)?;

// deserialize
- bincode::deserialize(&bytes).ok()
+ rmp_serde::decode::from_read_ref(&bytes).ok()
3 Likes

We’ll be most likely making a switch after we get the latest work stabilised (there’s been big changes, so we want to get everything going w/ that w/ bincode before making any switch).

Thanks for researching and pushing on this. It’s been very helpful.

I should think so. :+1:

4 Likes

Of course this makes perfect sense. Especially with removal of parsec, it’s not sensible to make this change until that’s completed.

I have been working with a version of routing heavily codependent on parsec and I can easily see it makes no sense to mess with bincode while such an entanglement+removal is happening. But I’m happily using fixed versions of code so I’m pushing ahead anyway with trying to serialize everything using msgpack.


A few things I’ve learned so far in trying to convert from bincode to msgpack (using rmp-serde)

Using non-string Map keys with messagepack needs careful attention. eg

HashMap<Digest256, State>
used in
routing SignatureAccumulator
has a bytes array as the key which needs to be sure it serializes/deserializes as expected.

BTreeMap<(AccumulatingEvent, bls::PublicKey), State>
used for
routing unaccumulated_events
uses a complex enum as the key and this does not serialize as expected.
Serializing enums as keys is not necessarily trivial. In Json (not sure about msgpack) keys must be in string format, so that has implications for deserialization, eg if a key is an integer it’s worth double checking that it deserializes as the correct integer and not, say, the first byte of "0" which would be " ie 34 or the char 0 which would be 48. I’ve still got to do a lot of investigation of the nuance of Map keys in messagepack.

I’m working on a stripped down demo which highlights all the caveats I’ve encountered and wanted to clarify. Still learning so much so no point publishing it yet. My purpose with the demo code is to be an interactive document to help explain the conversion difficulties from bincode to rmp-serde that I have encountered in converting safe-vault and safe-api (and all dependencies).

I experimented with changing the default serialization / deserialization of xor-url crate XorName type so it would produce a hex string. This helped clean up XorNames-as-map-keys. Before the change XorNames would be encoded like this:

{
  "[20,209,154,101,60,103,242,197,206,207,99,239,140,201,160,188,57,220,126,153,111,121,107,250,77,42,5,153,102,227,23,121]": [<the values are removed>]
}

and after the change they became like this

{
  "0f4071bee2f2623d799b7705741acdb513ae4a142df657abb15840226788d3f4": [<the values are removed>]
}

Is the original ‘array string key’ an artifact of converting from msgpack to json? Does the original ‘array string key’ deserialize correctly? Which format is less bytes in the serialized form? It’s something I’ve got to test. No point converting to and fro hex unless it’s needed.

In general, what is the correct way to visualize and inspect messagepack data? I don’t know yet. What is the impact of viewing messagepack in a ‘convert to json’ tool?

There’s some decision to make about how compact to make serialization, eg having struct fields as msgpack map keys, or just having struct values as an array and interpret by the order of fields in the codebase. I think most compact is best, but it takes a very good doc and spec and tests to make that workable.

When serializing with rmp-serde there’s an option called with_struct_map that will make encoding size larger but also more verbose / explicit, see this example. Without this, the order of fields within a struct is important. That introduces a risk when of adding new fields to the struct, if it’s in the middle of the struct it means deserialization is no longer backward compatible. So it’s a tradeoff between robustness vs compactness and carefulness. Good docs and tests and specs are the best antidote I feel in this case. I think the project can get away with the using most compact possible format, just be aware that doesn’t mean a free lunch. eg

struct Person {
  name: str,
  age: u8,
  weight: u8,
}
let p = Person {name: "Barry", age: 39, weight: 70};
serialize(p);
// compact serialization, implied order of fields,
// needs good docs and tests so we know what
// each value corresponds to.
// Could be easy to accidentally switch age and weight.
["Barry",39,70]
// robust serialization, order of fields can be changed in code
// without consequence.
{"name":"Barry","age":39,"weight":70}

The other thing being clear about serializing for local use vs wire use. Bincode is fine in, say, chunk_store or config, since it’s all local. But the same data objects going into the chunk store must also hit the wire at some point. So it’d be useful to have clear serialization methods eg serialize_for_local vs serialize_for_comms or something like that. Ambiguity is really tough to reason about, especially with deeply nested data structures spanning multiple libraries. Best to make this bluntly explicit.

I noticed algorand is using messagepack also: https://github.com/algorand/go-algorand/tree/master/cmd/msgpacktool
The comment is worth reading for nuance about messagepack and json conversions.

3 Likes

I’m pretty sure this info does not apply to bincode -> msgpack conversion in SAFE, but it does apply in a general sense so don’t look for any deep meanings here, just some notes for myself.

Enums in rust are kind of a challenging mix with messagepack.

Entirely within rust the encode-decode cycle of enums is fine. It’s pretty cool actually! Taking the time to manually read the bytes against the messagepack spec is really fascinating to see how enum encoding/decoding works.

But outside rust, it’s not good news.

Looking at routing/src/consensus/network_event.rs AccumulatingEvent enum.

This enum is used as a map key in EventAccumulator.unaccumulated_events.

The enum is encoded by messagepack as an array type. Let’s put aside the exact details for a second and look at what arrays-as-keys means for, say, javascript.

map = {};

key1 = {"a": 1}
value1 = "test value 1";
map[key1] = value1;

key2 = {"b": 2}
value2 = "test value 2";
map[key2] = value2;

console.log(map[key1]);

// key 1 unexpectedly outputs "test value 2"

for (key in map) { console.log(key) }

// output is one key: "[object Object]"

// Use an int as a key, it will be converted to a string

key0 = 0 // an int
map[key0] = "test value 0"

for (key in map) { console.log(typeof(key)) }

// output is two lots of "string" even though our keys were
// 2 array types and 1 integer type.

So an enum as a key in unaccumulated_events will not decode correctly into a javascript object because of the way javascript maps are forced to have strings for keys. This can be solved with a custom msgpack parser, but that sorta defeats the purpose of the change from bincode.

Same is true in golang.

// maps can have arrays (ie fixed length) as keys
type key [3]int
map := map[key]string{}

// maps cannot have slices (ie unknown length) as keys
// this gives a compiler error
// invalid map key type badkey
type badkey []int
badmap := map[badkey]string{}

So in golang we can decode msgpack of, say, map<XorName, Signature> since XorName is a fixed length array. But we cannot read msgpack of, say, map<AccumulatingEvent, Signature> since the enum has unbounded Vec in Genesis.related_info. And to reiterate, neither of these maps would be valid keys in javascript.

This post has described the difficulty / problem. There are solutions to it, but we’ll get into that in a later post. Most likely you are already imagining these solutions anyway cause it’s a juicy puzzle.

And we also need to cover the topic of how these complex enums like AccumulatingEvent are converted to msgpack bytes, regardless of their use as map keys or map values or as a standalone value. I think this post is long enough already, I’ll go into those details next post.

edit: The example I’ve given of AccumulatingEvent is maybe not actually used as a map key serialized on the wire. And the more I tried to find an example, the more I am feeling it’s probably not happening… I have converted what I think is everything from bincode to msgpack but am getting signature verify errors so something is not right. And even so, it should not be the enum issue in this post since this is rust-to-rust encode-decode.

edit2: just for my own future reference… when accumulating signatures for a NodeApproval event (enum index 2 of a Variant), the payload looks a bit like the json below. It’s encoded to msgpack using the ‘robust’ format (ie encode struct field names into the msgpack bytes). The json equivalent is not quite exactly convertible so is only slightly useful as a reference.

“content” is defined here and the json does not exactly match that structure of (src, dst, dst_key, variant).

{
  "content": {
    "dst": {
      "0": "5277f7a8e8f1f2fd76ae188b614eff5576be4a20f74f3726cd7ec648ad1e6b87"
    },
    "dst_key": [131, 203, 140, 202, 167, 181, 81, 11, 185, 153, 150, 172, 70, 95, 78, 116, 245, 116, 194, 163, 161, 249, 75, 35, 255, 249, 232, 204, 212, 96, 164, 127, 76, 173, 199, 158, 237, 157, 101, 129, 223, 215, 23, 188, 211, 24, 147, 224],
    "variant": {
      "2": {
        "elders_info": {
          "value": {
            "elders": {
              "a3883eb7f551d2dbeca74b8572000b782a610378df4d3051f0b5d360d33a6b09": {
                "public_id": [
                  "o4g+t/VR0tvsp0uFcgALeCphA3jfTTBR8LXTYNM6awk=",
                  [148, 11, 171, 175, 59, 134, 177, 91, 121, 15, 243, 206, 204, 1, 55, 59, 209, 76, 105, 116, 60, 213, 236, 112, 204, 73, 125, 253, 24, 138, 246, 129, 220, 147, 54, 90, 5, 25, 37, 223, 210, 162, 149, 84, 224, 51, 224, 149]
                ],
                "peer_addr": "127.0.0.1:12000"
              }
            },
            "prefix": {
              "bit_count": 0,
              "name": "0000000000000000000000000000000000000000000000000000000000000000"
            }
          },
          "proof": {
            "public_key": [131, 203, 140, 202, 167, 181, 81, 11, 185, 153, 150, 172, 70, 95, 78, 116, 245, 116, 194, 163, 161, 249, 75, 35, 255, 249, 232, 204, 212, 96, 164, 127, 76, 173, 199, 158, 237, 157, 101, 129, 223, 215, 23, 188, 211, 24, 147, 224],
            "signature": [132, 247, 108, 5, 229, 73, 48, 50, 27, 136, 68, 60, 178, 159, 214, 98, 251, 45, 217, 215, 106, 212, 39, 113, 82, 240, 149, 237, 5, 52, 14, 104, 174, 72, 150, 29, 55, 46, 211, 247, 213, 181, 101, 63, 253, 88, 242, 184, 1, 136, 136, 53, 146, 155, 60, 63, 254, 132, 56, 79, 108, 51, 26, 14, 173, 216, 74, 80, 29, 78, 95, 196, 109, 249, 192, 222, 150, 231, 134, 100, 76, 91, 193, 210, 32, 215, 230, 125, 172, 47, 248, 31, 53, 11, 114, 217]
          }
        },
        "parsec_version": 0
      }
    }
  }
}

raw msgpack bytes for above:

[129, 167, 99, 111, 110, 116, 101, 110, 116, 131, 163, 100, 115, 116, 129, 0, 217, 64, 53, 50, 55, 55, 102, 55, 97, 56, 101, 56, 102, 49, 102, 50, 102, 100, 55, 54, 97, 101, 49, 56, 56, 98, 54, 49, 52, 101, 102, 102, 53, 53, 55, 54, 98, 101, 52, 97, 50, 48, 102, 55, 52, 102, 51, 55,50, 54, 99, 100, 55, 101, 99, 54, 52, 56, 97, 100, 49, 101, 54, 98, 56, 55, 167, 100, 115, 116, 95, 107, 101, 121, 220, 0, 48, 204, 131, 204, 203, 204, 140, 204, 202, 204, 167, 204, 181, 81, 11, 204, 185, 204, 153, 204, 150, 204, 172, 70, 95, 78, 116, 204, 245, 116, 204, 194, 204, 163, 204, 161, 204, 249, 75, 35, 204, 255, 204, 249, 204, 232, 204, 204, 204, 212, 96, 204, 164, 127, 76, 204, 173, 204, 199, 204, 158, 204, 237, 204, 157, 101, 204, 129, 204, 223, 204, 215, 23, 204, 188, 204, 211, 24, 204, 147, 204, 224, 167, 118, 97, 114, 105, 97, 110, 116, 129, 2, 130, 171, 101, 108, 100, 101, 114, 115, 95, 105, 110, 102, 111, 130, 165, 118, 97, 108, 117, 101, 130, 166, 101, 108, 100, 101, 114, 115, 129, 217, 64, 97, 51, 56, 56, 51, 101, 98, 55, 102, 53, 53, 49, 100, 50, 100, 98, 101, 99, 97, 55, 52, 98,56, 53, 55, 50, 48, 48, 48, 98, 55, 56, 50, 97, 54, 49, 48, 51, 55, 56, 100, 102, 52, 100, 51, 48, 53, 49, 102, 48, 98, 53, 100, 51, 54, 48, 100, 51, 51, 97, 54, 98, 48, 57, 130, 169, 112, 117, 98, 108,105, 99, 95, 105, 100, 146, 196, 32, 163, 136, 62, 183, 245, 81, 210, 219, 236, 167, 75, 133, 114, 0, 11, 120, 42, 97, 3, 120, 223, 77, 48, 81, 240, 181, 211, 96, 211, 58, 107, 9, 220, 0, 48, 204, 148, 11, 204, 171, 204, 175, 59, 204, 134, 204, 177, 91, 121, 15, 204, 243, 204, 206, 204, 204, 1, 55, 59, 204, 209, 76, 105, 116, 60, 204, 213, 204, 236, 112, 204, 204, 73, 125, 204, 253, 24, 204, 138, 204, 246, 204, 129, 204, 220, 204, 147, 54, 90, 5, 25, 37, 204, 223, 204, 210, 204, 162, 204, 149, 84, 204, 224, 51, 204, 224, 204, 149, 169, 112, 101, 101, 114, 95, 97, 100, 100, 114, 175, 49, 50, 55, 46, 48, 46, 48, 46, 49, 58, 49, 50, 48, 48, 48, 166, 112, 114, 101, 102, 105, 120, 130, 169, 98, 105, 116, 95, 99, 111, 117, 110, 116, 0, 164, 110, 97, 109, 101, 217, 64, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 165, 112, 114, 111, 111, 102, 130, 170, 112, 117, 98, 108, 105, 99, 95, 107, 101, 121, 220, 0, 48, 204, 131, 204, 203, 204, 140, 204, 202, 204, 167, 204, 181, 81, 11, 204, 185, 204, 153, 204, 150, 204, 172, 70, 95, 78, 116, 204, 245, 116, 204, 194, 204, 163, 204, 161, 204, 249, 75, 35, 204, 255, 204, 249, 204, 232, 204, 204, 204, 212, 96, 204, 164, 127, 76, 204, 173, 204, 199, 204, 158, 204, 237, 204, 157, 101, 204, 129, 204, 223, 204, 215, 23, 204, 188, 204, 211, 24, 204, 147, 204, 224, 169, 115, 105, 103, 110, 97, 116, 117, 114, 101, 220, 0, 96, 204, 132, 204, 247, 108, 5, 204, 229, 73, 48, 50, 27, 204, 136, 68, 60, 204, 178, 204, 159, 204, 214, 98, 204, 251, 45, 204, 217, 204, 215, 106, 204, 212, 39, 113, 82, 204, 240, 204, 149, 204, 237, 5, 52, 14, 104, 204, 174, 72, 204, 150, 29, 55, 46, 204, 211, 204, 247, 204, 213, 204, 181, 101, 63, 204, 253, 88, 204, 242, 204, 184, 1, 204, 136, 204, 136, 53, 204, 146, 204, 155, 60, 63, 204, 254, 204, 132, 56, 79, 108, 51, 26, 14, 204, 173, 204, 216, 74, 80, 29, 78, 95, 204, 196, 109, 204, 249, 204, 192, 204, 222, 204, 150, 204, 231, 204, 134, 100, 76, 91, 204, 193, 204, 210, 32, 204, 215, 204, 230, 125, 204, 172, 47, 204, 248, 31, 53, 11, 114, 204, 217, 174, 112, 97, 114, 115, 101, 99, 95, 118, 101, 114, 115, 105, 111, 110, 0]

2 Likes

This may be sludgy brain soup but it’s useful sludgy brain soup for me when I come back to this tomorrow to try to find a fix. What is described below is a very subtle problem with changing from bincode to msgpack.


There’s an issue in routing with verifying signatures. This only happens when using msgpack to serialize data and doesn’t exist when using bincode. It’s pretty complex to explain so hopefully I get it across. The effect of it is that no nodes can be approved for joining because the signature on their joining data seems to be invalid when really it is just serialized in two different and subtly incompatible data types, one data type for signing and another data type for verifying (both ways happen to serialize to the same bytes in bincode but not in msgpack).

In summary the original signed message is a SignableView of the form
{"dst":<Prefix>, "dst_key":<XorName>, "variant":<Variant enum>}
and the message being verified is a Payload of the form
{"content":<data above>}
so the verification fails.

In bincode the field “content” is assumed by the order of the fields so it does not affect the encoding between SignableView and Payload.
In msgpack the “content” key in Payload is not explicitly encoded but the mere existence of some field is encoded unlike in SignableView where that field does not exist.

So you say just change it from verify(msg) to verify(msg.content) but there is a lot of complexity in how these data structures are formed and sent between functions. So it’s not that simple! I’ll try to explain below (I do not anticipate you will understand it unless you actually wrote or work closely with this code but that is sorta my point to show the complexity).

  1. Starting with creating the serialized bytes to sign. In routing/node/stage/approved.rs L2001 the node approval event is turned into an AccumulatingMessage.

  2. The AccumulatingMessage is created with the content populated on L2041 which is a type PlainMessage.

  3. There is a call to content.prove() in L2048.

  4. The prove function creates a new ProofShare in src/messages/accumulating_message.rs L60 using the serialized bytes of itself (ie serializes a PlainMessage) as a SignableView on L64.

  5. These serialized bytes of the PlainMessage as a SignableView are what are signed. And then the newly created ProofShare using those bytes is added to the AccumulatingMessage.

  6. Now let’s look at how the verification happens. The AccumulatingMessage is then added to the message_accumulator on L2016

  7. The add function uses accumulating_msg.content (ie the PlainMessage from above) on L25 to populate the content field of a Payload.

  8. The Payload is added to the SignatureAccumulator (which the MessageAccumulator inherits from) in L29.

  9. The signature accumulator serializes the Payload in src/consensus/signature_accumulator.rs L89 which are used for the verification.

  10. These serialized bytes are expected to match the serialized bytes of the original PlainMessage-as-a-SignedView in Step 5 above. They do when using bincode, but do not when using msgpack.

Compare the different data types used through this process: signable view aed plain message and payload. All these serve important semantic purposes but their differences, very slight though they may be, add up to a very complex-[for-me]-to-resolve signature verification problem when using msgpack.

Having outlined this, what to do?

The serialization of Payload is probably the main candidate for reform. This is a trivial change that depends on non-trivial rust knowledge remaining beyond me for now, although I will keep tinkering for my own interest, and it really is interesting I reckon so will keep on it.

Another option seems to be don’t use Payload but use Payload.content for verification. This throws up compexity with changing the type in signature_accumulator.add that seems to flow through to many other places.

A third option is to retain the signed bytes and don’t change their data type so much. I like this because it’s a dumb and explicit code-that-explains-itself type option. I don’t like it because it’s probably an inefficient use of memory (need to measure, right?!). I found the discovery of this issue to be very complex because there are so many data type conversions happening in this part of the code. Maybe when parsec is refactored out it may be less complex, although bls-dkg is still going to be there which is what this part of the code is all about. Maybe I’m just not rust savvy enough yet. As you can tell from the detailed description above, tracing this data flow was quite difficult. When reading top to bottom it seems ok, but when you only have step 10 and have to work your way back to step 1 it’s not so simple.


Example msgpack bytes when signing (extracted from here):

[131, 163, 100, 115, 116, 129, 0, 217, 64, 102, 49, 50, 49, 56, 52, 53, 51, 54, 98, 55, 98, 55, 49, 48, 52, 56, 102, 57, 98, 52,99, 57, 53, 48, 56, 97, 52, 51, 57, 57, 98, 98, 56, 50, 52, 99, 98, 48, 53, 97, 52, 54, 57, 57, 55, 98, 52, 52, 101, 54, 57,97, 49, 50, 100, 99, 102, 56, 50, 57, 101, 55, 52, 167, 100, 115, 116, 95, 107, 101, 121, 220, 0, 48, 204, 178, 72, 36, 71, 204, 222, 204, 148, 204, 158, 204, 165, 204, 177, 204, 161, 204, 148, 204, 147, 204, 224, 84, 38, 17, 204, 216, 20, 6, 204, 219, 17, 204, 236, 89, 204, 163, 54, 24, 204, 196, 59, 95, 101, 34, 204, 140, 123, 127, 46, 95, 204, 211, 204, 250, 204, 129, 204, 254, 76, 28, 204, 177, 121, 107, 204, 218, 204, 138, 204, 213, 167, 118, 97, 114, 105, 97, 110, 116, 129, 2, 130, 171, 101, 108, 100, 101, 114, 115, 95, 105, 110, 102, 111, 130, 165, 118, 97, 108, 117, 101, 130, 166, 101, 108, 100, 101, 114, 115, 129, 217, 64, 98, 98, 55, 54, 51, 100, 101, 101, 56, 49, 57, 101, 56, 101, 48, 56, 52, 97, 97, 57, 98, 99, 55, 57, 51, 57, 102, 52, 55, 49, 97, 55, 52, 52, 56, 100, 54, 48, 97, 54, 52, 57, 98, 48, 51, 99, 53, 101, 56, 102, 102, 97, 51, 52, 99, 99, 97, 51, 52, 97, 55, 50, 56, 51, 130, 169, 112, 117, 98, 108, 105, 99, 95, 105, 100, 146, 196, 32, 187, 118, 61, 238, 129, 158, 142, 8, 74, 169, 188, 121, 57, 244, 113, 167, 68, 141, 96, 166, 73, 176, 60, 94, 143, 250, 52, 204, 163, 74, 114, 131, 220, 0, 48, 204, 182, 204, 176, 204, 252, 204, 149, 204, 191, 204, 235, 46, 4, 10, 204, 161, 204, 164, 24, 29, 204, 228, 71, 58, 111, 25, 204, 181, 204, 207, 32, 15, 204, 248, 122, 89, 56, 34, 111, 204, 140, 88, 204, 190, 204, 243, 204, 239, 204, 255, 204, 176, 96, 87, 30, 71, 95, 204, 216, 115, 7, 204, 209, 108, 204, 188, 47, 86, 169, 112, 101, 101, 114, 95, 97, 100, 100, 114, 175, 49, 50, 55, 46, 48, 46, 48, 46, 49, 58, 49, 50, 48, 48, 48, 166, 112, 114, 101, 102, 105, 120, 130, 169, 98, 105, 116, 95, 99, 111, 117, 110, 116, 0, 164, 110, 97, 109, 101, 217, 64, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 165, 112, 114, 111, 111, 102, 130, 170, 112, 117, 98, 108, 105, 99, 95, 107, 101, 121, 220, 0, 48, 204, 178, 72, 36, 71, 204, 222, 204, 148, 204, 158, 204, 165, 204, 177, 204, 161, 204, 148, 204, 147, 204, 224, 84, 38, 17, 204, 216, 20, 6, 204, 219, 17, 204, 236, 89, 204, 163, 54, 24, 204, 196, 59, 95, 101, 34, 204, 140, 123, 127, 46, 95, 204, 211, 204, 250, 204, 129, 204, 254, 76, 28, 204, 177, 121, 107, 204, 218, 204, 138, 204, 213, 169, 115, 105, 103, 110, 97, 116, 117, 114, 101, 220, 0, 96, 204, 152, 100, 204, 203, 204, 201, 204, 133, 204, 248, 92, 204, 156, 12, 204, 197, 24, 204, 206, 33, 14, 4, 18, 122, 98, 204, 155, 90, 204, 219, 204, 185, 204, 243, 204, 190, 204, 147, 204, 157, 114, 106, 95, 204, 146, 204, 202, 204, 169, 204, 139, 104, 204, 203, 38, 107, 204, 255, 50, 204, 239, 97, 93, 51, 92, 37, 108, 204, 208, 72, 5, 204, 168, 61, 115, 204, 183, 204, 217, 204, 169, 204, 142, 204, 131, 204, 255, 204, 241, 204, 232, 113, 204, 173, 77, 204, 237, 204, 220, 26, 66, 14, 123, 204, 193, 126, 204, 188, 204, 150, 204, 184, 107, 204, 222, 204, 129, 7, 110, 17, 9, 41, 204, 245, 204, 243, 120, 204, 174, 204, 183, 204, 191, 39, 23, 7, 39, 7, 204, 144, 204, 209, 83, 174, 112, 97, 114, 115, 101, 99, 95, 118, 101, 114, 115, 105, 111, 110, 0]

Example same msgpack bytes when verifying (notice if the first 9 bytes are removed it’s the same as above, extracted from here):

[129, 167, 99, 111, 110, 116, 101, 110, 116, 131, 163, 100, 115, 116, 129, 0, 217, 64, 50, 51, 50, 49, 98, 57, 102, 101, 52, 51, 53, 54, 53, 51, 98, 98, 100, 98, 50, 57, 55, 50, 98, 51, 97, 97, 54, 100, 49, 101, 98, 53, 49, 55, 50, 57, 57, 49, 49, 53, 100, 48, 97, 99, 98, 100, 56, 57, 52, 101, 56, 53, 54, 98, 48, 102, 51, 56, 101, 55, 99, 57, 98, 51, 167, 100, 115, 116, 95, 107, 101, 121, 220, 0, 48, 204, 183, 51, 124, 109, 204, 232, 204, 184, 204, 169, 204, 188, 53, 204, 143, 88, 56, 204, 168, 0, 68, 115, 204, 194, 103, 204, 242, 204, 222, 204, 199, 50, 115, 31, 204, 211, 204, 169, 204, 144, 39, 204, 254, 204, 135, 74, 35, 204, 221, 67, 204, 131, 204, 168, 8, 14, 204, 254, 204, 200, 3, 75, 111, 38, 204, 175, 116, 40, 204, 151, 167, 118, 97, 114, 105, 97, 110, 116, 129, 2, 130, 171, 101, 108, 100, 101, 114, 115, 95, 105, 110, 102, 111, 130, 165, 118, 97, 108, 117, 101, 130, 166, 101, 108, 100, 101, 114, 115, 129, 217, 64, 49, 97, 53, 55, 57, 49, 48, 57, 50, 98, 51, 99, 100, 51, 53, 50, 98, 51, 101, 51, 54, 101, 56, 100, 49, 48, 48, 49, 54, 50, 102, 53, 52, 98, 51, 97, 54, 54, 50, 53, 52, 98, 100, 55, 52, 97, 100, 57, 57, 97, 57, 50, 53, 102, 101, 98, 97, 97, 99, 50, 51, 99, 55, 100, 130, 169, 112, 117, 98, 108, 105, 99, 95, 105, 100, 146, 196, 32, 26, 87, 145, 9, 43, 60, 211, 82, 179, 227, 110, 141, 16, 1, 98, 245, 75, 58, 102, 37, 75, 215, 74, 217, 154, 146, 95, 235, 170, 194, 60, 125, 220, 0, 48, 204, 180, 204, 129, 204, 156, 30, 26, 29, 76, 64, 204, 158, 204, 236, 204, 216, 36, 204, 152, 88, 204, 208, 204, 186, 93, 67, 204, 134, 204, 247, 38, 204, 169, 204, 222, 204, 183, 204, 187, 126, 29, 204, 153, 204, 242, 204, 153, 118, 204, 246, 204, 217, 80, 43, 204, 188, 82, 204, 249, 51, 107, 95, 30, 204, 231, 33, 204, 230, 204, 197, 204, 234, 108, 169, 112, 101, 101, 114, 95, 97, 100, 100, 114, 175, 49, 50, 55, 46, 48, 46, 48, 46, 49, 58, 49, 50, 48, 48, 48, 166, 112, 114, 101, 102, 105, 120, 130, 169, 98, 105, 116, 95, 99, 111, 117, 110, 116, 0, 164, 110, 97, 109, 101, 217, 64, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 165, 112, 114, 111, 111, 102, 130, 170, 112, 117, 98, 108, 105, 99, 95, 107, 101, 121, 220, 0, 48, 204, 183, 51, 124, 109, 204, 232, 204, 184, 204, 169, 204, 188, 53, 204, 143, 88, 56, 204, 168, 0, 68, 115, 204, 194, 103, 204, 242, 204, 222, 204, 199, 50, 115, 31, 204, 211, 204, 169, 204, 144, 39, 204, 254, 204, 135, 74, 35, 204, 221, 67, 204, 131, 204, 168, 8, 14, 204, 254, 204, 200, 3, 75, 111, 38, 204, 175, 116, 40, 204, 151, 169, 115, 105, 103, 110, 97, 116, 117, 114, 101, 220, 0, 96, 204, 180, 204, 188, 32, 24, 84, 30, 63, 12, 90, 204, 225, 204, 228, 40, 73, 204, 192, 125, 204, 202, 204, 200, 91, 13, 204, 207, 2, 204, 163, 204, 250, 105, 204, 173, 7, 105, 204, 158, 204, 219, 204, 131, 204, 147, 58, 33, 204, 242, 89, 204, 160, 26, 1, 39, 204, 131, 34, 204, 171, 61, 58, 204, 137, 96, 204, 199, 204, 171, 1, 97, 204, 143, 31, 5, 46, 204, 151, 204, 166, 15, 31, 82, 5, 105, 204, 164, 39, 63, 11, 204, 252, 22, 66, 204, 185, 204, 139, 204, 186, 204, 252, 120, 204, 220, 204, 164, 204, 216, 2, 204, 134, 89, 204, 255, 114, 29, 58, 204, 178, 36, 204, 253, 204, 195, 204, 237, 204, 143, 95, 97, 204, 244, 33, 14, 204, 200, 38, 174, 112, 97, 114, 115, 101, 99, 95, 118, 101, 114, 115, 105, 111, 110, 0]
2 Likes

To put a bit more detail for the problem above, here are the bytes for signing vs the bytes for verifying.

Signing

Serializing the PlainMessage-as-a-SignableView for signing happens in routing/src/messages/accumulating_message.rs L64.

Pulling it out into a variable and serializing it looks like this (best_bytes::for_ipc is a wrapper around rmp_serde serialize with_struct_map):

let signable = self.as_signable();
log::info!("SIGN bincode {:?}", bincode::serialize(&signable));
log::info!("SIGN rmp_compact {:?}", rmp_serde::encode::to_vec(&signable));
log::info!("SIGN rmp_robust {:?}", best_bytes::for_ipc(&signable));

and gives us these bytes

bincode for signing

[0, 0, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0, 102, 57, 50, 51, 51, 97, 51, 53, 56, 49, 49, 98, 57, 49, 49, 100, 54, 51, 52, 48, 50, 100, 102, 56, 57, 56, 57, 102, 56, 55, 98, 50, 100, 55, 100, 54, 99, 57, 51, 49, 98, 99, 98, 101, 52, 98, 98, 101, 55, 55, 52, 97, 48, 101, 57, 52, 57, 100, 102, 98, 55, 102, 99, 49, 1, 173, 8, 191, 87, 90, 92, 119, 102, 219, 62, 40, 236, 37, 149, 202, 134, 241, 136, 253, 31, 15, 210, 138, 36, 0, 21, 22, 35, 13, 216, 84, 84, 5, 148, 126, 207, 219, 51, 19, 40, 190, 14, 109, 161, 53, 63, 204, 176, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0, 99, 50, 98, 97, 48, 98, 49, 53, 49, 56, 56, 98, 49, 49, 53, 49, 99, 48, 48, 57, 48, 98, 97, 48, 51, 49, 49, 50, 55, 51, 54, 57, 51, 52, 99, 48, 48, 101, 57, 55, 50, 48, 99, 97, 54, 101, 48, 50, 51, 54, 49, 52, 50, 53, 100, 57, 99, 102, 50, 57, 49, 50, 49, 97, 32, 0, 0, 0, 0, 0, 0, 0, 194, 186, 11, 21, 24, 139, 17, 81, 192, 9, 11, 160, 49, 18, 115, 105, 52, 192, 14, 151, 32, 202, 110, 2, 54, 20, 37, 217, 207, 41, 18, 26, 148, 67, 214, 182, 247, 49, 34, 165, 125, 243, 102, 234, 50, 76, 69, 67, 219, 178, 46, 105, 220, 81, 210, 193, 50, 238, 7, 245, 29, 94, 118, 232, 36, 161, 124, 221, 77, 78, 244, 223, 184, 223, 194, 134, 54, 15, 165, 250, 0, 0, 0, 0, 127, 0, 0, 1, 224, 46, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 173, 8, 191, 87, 90, 92, 119, 102, 219, 62, 40, 236, 37, 149, 202, 134, 241, 136, 253, 31, 15, 210, 138, 36, 0, 21, 22, 35, 13, 216, 84, 84, 5, 148, 126, 207, 219, 51, 19, 40, 190, 14, 109, 161, 53, 63, 204, 176, 137, 167, 141, 153, 113, 109, 69, 25, 53, 17, 118, 213, 25, 199, 83, 229, 66, 134, 186, 50, 67, 4, 154, 68, 109, 95, 254, 52, 207, 194, 103, 71, 93, 99, 105, 96, 37, 59, 171, 92, 37, 33, 46, 232, 135, 165, 98, 211, 12, 159, 103, 76, 57, 158, 182, 9, 145, 189, 115, 49, 141, 247, 218, 135, 40, 25, 219, 107, 28, 30, 106, 46, 6, 102, 228, 20, 193, 108, 245, 234, 95, 126, 130, 173, 117, 130, 53, 144, 177, 201, 122, 182, 41, 205, 44, 239, 0, 0, 0, 0, 0, 0, 0, 0]

msgpack compact for signing

[147, 129, 0, 217, 64, 102, 57, 50, 51, 51, 97, 51, 53, 56, 49, 49, 98, 57, 49, 49, 100, 54, 51, 52, 48, 50, 100, 102, 56, 57, 56, 57, 102, 56, 55, 98, 50, 100, 55, 100, 54, 99, 57, 51, 49, 98, 99, 98, 101, 52, 98, 98, 101, 55, 55, 52, 97, 48, 101, 57, 52, 57, 100, 102, 98, 55, 102, 99, 49, 220, 0, 48, 204, 173, 8, 204, 191, 87, 90, 92, 119, 102, 204, 219, 62, 40, 204, 236, 37, 204, 149, 204, 202, 204, 134, 204, 241, 204, 136, 204, 253, 31, 15, 204, 210, 204, 138, 36, 0, 21, 22, 35, 13, 204, 216, 84, 84, 5, 204, 148, 126, 204, 207, 204, 219, 51, 19, 40, 204, 190, 14, 109, 204, 161, 53, 63, 204, 204, 204, 176, 129, 2, 146, 146, 146, 129, 217, 64, 99, 50, 98, 97, 48, 98, 49, 53, 49, 56, 56, 98, 49, 49, 53, 49, 99, 48, 48, 57, 48, 98, 97, 48, 51, 49, 49, 50, 55, 51, 54, 57, 51, 52, 99, 48, 48, 101, 57, 55, 50, 48, 99, 97, 54, 101, 48, 50, 51, 54, 49, 52, 50, 53, 100, 57, 99, 102, 50, 57, 49, 50, 49, 97, 146, 146, 196, 32, 194, 186, 11, 21, 24, 139, 17, 81, 192, 9, 11, 160, 49, 18, 115, 105, 52, 192, 14, 151, 32, 202, 110, 2, 54, 20, 37, 217, 207, 41, 18, 26, 220, 0, 48, 204, 148, 67, 204, 214, 204, 182, 204, 247, 49, 34, 204, 165, 125, 204, 243, 102, 204, 234, 50, 76, 69, 67, 204, 219, 204, 178, 46, 105, 204, 220, 81, 204, 210, 204, 193, 50, 204, 238, 7, 204, 245, 29, 94, 118, 204, 232, 36, 204, 161, 124, 204, 221, 77, 78, 204, 244, 204, 223, 204, 184, 204, 223, 204, 194, 204, 134, 54, 15, 204, 165, 204, 250, 175, 49, 50, 55, 46, 48, 46, 48, 46, 49, 58, 49, 50, 48, 48, 48, 146, 0, 217, 64, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 146, 220, 0, 48, 204, 173, 8, 204, 191, 87, 90, 92, 119, 102, 204, 219, 62, 40, 204, 236, 37, 204, 149, 204, 202, 204, 134, 204, 241, 204, 136, 204, 253, 31, 15, 204, 210, 204, 138, 36, 0, 21, 22, 35, 13, 204, 216, 84, 84, 5, 204, 148, 126, 204, 207, 204, 219, 51, 19, 40, 204, 190, 14, 109, 204, 161, 53, 63, 204, 204, 204, 176, 220, 0, 96, 204, 137, 204, 167, 204, 141, 204, 153, 113, 109, 69, 25, 53, 17, 118, 204, 213, 25, 204, 199, 83, 204, 229, 66, 204, 134, 204, 186, 50, 67, 4, 204, 154, 68, 109, 95, 204, 254, 52, 204, 207, 204, 194, 103, 71, 93, 99, 105, 96, 37, 59, 204, 171, 92, 37, 33, 46, 204, 232, 204, 135, 204, 165, 98, 204, 211, 12, 204, 159, 103, 76, 57, 204, 158, 204, 182, 9, 204, 145, 204, 189, 115, 49, 204, 141, 204, 247, 204, 218, 204, 135, 40, 25, 204, 219, 107, 28, 30, 106, 46, 6, 102, 204, 228, 20, 204, 193, 108, 204, 245, 204, 234, 95, 126, 204, 130, 204, 173, 117, 204, 130, 53, 204, 144, 204, 177, 204, 201, 122, 204, 182, 41, 204, 205, 44, 204, 239, 0]

msgpack robust for signing

[131, 163, 100, 115, 116, 129, 0, 217, 64, 102, 57, 50, 51, 51, 97, 51, 53, 56, 49, 49, 98, 57, 49, 49, 100, 54, 51, 52, 48, 50, 100, 102, 56, 57, 56, 57, 102, 56, 55, 98, 50, 100, 55, 100, 54, 99, 57, 51, 49, 98, 99, 98, 101, 52, 98, 98, 101, 55, 55, 52, 97, 48, 101, 57, 52, 57, 100, 102, 98, 55, 102, 99, 49, 167, 100, 115, 116, 95, 107, 101, 121, 220, 0, 48, 204, 173, 8, 204, 191, 87, 90, 92, 119, 102, 204, 219, 62, 40, 204, 236, 37, 204, 149, 204, 202, 204, 134, 204, 241, 204, 136, 204, 253, 31, 15, 204, 210, 204, 138, 36, 0, 21, 22, 35, 13, 204, 216, 84, 84, 5, 204, 148, 126, 204, 207, 204, 219, 51, 19, 40, 204, 190, 14, 109, 204, 161, 53, 63, 204, 204, 204, 176, 167, 118, 97, 114, 105, 97, 110, 116, 129, 2, 130, 171, 101, 108, 100, 101, 114, 115, 95, 105, 110, 102, 111, 130, 165, 118, 97, 108, 117, 101, 130, 166, 101, 108, 100, 101, 114, 115, 129, 217, 64, 99, 50, 98, 97, 48, 98, 49, 53, 49, 56, 56, 98, 49, 49, 53, 49, 99, 48, 48, 57, 48, 98, 97, 48, 51, 49, 49, 50, 55, 51, 54, 57, 51, 52, 99, 48, 48, 101, 57, 55, 50, 48, 99, 97, 54, 101, 48, 50, 51, 54, 49, 52, 50, 53, 100, 57, 99, 102, 50, 57, 49, 50, 49, 97, 130, 169, 112, 117, 98, 108, 105, 99, 95, 105, 100, 146, 196, 32, 194, 186, 11, 21, 24, 139, 17, 81, 192, 9, 11, 160, 49, 18, 115, 105, 52, 192, 14, 151, 32, 202, 110, 2, 54, 20, 37, 217, 207, 41, 18, 26, 220, 0, 48, 204, 148, 67, 204, 214, 204, 182, 204, 247, 49, 34, 204, 165, 125, 204, 243, 102, 204, 234, 50, 76, 69, 67, 204, 219, 204, 178, 46, 105, 204, 220, 81, 204, 210, 204, 193, 50, 204, 238, 7, 204, 245, 29, 94, 118, 204, 232, 36, 204, 161, 124, 204, 221, 77, 78, 204, 244, 204, 223, 204, 184, 204, 223, 204, 194, 204, 134, 54, 15, 204, 165, 204, 250, 169, 112, 101, 101, 114, 95, 97, 100, 100, 114, 175, 49, 50, 55, 46, 48, 46, 48, 46, 49, 58, 49, 50, 48, 48, 48, 166, 112, 114, 101, 102, 105, 120, 130, 169, 98, 105, 116, 95, 99, 111, 117, 110, 116, 0, 164, 110, 97, 109, 101, 217, 64, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 165, 112, 114, 111, 111, 102, 130, 170, 112, 117, 98, 108, 105, 99, 95, 107, 101, 121, 220, 0, 48, 204, 173, 8, 204, 191, 87, 90, 92, 119, 102, 204, 219, 62, 40, 204, 236, 37, 204, 149, 204, 202, 204, 134, 204, 241, 204, 136, 204, 253, 31, 15, 204, 210, 204, 138, 36, 0, 21, 22, 35, 13, 204, 216, 84, 84, 5, 204, 148, 126, 204, 207, 204, 219, 51, 19, 40, 204, 190, 14, 109, 204, 161, 53, 63, 204, 204, 204, 176, 169, 115, 105, 103, 110, 97, 116, 117, 114, 101, 220, 0, 96, 204, 137, 204, 167, 204, 141, 204, 153, 113, 109, 69, 25, 53, 17, 118, 204, 213, 25, 204, 199, 83, 204, 229, 66, 204, 134, 204, 186, 50, 67, 4, 204, 154, 68, 109, 95, 204, 254, 52, 204, 207, 204, 194, 103, 71, 93, 99, 105, 96, 37, 59, 204, 171, 92, 37, 33, 46, 204, 232, 204, 135, 204, 165, 98, 204, 211, 12, 204, 159, 103, 76, 57, 204, 158, 204, 182, 9, 204, 145, 204, 189, 115, 49, 204, 141, 204, 247, 204, 218, 204, 135, 40, 25, 204, 219, 107, 28, 30, 106, 46, 6, 102, 204, 228, 20, 204, 193, 108, 204, 245, 204, 234, 95, 126, 204, 130, 204, 173, 117, 204, 130, 53, 204, 144, 204, 177, 204, 201, 122, 204, 182, 41, 204, 205, 44, 204, 239, 174, 112, 97, 114, 115, 101, 99, 95, 118, 101, 114, 115, 105, 111, 110, 0]

Verifying

Serializing the Payload for verifying happens in routing/src/consensus/signature_accumulator.rs L89.

The extra code to log the various serializatins is:

log::info!("VERIFY bincode {:?}", bincode::serialize(&payload));
log::info!("VERIFY rmp_compact {:?}", rmp_serde::encode::to_vec(&payload));
log::info!("VERIFY rmp_robust {:?}", best_bytes::for_ipc(&payload));

and gives us these bytes

bincode for verifying

[0, 0, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0, 102, 57, 50, 51, 51, 97, 51, 53, 56, 49, 49, 98, 57, 49, 49, 100, 54, 51, 52, 48, 50, 100, 102, 56, 57, 56, 57, 102, 56, 55, 98, 50, 100, 55, 100, 54, 99, 57, 51, 49, 98, 99, 98, 101, 52, 98, 98, 101, 55, 55, 52, 97, 48, 101, 57, 52, 57, 100, 102, 98, 55, 102, 99, 49, 1, 173, 8, 191, 87, 90, 92, 119, 102, 219, 62, 40, 236, 37, 149, 202, 134, 241, 136, 253, 31, 15, 210, 138, 36, 0, 21, 22, 35, 13, 216, 84, 84, 5, 148, 126, 207, 219, 51, 19, 40, 190, 14, 109, 161, 53, 63, 204, 176, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0, 99, 50, 98, 97, 48, 98, 49, 53, 49, 56, 56, 98, 49, 49, 53, 49, 99, 48, 48, 57, 48, 98, 97, 48, 51, 49, 49, 50, 55, 51, 54, 57, 51, 52, 99, 48, 48, 101, 57, 55, 50, 48, 99, 97, 54, 101, 48, 50, 51, 54, 49, 52, 50, 53, 100, 57, 99, 102, 50, 57, 49, 50, 49, 97, 32, 0, 0, 0, 0, 0, 0, 0, 194, 186, 11, 21, 24, 139, 17, 81, 192, 9, 11, 160, 49, 18, 115, 105, 52, 192, 14, 151, 32, 202, 110, 2, 54, 20, 37, 217, 207, 41, 18, 26, 148, 67, 214, 182, 247, 49, 34, 165, 125, 243, 102, 234, 50, 76, 69, 67, 219, 178, 46, 105, 220, 81, 210, 193, 50, 238, 7, 245, 29, 94, 118, 232, 36, 161, 124, 221, 77, 78, 244, 223, 184, 223, 194, 134, 54, 15, 165, 250, 0, 0, 0, 0, 127, 0, 0, 1, 224, 46, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 173, 8, 191, 87, 90, 92, 119, 102, 219, 62, 40, 236, 37, 149, 202, 134, 241, 136, 253, 31, 15, 210, 138, 36, 0, 21, 22, 35, 13, 216, 84, 84, 5, 148, 126, 207, 219, 51, 19, 40, 190, 14, 109, 161, 53, 63, 204, 176, 137, 167, 141, 153, 113, 109, 69, 25, 53, 17, 118, 213, 25, 199, 83, 229, 66, 134, 186, 50, 67, 4, 154, 68, 109, 95, 254, 52, 207, 194, 103, 71, 93, 99, 105, 96, 37, 59, 171, 92, 37, 33, 46, 232, 135, 165, 98, 211, 12, 159, 103, 76, 57, 158, 182, 9, 145, 189, 115, 49, 141, 247, 218, 135, 40, 25, 219, 107, 28, 30, 106, 46, 6, 102, 228, 20, 193, 108, 245, 234, 95, 126, 130, 173, 117, 130, 53, 144, 177, 201, 122, 182, 41, 205, 44, 239, 0, 0, 0, 0, 0, 0, 0, 0]

msgpack compact for verifying

[145, 147, 129, 0, 217, 64, 102, 57, 50, 51, 51, 97, 51, 53, 56, 49, 49, 98, 57, 49, 49, 100, 54, 51, 52, 48, 50, 100, 102, 56, 57, 56, 57, 102, 56, 55, 98, 50, 100, 55, 100, 54, 99, 57, 51, 49, 98, 99, 98, 101, 52, 98, 98, 101, 55, 55, 52, 97, 48, 101, 57, 52, 57, 100, 102, 98, 55, 102, 99, 49, 220, 0, 48, 204, 173, 8, 204, 191, 87, 90, 92, 119, 102, 204, 219, 62, 40, 204, 236, 37, 204, 149, 204, 202, 204, 134, 204, 241, 204, 136, 204, 253, 31, 15, 204, 210, 204, 138, 36, 0, 21, 22, 35, 13, 204, 216, 84, 84, 5, 204, 148, 126, 204, 207, 204, 219, 51, 19, 40, 204, 190, 14, 109, 204, 161, 53, 63, 204, 204, 204, 176, 129, 2, 146, 146, 146, 129, 217, 64, 99, 50, 98, 97, 48, 98, 49, 53, 49, 56, 56, 98, 49, 49, 53, 49, 99, 48, 48, 57, 48, 98, 97, 48, 51, 49, 49, 50, 55, 51, 54, 57, 51, 52, 99, 48, 48, 101, 57, 55, 50, 48, 99, 97, 54, 101, 48, 50, 51, 54, 49, 52, 50, 53, 100, 57, 99, 102, 50, 57, 49, 50, 49, 97, 146, 146, 196, 32, 194, 186, 11, 21, 24, 139, 17, 81, 192, 9, 11, 160, 49, 18, 115, 105, 52, 192, 14, 151, 32, 202, 110, 2, 54, 20, 37, 217, 207, 41, 18, 26, 220, 0, 48, 204, 148, 67, 204, 214, 204, 182, 204, 247, 49, 34, 204, 165, 125, 204, 243, 102, 204, 234, 50, 76, 69, 67, 204, 219, 204, 178, 46, 105, 204, 220, 81, 204, 210, 204, 193, 50, 204, 238, 7, 204, 245, 29, 94, 118, 204, 232, 36, 204, 161, 124, 204, 221, 77, 78, 204, 244, 204, 223, 204, 184, 204, 223, 204, 194, 204, 134, 54, 15, 204, 165, 204, 250, 175, 49, 50, 55, 46, 48, 46, 48, 46, 49, 58, 49, 50, 48, 48, 48, 146, 0, 217, 64, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 146, 220, 0, 48, 204, 173, 8, 204, 191, 87, 90, 92, 119, 102, 204, 219, 62, 40, 204, 236, 37, 204, 149, 204, 202, 204, 134, 204, 241, 204, 136, 204, 253, 31, 15, 204, 210, 204, 138, 36, 0, 21, 22, 35, 13, 204, 216, 84, 84, 5, 204, 148, 126, 204, 207, 204, 219, 51, 19, 40, 204, 190, 14, 109, 204, 161, 53, 63, 204, 204, 204, 176, 220, 0, 96, 204, 137, 204, 167, 204, 141, 204, 153, 113, 109, 69, 25, 53, 17, 118, 204, 213, 25, 204, 199, 83, 204, 229, 66, 204, 134, 204, 186, 50, 67, 4, 204, 154, 68, 109, 95, 204, 254, 52, 204, 207, 204, 194, 103, 71, 93, 99, 105, 96, 37, 59, 204, 171, 92, 37, 33, 46, 204, 232, 204, 135, 204, 165, 98, 204, 211, 12, 204, 159, 103, 76, 57, 204, 158, 204, 182, 9, 204, 145, 204, 189, 115, 49, 204, 141, 204, 247, 204, 218, 204, 135, 40, 25, 204, 219, 107, 28, 30, 106, 46, 6, 102, 204, 228, 20, 204, 193, 108, 204, 245, 204, 234, 95, 126, 204, 130, 204, 173, 117, 204, 130, 53, 204, 144, 204, 177, 204, 201, 122, 204, 182, 41, 204, 205, 44, 204, 239, 0]

msgpack robust for verifying

[129, 167, 99, 111, 110, 116, 101, 110, 116, 131, 163, 100, 115, 116, 129, 0, 217, 64, 102, 57, 50, 51, 51, 97, 51, 53, 56, 49, 49, 98, 57, 49, 49, 100, 54, 51, 52, 48, 50, 100, 102, 56, 57, 56, 57, 102, 56, 55, 98, 50, 100, 55, 100, 54, 99, 57, 51, 49, 98, 99, 98, 101, 52, 98, 98, 101, 55, 55, 52, 97, 48, 101, 57, 52, 57, 100, 102, 98, 55, 102, 99, 49, 167, 100, 115, 116, 95, 107, 101, 121, 220, 0, 48, 204, 173, 8, 204, 191, 87, 90, 92, 119, 102, 204, 219, 62, 40, 204, 236, 37, 204, 149, 204, 202, 204, 134, 204, 241, 204, 136, 204, 253, 31, 15, 204, 210, 204, 138, 36, 0, 21, 22, 35, 13, 204, 216, 84, 84, 5, 204, 148, 126, 204, 207, 204, 219, 51, 19, 40, 204, 190, 14, 109, 204, 161, 53, 63, 204, 204, 204, 176, 167, 118, 97, 114, 105, 97, 110, 116, 129, 2, 130, 171, 101, 108, 100, 101, 114, 115, 95, 105, 110, 102, 111, 130, 165, 118, 97, 108, 117, 101, 130, 166, 101, 108, 100, 101, 114, 115, 129, 217, 64, 99, 50, 98, 97, 48, 98, 49, 53, 49, 56, 56, 98, 49, 49, 53, 49, 99, 48, 48, 57, 48, 98, 97, 48, 51, 49, 49, 50, 55, 51, 54, 57, 51, 52, 99, 48, 48, 101, 57, 55, 50, 48, 99, 97, 54, 101, 48, 50, 51, 54, 49, 52, 50, 53, 100, 57, 99, 102, 50, 57, 49, 50, 49, 97, 130, 169, 112, 117, 98, 108, 105, 99, 95, 105, 100, 146, 196, 32, 194, 186, 11, 21, 24, 139, 17, 81, 192, 9, 11, 160, 49, 18, 115, 105, 52, 192, 14, 151, 32, 202, 110, 2, 54, 20, 37, 217, 207, 41, 18, 26, 220, 0, 48, 204, 148, 67, 204, 214, 204, 182, 204, 247, 49, 34, 204, 165, 125, 204, 243, 102, 204, 234, 50, 76, 69, 67, 204, 219, 204, 178, 46, 105, 204, 220, 81, 204, 210, 204, 193, 50, 204, 238, 7, 204, 245, 29, 94, 118, 204, 232, 36, 204, 161, 124, 204, 221, 77, 78, 204, 244, 204, 223, 204, 184, 204, 223, 204, 194, 204, 134, 54, 15, 204, 165, 204, 250, 169, 112, 101, 101, 114, 95, 97, 100, 100, 114, 175, 49, 50, 55, 46, 48, 46, 48, 46, 49, 58, 49, 50, 48, 48, 48, 166, 112, 114, 101, 102, 105, 120, 130, 169, 98, 105, 116, 95, 99, 111, 117, 110, 116, 0, 164, 110, 97, 109, 101, 217, 64, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 165, 112, 114, 111, 111, 102, 130, 170, 112, 117, 98, 108, 105, 99, 95, 107, 101, 121, 220, 0, 48, 204, 173, 8, 204, 191, 87, 90, 92, 119, 102, 204, 219, 62, 40, 204, 236, 37, 204, 149, 204, 202, 204, 134, 204, 241, 204, 136, 204, 253, 31, 15, 204, 210, 204, 138, 36, 0, 21, 22, 35, 13, 204, 216, 84, 84, 5, 204, 148, 126, 204, 207, 204, 219, 51, 19, 40, 204, 190, 14, 109, 204, 161, 53, 63, 204, 204, 204, 176, 169, 115, 105, 103, 110, 97, 116, 117, 114, 101, 220, 0, 96, 204, 137, 204, 167, 204, 141, 204, 153, 113, 109, 69, 25, 53, 17, 118, 204, 213, 25, 204, 199, 83, 204, 229, 66, 204, 134, 204, 186, 50, 67, 4, 204, 154, 68, 109, 95, 204, 254, 52, 204, 207, 204, 194, 103, 71, 93, 99, 105, 96, 37, 59, 204, 171, 92, 37, 33, 46, 204, 232, 204, 135, 204, 165, 98, 204, 211, 12, 204, 159, 103, 76, 57, 204, 158, 204, 182, 9, 204, 145, 204, 189, 115, 49, 204, 141, 204, 247, 204, 218, 204, 135, 40, 25, 204, 219, 107, 28, 30, 106, 46, 6, 102, 204, 228, 20, 204, 193, 108, 204, 245, 204, 234, 95, 126, 204, 130, 204, 173, 117, 204, 130, 53, 204, 144, 204, 177, 204, 201, 122, 204, 182, 41, 204, 205, 44, 204, 239, 174, 112, 97, 114, 115, 101, 99, 95, 118, 101, 114, 115, 105, 111, 110, 0]

Side by side

Here’s the leading bytes with the first line bytes to sign, second line bytes to verify

bincode:

[0, 0, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0, 102, 57, 50, 51, 51, 97...
[0, 0, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0, 102, 57, 50, 51, 51, 97...

msgpack compact:

[     147, 129, 0, 217, 64, 102, 57, 50, 51, 51, 97, 51, 53, 56...
[145, 147, 129, 0, 217, 64, 102, 57, 50, 51, 51, 97, 51, 53, 56...

msgpack robust:

[                                            131, 163, 100, 115, 116, 129, 0, 217, 64, 102...
[129, 167, 99, 111, 110, 116, 101, 110, 116, 131, 163, 100, 115, 116, 129, 0, 217, 64, 102...
2 Likes

This is all great info again @mav we have the guys looking into this as well. @lionel.faber did some digging with msgpack and did notice differences similar to what you are describing here. It seems msgpack uses labels for structs and bincode uses a unique identifier. We do need to look at what we serialize to sign and perhaps make that a struct specifically serializable and then a wrapper struct that adds the signature for sending across the wire.

2 Likes

I resolved the sign/verify serialization issue in routing (a hack, will not elaborate!) and can now start a baby-fleming section using msgpack on all the vault messaging.

Next I’m looking at uploading and downloading a file using msgpack in safe-api (and the dependencies).

edit: bit of a facepalm moment, I’ll hide the unnecessary stuff in a details tag… turns out I wasn’t replacing safe-authd binary so authd was still trying to talk in bincode to the vaults.

I'm just setting the first foot into this territory but it looks like maybe some difficulty with deserializing some safe-nd enums.

Errors are being thrown by rmp_serde at handle_new_message L352.

Error parsing message Syntax("unknown variant `Response`, expected `Ok` or `Err`")
Error parsing message Syntax("unknown variant `Notification`, expected `Ok` or `Err`")

These bytes are serializations of safe-nd Message enum. They don’t seem particularly complex, I’m still not sure why they are not deserializing correctly.

The part that seems to be asking for further inspection the Message::Response enum on L22 which is a long list of data types all wrapped in Result. I think maybe the Result wrapper is maybe messing with msgpack? Not sure but I want to check it out a bit more. For example, enum #17 is Transaction on L73 which is used for account creation with test coins.

But the head-scratcher for me is Message::Notification which is just a heavily nested pair of u64s. It seems like it should deserialize easily.

An example of Message::Response msgpack bytes:

[129, 168, 82, 101, 115, 112, 111, 110, 115, 101, 130, 168, 114, 101, 115, 112, 111, 110, 115, 101, 129, 171, 84, 114, 97, 110, 115, 97, 99, 116, 105, 111, 110, 129, 162, 79, 107, 130, 162, 105, 100, 207, 220, 57, 236, 105, 33, 200, 56, 162, 166, 97, 109, 111, 117, 110, 116, 207, 0, 0, 0, 232, 219, 51, 135, 128, 170, 109, 101, 115, 115, 97, 103, 101, 95, 105, 100, 220, 0, 32, 112, 204, 254, 70, 34, 79, 76, 204, 230, 51, 70, 123, 39, 95, 204, 252, 204, 190, 204, 151, 204, 161, 204, 195, 25, 204, 167, 31, 94, 6, 204, 213, 113, 51, 108, 204, 159, 126, 39, 204, 240, 47, 23]

An example of Message::Notification msgpack bytes:

[129, 172, 78, 111, 116, 105, 102, 105, 99, 97, 116, 105, 111, 110, 129, 172, 110, 111, 116, 105, 102, 105, 99, 97, 116, 105, 111, 110, 130, 162, 105, 100, 207, 220, 57, 236, 105, 33, 200, 56, 162, 166, 97, 109, 111, 117, 110, 116, 207, 0, 0, 0, 232, 219, 51, 135, 128]

Will look further into this tomorrow. Feels very close to working now. I’ve got benchmarks for bincode and am keen to compare how msgpack performs. It’s one of those things where 95% of the work took 5% of the time. I felt benchmarking bincode vs msgpack would be a quick job, but the last 5% have been a nice puzzle to dig into. It’s so cool when there’s a nontrivial answer to ‘what is hard about doing this seemingly trivial thing’.

One question I have is why is authd a separate binary. I can appreciate some people will want to only run authd, that’s fine we can give them the authd binary, but since all people running the safe binary will want to run authd why not package it with that (ok I digress readonly users won’t need authd). Seems overall a lot more ergonomic to bundle authd logic in with safe rather than keep it separate.

eg

  • powe ruser -> download authd and interface to Safe Network directly from that.
  • normal user -> download safe and directly interface to Safe Network from that. No need for safe auth install | update. The need for safe auth start | stop | restart can be inferred from behaviors (eg need to be logged in complete an action? start authd automatically). Seems a lot simpler to me.

As it is now, normal user is exposed to extra complexity by the separation of safe and authd, albeit neatly-wrapped complexity. Seems like it should be the other way around and power user is the one that should be exposed to complexity.

I feel I must be missing some historical or practical reason for authd being the way it is. Can anyone fill me in?

The readme says “[safe-authd] is normally shipped as part of the package of an Authenticator GUI, like the SAFE Network Application, and therefore SAFE users and SAFE app developers don’t need it or worry about since the SAFE API already provides functions to interact with the safe-authd , and the SAFE CLI also has commands to do so.”
But this doesn’t really clarify for me why it is a second separate binary? I guess safe-authd being ~20M means less duplication is better…? Not a major point, just a curiosity.

2 Likes

I’ve got the basic test working with messagepack, ie start baby-fleming network, upload 10MB file, download 10MB file.

However I’m still not ready to show benchmarks comparing the two.

There’s another subtle aspect of messagepack to be considered which is massively inflating the message size (1.5x larger messages leading to around 3x the amount of bandwidth used).

Immutable data values are defined in safe-nd/src/immutable_data.rs L30 as Vec<u8> types.

This is encoded in messagepack using the array format which leads to 1.5x the size of the original Vec. The array format means each object in the array is encoded individually. Looking at the int format spec, a number between 0-127 will be encoded as a ‘fixnum’ into a single byte. A number between 128-255 will be encoded as a ‘uint8’ in 2 bytes. So about half the original bytes are encoded in a single byte and about half are encoded in two bytes. This gives the expected 1.5 increase in size.

Here’s the serialized size (in bytes) for nine 1MB immutable data chunks, comparing messagepack arrays with bincode:

robust compact bincode
1,572,763 1,572,713 1,048,749
1,573,256 1,573,206 1,048,749
1,573,139 1,573,089 1,048,749
1,572,639 1,572,589 1,048,749
1,572,672 1,572,622 1,048,749
1,573,716 1,573,666 1,048,749
1,572,765 1,572,715 1,048,749
1,573,526 1,573,476 1,048,749
1,573,298 1,573,248 1,048,749

The way to avoid this is to use the bin format which is meant for encoding bytes.

How to use bin format in rmp_serde instead of array format? Looks like using [u8] instead of Vec<u8> will be the answer (from reading this source). Where and when to do this is getting beyond the scope of the benchmarking, and the number of quick hacks to get a working test is beginning to make me feel too much erosion into the usefulness of any results.

I feel changing messagepack from array format to bin format will also make a huge difference in speed, since msgpack won’t need to parse each item in the array individually. Right now the performance difference is extremely bad, at least 3x worse, but hopefully the change makes performance between bincode and messagepack almost indistinguishable.

I’m going to stop here with converting bincode to messagepack. I feel I have learned enough about the difficulties and am confident there are no show-stoppers for the change, and the performance is not going to be impacted significantly. This thread outlines all the difficulties, of course I’m happy to answer any questions about it if anyone wants.

Something I wonder is whether it’s worth keeping bincode and simply expecting that format to be implemented in other languages in the future. It’s maybe possible, right?!

Another thing I wonder is if a schema based serialization would be beneficial. I’m not going to explore it in the code since it’s a really different and more complex thing to undertake, but capnproto and protobuf both seem to have benefits of being explicit in their definition rather than serializing whatever happens to be in memory at the time (although I can see they are not a magic bullet and my personal experience of protobufs via BIP70 was not as pleasant as, say, json). The description of protobufs is quite apt though:

The design goals for Protocol Buffers emphasized simplicity and performance.

Protocol Buffers are widely used at Google for storing and interchanging all kinds of structured information. The method serves as a basis for a custom remote procedure call (RPC) system that is used for nearly all inter-machine communication at Google.

Though the primary purpose of Protocol Buffers is to facilitate network communication, its simplicity and speed make Protocol Buffers an alternative to data-centric C++ classes and structs, especially where interoperability with other languages or systems might be needed in the future.

edit:

The way to get rmp_serde to use binary format instead of array format is serde_bytes

It gives the expected improvement in results. The data is only 50-100 bytes more when serializing an immutable data chunk, probably even slightly less when serialization is diligently done rather than my proof of concept.

robust compact bincode
1,048,845 1,048,795 1,048,749
1,048,865 1,048,815 1,048,749
1,048,850 1,048,800 1,048,749
1,048,840 1,048,790 1,048,749
1,048,854 1,048,804 1,048,749
1,048,855 1,048,805 1,048,749
1,048,857 1,048,807 1,048,749
1,048,855 1,048,805 1,048,749
1,048,847 1,048,797 1,048,749
1,048,852 1,048,802 1,048,749
3 Likes

The change from Vec<u8> to serde_bytes::ByteBuf for serialization allowed reasonably fair comparisons to be made between msgpack and bincode, so I figured after this fairly long experimental project why not put up some results, even though I feel they are just getting the vibe of the thing rather than a proper production level comparison.

Still, there’s a lot of caveats, but we can see that the results are at least within the same ballpark as each other.

Caveats

  • Not all Vec were serialized with BytesBuf, only the immutable data chunks. There are definitely still some Vec data causing slowdown due to the technicality in post #19, but they are much smaller than the immutable data.

  • The relationship between safe-cli, safe-authd and safe-vault during create-acc and login was too hard for me to unravel; the create-acc command seemed to work correctly but login would fail with SafeNd::AccessDenied.
    Whether this was because vaults during create-acc were receiving or returning bad data, or whether it was authd mishandling account creation, or if login was sending bad data, or if data was being misread by vaults during login, I couldn’t work it out, but the impact was mitigated by removing the ownership test during login on safe-vault/src/client_handler/login_packets.rs L331. This means the full chain of communication was still carried out, but the validation was not performed. It returned a login credential to the client and that credential allowed upload and download to happen. So this quick fix retained all the logic (except one public key comparison) and all the communication and I expect does not significantly affect the benchmarks.

  • The bytes being signed vs verified during network startup differ due to the data types as described in post #15 and #16. I fixed this by doing a second serialization to get the original bytes for verification. This means some minor extra work by reserializing the data, but no change to bandwidth. I don’t see this impacting the benchmarks in any meaningful way.

  • I put a lot of extra log messages in at the info level, and ran the benchmarks without any RUST_LOG environment variable. This should have no impact on the benchmark. At one point I was logging the full message bytes, which slowed down things a lot when logging every instance of a 1MB chunk data being received and sent, but other than that all the log messages were very light-weight.

  • The rmp_serde serialization uses with_struct_map and with_string_variants (detailed here) which produces slightly more bytes than the alternative compact form of with_struct_tuple and with_integer_variants. This has a small impact on the number of bytes sent and computation when deserializing, but I’m sure as a percent of total bytes it’s insignificant for the benchmarks.

Baseline configuration

This is running the latest public version, a couple of months old now but still fine for this test since it’s all based on the same baseline set of libraries from 2020-07-17.

safe_vault v0.25.1 commit:5d5c214deb9b48dbb6ec532ca13e39a338eb0eff
safe_api v0.15.0 commit:0feadec0d0c196f57458b916d1c9b491e396906b
routing v0.37.0 commit:97cc0cdb6acf073fb53b9f679181176005db1f2b
quic-p2p v0.7.0 commit:a1ed4c4284445194caad7728f065ccc851c4e3ba
safe-nd v0.10.1 commit:69dc14125413ac17ae36190e462b8a488be4be9d
safe_core v0.42.1 commit:9ff407420a1ef577c545d60881a9dea44488ed83
threshold_crypto v0.3.2 commit:485333db6e2611d26a2a921a6ee1e1dcc3ce4624
BLS-DKG v0.1.0 commit:be6ac3e3eb0835086f17b694f551118fa8bb0de8
parsec v0.7.1 commit:b71dfb3b8843f647c853fdc6925f670d67024cb7

The aim is to replace bincode 1.2.1 with rmp-serde 0.14.4 in all these libraries and binaries.

Test Setup

The test is to start a baby-fleming network, create an account then upload a 10MB file.

safe vault run-baby-fleming
safe auth start
safe auth create-acc --test-coins
safe auth login --self-auth
dd if=/dev/urandom of=/tmp/10MB.dat bs=1M count=10
time safe files put /tmp/10MB.dat
safe auth logout
safe auth stop
safe vault killall

As a one-liner:

export SAFE_AUTH_PASSPHRASE=insecure_test_passphrase; export SAFE_AUTH_PASSWORD=insecure_test_password; safe auth stop; safe vault killall; rm -r baby-fleming-vaults/; safe vault run-baby-fleming; safe auth start; safe auth create-acc --test-coins; safe auth login --self-auth; dd if=/dev/urandom of=/tmp/10M.dat bs=1M count=10; time safe files put /tmp/10M.dat; safe auth logout; safe auth stop; safe vault killall;

Record how long the 10 MB file takes to upload.

Record the total bytes sent/received in the whole process (including network start) by adding an info! log message at try_write_to_peer for the sent bytes, and read_peer_stream for the received bytes. These values did not match in the results, so maybe an uneven level of abstraction was chosen for the logging, or maybe some bytes do actually get added somehow somewhere?

Results

The test was repeated 15 times.

First column is time to upload 10MB file in seconds.
Second column is total bytes sent during the entire process including network start, summed for all 8 vaults (but does not include bytes sent by safe-cli or safe-authd).
Third column is same as second but for received bytes instead of sent.

Bincode

time bytes sent bytes received
20.438 1,626,054,744 1,688,888,357
19.878 1,615,885,458 1,668,225,507
19.694 1,599,402,321 1,651,745,506
9.163 557,369,184 588,839,703
9.209 540,505,739 571,977,219
23.138 1,980,864,132 2,043,689,887
23.858 1,897,629,676 1,949,978,448
20.458 1,630,496,738 1,693,327,909
24.13 1,976,827,964 2,050,145,304
8.841 544,707,726 576,178,599
14.401 920,677,634 962,593,481
19.937 1,599,231,640 1,648,424,177
12.082 822,081,059 860,880,802
20.686 1,626,114,530 1,688,946,933
19.904 1,570,649,608 1,633,482,673

MessagePack

time bytes sent bytes received
23.935 1,592,763,338 1,645,168,284
40.003 2,474,794,093 2,537,684,579
25.734 1,689,742,737 1,742,145,731
14.148 830,543,137 862,046,755
41.292 2,987,210,155 3,038,579,795
18.863 1,240,281,603 1,282,242,472
24.938 1,682,991,584 1,735,395,933
36.298 2,422,796,021 2,485,687,739
18.357 1,201,547,567 1,243,539,603
306.491 4,148,463,804 4,211,268,769
34.879 2,242,732,855 2,305,631,064
14.097 837,220,247 868,707,015
25.029 1,675,382,112 1,727,786,371
41.233 2,448,362,431 2,511,251,434
41.434 2,998,209,902 3,055,874,989

This is not so simple to interpret as just taking the average or median.

I suspect there are some cases where there is less overall work, eg in the bincode table cases between 9 and 15 seconds, vs other cases where there is overall more work eg between 15 and 25 seconds. So it may be that there are really two situations being shown in the one table. Should we compare them as equals? Or should we split them into two sets? It’s not obvious to me how to treat the results.

The MessagePack table seems to have overall a) larger values b) greater variance. So I think it’s fair to say messagepack isn’t as efficient as bincode, which is what we’d expect and the numbers do show that.

Can we make decisions about future work from this test? Probably not. But I can say from the amount and nature of the work required to get a basic test running that the conversion is not a mechanical task of changing one library function to a different library function. It takes some careful thought and analysis when things don’t work as expected. And discovering when things don’t work requires good test coverage. I was surprised this task became as complex and detailed as it did. It seemed to me that serializations were relatively fungible but no, they are definitely not!

4 Likes