Secure Random Relocation

rob · January 12, 2018, 1:42am

If you are specifying the section to get key for then it is similar to brute-force a password.

Two sections is like brute force of 1 bit password. 256 sections is like brute-force of a one character password etc.

Not specifying the section would remove the brute-force label though.

And HD keys would be clean and no “lets gen a key and see if it fits” uncouth way.

Thanks for that @mav

dirvine · January 12, 2018, 1:47am

Yea I see that different. With this logic if we fill an array of 32 bytes in a loop then it would be brute force filling an array. To me brute force would be a significant & unknown amount of work and time.

I think here there is possibly more work involved if we force relocation section but less work involved if we used an HDkey to get that section. However the relocating group has a wee bit more work as it then has to get the key from the peer when it is created, agree to it (via consensus) and then send the messages to the new group to expect this peer. If we choose a section and send the expect message to the new section, then the peer does the extra work.

dirvine · January 12, 2018, 1:50am

Nice work chap, It may be simpler though. I mean if the network splits into X sections then to get into a section we should only need to generate a max of X keys to get into a section. Well if the key generation is random (as random as SHA etc.).

mav · January 12, 2018, 1:53am

It’s like bitcoin mining though. You never know what the start of the public key is so you just have to keep trying. Sometimes you’ll be unlucky and it takes more than X guesses. The 50% column is the best one to use since it represents the average number of guesses to find a key.

Fraser · January 12, 2018, 2:08am

I guess that’s where using a random byte array provided by the source section and XORing this with the public key would win over the HDKeypairs approach - there’d be mininal cost to the target section and the relocating peer.

The random relocation will have an extra cost of usually being multi-hop, but all these costs will probably be insignificant compared to the effort required to bring the relocated peer up to date once it connects to the target section.

Qi_Ma · January 12, 2018, 7:29am

I think the issue is different here. we only need a pub-key with a predictable pattern, which allows us to carry out guess in advance.

As mentioned above. so basicaly you just create a look-up list, and put any key-pair you generated into it, filter out those fall in the same prefix length.
Then when you being asked to relocate, you only need to pick any stored key that having same or longer prefix-length.

A 1GB look-up list could gives you 16M entry slots (only need to store priv-key, assumed to be 512 bits long), i.e. 16M sections. That’s a network size over 0.8B (avg 50 nodes per section as in your table)
And this file can be further indexed by a 2MB block, with each bit represents whether the slot contains a candidate or not.

Fraser · January 12, 2018, 9:37am

So, you’re saying if our section’s Prefix is e.g. 000, we cache a total of 8 keys, one for each of 000, 001, … 111 on the reasonable assumption that all sections across the network have the same prefix length as us? Then if our section splits, we create more keys until we have a total of 16 cached covering 0000, 0001, … 1111?

So you mean we generate a key pair and if the public key matches an empty slot in this key cache file we store only the private key in the cache? Then if we need it, we regenerate the public key from the private one?

Assuming I’m understanding this approach correctly, I’m not really that keen on it to be honest. I don’t like the cost of the background keygen work, nor the overhead of managing the cache (we’d presumably want to encrypt the cache file if it contains a pile of private keys?) Also, come time to relocate, if the cache didn’t contain a suitable key we’d be stuck with having to create a keypair on the fly, i.e. more or less back at square one.

Qi_Ma · January 12, 2018, 10:05am

it does’t store 8 keys first, then 16 keys, then going on. It just assume the max prefix length at beginning and then store any keys. i.e. we store 01011101110100111000110001... at the beginning, then use it for 010 or 01011 etc.

This is kind of sacrifice space for time, and does have the risk of miss hit when network grows super large (unless you want to create a super large lookup file)

I agree this may incur other concerns, such as protection of private key. So this shall be totally optional.

Regarding store private key only, seems I am wrong with that. It has to be the key-pair to be stored.
Then in this case it shall be a seed to be stored, and the user has to use a seeded key generator?

Fraser · January 12, 2018, 10:27am

No, I think libsodium provides the functionality to extract the seed and public key from just the private key. I’m pretty sure that’s not exposed through rust_sodium, but it should be trivial to add it.

ustulation · January 12, 2018, 11:31am

Maybe just worth mentioning here about an orthogonal case which could make using HDKey(like) concept more difficult: there’s a rule that each time you get from Adult to Elder (promotion) you need to come up with a new Public-key (which is your ID) in order to make the Live/Gone blocks unique and non-replayable. This new Key ofc has to be chosen in the range given by the Elders who are promoting you and you will need to permute to fit that range. Without this you can have, within the same current prefix, Live(A) -> ... -> Gone(A) -> ... -> Live(A) where Gone(A) can now be replayed (or even Live(A)). So to make it unique, the currently discussed plan is to have Live(A) -> .. -> Gone(A) -> .. -> Live(A-new) -> .. -> Gone(A-new) -> .. -> Live(A-new-new) kind of a strategy. If we stick to this rule then we were contemplating that there will be places where we would require iterative key-generation to fit a range anyway.

Fraser · January 12, 2018, 11:38am

Again, the random bytes XORed with the existing public key would allow for unique Live blocks at minimal cost to all involved. In this case, the leading (most significant) bits of the random data would be tweaked to ensure the promoted peer’s new ID fits the target address range.

ustulation · January 12, 2018, 12:02pm

Yes with a slight side-effect of your ID no longer being you public-key (requiring a separate ID by XORing PK+RandXOR and actual PK to verify us) - which may/may-not be that bad a thing.

ustulation · January 12, 2018, 12:07pm

This is the part that makes me most uncomfortable about the background work - somehow in my mind it’s highly likely to not fit once a range constraint is given.

dirvine · January 12, 2018, 12:45pm

I am unconvinced that the work required to generate the correct key is beyond what a node should be able to do easily.

mav · January 12, 2018, 10:37pm

Good point. I guess my bias to worst case scenario is coming through.

The bitcoin-style lottery only applies to finding the first key from a fresh key cache, but as you say previous guesses can be stored to make future relocations hopefully faster via lookup.

My concern is that if I’m running multiple vaults (which I will) it makes good sense to share the key caches. And then it’s an obvious step to share them with trusted friends. And then share them with other people (maybe for a price). And then message security becomes questionable but not in any obviously noticable way. And then a major key provider suddenly realises they control most of the messaging on the network. And then the major key provider gets hacked. And so on down the slippery slope.

Individual vaults will discard duplicate prefixes, but this is a waste. It’s more economical to keep the duplicates but since the person generating them can’t use the duplicates they must try to sell them so the work isn’t wasted. So this is a pretty clear incentive to sell keys if you possibly can to avoid discarding duplicated work.

Because work is not discarded, people who start generating keys early have an advantage over those who start generating keys later. It creates an incentive for early participants to establish many vaults to make long section prefixes so newcomers require lots of initial work and the work of early participants gives them an even greater headstart. If I had a lot of resources I’d definitely want the network to grow large to push smaller participants out of viability. Sounds a lot like the direction of bitcoin mining to me.

Back to the optimist angle, this only applies to very large networks, since key generation is viable even for ‘quite large’ networks. But I feel by the time this becomes a problem it’ll also be the hardest time to be solving it. Best to address it beforehand.

Agreed except for encrypting the cache, since the process managing the key finding presumably runs at the same security level as the vault itself, which has the keys and must be secure in the first place.

I think storing privkey + pubkeyprefix is enough. Pubkey can be derived from privkey, but there needs to be a way to quickly look up pubkeyprefix.

Doesn’t the block include the prior block hash, making every block unique? How are they chained? Doesn’t the ‘chain link’ make every block unique, thus preventing replay?

I’m almost there, but not quite.

The amount of work, sure, it’s probably doable most of the time by most computers.

But the incentives that arise becauase of it are terrible. It’s a waste of time and energy on a not-productive-to-end-users exercise. It’s inelegant. It’s poor design and engineering. It’s less scalable.

The table of guesses can be used to see how many vaults really struggle to find a specific key.

For example a network size of 10K vaults, using the row for prefix length 8:

              | probability of generating a valid key  |
              |      0.1      0.5    0.9   0.99 0.9999 |              
Prefix Length |                                        | Sections Vaults
            8 |       27      177    588     1K     2K |    256    13K

This means 10% of vaults (ie 1000 vaults) will find a key within 27 guesses, or rather, 90% of vaults will not find a key within 27 guesses.

50% of vaults will not find a key within 177 guesses.

10% of vaults will not find a key within 588 guesses.

1% of vaults (ie 100 vaults) will not find a key within 1K guesses.

0.01% of vaults (ie 1 vault) will not find a key within 2K guesses.

This idea becomes quite significant at higher network sizes, eg 7B vaults means 700K vaults will not find a key within 1B guesses. This growth in failure as the network grows is quite daunting.

But by the same token, to be an optimist, 1B guesses is about 10h on todays hardware, so in less than half a day almost all vaults would be able to find a key and be able to take part in a 7B node network. That’s pretty fine by me.

But a 7T node network? I think problems will start. Do we care about 7T networks? I’d like to think so

Thinking pragmatically, it’s probably worth considering whether the proposed changes are needed from the start or whether it’s possible to add them later. Does it affect consensus? Can two modes of operation be run side-by-side and one phased out gradually? It’s a complex question which I haven’t thought about yet. But it does affect what work gets done now vs later so I think it matters.

Ultimately I feel it comes down to the economic model for safecoin and how the incentives affect network size. If the network never becomes massive then keygen isn’t a problem. But if it becomes massive then keygen will get ugly. The size of the network imo only depends on the as-yet-undesigned safecoin incentive structure.

Exponential growth is real. We can’t ignore it.

rob · January 12, 2018, 10:54pm

What do you mean by today’s hardware. RPi3? or 8th gen i7 10+ core?

That 10hour might be 100 or more hours on a RPi3. I’d say if we can be seeing vaults like commodity items then we may reach the greater than a billion vaults within a reasonable time. But if it takes days or a week (== 10Hr on PC) to join the network then that is a worry.

If this one piece of work prevents SBCs like RPi3 which will be with us for at least 5 more years, then we will lose a major source of vaults and if we limit it to PCs then its a reducing market.

If we want phones then we need to get rid of the “brute-force” method and your HD keys method still seems to be the better way.

We have to think in terms of each >2TB drive produced could eventually have a small SBC attached and be a vault.

Its not a case of thinking of what size it will in reality be, but planning for the best outcome. The best outcome is that every internet connected home will have a SBC+drive vault. Just like every home used to have a rotary dial phone and those who didn’t wanted one. (at least in the 1st world)

And its worth noting that if we are to distinguish ourselves from blockchain then we cannot be wasting energy on a brute-force style method where a logical method does the job easier.

The cry will be “energy wasters”. Its just like blockchain mining. And its a billion nodes wasting energy not a few hundred. (yes people won’t consider its only 10 hours to 100 hours the energy is used)

Fraser · January 12, 2018, 11:43pm

If the keys were held in memory I’d agree, but I believe Qi’s proposing holding the cache as a file on disk. In that case I think it’s safer to just encrypt the cache.

Don’t want to drag us off topic here, but the current design doesn’t include the hash of the previous block. Some brief points which hopefully don’t confuse more than help:

an absolutely strict order isn’t required (say peers A and B both drop at the same time from the same section, then some remaining peers could validly record this as A first and others as B first - both perspectives are valid)
the signers of a block help in identifying where that block can be valid in the chain
the type of the block helps in identifying where it fits in the chain (e.g. if an elder peer leaves the section either via relocation or due to dropping, then the following block should be promoting an adult or infant to replace it)

+1

Yes, this is so often the case! I feel the network will be most vulnerable to attack while still relatively small. Maybe not in terms of the number of individual attack attempts but in terms of difficulty (given the ability to run x malicious vaults, this will be a higher proprtion of a small network) and in terms of the impact of a successful attack. So there’s a strong incentive to get things secured properly from the start.

Optimisation is a different thing entirely. Assuming a given design is sub-optimal but still secure and feasible for a small/medium sized network, then it’s much easier and probably more sensible to postpone optimising.

I feel the issue we’re discussing here though is a bit of both. The brute force keygen is a sub-optimal design in my opinion, but the targeted relocation is a security concern. Even that’s over-simplifying the case - not using targeted relocation could prove to be infeasible if we end up with highly imbalanced sections!

I guess I’d be more comfortable if we even took the middle ground and went for random relocation and stuck with brute force keygen. But I still feel we can do better than that from the outset! I’d need to do a lot more reading about HDKeypairs to be fully convinced, and even then I think I’d still be slightly more in favour of the random data being built into a peer’s ID since it seems to have a couple of benefits over the HDKeypair approach.

Fraser · January 12, 2018, 11:56pm

While I do sort of agree here, I think this is probably overstating the case. Mainly because the brute-force method will only be used relatively infrequently when compared to bitcoin’s proof of work applied to every single block in the chain. Bitcoin mining is mostly wasted energy, whereas our brute-force keygen energy sink should only be a relatively small proprtion of the overall farming effort.

rob · January 13, 2018, 1:44am

I am thinking of the nay sayers overstating the case, and while 90% of the population do not understand the mechanics some/many/most? will end up believing the shock-jocks. I personally realise the energy wastage is not much in the scheme of things, but we will end up lone voices against the few who spread that claim of energy wastage.

My major concern is with the time taken for new/restarted nodes being relocated to do this brute-force when the network is of decent size.

I am thinking that for a few years while RPi3 (or 4) is a typical cheap SBC that we may make them unsuitable for connecting to the network, even though they can operate as a Node under all other conditions. I am also thinking that if we want smart phones to be capable of being nodes (on charge/wifi) that we need to be very mindful of the time required. No good for smartphones if they cannot even connect (relocate) while on charge overnight.

Also I feel and am repeating myself it seems that we might want to be able to have “smart drives” where one puts a RPi3 (or equivalent) on top of a Hard Drive (or SD card) and these become a major method of getting vaults/Nodes into the homes. Imagine paying 20$ + cost of Harddrive or SD card and the home becomes a node in the network and the homeowner gets paid for it.

If it requires a i7 PC to get connected in reasonable time then adoption is going to be real slow. Remember most people still turn off their home computers/laptops every day. Having a SBC Node solves that issue and is a very low energy device which solves the ongoing cost.

mav · January 13, 2018, 4:04am

Maybe we’ll see usb keygen asics, like we saw consumer-grade usb bitcoin miner asics. So rpi + hdd + keygenasic could be a solution…? Not ideal but definitely possible.