Data Chains - Deeper Dive

dirvine · December 10, 2017, 11:35pm

There is a little more to add. It may make it easier, or worse I never know. Anyway here goes.

There are several consensus mechanisms in play, the two main ones are

Running Network Consensus
Off-line consensus (data chain).

On the network a group can tell another group something and it will be believed. The other group is the next group on the binary tree (the xor space). So from A->B->C A cannot really tell C (usually) something as it’s not connected to it, so it tells B who is connected to it and can also see the membership changes etc, Then B tells C and so on. That is pretty much running consensus.

Then chain consensus where there is a progression of membership change events in the voting members of a section, this is the Elders as you point out (thanks). As each membership change happens it must be signed by the majority of the previous membership. This can be tightened to beyond majority where peers are penalised for not voting (but later, just take that as a future thing).

There is a lot that can happen now, so a client for instance on joining the network can connect to it’s close_group (which will also allow push notifications etc. so likely what will happen) and keep the last valid Block from that chain. On rejoin the chain in that section must progress form that Block forward, cryptographically secured. This amongst many other things allows us to know we are on the right network etc.

If the network restarts clients and peers do a similar thing, go back to the section they were in and things are again OK. Or they can be given data signed by the last known valid Block or a future generation of that (if they have been off for a while). This ignores the issue that many clients sign data so will be certain its their data via that route as well.

Back to running network consensus and chain consensus for a second. As the chain comprises unique blocks with unique membership at the section level we can do a trick. that is convert running network consensus into chain consensus. All we do is Vote as the current Elders on the event/data/upgrade etc. and we can create a Block for that event. It can sit on top of the chain signed by the most recent Elders and all is well in the world. So we can basically store any descriptor or data or pretty much anything like this. Converting to local (chain) consensus and locking it into the chain.

This trick also has another side effect, for a peer to join a section it can be given all data, in fact all recent Blocks and give he section Elders the Votes for all the Blocks. In terms of data it means they must get all the data and provide signatures of it to us. This replaces resource proof with a different work, this work is the network checking, can this peer get all this stuff and sign it. So then we know the peer is good and holds the data, it signed to say so. Now peers joining a section have worked hard, but have signed they have the data, so when we tell them to deliver data to a client and they do not, then there is no excuse, they should be punished (probably killed).

It goes on a good bit, but this is where I think we can start to really see the benefits of such a chain. Also as membership blocks are unique, we can prune the chain very efficiently. No need to grow forever.

There is a lot more but it is pretty exciting now to see many more folks getting involved in the thought experiments here.

happybeing · December 10, 2017, 11:44pm

Great post @JPL thank you.

I recall a post or two by David when he first had the idea which might be a basis for this. I may have saved them the links.

I haven’t tracked down the post I’m thinking of yet, but here is where Datachains / Data_chains were first discussed I think.

JPL · December 10, 2017, 11:54pm

Thanks David. Turning in now but I’ll read and digest tomorrow.

dirvine · December 11, 2017, 12:03am

No worries John, thanks a lot of the precis though, really great.

polpolrene · December 11, 2017, 11:43am

Good one John! Just gave it a 1 time read (the OP) but will certainly need more reads to get my head a bit around it. This really shows how complicated it is to make this whole idea of groups and splits and merges autonomous. Try to imagine how much easier this gets with “MaidSafe Trackers” that act as servers providing service to nodes and controlling all the sections and Vaults like BitTorrent etc. But that isn’t the idea is it ?

Quite amazing when this stuff is tested and active. It’s a bit like the Linux Kernel… It takes quite some time to create it and perfect it, but when it got stable it started to pop up everywhere because it’s so good.

chpio · December 11, 2017, 2:48pm

My guess is: when an elder dies it gets replaced by another “live” elder, thus the order.

Yeah, there is no way to reorder the datachain (that’s why its called “chain”).

mav · December 12, 2017, 6:23am

I have some questions and observations. The definitions are really dense and full of subtle implication; there must have been a lot of very detailed conversation to refine them into such solid little nuggets. I think many probably don’t see how much work these definitions represent, but it’s a seriously vast amount of work. They’re all intensely deep-meaning terms.

The primary sources I’ve been using are

The OP of this topic https://forum.safedev.org/t/data-chains-deeper-dive/1209

The Disjoint Sections RFC https://github.com/maidsafe/rfcs/blob/c63ff6636d0dc21d6939856cf3a32445c3855b8c/text/0037-disjoint-groups/0037-disjoint-groups.md

Bart’s simulation https://github.com/fizyk20/ageing_sim - a really excellent resource.

“Complete group: GROUP_SIZE peers with age >4 in a section.”

Is this ‘exactly groupsize’ or ‘at least groupsize’?

Confusingly, the simulation defines is_complete as exactly groupsize elders (which is confusing because elders may sometimes be age<=4). Furthermore, the simulation says all elders are adults (ie infant elders are treated as adults) but presumably only for the purposes of is_complete.

Some clarification of the definition would be helpful here.

“Elders: the GROUP_SIZE oldest peers in the section.”

Can the section have more than groupsize elders?

Is the tiebreaker only applied to network events, or is it also to elder membership?

I’m not clear on how to use the tiebreaker. I’ve reworded it in a way that makes sense to me.

Original:

“If there are multiple peers of the same age then XOR their public keys together and find the one XOR closest to it.”

My rewording:

If there are multiple peers of the same age then XOR their public keys together to get the public key xor distance (pkxd). Select the peer with the public key closest to pkxd (measured by xor distance).

Is the final ‘it’ in the original definition the same as ‘pkxd’ in the reworded definition, or is ‘it’ something else?

If ‘it’ is pkxd, the tiebreaker algorithm seems useless to me:

pk1 = 44225
pk2 = 98363
pkxd = pk1^pk2 = 77050
pk1d = pk1^pkxd = 98363
pk2d = pk2^pkxd = 44225

May as well just pick the largest public key value rather than do the xor distance comparison routine.

Can a formal definition for Network Event please be given? I suspect it is:

elder joins
elder departs (leaves network or relocated)
section splits
sections merge

See fizyk20/ageing_sim/churn.rs for the list of these events.

From the code it seems any joining or departing node (ie not necessarily an elder) also triggers a network event.

See add_random_node and drop_random_node for the triggering of network events for non-elder nodes.

Not a question, just a point of observation

Sibling: A section differing from us in the last bit.

It’s worth explicitly clarifying that a sibling may be multiple children sections of the sibling prefix. This means if the sibling section has already split, there are still sibling vaults (just that they’re formed by multiple sections). I feel the definition doesn’t cover this (but probably doesn’t need to).

A similar idea applies to neighbours.

More generally, any prefix may specify a single section or multiple sections. There may be a section with exactly that prefix on the network, or one with a shorter prefix, or multiple child sections with that prefix. I guess the point is there’s no such thing as an ‘invalid’ or ‘nonexistent’ prefix.

“When a section does not have a complete group, all oldest peers up to GROUP_SIZE are Elders and all section members influence node aging.”

What does ‘influence node aging’ mean? I understand an incomplete group may have infants for elders, but I don’t understand what that means beyond normal elder behaviour.

What is recd in handle_vote_recd?

Should I test for groupsize+buffer ‘elders’ or ‘adults’ in each hypothetical new section when testing for a split?

Likewise does a merge happen when a section has less than groupsize elders or adults?

The simulation says should_split depends on adults only (ie not elders) and same for should_merge.

The Disjoint Sections RFC is the closest I can find to a formal definition of split and merge, but does not explicitly deal with ageing or elders vs adults.

Is the following disallow rule correct? Where is it defined?

fizyk20/ageing_sim/section.rs#L198

“disallow more than one node aged 1 per section if the section is complete (all elders are adults)”

Superb writeup.

One nitpick: age can be more than 4. This matters because during merge the older nodes take precedence when deciding the elders.

bart · December 12, 2017, 3:45pm

Hi! Thanks for your interest in the proposal. I think I can say for us all at MaidSafe that we are impressed with how deeply you have read the document

Just to note - this document is still actively being discussed, so large parts of it might still change - but we will update the thread every time some changes happen.

Bart’s simulation GitHub - fizyk20/ageing_sim: Simulation of node ageing in SAFE Network - a really excellent resource.

I’ll take this opportunity to say: thanks! Glad to see it proved useful to you

Is this ‘exactly groupsize’ or ‘at least groupsize’?

Confusingly, the simulation defines is_complete2 as exactly groupsize elders (which is confusing because elders may sometimes be age<=4). Furthermore, the simulation says all elders are adults1 (ie infant elders are treated as adults) but presumably only for the purposes of is_complete.

Some clarification of the definition would be helpful here.

It is “at least groupsize”. The point is that we want to have all Elders being older than 4, but when the network has just started its operation, there won’t be enough peers with such an age. Once there are enough of them, we call it “having a complete group” and start operating a bit differently (I’ll explain more in the answers to your other questions below).

What the simulation does is it checks whether we have GROUP_SIZE Elders in the section (we can have less if we are just starting up and we have less than GROUP_SIZE peers in the whole section - we obviously don’t have a complete group then) and if yes (&& operator), whether all of them are adults (which is defined elsewhere in the simulation as being older than 4 - in the proposal we actually use the term “Adult” differently, see below). So effectively we check whether there are at least GROUP_SIZE peers in the section and whether the GROUP_SIZE eldest peers are older than 4 - which is how a complete group is defined.

Regarding the meaning of “Adult” - in the simulation it’s just a peer that is older than 4, period. In the proposal, we use it as meaning a peer older than 4 not being an Elder. This is a difference that might be confusing, and we probably should have a separate term for the meaning I used in the simulation to be more clear.

One more potential point of confusion is in the results printed by the simulation - “Adult” is actually used there in the meaning from the proposal… This will definitely have to be fixed

Can a formal definition for Network Event please be given? I suspect it is:

elder joins

elder departs (leaves network or relocated)

section splits

sections merge

This part has been changing quite a bit recently, so no wonder you are confused We use the term “Network Event” for anything that happens to the section and will be recorded in the Data Chain - which, as you correctly guessed, is mostly the 4 kinds you wrote.

The code is actually based on an older version of the proposal, where if some events need different treatment (like an elder leaving and an elder relocating), we considered them separate Network Events. The current proposal focuses on the effect an event has on the section, so both elder leaving and elder relocating are now Dead. This might get updated in the simulation, but it’s rather low priority, as it’s probably not really important to the results.

From the code it seems any joining or departing node (ie not necessarily an elder) also triggers a network event.

In a way, yes. We usually call an Adult or an Infant joining an event as well, but it might not be recorded in the Chain and might not influence node ageing. To be precise, Infants are only recorded in the Chain and influence node ageing while we don’t yet have a complete group, and Adults - in the proposal’s meaning, so not being Elders - are never recorded and always influence node ageing. I’ll elaborate on that node ageing part later in the post.

May as well just pick the largest public key value rather than do the xor distance comparison routine.

You are right, provided that the tie is just between two peers, and it can often be between more of them. But nevertheless, what you pointed out is a significant issue, so we will modify that part - thanks for that!

We will actually use a hash of the XORed names, then choose the one closest by XOR distance to that. So in your algorithm, this would mean something like pkxd = hash(pk1 ^ pk2), with the rest unmodified.

Not a question, just a point of observation

Sibling: A section differing from us in the last bit.

It’s worth explicitly clarifying that a sibling may be multiple children sections of the sibling prefix. This means if the sibling section has already split, there are still sibling vaults (just that they’re formed by multiple sections). I feel the definition doesn’t cover this (but probably doesn’t need to).

I think we actually don’t want to extend the meaning this far. The point of the sibling is to refer to a section that a given section will merge with, if it needs to merge - which will always be a section with a prefix of the same length and the last bit different. If such a section does not exist (which can happen, as you correctly state), we will have to wait until all the children of our sibling merge into the actual sibling section, and then merge with that section.

This might seem unnecessary, as we could just merge all the sections in one go - which we actually tried when initially implementing Disjoint Sections, but it turned out to be too problematic - bringing just two sections to consensus is challenging enough, let alone more of them

What does ‘influence node aging’ mean? I understand an incomplete group may have infants for elders, but I don’t understand what that means beyond normal elder behaviour.

This touches upon how node ageing actually works.

With node ageing, a node gets older when it is relocated. It is relocated when a network event hashes to a value ending with enough 0 bits. But which network events do we hash? The idea here is that since Infants will probably usually be nodes that only joined the network for a moment, we don’t want to take into account the events they generate - they are “noise” overlaid on the “signal” of Elders and Adults, so to speak

We can’t always ignore Infants, though - if we do that, then when the network starts and everyone is an Infant, noone will age! Every event will be ignored, as it was generated by an Infant, and so nobody will get older, so everyone will remain an Infant, so… you get the idea. We need to take the Infants into account at the beginning, but we don’t want to do it later - so we decided to mark the point of transition at the moment of getting a complete group. When we have a complete group, we decide that we have enough Adults and we don’t need to take Infants into account anymore, so we stop doing that.

Should I test for groupsize+buffer ‘elders’ or ‘adults’ in each hypothetical new section when testing for a split?

We never have more than GROUP_SIZE Elders, so it’s Adults (in the simulation’s understanding of the term; Adults plus Elders, if we exclude Elders from being Adults, as mentioned above).

Actually, we recently decided to allow the number of Elders to grow to GROUP_SIZE+1 briefly - but it’s still being discussed, and not reflected in the proposal. That being said, it’s Adults here, anyway

Likewise does a merge happen when a section has less than groupsize elders or adults?

Adults again, and it’s even when we have exactly GROUP_SIZE of them, not less.

If we only have GROUP_SIZE Adults, it means that all of them are Elders, which means we have no buffer if we lose another Elder - we would have to promote an Infant, then. So we decide to merge, which will give us the needed shot of fresh Adults from our sibling section

The simulation says should_split depends on adults only (ie not elders) and same for should_merge.

Yes - but bear in mind that for the purpose of the simulation, everyone with age > 4 is an Adult (so if some of them are Elders, they are still Adults, too).

Is the following disallow rule correct? Where is it defined?

fizyk20/ageing_sim/section.rs#L198

“disallow more than one node aged 1 per section if the section is complete (all elders are adults)”

I think we forgot to mention this in the document, so apologies for that - and thanks for pointing it out!
This rule is needed to avoid sections having huge numbers of Infants, which is actually a result from the simulation.

It wasn’t there initially, but because the age of peers grows roughly logarithmically with the number of churn events experienced by them, a single peer has to wait around 32 events until it becomes an Adult. This means that we will have 32 Infants for every adult in the section, which means over 800 Infants in a section that is about to split - clearly a huge number.

In order to prevent such situations, we decided to stop new age-1 Infants from joining when we already have such an Infant. This means that some of the Infants we already have will be relocated before a new one joins the network, so the number grows much slower. We still get 50-100 infants per section even with this rule, but this is actually manageable thanks to Infants only being connected to their own Elders and nobody else.

Also worth noting here is that this is again an example of a rule we can’t enforce from the beginning - the first node in the network would be an age-1 Infant, and so nobody else would be allow to join, then. This is why this rule is only enforced when we have a complete group, just like with calculating relocations.

One nitpick: age can be more than 4. This matters because during merge the older nodes take precedence when deciding the elders.

I’m not sure what you mean by that, but you seem to be mostly correct. Just to make it clearer - the age can indeed be up to 255 (but we don’t expect such ages to appear in the network, as they would require about 2^255 churn events, which is almost as many as there are atoms in the Universe ), but noone actually “decides” the Elders - they are always the oldest nodes in a section. So if you just mean that the older nodes become Elders first, then you are totally correct; and if you meant something else, well, I hope I clarified this a bit

JPL · December 14, 2017, 2:04pm

I realise that this is a work in progress but what exactly is going to go into the blocks of the data chain?

So we have the network events including

elder joins
elder departs (leaves network or relocated)
section splits
sections merge

But what about the data bit? Is each chunk of data, including mutable data, going to get its own block containing a pointer to that data, with a new block created every time the state is changed (for MD)? If so I should imagine pruning would be needed sooner rather than later to keep it to a manageable size.

Another thing I’m not quite clear on is the Running Network Consensus - so just the most recent block(s) of the chain get shared with the Elders in the sibling sections, is that correct? And presumably that would only be for section membership, rather than a record of the data looked after by that section? So will there be two datachains effectively, one for peers and one for data?

dirvine · December 14, 2017, 8:15pm

Yes, exactly.

Yes, there is no need to keep version - 1 in the chain at all. If we did then it would be a ledger type where every mutation is a new put payment (so we keep all old copies/transaction, like blockchain then).

Our approach to data and Adults & Infants is a pattern called bubble up. So each new Elder has to accept that data and vote for it again (like a kind of resource proof). So the Blocks for these things are always voted on by the most recent members of the chain (actually always voted by all members of the chain (elders)).

We have not published this part yet as we have not got 100% consensus in design of the minutia of data chains implementation, we are pretty close though. then we will implement phase I and publish phase II (data) which is actually alpha 4.

mav · December 20, 2017, 2:08am

The recent research obviously negates some old docs and it takes time to finalize and filter through, but in RFC-0045 Node Ageing (more than one year old) there are some things I’m unclear about.

Mostly it matches extremely closely with the op, which is kinda cool considering how much work has happened since rfc0045 was written.

“Groups will only allow one node with an age of zero.” from rfc0045

The simulation says initial age 1, but rfc0045 says initial age 0. Maybe the simulation is doing a shortcut for this part of the rfc:

“This means the node itself on successful join will be immediately relocated.” and thus have age 1 at the start.

This would seem to affect the disallow rule, making it age==0 rather than the simulated age==1. Is there any difference in reality? Just trying to grasp if initial age 1 a shortcut or if it’s a reality.

“If there exists more than one node to relocate then the oldest node should be chosen.” from rfc0045

This goes directly against the op which favours relocating the youngest node. I assume op is correct and the doc needs changing, but it’s interesting because this simple change seems to have a very big impact on the ageing and splitting algorithm because the new algorithm makes much fewer adults in young networks. I think favouring younger is the correct approach, but the difference in docs is worth noting anyhow.

The relocation tiebreaker has also changed between rfc0045 and op, just worth noting, nothing further to be said.

“If two nodes have the same oldest age then the one that has been present for the most churn events is chosen to relocate.” from rfc0045, now changed as per op.

“When a churn even does force a node to relocate to a destination group (D)… The relocating node then generates a key that will fit in D and attempts to join D.” from rfc0045

Is this key generation process still applicable? It seems unavoidable but is not mentioned in op (probably it’s out of scope for op).

This process gets exponentially harder as the section prefix / network size grows. Probably worth simulating how hard it becomes and when this becomes ‘too hard’. It may be a potential issue but also may never become one in the real world; can’t say off the top of my head.

Some rough calcs:

Generating an ed25519 key pair (kp) takes about 68000 cycles on an i7 2.1 GHz machine, which is about 30K kp/s.

This means it takes about 1s to find a key with prefix of log2(30K) = 15 bits.

A network of 1B adult nodes has about 1B / 12 = 83M sections.

This means a prefix length of log2(83M) = 26 bits.

Assuming finding a kp for 26 bit prefix takes 2^26 attempts that’s about 2^(26-15) seconds = 2048 seconds = 30 minutes.

Reckon this might become an issue? This is all back-of-the-envelope so I am looking for holes in my reasoning.

digipl · December 20, 2017, 10:50am

1 billion adult or 83M sections, does not it seem an unlikely number? If each section will have 50-100 infants that’s, at least, 5-10 B total vaults. Even count a good percentage of mobile devices as Vaults, seems to me excessive. In fact need the whole planet became farmer.

For example, with 10M sections (about 600-1200 Million Vault) the time to find a key it’s reduced to about five minutes.

dirvine · December 20, 2017, 11:47am

At the moment it is reality. I suspect (personally) that infants may start older, say at 3 or 4 to still prove worth but not have so many with very short lifes in a section (just relocate hassle).

Yes this is also an update, to prefer relocating and aging younger peers.

Yes this is still the case, IIRC we can create something of the order of 80,000 keys per second with Dalek (it may be 15000 or so with rust sodium).

Yes this is a nice thing to see, lots of tweaks really, but that rfc was written with data chains in mind really, so it is good that not much changed.

mav · December 20, 2017, 10:31pm

Back in the days of bitcoin mining when the total network hashrate was of the order of gigahashing it was hard to believe in a few years the network would be exhahashing as it is now (I was part of mining when it was on the order of terahashes and the continual and sustained growth I see in mining is mind boggling).

So I don’t think 1B adults is unlikely. Not at all.

But what I do think is if that happens then asics will be developed for these sorts of operations (assuming a delayed relocation actually somehow causes loss of income), for generating keypairs and definitely for signature operations. But it may take some time, and maybe the network won’t become that large. It’s all speculation.

Sure, but if it goes the other way up by a factor of ten beyond what was expected then it becomes 5h, then another factor of ten is 2d, and this sort of growth is common in these networks. So I think it warrants some consideration. 2d to relocate a node? I’d be worried.

I suppose the risks of delayed relocation need to be further explored too; maybe delays are relatively benign.

I suppose it should also be considered that the name must match, ie sha3(pubkey), so the hashing slows the search down considerably. I don’t think the /dev/random vs /dev/urandom performance matters in this case so sourcing entropy shouldn’t be a factor in the performance. I’m not saying there is definitely a problem, but I do think it warrants careful thought since I could see it becoming an issue.

dirvine · December 20, 2017, 10:38pm

We have also tested the distribution of public keys and sha3 hashes, they are pretty much equivalent, so the name will be just the public key and not that hash of that. This means we can ignore a hash cost. Just more info, not disputing we do need to be careful. If it were to become an issue then I feel there are a lot of ways around this that we could adopt. I am looking at some tricks here regarding that very issue for a different reason right now mind you. Anyway hope this info helps.

mav · December 21, 2017, 11:07pm

This means a vault can be both an adult and an elder at the same time. So the category of ‘non-elder adults’ is a distinct category different from ‘adults’ according to these definitions. ie there are 4 distinct categories: infants, non-elder adults, elders, adults (which is non-elder adults plus elder adults).

‘non-elder adults’ is a bit awkward so I propose using four category names

infants: age <4 (same as original)
adults: age >=4 (same as original)
elders: groupsize oldest peers in section (same as original, can be infants and/or adults)
bystanders: non-elder adults (open to suggestions for bystander!)

Vaults can be in multiple categories at the same time, eg adult bystander.

I think ‘adults’ is used in a way to mean ‘non-elder adults’, right? ie bystanders in the above categories?

Please can the use of terms be settled and used consistently. Let’s not use ‘adult’ interchangeably for ‘non-elder adults’ and ‘non-elder adults plus elder adults’. This inconsistency is taking a lot of needless effort to interpret the ‘adult’ category when it’s used differently in context-specific ways.

If I’ve missed something and it’s simpler than I make out please let me know because I’m feeling like reading ‘adult’ is harder than it needs to be.

dirvine · December 22, 2017, 12:13am

Not sure what you mean here. An Adult is an Elder iff it is one of the GROUP SIZE oldest Adults, but it cannot be both at the same time. It may be the definition is worded in a confusing way though.

Here what we are saying that the section has come down to GroupSize Elders and there is no Adult to promote if we were to lose an Elder, so we are in a danger zone and will merge now.

Hope that helps.

[Edit] I think I see the issue and yes it is possibly unclear wording. The reason the definition says ;

Is that, when the network starts there may only be Infants that will make up the Elder group. At this stage though we have an incomplete group and all peers may be Elders. If we have a complete group though then only non infants would be Elders

mav · December 22, 2017, 12:26am

Ah I see so ‘adult’ is

Adult: in a section with a complete group a peer with age >4 that is not an elder. They are only connected to their section Elders.

Thanks for the clarification.

There’s a difference between ‘adult’ and ‘vault age >4’… this is my misunderstanding. I had thought they were the same but they are not.

mav · December 26, 2017, 5:12am

Some exploration of the ageing process…

Purpose

The purpose of ageing is to “ensure that nodes are relocated and therefore dilute their impact on the network across the address range” so “that an attacker or large participant in the network has an amount of work that must be carried out that is prohibitively expensive” (rfc-0045 node ageing).

I would paraphrase this as ‘balancing inclusiveness vs restrictions’.

There was an issue with the recently proposed ageing mechanism (ie the first post in this topic) that “made the network accept barely any new peers” (2017-12-14 update).

This post explores one potential cause and solution for the issue.

Cause

The simulation below indicates the issue is “it’s too difficult to age” rather than “too many nodes are disallowed”; the disallowing is a symptom, not a cause.

The ‘problem’ must be looked at from the perspective of ‘a prohibitively expensive amount of work’ required to ‘impact the network’. Extreme difficulty in ageing is counterproductive. Even though it makes it expensive to impact the network it also makes it expensive to participate in the first place. Ageing protections must not come at the cost of inclusiveness.

Making it expensive to join the network goes against the ‘everyone’ part of ‘Secure Access For Everyone’.

The aim of ageing is to make it expensive to impact the network without making it expensive to participate in the network. Maybe this is not the aim? Or not possible to achieve? Anyway, that’s a different discussion perhaps. Let’s look at the ageing mechanism.

Simulation

One of the major changes from the rfc and the new proposal is that originally the oldest possible vault would be aged, but now the youngest possible vault is aged.

The following tables show the difference between a preference toward younger vs older vaults.

A simulation of these two different preferences is shown below. It involves creating a network of 100K vaults (500K joins interleaved with 400K departures).

YOUNGEST PREFERRED

age vaults
  1  29712
  2  67461
  3   2827

100000 total vaults
  6186 total sections

OLDEST PREFERRED

age vaults
  1   4274
  2  16337
  3  38242
  4  21057
  5  10171
  6   4759
  7   2677
  8   1366
  9    691
 10    287
 11    113
 12     22
 13      3
 14      1

100000 total vaults
  1607 total sections

Caveats

Youngest-preferred never hits the disallow rule, since there is never a Complete Section formed.

For oldest-preferred it was necessary to remove the disallow rule because the simulation got stuck at a single Complete Section which disallowed every new vault. I think the disallow rule needs tweaking or removing.

I suspect if the youngest-preferred eventually forms a Complete Section the disallow rule would also need to be tweaked or removed.

Thoughts

The rfc calls for an “exponentially increasing period between such relocations” which youngest-preferred does not achieve (but oldest-preferred does).

Youngest-preferred means an old vault can only age if its section has both a) no younger vaults and b) it ‘matches’ the hash of the network event. This is an almost-impossible condition to meet for old vaults (>3) because young vaults continue to accumulate in the section. It’s much harder than exponentially difficult.

Oldest-preferred seems to better match the intended difficulty of ageing and leads to an ‘expected’ age distribution. Ageing is exponentially more difficult as vaults get older, but is still common for younger vaults to age (ie exponentially easier the younger the vault is).

The key concept being affected by the younger-preferred mechanism is ‘difficulty’. Old (>3) vaults are nearly-impossible to age, rather than merely exponentially difficult to age.

I think perhaps most important is to keep the intended purpose in mind. Simulations show the intuitive approach to ‘age youngest first’ does not achieve the intended purpose. I’m not saying oldest-preferred is necessarily the best solution, but it does seem to me that youngest-preferred does not function as intended because it’s far too hard to become an adult.

Other possible things to try (but I think would not replace the older-preferred mechanism) are

age all vaults in the section that match the network event rather than just one, but still only relocate one vault to avoid massive churn.
tweak the disallow algorithm to have a gradually-stricter-rate-of-disallowing rather than a strict wall. This allows more chances for relocations and thus more chances for ageing.

It’s complex. But it always comes back to balancing inclusiveness vs restrictions. I think the simulations show youngest-preferred fails to achieve this balance, but oldest-preferred does a good job.

Just something to mull over during the holiday season… looking forward to seeing what comes from maidsafe in the new year.

rob · December 26, 2017, 8:16am

Good work there again @mav

I might nitpick one sentence. The reason for pointing this out is that it can alter one view on things

The S.A.F.E. is geared for access to use the network and its facilities. Where as being a part of the network itself is not access but a “worker”

The reason for making a distinction is that it is valid to have some who cannot join as a node since they will slow things down and if too many the network is not so useful and slow.

Mind you your posts deals with a much greater problem and is certainly good.

Good work @mav