Hi! Thanks for your interest in the proposal. I think I can say for us all at MaidSafe that we are impressed with how deeply you have read the document
Just to note - this document is still actively being discussed, so large parts of it might still change - but we will update the thread every time some changes happen.
Bart’s simulation GitHub - fizyk20/ageing_sim: Simulation of node ageing in SAFE Network - a really excellent resource.
I’ll take this opportunity to say: thanks! Glad to see it proved useful to you
Is this ‘exactly groupsize’ or ‘at least groupsize’?
Confusingly, the simulation defines is_complete2 as exactly groupsize elders (which is confusing because elders may sometimes be age<=4). Furthermore, the simulation says all elders are adults1 (ie infant elders are treated as adults) but presumably only for the purposes of is_complete.
Some clarification of the definition would be helpful here.
It is “at least groupsize”. The point is that we want to have all Elders being older than 4, but when the network has just started its operation, there won’t be enough peers with such an age. Once there are enough of them, we call it “having a complete group” and start operating a bit differently (I’ll explain more in the answers to your other questions below).
What the simulation does is it checks whether we have GROUP_SIZE Elders in the section (we can have less if we are just starting up and we have less than GROUP_SIZE peers in the whole section - we obviously don’t have a complete group then) and if yes (&& operator), whether all of them are adults (which is defined elsewhere in the simulation as being older than 4 - in the proposal we actually use the term “Adult” differently, see below). So effectively we check whether there are at least GROUP_SIZE peers in the section and whether the GROUP_SIZE eldest peers are older than 4 - which is how a complete group is defined.
Regarding the meaning of “Adult” - in the simulation it’s just a peer that is older than 4, period. In the proposal, we use it as meaning a peer older than 4 not being an Elder. This is a difference that might be confusing, and we probably should have a separate term for the meaning I used in the simulation to be more clear.
One more potential point of confusion is in the results printed by the simulation - “Adult” is actually used there in the meaning from the proposal… This will definitely have to be fixed
Can a formal definition for Network Event please be given? I suspect it is:
- elder joins
- elder departs (leaves network or relocated)
- section splits
- sections merge
This part has been changing quite a bit recently, so no wonder you are confused We use the term “Network Event” for anything that happens to the section and will be recorded in the Data Chain - which, as you correctly guessed, is mostly the 4 kinds you wrote.
The code is actually based on an older version of the proposal, where if some events need different treatment (like an elder leaving and an elder relocating), we considered them separate Network Events. The current proposal focuses on the effect an event has on the section, so both elder leaving and elder relocating are now Dead
. This might get updated in the simulation, but it’s rather low priority, as it’s probably not really important to the results.
From the code it seems any joining or departing node (ie not necessarily an elder) also triggers a network event.
In a way, yes. We usually call an Adult or an Infant joining an event as well, but it might not be recorded in the Chain and might not influence node ageing. To be precise, Infants are only recorded in the Chain and influence node ageing while we don’t yet have a complete group, and Adults - in the proposal’s meaning, so not being Elders - are never recorded and always influence node ageing. I’ll elaborate on that node ageing part later in the post.
May as well just pick the largest public key value rather than do the xor distance comparison routine.
You are right, provided that the tie is just between two peers, and it can often be between more of them. But nevertheless, what you pointed out is a significant issue, so we will modify that part - thanks for that!
We will actually use a hash of the XORed names, then choose the one closest by XOR distance to that. So in your algorithm, this would mean something like pkxd = hash(pk1 ^ pk2)
, with the rest unmodified.
Not a question, just a point of observation
Sibling: A section differing from us in the last bit.
It’s worth explicitly clarifying that a sibling may be multiple children sections of the sibling prefix. This means if the sibling section has already split, there are still sibling vaults (just that they’re formed by multiple sections). I feel the definition doesn’t cover this (but probably doesn’t need to).
I think we actually don’t want to extend the meaning this far. The point of the sibling is to refer to a section that a given section will merge with, if it needs to merge - which will always be a section with a prefix of the same length and the last bit different. If such a section does not exist (which can happen, as you correctly state), we will have to wait until all the children of our sibling merge into the actual sibling section, and then merge with that section.
This might seem unnecessary, as we could just merge all the sections in one go - which we actually tried when initially implementing Disjoint Sections, but it turned out to be too problematic - bringing just two sections to consensus is challenging enough, let alone more of them
What does ‘influence node aging’ mean? I understand an incomplete group may have infants for elders, but I don’t understand what that means beyond normal elder behaviour.
This touches upon how node ageing actually works.
With node ageing, a node gets older when it is relocated. It is relocated when a network event hashes to a value ending with enough 0 bits. But which network events do we hash? The idea here is that since Infants will probably usually be nodes that only joined the network for a moment, we don’t want to take into account the events they generate - they are “noise” overlaid on the “signal” of Elders and Adults, so to speak
We can’t always ignore Infants, though - if we do that, then when the network starts and everyone is an Infant, noone will age! Every event will be ignored, as it was generated by an Infant, and so nobody will get older, so everyone will remain an Infant, so… you get the idea. We need to take the Infants into account at the beginning, but we don’t want to do it later - so we decided to mark the point of transition at the moment of getting a complete group. When we have a complete group, we decide that we have enough Adults and we don’t need to take Infants into account anymore, so we stop doing that.
Should I test for groupsize+buffer ‘elders’ or ‘adults’ in each hypothetical new section when testing for a split?
We never have more than GROUP_SIZE Elders, so it’s Adults (in the simulation’s understanding of the term; Adults plus Elders, if we exclude Elders from being Adults, as mentioned above).
Actually, we recently decided to allow the number of Elders to grow to GROUP_SIZE+1 briefly - but it’s still being discussed, and not reflected in the proposal. That being said, it’s Adults here, anyway
Likewise does a merge happen when a section has less than groupsize elders or adults?
Adults again, and it’s even when we have exactly GROUP_SIZE of them, not less.
If we only have GROUP_SIZE Adults, it means that all of them are Elders, which means we have no buffer if we lose another Elder - we would have to promote an Infant, then. So we decide to merge, which will give us the needed shot of fresh Adults from our sibling section
The simulation says should_split depends on adults only (ie not elders) and same for should_merge.
Yes - but bear in mind that for the purpose of the simulation, everyone with age > 4 is an Adult (so if some of them are Elders, they are still Adults, too).
Is the following disallow rule correct? Where is it defined?
fizyk20/ageing_sim/section.rs#L198
“disallow more than one node aged 1 per section if the section is complete (all elders are adults)”
I think we forgot to mention this in the document, so apologies for that - and thanks for pointing it out!
This rule is needed to avoid sections having huge numbers of Infants, which is actually a result from the simulation.
It wasn’t there initially, but because the age of peers grows roughly logarithmically with the number of churn events experienced by them, a single peer has to wait around 32 events until it becomes an Adult. This means that we will have 32 Infants for every adult in the section, which means over 800 Infants in a section that is about to split - clearly a huge number.
In order to prevent such situations, we decided to stop new age-1 Infants from joining when we already have such an Infant. This means that some of the Infants we already have will be relocated before a new one joins the network, so the number grows much slower. We still get 50-100 infants per section even with this rule, but this is actually manageable thanks to Infants only being connected to their own Elders and nobody else.
Also worth noting here is that this is again an example of a rule we can’t enforce from the beginning - the first node in the network would be an age-1 Infant, and so nobody else would be allow to join, then. This is why this rule is only enforced when we have a complete group, just like with calculating relocations.
One nitpick: age can be more than 4. This matters because during merge the older nodes take precedence when deciding the elders.
I’m not sure what you mean by that, but you seem to be mostly correct. Just to make it clearer - the age can indeed be up to 255 (but we don’t expect such ages to appear in the network, as they would require about 2^255 churn events, which is almost as many as there are atoms in the Universe ), but noone actually “decides” the Elders - they are always the oldest nodes in a section. So if you just mean that the older nodes become Elders first, then you are totally correct; and if you meant something else, well, I hope I clarified this a bit