We do already gave such a mechanism in place, however I think that as the network grows the balance happens naturally. So the option is more complex code to force/help balance or allow the natural balance which in some ways is nice as that is also how data distributes in a balanced manner.
The aim of the “control system” implemented in the vaults is to insure the safety and efficiency of the network. This includes, among other things, the need to get a well-balanced prefix tree, which means few distinct values for the prefix length (ideally 1 or 2).
In this configuration (young_max=6) the result is good because we have 3 values but with few sections having the min or the max prefix length (1.33% of the sections with the min prefix length and 4% with the max prefix length).
Of course, the target is not specifically 12 and the value depends on the size of the network. For example, after 4000 iterations (instead of 100000), the node distribution is:
Prefix len
Count
Average
Min
Max
Standard dev
7
123
21.37
18
28
1.71
8
10
16.40
12
23
3.34
All
133
21.00
12
28
2.28
And after 20000 iterations:
Prefix len
Count
Average
Min
Max
Standard dev
9
222
26.75
18
31
1.98
10
580
13.30
8
23
2.13
All
802
17.02
8
31
6.37
The results with max_young=10 after 100000 iterations is:
One thing important for the limit of young nodes per section is to help with the situation when the network is relatively quiet. That is a possible attack vector for large adversaries who would try and flood the network at that time. So the limit effectively stops the flood from being too large and if there was growth then many other nodes would jump on the join queue (even other attackers, which is ok). This is the same protection really that states the network should only accept nodes when in needs them.
Both points for sure debatable and I think @mav has already made good arguments against that proposal. Personally I am on the fence, but slightly in favour of this protection.
I suspect 1 is the most secure but am not 100% convinced just yet, These types of security limitations are always pretty deep. I am running your code just now with 1000000 iterations. I think it may be wise to run forever but print out only the final results every 10,000 runs. Just working out the info we want printed on each 10,000 iters
[Edit most output should be logging and set to info, the distributions should be just min max and average, not a list I think that would work best. What do you think @tfa ?]
With max_young=1 the results after 100000 iterations are:
Metrics
Values
Adds
89795
Drops
7116
Rejoins
3089
Relocations
126415
Rejections
44357
Churns
355262
Sections
2073
Section nodes
40061
Left nodes
3083
Prefix len
Count
Average
Min
Max
Standard dev
11
2023
19.51
13
28
0.88
12
50
11.86
9
17
2.17
All
2073
19.33
9
28
1.50
They are not as good (the number of sections is halved and the number of rejections has doubled).
I agree with this, but I don’t have time to do it right now. But next weekend will be OK for me.
When the relocation algorithm is correct, the list is limited to 2 or 3 elements (1 element = 1 prefix length). I think it is important to check the distribution of every prefix length and pollution is minimal.
Unless you were talking about the section list that was previously displayed? In that case I have already removed it in the Display trait that I have added to Network structure. I also generate markdown tables which are compatible with discourse forum. When I display the network summary, I use "{}" instead of "{:?}" to call this trait instead of Debug trait.
Apparently the smaller the number of elders is, the more heterogenous the network is. In configuration with GROUP_SIZE=5, prefix length goes from 11 to 14. This doesn’t sound good for network stability. In conclusion, better keep 8 elders.
With GROUP_SIZE=4 the result is even worse:
Metrics
Values
Adds
89856
Drops
7013
Rejoins
3131
Relocations
240781
Rejections
0
Churns
590476
Sections
8363
Complete
0
Section nodes
85020
Left nodes
3883
Prefix len
Count
Average
Min
Max
Standard dev
12
1
17.00
17
17
None
13
8022
10.24
7
21
1.09
14
332
8.51
5
16
1.90
15
8
7.50
6
9
0.93
All
8363
10.17
5
21
1.19
Which means that no sections reach the complete state, hence 0 rejections.
Seed: [854126829, 497889719, 278581660, 1742524827]
Iteration 0...
Number of sections: 1 (complete: 0)
Iteration 10000...
Number of sections: 259 (complete: 259)
Iteration 20000...
Number of sections: 517 (complete: 517)
Iteration 30000...
Number of sections: 778 (complete: 778)
Iteration 40000...
Number of sections: 1041 (complete: 1041)
Iteration 50000...
Number of sections: 1039 (complete: 1039)
Iteration 60000...
Number of sections: 1577 (complete: 1577)
Iteration 70000...
Number of sections: 2073 (complete: 2073)
Iteration 80000...
Number of sections: 2079 (complete: 2079)
Iteration 90000...
Number of sections: 2077 (complete: 2077)
Iteration 100000...
Number of sections: 2071 (complete: 2071)
Iteration 110000...
Number of sections: 2180 (complete: 2180)
Iteration 120000...
Number of sections: 3090 (complete: 3090)
Iteration 130000...
Number of sections: 3896 (complete: 3896)
Iteration 140000...
Number of sections: 4165 (complete: 4165)
Iteration 150000...
Number of sections: 4171 (complete: 4171)
Iteration 160000...
Number of sections: 4165 (complete: 4165)
Iteration 170000...
Number of sections: 4164 (complete: 4164)
Iteration 180000...
Number of sections: 4160 (complete: 4160)
Iteration 190000...
Number of sections: 4153 (complete: 4153)
Iteration 200000...
Number of sections: 4150 (complete: 4150)
Iteration 210000...
Number of sections: 4163 (complete: 4163)
Iteration 220000...
Number of sections: 4367 (complete: 4367)
Iteration 230000...
Number of sections: 5100 (complete: 5100)
Iteration 240000...
Number of sections: 6068 (complete: 6068)
Iteration 250000...
Number of sections: 6893 (complete: 6893)
Iteration 260000...
Number of sections: 7639 (complete: 7639)
Iteration 270000...
Number of sections: 8138 (complete: 8138)
Iteration 280000...
Number of sections: 8328 (complete: 8328)
Iteration 290000...
Number of sections: 8349 (complete: 8349)
Iteration 300000...
Number of sections: 8350 (complete: 8350)
Iteration 310000...
Number of sections: 8346 (complete: 8346)
Iteration 320000...
Number of sections: 8342 (complete: 8342)
Iteration 330000...
Number of sections: 8335 (complete: 8335)
Iteration 340000...
Number of sections: 8330 (complete: 8330)
Iteration 350000...
Number of sections: 8325 (complete: 8325)
Iteration 360000...
Number of sections: 8321 (complete: 8321)
Iteration 370000...
Number of sections: 8318 (complete: 8318)
Iteration 380000...
Number of sections: 8309 (complete: 8309)
Iteration 390000...
Number of sections: 8304 (complete: 8304)
Iteration 400000...
Number of sections: 8299 (complete: 8299)
Iteration 410000...
Number of sections: 8296 (complete: 8296)
Iteration 420000...
Number of sections: 8291 (complete: 8291)
Iteration 430000...
Number of sections: 8360 (complete: 8360)
Iteration 440000...
Number of sections: 8632 (complete: 8632)
Iteration 450000...
Number of sections: 9197 (complete: 9197)
Iteration 460000...
Number of sections: 9989 (complete: 9989)
Iteration 470000...
Number of sections: 11008 (complete: 11008)
Iteration 480000...
Number of sections: 12049 (complete: 12049)
Iteration 490000...
Number of sections: 12972 (complete: 12972)
Iteration 500000...
Number of sections: 13867 (complete: 13867)
Iteration 510000...
Number of sections: 14694 (complete: 14694)
Iteration 520000...
Number of sections: 15415 (complete: 15415)
Iteration 530000...
Number of sections: 15989 (complete: 15989)
Iteration 540000...
Number of sections: 16387 (complete: 16387)
Iteration 550000...
Number of sections: 16574 (complete: 16574)
Iteration 560000...
Number of sections: 16642 (complete: 16642)
Iteration 570000...
Number of sections: 16664 (complete: 16664)
Iteration 580000...
Number of sections: 16665 (complete: 16665)
Iteration 590000...
Number of sections: 16664 (complete: 16664)
Iteration 600000...
Number of sections: 16665 (complete: 16665)
Iteration 610000...
Number of sections: 16664 (complete: 16664)
Iteration 620000...
Number of sections: 16659 (complete: 16659)
Iteration 630000...
Number of sections: 16659 (complete: 16659)
Iteration 640000...
Number of sections: 16654 (complete: 16654)
Iteration 650000...
Number of sections: 16653 (complete: 16653)
Iteration 660000...
Number of sections: 16650 (complete: 16650)
Iteration 670000...
Number of sections: 16646 (complete: 16646)
Iteration 680000...
Number of sections: 16643 (complete: 16643)
Iteration 690000...
Number of sections: 16639 (complete: 16639)
Iteration 700000...
Number of sections: 16635 (complete: 16635)
Iteration 710000...
Number of sections: 16631 (complete: 16631)
Iteration 720000...
Number of sections: 16629 (complete: 16629)
Iteration 730000...
Number of sections: 16629 (complete: 16629)
Iteration 740000...
Number of sections: 16628 (complete: 16628)
Iteration 750000...
Number of sections: 16626 (complete: 16626)
Iteration 760000...
Number of sections: 16622 (complete: 16622)
Iteration 770000...
Number of sections: 16620 (complete: 16620)
Iteration 780000...
Number of sections: 16618 (complete: 16618)
Iteration 790000...
Number of sections: 16610 (complete: 16610)
Iteration 800000...
Number of sections: 16607 (complete: 16607)
Iteration 810000...
Number of sections: 16600 (complete: 16600)
Iteration 820000...
Number of sections: 16591 (complete: 16591)
Iteration 830000...
Number of sections: 16591 (complete: 16591)
Iteration 840000...
Number of sections: 16604 (complete: 16604)
Iteration 850000...
Number of sections: 16649 (complete: 16649)
Iteration 860000...
Number of sections: 16749 (complete: 16749)
Iteration 870000...
Number of sections: 16930 (complete: 16930)
Iteration 880000...
Number of sections: 17289 (complete: 17289)
Iteration 890000...
Number of sections: 17744 (complete: 17744)
Iteration 900000...
Number of sections: 18413 (complete: 18413)
Iteration 910000...
Number of sections: 19189 (complete: 19189)
Iteration 920000...
Number of sections: 20043 (complete: 20043)
Iteration 930000...
Number of sections: 20962 (complete: 20962)
Iteration 940000...
Number of sections: 21872 (complete: 21872)
Iteration 950000...
Number of sections: 22820 (complete: 22820)
Iteration 960000...
Number of sections: 23761 (complete: 23761)
Iteration 970000...
Number of sections: 24719 (complete: 24719)
Iteration 980000...
Number of sections: 25618 (complete: 25618)
Iteration 990000...
Number of sections: 26498 (complete: 26498)
I have then uploaded a new version with better cells formatting so that tables are indented and justified also in the console window and not just in markdown editor.
Now, if you select a console height of about 23 lines, you see a fixed panel as long as you have 2 or 3 distinct prefix lengths (tested on a Cygwin terminal).
This is something I added 3 days ago to improve network growth:
With trailing_zeros + params.init_age – 1, init_age=4 and max_young=1, the result after 100_000 iterations is:
Metrics
Values
Adds
89894
Drops
7012
Rejoins
3094
Relocations
57141
Rejections
64631
Churns
215930
Sections
1038
Section nodes
18901
Left nodes
2853
Prefix len
Count
Average
Min
Max
Standard dev
10
1010
18.37
14
24
0.92
11
28
12.29
8
23
3.02
All
1038
18.21
8
24
1.43
Number of sections is 1038 instead of 2073 and number of rejections is 64631 instead of 44357. This is a lot, but I can do this if you confirm this is needed for the network security.
The way this is designed is that older nodes will seldom relocate, (relatively) we will reject a lot, but I feel that is fine actually. I am not sure new nodes will always be waiting at every section. We have another sim in the background which Adam is working on. This will allow us to create many random numbers from each Live event by repeatedly hashing the event (settable). This will simulate and may replace using mutation events (like Put) as well as Live events. These nudge the network to grow as there is a need (when trigger is put especially).
So the Hash(Event) % 2^age == 0 applied to each peer in decreasing age and relocating the first one that matches the test will possibly slow down relocation (i.e. create more rejections) but is likely more secure but may delay nodes joining for perhaps even a few hours (or more). I am not sure this is bad as it is more a test of the nodes up time and is probably still faster to join than downloading a block chain (for now anyway).