Data Chains - Deeper Dive

David has been know to make devs eat Haggis for using “magic numbers” like that. Somebody may be lucky it wasn’t production code!

2 Likes

Or somebody likes Haggis…

1 Like

In the first tests I made the same mistake. Only when the results seems so strange ,I dig deeper into the code.

Basically, changing the Group_size, the only significant difference, in the metrics, is the number of sectors.

For example, move from Group_size 8 to 5, double the number of sectors and the average prefix len goes from 11.02 to 12.04.

Of course all these data are with a strong growth. Will have to test in situations of more stability.

Hi, @tfa , just raised a PR into your repo to sort out the print outs little bit.

And, just wondering maybe that https://github.com/Thierry61/ageing_sim/blob/master/src/network/section.rs#L191
shall be let node_to_age = self.choose_for_relocation(trailing_zeros + params.init_age - 1); ?
OR
https://github.com/Thierry61/ageing_sim/blob/master/src/network/section.rs#L164 to be .filter(|n| n.age() < age) ?
Otherwise a relocation will always happen on a churn?

2 Likes

Thank you for this.

I have then uploaded a new version with better cells formatting so that tables are indented and justified also in the console window and not just in markdown editor.

Now, if you select a console height of about 23 lines, you see a fixed panel as long as you have 2 or 3 distinct prefix lengths (tested on a Cygwin terminal).

This is something I added 3 days ago to improve network growth:

With trailing_zeros + params.init_age – 1, init_age=4 and max_young=1, the result after 100_000 iterations is:

Metrics Values
Adds 89894
Drops 7012
Rejoins 3094
Relocations 57141
Rejections 64631
Churns 215930
Sections 1038
Section nodes 18901
Left nodes 2853
Prefix len Count Average Min Max Standard dev
10 1010 18.37 14 24 0.92
11 28 12.29 8 23 3.02
All 1038 18.21 8 24 1.43

Number of sections is 1038 instead of 2073 and number of rejections is 64631 instead of 44357. This is a lot, but I can do this if you confirm this is needed for the network security.

6 Likes

The way this is designed is that older nodes will seldom relocate, (relatively) we will reject a lot, but I feel that is fine actually. I am not sure new nodes will always be waiting at every section. We have another sim in the background which Adam is working on. This will allow us to create many random numbers from each Live event by repeatedly hashing the event (settable). This will simulate and may replace using mutation events (like Put) as well as Live events. These nudge the network to grow as there is a need (when trigger is put especially).

So the Hash(Event) % 2^age == 0 applied to each peer in decreasing age and relocating the first one that matches the test will possibly slow down relocation (i.e. create more rejections) but is likely more secure but may delay nodes joining for perhaps even a few hours (or more). I am not sure this is bad as it is more a test of the nodes up time and is probably still faster to join than downloading a block chain (for now anyway).

Hope this helps.

2 Likes

David, have you tried tackling this optimisation using a Genetic Algorithm? I can see difficulties: you might need a Connection Machine to do it in a reasonable time, and designing the fitness function is non trivial, but I wonder if it is a suitable problem.

1 Like

With a method like NSGA-II you can use multiple objectives directly rather than trying to cook up a meaningful scalar fitness. Things get a bit tough to visualize after 3 dimensions though.

2 Likes

Thanks. I am sooo out of date on this :slight_smile:

1 Like

Rust impl, worth forking.


Examples of grid compatibility

Comparisons
http://www.ias.ac.in/article/fulltext/sadh/037/06/0675-0694

It is hard to swallow because in the name of security we both set max young to 1 and relocate only on every other churn event. Globally these two elements divide the number of sections by 4 (from 4122 to 1038) and multiply the number of rejections by 3 (from 21000 to 64631).

But I think I understand why: the rejection rate is a security factor for a relatively small network. It can only be reduced if the network grows, which happens naturally because more sections means more available slots for young nodes.

In addition to the rejections (which are accumulated from the start of the network), I will add the instantaneous rejection rate in the displayed metrics. I guess it is the number of sections that cannot accept a young nodes divided by the total number of nodes.

2 Likes

Yes and I agree it feels a bit strict. It is more secure AFAIK and I feel this algo will develop over time where it should become more lenient. Ultimately we do want even very short lived nodes to be of value and earn. Initially though this is possibly safer.

This will be very interesting as well. Thanks again @tfa

3 Likes

The rejection rate has been added in my repo.

I have run a simulation with 890_000 iterations but the result is disappointing: I had hopes that the rejection rate would slowly diminish with the size of the network but that’s not the case and it remains stable with small oscillations around 70%.

Iteration Sections Rejection rate
0 1 0%
10000 133 73%
20000 262 70%
30000 320 66%
40000 530 73%
50000 526 74%
60000 615 68%
70000 924 68%
80000 1052 73%
90000 1045 72%
100000 1038 71%
110000 1033 71%
120000 1189 69%
130000 1552 67%
140000 1768 71%
150000 2036 69%
160000 2102 73%
170000 2098 73%
180000 2090 73%
190000 2086 72%
200000 2079 72%
210000 2074 74%
220000 2068 72%
230000 2084 70%
240000 2246 68%
250000 2612 68%
260000 3018 68%
270000 3321 69%
280000 3551 70%
290000 3707 70%
300000 3903 71%
310000 4111 70%
320000 4191 72%
330000 4199 70%
340000 4199 72%
350000 4194 72%
360000 4188 72%
370000 4177 73%
380000 4171 72%
390000 4166 72%
400000 4158 71%
410000 4150 72%
420000 4140 72%
430000 4138 71%
440000 4135 73%
450000 4132 72%
460000 4134 72%
470000 4182 72%
480000 4300 70%
490000 4531 69%
500000 4851 68%
510000 5229 68%
520000 5608 68%
530000 5979 69%
540000 6330 68%
550000 6586 70%
560000 6870 69%
570000 7081 69%
580000 7266 69%
590000 7451 69%
600000 7689 70%
610000 7892 70%
620000 8111 71%
630000 8272 71%
640000 8355 71%
650000 8395 71%
660000 8393 71%
670000 8391 72%
680000 8393 72%
690000 8388 71%
700000 8379 72%
710000 8368 73%
720000 8360 73%
730000 8347 72%
740000 8341 72%
750000 8333 72%
760000 8327 71%
770000 8318 71%
780000 8311 73%
790000 8302 73%
800000 8299 72%
810000 8294 72%
820000 8293 72%
830000 8286 72%
840000 8274 72%
850000 8272 71%
860000 8272 73%
870000 8266 72%
880000 8262 72%
890000 8258 72%
6 Likes

This is good to know. I really suspect though that we are trying very hard to add nodes, Initially this may be the case, but then I suspect fewer nodes will try and join, However as the network needs more nodes it should get them. This is where the mutation events will help us in ageing nodes faster. So we will need to dive in there a little, the rate of trying to add will diminish over time.

We need to think about that a bit more, as nodes are needed due to the data getting large in a section then we want it to split, so more new nodes / sections will distribute the data better. I think we are missing a tiny bit here. Even with a 70% rejection rate though it is probably not to bad, it is just a few attempts to start (3 or 4) and requesting to join is not all that arduous I suppose.

(still thinking a bit about this one). There are a few of us spending a weekend down at my place this weekend to get space to dive into this a lot deeper, but its a full agenda for the weekend, republish, sgx etc. (https://github.com/baidu/rust-sgx-sdk - interesting to add security, but limited as well there is also arm trust zone etc.), secure gossip, upgrades and some more. We will spend a while on this as well in a focussed way. I am keen to look a bot more at the assumptions of this sim and the updated one we should publish tomorrow as well, these probabilities of add/rejoin/join etc. I think are close but possibly not as real life as it could be.

[edit, the join rate will also be affected by safecoin, when the network needs nodes it will increase payments and when it does not it will decrease them, so perhaps that also should be factored in somehow, well if it can]

6 Likes

I don’t want to send the Data Chains thread off topic… but just in case you haven’t seen it yet, this recent paper may provide some additional food for thought:

Malware Guard Extension: Using SGX to Conceal Cache Attacks
" Our attack is the first malware running on real SGX hardware, abusing SGX protection features to conceal itself. "

2 Likes

According to the whitepaper, one way to prevent an SGX attack is to use side-channel resistant crypto implementations. Libsodium, fortunately, is implemented, from the beginning, to be extremely resistant to these attacks.

But every day it is more evident that modern CPUs have more holes than a Gruyère cheese.

This factor is, in my view, fundamental. The success of bitcoin comes more from an intelligent application of game theory than from technical implementations.

In the same way, the design of both, the farming rate and the initial distribution, must be adjusted as a reinforcement in security, especially at the beginning when the network is most vulnerable.

From the point of view of an engineer may seem a sin, but, I am convinced that a good advertising campaign and a distribution that incites greed is, for the security of the network, much more important than an ultra-optimized algorithm.

5 Likes

I agree with this principle, but you need to lower the rejection rate to implement it. I also think that a greater number of sections is a better security measure than a greater rejection rate.

To facilitate experiments, I have added a parameter to accelerate or slowdown the relocation rate. Base current formula was: Hash(Event) % 2^(age - init_age + 1) == 0. This parameter modifies it to Hash(Event) % 2^(age - init_age + 1 - relocation_rate) == 0 where relocation_rate = 0 for standard rate and 1 for aggressive rate.

  • Standard rate keeps base behavior with a relocation triggered every other event and with the following probabilities of relocation: 50%: no relocation, 25%: init_age, 12.5%: init_age+1, 6.25%; init_age+2, …

  • Aggressive rate triggers a relocation every event and each probability is shifted one step to the right: 50%: init_age, 25%: init_age+1, 12.5%: init_age+2, 6.25%; init_age+3, …

IMO, there are 2 main models: maximizing or not the number of sections. To have a clear view of the big differences between them, I have launched 2 simulations with 890_000 iterations:

  • the first simulation allows only one young and uses standard relocation rate

  • the second one allows 4 young nodes and uses aggressive relocation rate

Here are the results:

Metrics Simulation 1 Simulation 2
Adds 800814 810134
Drops 62523 62976
Rejoins 26663 26890
Relocations 503815 1285360
Rejections 578686 195233
Churns 1912122 3507968
Sections 8258 33405
Section nodes 161447 564058
Left nodes 26287 14370
Rejection rate 72% 26%

4 times more sections will make things harder for an attacker to control a section because he needs much more nodes. This is better than having a greater rejection rate because the attacker has just to retry to connect each node more often but the number of nodes he needs is not impacted.

Another argument is that a greater rejection will deter some casual users from connecting a vault, which will lower the total number of nodes and so will increase the proportion of attacking nodes.

Additional info: section distribution after 890_000 iterations for simulation 1 is:

Prefix len Count Average Min Max Standard dev
13 8126 19.67 14 26 0.84
14 132 11.91 9 20 2.10
All 8258 19.55 9 26 1.31

for simulation 2:

Prefix len Count Average Min Max Standard dev
15 32141 16.96 10 37 2.31
16 1244 14.99 9 26 2.77
17 20 13.55 11 19 2.42
All 33405 16.89 9 37 2.36
9 Likes

I suppose the downside of sim 2 and the greater number of sections is the much higher churn (ie network overhead)?

There are more churns in simulation 2 simply because the network is bigger, and it is bigger because there are less nodes that cannot connect. So, no downside in this area.

The number of churns is even smaller if we look at the figures when simulation 2 had a similar number of nodes:

Metrics Simulation 1 at step 890000 Simulation 2 at step 260000
Adds 800814 234114
Drops 62523 18053
Rejoins 26663 7833
Relocations 503815 376067
Rejections 578686 56978
Churns 1912122 1021403
Sections 8258 8358
Section nodes 161447 163394
Left nodes 26287 3997
Rejection rate 72% 27%

Simulation 2 is much better than simulation 1, except in the number of distinct prefix lengths which varies from 3 to 4: the values are better (I mean prefix lengths are greater) but they are slightly more dispersed than in simulation 1.

So, I improved the program to make sure that when a node is relocated to a section it is placed in the section half having less adults. The simulation is running right now. Results are expected in a few hours.

5 Likes

Result with the new algorithm to balance the sections surpasses all expectations: there is only one prefix length, except during the period of transition from one prefix length to the next one where there are 2.

As a consequence, the number of section is now a stepped curve equaling exactly a power of 2 between each transition (in green):

The rejection rate has large oscillations with minimum values at transitions:

The number of nodes is slightly better:

But the real gain of the new algorithm is the homogeneity of the network.

9 Likes