RFC 45 – Node Ageing

routing

#1

Discussion topic for RFC 45 – Node Ageing


How does SAFE Network keep track of itself in terms of state/health
#2

An idea here to allow this RFC to proceed prior to data chains is to use a simple mechanism for when to relocate.

Proposal

Use the hash of the churn event (it will be a node address joining or leaving) and then H(churn) % 2^age == 0 is checked against all nodes in the group.

This allows a quick check on whether a node shoudl relocate or not. It’s not as accurate as using the link block, but any error in this will be evenly distributed as long as targeting cannot happen to the churn event id.

Issues

An attacker can know the id’s and age in a group to calculate an id to cause churn of that node. It’s relatively simple to do this.

Additional proposal possibly not required at this stage

Do not allow any node under age QUORUM_SIZE to have any influence in causing a relocation. This puts the aforementioned attack out of reach.

This is the start of age or range ranges. So infants have very little control, but they can with (beyond) exponentially increasing difficulty locate themselves in groups as early stage nodes.

Later this could extend to nodes between QUORUM_SIZE and GROUP_SIZE have to relay only, above GROUP_SIZE The nodes are full nodes.

For this proposal it’s merely ignore nodes under QUORUM_SIZE and then let them become full nodes. In a further RFC this staging can be clearly defined and calculated more effectively.

@AndreasF @Fraser @Qi_Ma @Viv -> be good to get opinions on this short cut proposal. I am looking to break the data chains and aging into two distinct work units. I imagine we can live with the targeting here as well until we implement data chains. Separating these work units though may make a lot of sense and help implementation and testing to be done more efficiently.


#3

Here is a paper on quorums to jog minds in case it proves helpful. https://blog.acolyer.org/2016/10/03/the-load-capacity-and-availability-of-quorum-systems/


#4

Two possible issue came to my mind immediately while I was reading the RFC:

  1. This might be already resolved, but just to make sure: the group does not admit more than one node with age 0, but a joining node is immediately relocated and promoted to age 1, right? If not, it might cause issues with new users joining when the network is small.
  2. “Groups may only allow one node of each age per group. This further distributes age through the network. This requires further modelling.” - this might cause trouble, since gaining age becomes exponentially harder, so there will be exponentially more nodes with low age in the network than nodes with high age. This would mean either that there is no place for new nodes in groups, or that there are many small groups with two or three nodes of low age. Requiring some kind of an exponential distibution of age in each group might do the trick, though.

#5

Yes this is the case. New nodes that join themselves (i.e. not requested by a group) are age zero. These are relocated immediately and given age 1.

Absolutely, this is a larger question about distribution and how to handle imbalance of nodes. So yes this needs a lot more modelling.

I like the notion of looking at distributions of age as you say though, that could prove very helpful.


#6

Is there a max_group time which nodes are allowed in the same group while being allowed to sign messages for quorum? Or do they have to choose between becoming a Archive Node (not allowed to sign messages any longer) or an ordinary node in a different group? I think it adds to security if the is a max_grouptime to be allowed in 1 group. Just to take away the option for attackers with a very long breath to attack a certain group.

max_grouptime == 24 hours??


#7

YEs this is a key component here. A node starts at age 1. It stays in that group for 1 churn event. Then it is 2 and stays in next group for 2 churn events. This goes on all the way to an age of 255 (which no node will get to likely). So exponentially over time (defined by churn events) a node is moved from group to group. After age 10 this is every 1024 churn events etc. So age 30 it is there for 1073741824
churn events. if we imagine a churn event every 30 minutes then this is over 60000 years. Hope this helps.


#8

This does give some extra overhead in the first few churn events but when I think about it it really destroys the idea of targeting a group :+1:.


#9

Not secure enough for me :wink:


#10