OK here is my interpretation of the OP for the more technically challenged reader (like me), stepping back a little to add a bit of an overview. I’ve left out the ordering of the blocks as I still don’t understand that and one or two other details have been omitted too.
It seems to me that the purpose of datachains is more than simply tracking the state of Elders
- that’s just one aspect of what they do, but the only one mentioned in the OP - and this is a cause of some of the confusion expressed in the comments afterwards - that and confusion over terms like ‘group’, it being a work in progress. Certainly getting a firm handle on things is not that straightforward, but it’s been fun learning and trying!
There are two things I’d really like to see. One is a succinct description (< 100 words) of what a datachain is and why it’s useful. I’ve rootled around but couldn’t find one anywhere. I remember reading a very short and simple description of a blockchain which made subsequent learning much easier, something like “a blockchain is an immutable ledger to which information can only be added, never changed or taken away. It is cryptographically secure and distributed across multiple machines, meaning that there is no need for a trusted central authority.” Could we do something similar for datachains?
The second is a description of how a datachain propagates across the network. I don’t really understand that at all.
Anyway, let me know if I’ve missed anything vital or got anything wrong.
A high-level look at consensus, node ageing and data chains
An autonomous network of decentralised nodes needs a set of rules to decide what is true and optimum, and what is potentially dangerous or counterproductive - since no human can do that for it. Because there is no central, universal source of ‘the truth’ (including current and elapsed time), such decisions, triggered by events on the network, must carried out by groups of nodes which reach a consensus by following rules as to what is and is not valid.
So, for example, a group of nodes might decide amongst themselves that a new member fulfils all the criteria required to join their number; conversely, they might decide that a particular node is acting suspiciously and should be thrown off the network. In another example, a group of nodes might decide that their number has become too small to ensure the required levels of security and that they should merge with a neighbouring group; and a large group of nodes might decide to split into two smaller entities.
Here, the word ‘group’ is used in general terms; as we’ll see, in the SAFE network it can have more specific definitions.
For a decision made by a group of nodes to be valid, a minimum number of them (a quorum) must agree. This quorum is set at more than 50% of the nodes. So, it there are eight nodes and five of them agree (by voting) on an event - say a new member trying joining the group is acceptable - then their vote will carry (the new member will be allowed to join) and their votes will be stored in a signed and secured block. If only four nodes agree then no action will be taken (in this example the node will not be allowed to join).
When it comes to decision making, not all nodes are equal. Some will have proven themselves to be trustworthy and to have sufficient resources (bandwidth, CPU, uptime), while others will be newer and/or less reliable. The concept of node_age
gives a number to this reliability factor, ranging from 0 to 4. Nodes with a node_age
of 4 are classed as Adults
. They will have been moved around the network a few times and proven themselves to be reliable; they are trusted to vote on network events (e.g. allowing a new member to join). Nodes with a node_age
< 4 are called Infants
- they are not yet trusted to vote.
While Adults
are capable of voting, in reality only the oldest and most trusted among them - the Elders
- have that right. The Elders
confer among themselves as to what is valid in their section. Only the results of their decisions (blocks) are shared with the Adults
and Infants
rather than the votes themselves. While they cannot vote themselves, Adults
and Children are able to contribute by storing data chunks.
Sections
At any given time, each node on the SAFE Network is a member of exactly one section. A section is a group of nodes that is responsible for looking after data stored within a certain range of addresses on the network. The number of nodes in a section varies and fluctuates constantly through the process of churn, but at any time most sections will have between 10 and 20 nodes.
Within each section are a certain number of Elders
, who are able to vote on events within the section, and also (usually - see below) a number of Adults
and Infants
. As well as being able to vote among their section peers (a process called local consensus), Elders
within a particular section are also connected to Elders
in sibling sections, near-neighbours as defined by their XOR address prefix. In this way, messages are passed across the network from Elders
in one section to Elders
in the next section, and so on.
If a particular section grows much larger than average (about 12 nodes), in general it will split into two smaller sections. (Note: in some circumstances, which we will not go into here, a split will not be possible.) Likewise, if, as a result of nodes leaving during ‘churn events’ the number of nodes drops below a certain level specified by a parameter called group_size, then its Elders
will be triggered to seek a merger with a sibling group. At present, group_size is 8.
In an average section containing 12 nodes, there will be eight Elders
plus a total of four Adults
and Infants
. In fact, eight (remember group_size = 8) is the minimum number of Elders
that is permitted. So what happens if the number of Elders
in a section drops below 8 as a result of churn, or an Elder
being demoted or removed? Well, since an Adult also has node_age > 4, the oldest Adult node is simply promoted to the Elder
type, i.e. it is given voting rights. If two Adults
have the same age a ‘tie-breaker’ algorithm is deployed to decide which gets promoted.
A ‘complete group’ is a section containing a minimum of group_size Elders
(we’ve been using 8). The most reliable of these Elders
will be the least likely to be lost during a churn event. Infants
, Adults
and those Elders
that participate least in voting or which are least reliable are liable to be moved off to other sections at any time. This is important for security reasons - it’s much harder for an attacker to control a section whose membership is in constant flux. But temporarily, prior to the completion of a churn event, sections may find they do not have group_size Elders
, nor do they have sufficient Adults
to promote to Elders
. In this case, the group is defined as ‘incomplete’ and all section members influence node ageing.
Datachains
From the point of view of a section, Elders
can exist in various states, including Live
(active), Gone
(moved to another section or offline) or Dead
(banned). Changes in these states are recorded in a permanent ledger called a datachain, which is held by all members of the section. So, for example, if an Elder changes from the state Gone to Live, this will be recorded in two consecutive blocks in the datachain along with the cryptographic proofs of the voting process validating the event. Indeed, one purpose of the datachain is to give nodes a provable history on the network and to add resilience. The network can survive large-scale churn events or even a complete outage and be able to rebuild itself and reinstate the data according the records stored in the data chain. Another purpose is to ensure the integrity and durability of data by providing a cryptographically secure record of its identity, validity and location. Datachains should also improve the efficiency of the network by reducing the number of nodes that need to store data.
The states of Adult
and Infant
nodes, as verified by the Elders
, are also stored, but since these nodes cannot vote and play a less important role in the continuity of the network this record is only temporary.