Data Chains - Deeper Dive


#1

Introduction

This document is a work in progress and is likely to change somewhat, with some sections marked as “TBC” (to be confirmed). It is a slightly deeper discussion regarding data chains, but does try and stay high level enough to be able to convey the important messages. Along with these desires it is hoped that this is also as brief as possible, allowing discourse and hopefully agreement on the fundamental aspects of the network. This will include:

  • data chains
  • peer age
  • peer penalties (TBC)
  • secure message transfer (TBC)

Consensus types

  • Remote network consensus: This is the consensus of events relayed securely to our section.
  • Local consensus: This is the consensus of events observed by our current section.
  • Chain (offline) consensus: This is a sub-set of local consensus which provides a cryptographically verifiable order of events.

Definitions

Complete group: at least GROUP_SIZE peers with age >4 in a section.

Elders: the GROUP_SIZE oldest peers in the section. If there is a tie, we use the tie breaker rule for node aging. They are the only ones who have a routing table and will be in routing table.

Tie breaker rule for Node Aging: If there are multiple peers of the same age then XOR their public keys together, hash the result and find the one XOR closest to this hash.

Adult: in a section with a complete group a peer with age >4. They are only connected to their section Elders.

Infant: non elder with age ≤ 4. They are only connected to their section Elders.

Vote: a node’s detection of a network event plus its signature of it.

Proof: a node’s id and its signature against a network event.

Block: network event with a vector of proofs collected.

Quorum: Majority of valid voters && > 50% of total age of valid voters.

Group consensus: a block containing a Quorum of proofs of the elders.

Section Membership: All the members of a single section, comprising the elders, adults and infants. All of whom hold the data chain.

Neighbours: sections we are supposed to be connected to differing in exactly one bit from us.

Sibling: A section differing from us in the last bit.

Prefix: The bits that identify the section. It additionally has version that increments by one each time the prefix appears on the network.

Static group and quorum sizes

When a section does not have a complete group, all oldest peers up to GROUP_SIZE are Elders and all section members influence node aging.

All peers must get all section relevant data, sign it and provide their votes to be accepted into the section. This replaces/enhances resource proof mechanism.

Newly joining peers enter the section as Infants with age 1. In order for the number of Infants in a section not to grow out of control, once our section has a Complete Group we also stop accepting new Infants (with age 1) if our section already has such an Infant.

Adults and Infants only store the data for now, and only connect to their section’s elders. Elders store the data and connect to all peers in their section and all elders in neighbouring sections.

There is no attempt in this paper to find other work for adults and infants, but it is very likely they can further reduce load on the elders.

Detailed design

Types

The DataChain will contain a permanent record of states for all elders in a section (this will be flushed to disk periodically) and a temporary record for adults and infants. The permanent one named blocks and will be Vec<Block<ElderState>>. Meanwhile the transient one named valid_peers and will be Vec<Block<AdultOrInfantState>>.

enum ElderState {
    /// Accepted but not yet lost.  May have joined or been relocated here, or may
    /// have come back via a merge.
    Live(PeerId),
    /// Cannot ever become live again
    Dead(PeerId),
    /// Gone to another section via "split" or become an Adult (Demoted).  Can
    /// again become live here.
    Gone(PeerId),
}

enum AdultOrInfantState {
    Live(PeerId),
}

Ordering constraints on ElderState for Complete Groups

The ElderState type blocks will come in pairs, where Live immediately follows Dead or Gone. DataChain::blocks should be reordered as these blocks become valid so we have the above pairings. (e.g. if two Dead blocks become valid, we insert the corresponding Live block between them once it becomes valid). The ordering of these pairs are defined by the valid voters.

/// Traversing the blocks from reverse (newest to oldest block)
for (elder_state, i) in blocks.iter().enumerate().rev() {
    match elder_state {
        ElderState::Dead |
        ElderState::Gone => assert!(blocks[i-1], ElderState::Live), // if not then reorder
    }
}

A Vote is defined as:

struct Vote<T> {
    payload: T,
    signature: Signature,
}

The peer ID is omitted here as it is implicit in direct messages. The signature is the sender’s signature of the payload.

struct PeerId {
    age: u8,
    pub_key: PublicKey,
}

struct Proof {
    peer_id: PeerId,
    sig: Signature,
}

The peer_id of the Vote's sender is inserted by the receiving peer to turn the Vote into a Proof.

struct Block<T> {
    payload: T,
    proofs: BTreeSet<Proof>,
}

struct Prefix {
    bits: BitVec,
    version: u64,
}

Intra Section Voting

A peer voting can be seen as sending either a Vote or a Block that contains its proof.

Vote's are sent and received only between elders while Blocks are sent to all the section members.

Peers will only send a Vote or Block once.

on local event:

  • if block exists with our proof -> end
  • else: vote to elders and trigger handle_vote_received(true)

handle_vote_received(swarm: bool):

  • if the block doesn’t exist
    • generate new
  • else:
    • add proof to existing block
    • if it becomes valid
      • add your proof to the block if not already there
      • if swarm
        • swarm the block

handle_block_received:

  • if the block is not valid based on current section -> drop the Block. We only use the Proofs inside the given Block to check the validity and not add the extra Proofs that we already know of.
  • Call handle_vote_received(false) for every Proof that’s inside the Block.
  • Swarm the block (the one formed inside handle_vote_received)

Block insertion rule:

Insert the blocks into the chain and action on it (e.g. sever connections or other steps necessary) only if correct pairing is obtained. Until then hold the block but do nothing because of it.

The goal of the insertion rule is to always get to GROUP_SIZE Elders.

Block States

Block states are represented as:

pub enum BlockState {
    NotYetValid,
    Valid,
    Full,
}

NotYetValid: Contains Proofs < Quorum

Valid: Contains Proofs >= Quorum && != Full

Full: Contains Proofs == maximum possible Proofs (will be defined later on)

ElderState (Only for the DataChain::blocks)

Live(PeerId, u8)

A peer in this state is a valid voter.

Trigger for voting

Once we have more than GROUP_SIZE peers in the section, this should follow the voting of Dead OR be followed by Gone if we saw these ourselves OR if we got a valid block containing any of these events.

On consensus

Add the peer to the routing table.


Dead(PeerId)

Never allow the PeerId to become live again. Peer relocated on churn or voted to be killed because of agreed accusations.

Trigger for voting

As we receive/form a valid block of Live for non-infant peers, we take the Hash of the event H. Then if H % 2^age == 0 for any peer (sorted by age ascending) in our section, we relocate this node to the neighbour that has the lowest number of peers.

If two peers have the same age, use the tie breaker rule. If all neighbours have the same number of peers we relocate to the section closest to the H above (that is not us).

OR

When we detect behaviour deemed enough to kill a node. This will form a part of accusations which is intended for later.

OR

Peer has gone offline. Can happen out of order, but forces next state.

On consensus

Remove the peer from the routing table and disconnect if not already disconnected.


Gone(PeerId)

A peer leaves the Elder group in this section but is allowed to come back to the Elder group with the same PeerId. It cannot however join by restarting.

Trigger for voting
  1. Split where the peer ends up as an elder in the sibling section
  2. Merge where the peer ends up becoming an Adult from an Elder
  3. When an Elder is accepted and pushes an existing one out as an Adult
On consensus

Remove peer’s entry from current section in the Routing Table.


Adults and Infants

They are not a part of the chain. Adults and Infants are stored as Blocks (Live) in a separate container and these entries are transient. Once an Adult or Infant is no longer a part of our section (either Dead or Gone) then we remove its block.

The blocks of Adults and Infants are always signed by the current Elders. If a new Elder becomes Live in our section then it must provide its proof for all the existing Blocks of Adults and Infants. If an existing Elder is Dead or Gone then its proof must be removed from the all these Blocks.

By keeping the container separate (from the main Elder chain), pruning it constanty as these peers disappear and providing Proofs from only the most recent Elders we keep the Adults and Infants blocks bubbled up to the top instead of embedded deep in the chain where we need to search more expensively to find out about them.


Merge

Data structures

struct SiblingEldersRpc {
    /// This is our vote for each of the sibling elders' Ids
    sibling_elder_ids: Vec<Vote<PeerId>>,
}

struct MergeInfoRpc {
    /// Our Adults and Infants
    adults_and_infants: Vec<Vote<PeerId>>,
    /// PeerIds of the additional neighbours
    new_neighbours: BTreeMap<Prefix, Vec<Vote<PeerId>>>,
}

The algorithm

Say we are an elder in the section S00, merging with S01 into S0.

  1. We realise that we need to merge, due to one of the two reasons:
    • we have come down to 0 adults
    • we accumulate a valid block from votes sent in MergeInfoRpcs by our sibling
      Note: if we started the merge because we had no adults, we don’t consider a valid block from MergeInfoRpc a trigger anymore
  2. Once we realise that, we send a SiblingEldersRpc to everyone in our section. The RPC contains our votes for every elder in the sibling section.
  3. When receiving SiblingEldersRpcs from the other elders, we accumulate the votes and create blocks in the cache.
    • Once a block becomes valid, we pass it to everyone in our section (to make sure that even if some malicious peers voted only to us, everyone will see it)
  4. Once we accumulate a quorum of valid blocks from SiblingEldersRpcs, we start sending MergeInfoRpcs to every elder in the sibling section.
    • note: every PeerId is accumulated separately - we wait for a quorum of those
    • If there is any change to our adults or infants, we send the RPC again with the updated data.
  5. Once we get at least a quorum of MergeInfoRpc (note: we should get up to GROUP_SIZE messages initially; there may be updates later, but we don’t wait for them), start doing the following in parallel:
    • vote for Live(x) to everyone in the sibling section if x is in our section and is supposed to be an elder of the merged section;
    • vote for Gone(x) to everyone in our section if x is an elder of our pre-merge section, but shouldn’t be an elder in the merged section;
    • vote for Gone(x) to everyone in the sibling section if x is an elder of that section, but shouldn’t be an elder of the merged section;
    • if a SiblingEldersRpc block accumulated during voting, adjust our view of the post-merge elders and send appropriate votes (ie. if we thought the post-merge elders will be 1,2,3,4 and after the update we think they should be 1,2,3,5, vote Gone(4) and Live(5))
    • the above ensures that every block will have a quorum of current elders, regardless of the order peers put them in the chain
    • every vote for Gone(x) above means also voting for AdultsAndInfants::Live, and analogous for Live(x)
  6. If any churn event happened since the time we realised the necessity to merge, send the vote for it to every elder in both sections.
    • If a Dead(x) accumulated for x that was supposed to be an elder in the merged section, calculate the next elder y (based on all the accumulated elders, adults and infants) and vote Live(y)
  7. Specific rules for voting:
    • If you were an elder before the merge started, you send the votes to appropriate sets of peers (defined by the rules above) for the whole duration of the merge, even if you have been demoted in the meantime
    • If you were an elder in a pre-merge section and receive votes from any peer who was an elder in a pre-merge section or is supposed to be an elder of the post-merge section (might be a pre-merge adult if an elder dies during the merge), you consider that vote to be valid (regardless of their elder/adult status) and form a not-yet-valid block. The same applies to received blocks.
    • If you are an adult that was a pre-merge elder, you send your votes to all the peers you perceive as current elders.
    • If you are an elder during the merge and receive a vote from a pre-merge elder who is now an adult, you forward it to all the pre-merge elders.
    • The standard rules for block validity apply - a block is valid if it has a quorum of votes from the members as defined by the previous block. The rules should ensure that we will receive votes from at least a quorum of voters from both pre-merge sections throughout the whole merge, which will allow us to find an order for them that makes them valid. At the same time, the malicious nodes will be too few to be able to form a block that will be considered valid at any point of the chain.
  8. Append all accumulated adults and infants to the adult/infant chain (except the ones that might have become elders in the meantime).
  9. Connect to the new neighbours (the ones that were your sibling’s neighbours but not yours before the merge) - the information about them was in the MergeInfoRpc struct.
  10. The merge is complete once the target elder group has been formed (all the agreed target elders are either promoted or dead; they may die during the merge).

Example

  1. S00 decides to merge with S01.
  2. S00 sends SiblingEldersRpc to its own section - S00…
    Each elder in S00 fills it up with its votes for what it considers are the elders in the sibling. We will accumulate on per PeerId basis. It’s OK if some members disagree on the complete membership of the sibling, at-least the quorum of them will be agreed upon. The rest will fall into place due to eventual consistency after following the algorithm to the end.
  3. S00 sends MergeInfoRpc to S01 to indicate about its adults and infants, after SiblingEldersRpc accumulates for it. The accumulation is done on per PeerId being voted on, not the entire set at once. The accumulation of this in S01 will make it understand there is a merge happening and it needs to react if it hasn’t already (which might happen if both simultaneously realise they want to merge - it is symmetrical so shouldn’t matter) - this is when you have a valid SiblingElderRpc accumulated.
  4. S01 will do the same thing - send SiblingEldersRpc to its own section, S01, and send MergeInfoRpc to the sibling.
  5. Whichever peer gets PeerIds from SiblingEldersRpc accumulated, starts voting for appropriate Live and Gone events.

Split

Data structures

We define an additional RPC:

struct SplitRpc;

The only purpose of this structure is to be voted for by the elders, and when it accumulates, everyone will know that they are, in fact, splitting. Otherwise, the following Live and Gone events could be mistaken for natural churn.

The algorithm

Assume we are an elder in a section that is about to split.

  1. We realise that we need to split when both child halves of our section have at least GROUP_SIZE + SPLIT_BUFFER peers with age ≥ 4 (the age condition being in force only once the section has a complete group).
    • A child half is the set of peers that belong to our section and match the prefix one bit longer than our current prefix. There are always two child halves, for p0 and p1, where p is our current prefix.
  2. When we realise the need to split, we cast a vote on SplitRpc. We use the usual voting algorithm.
  3. When SplitRpc accumulates, we start the split.
    • The valid SplitRpc block is not recorded in the chain.
    • We calculate which child half of the section we will belong to. Let’s call this half “our half”, and the other one “sibling half”.
  4. All pre-split and post-split elders cast votes for all Live and Gone events that need to happen to bring the set of the elders to the post-split configuration in both sections. The votes are sent to all the pre-split and post-split elders that fit the corresponding prefix.
    • Say our section is S0 which is splitting into S00 and S01, and we vote for Gone(1). 1 will belong to S0, which means it only needs to be gone in S01 - so we send the vote only to the peers matching the prefix of S01.
    • This ensures that both branches of the chain will have quora of votes for every block. When the section splits, every elder belongs to one of the children, but still has to vote for a few blocks in the sibling section until they vote for him being Gone.
    • The blocks will generally have more than GROUP_SIZE votes. Once a block is accumulated and inserted into the chain, the rendundant votes are dropped - only the ones signed by peers implied by the previous block to be the elders are retained.
  5. The same as 4 happens with the adults and infants. The pre-split and post-split elders vote for Gone for every adult and infant and send the votes to the relevant peers. In addition, the post-split elders that weren’t pre-split elders need to vote Live for every relevant adult and infant.
    • The Gone blocks for adults and infants aren’t saved in the chain, but cause removal of the corresponding Live blocks from the adults_and_infants chain.
  6. All pre-split and post-split elders vote for any other event relevant to their new prefix.
    • Example: if peer A that would be a part of S01 died, all pre-split elders matching S01 and all S01’s post-split elders vote for Dead(A)
  7. The split is considered complete when the final set of the elders is promoted.

Example

There are sections S0, S10 and S11. S0 currently has 1, 2, C, D as elders, and is about to split into S00 (will have 1 - 4 as elders) and S01 (will have A - D as elders).

The following is the flow:

  1. A Live becoming valid triggers S0 to start split process.
  2. The elders of S0 vote for SplitRpc, sending the accumulated blocks to the adults and infants as well
  3. Once the above accumulates:
    3.1. 1, 2, C, D, 3, 4 cast votes for Live(3), Live(4), Gone(C), Gone(D) to 1, 2, 3 and 4 (they don’t vote for blocks where it doesn’t make sense, though - as in, 3 doesn’t vote for Live(3) etc.)
    3.2. 1, 2, C, D, A, B cast votes for Live(A), Live(B), Gone(1), Gone(2) to A, B, C and D
    3.3. When the blocks accumulate, they are swarmed to the elders of neighbouring sections and adults/infants of own section
  4. Adults/Infants within S0, when they receive new accumulated Live/Gone blocks:
    • Record those blocks regarding elders that will be in the same new section with us.
  5. Elders within S10 or S11, when they receive new accumulated Live/Gone blocks:
    • Disconnect from elders that they are no longer supposed to be connected to, and start connecting to new promoted elders.

#2

Do they still cache data. Or just Adults (& elders) cache data Or just elders?

Does this grow *forever* Or is it pruned from time to time?

How long is this information kept? I could see in a few years this list could be so long that it takes long time to search (relatively speaking) when required and/or the size is very large on disk.

Sorry if I missed the answers to these questions.


#3

We haven’t decided on that yet. It’s a double-edged sword really - the plan for handling client messages (which are the ones which will contain cacheable data) means that only the elders are involved passing these between sections. So to allow adults and/or infants to cache data, we’d have to make the elders do extra work to pass these to their non-elders at intermediate hops.

We have a few ideas we’re mulling over regarding pruning this, but none settled on yet.

Yes, agreed. I think this is related to the pruning question really. At the moment for this initial cut, we’re aiming to keep the ElderState blocks permanently (not just the Dead ones, but all of them). Each block should only be around 1kB, so the size on disk shouldn’t be a problem for a while (I bet that’s what they said at the start of Bitcoin too! :smiley: ).


#4

Thats what the post seemed to indicate. Thanks for the info

Yep from someone who been around computers since my young teens all those decades ago, I know you will have to prune along the way. Blockchains show us that too.

Thanks for letting us know that this is still being worked on.

So has there been any estimations/simulations done yet on the expected number of blocks being generated. I know time is not a SAFE thing, but we work with time, so how many blocks for a typical section would be generated per hour/day shown in the estimates/simulations?


#5

Ah that’ll depend on number of churns/events happening on the network. So whatever the simulation tweaks the parameters to change its probabilistic model. The only proper answer would be from statistical collection from a live network. What we do know however is the block generation (ordering) constraints. So a Gone for instance must be followed by a Live in a complete group etc.

As for a simulation we will be completing one of our simulators soon though, link to it and will run that and publish the results for the curious.


#6

For ease of understanding at a high level, how about adding the following definitions to the intro?

group
group_size
complete group
section


#7

A complex thing to understand, but very clear and concise. I’ve followed some of it, am not quite getting a fair bit, and some not at all yet. But I’ve only tried reading it twice :slight_smile:

Well done to the author(s)!

One area I think is not just me, so might benefit from clarification…

Types

I’m confused from ‘Types’ :

This seems contradictory to me.

Subsequently it says:

So the first quote seems to incorrectly say that Dead “cannot ever become become Live again”

But then again we have this being:

… so it is not clear what’s meant by these seemingly contradictory statements.

Maybe the confusion is between block type order in the chain, and permitted state changes for a node? Anyway, I think this area needs clarification, and that first quote does seem to contradict itself.

So perhaps the chain block type order is:
block of Gone -> block of Live -> block of Dead -> block of Live -> …

And allowable node state changes:

Live -> Gone
Live -> Dead
Gone -> Live
Gone -> Dead
Dead -> Gone

Or something (again, not entirely defined I think).

I get a bit further but ‘Merge’ not much at all yet. :slight_smile:

Great to read this, and to hear that it will soon be coded up. I’m very excited.


#8

Only remarks about the form and not the substance:

I didn’t really receive an answer about why the traditional group_size value has the same value as the number of elders in charge of a section (see this post in the other forum):

Even if Maidsafe estimates now that these 2 values are the same, tests may reveal later that security must be emphasized more on replication factor or on contrary more on the number of elders. Having the possibility to change their value independently would allow testing different hypothesis. Also, even if the values are the same, having a proper name for entities helps for a more readable code.

For example, using 2 parameters like “replication_factor” and “council_of_elders_size” would be more maintainable and clearer than using unique parameter “group_size” (these terms are only propositions to illustrate what I mean).

Again, about terminology clarity: Block is too vague and doesn’t convey any meaning. I propose something like “BallotBox”, “ElectionResults” or “PollResults”? This is exactly this: a set of votes about one event.

In conclusion I would say that using a better terminology would help for a more understandable specification and later for a more maintainable code.


#9

Yea, we had restart capability then pulled it at the last minute. It’s in our other doc which has a lot more in it. so this is a summary of the bigger more detailed doc. Right now the states are
Live Gone Dead Where Live is active, Gone means it is on the network but not in our section OR has been demoted to an Adult (i.e. not in our Elders group, Dead means that ID is no longer able to appear on the network (its banned). The confusion is the Dead used to be Killed or Relocated where a Lost peer could reconnect and be relocated with loss of 50% age. This will still happen, but removed from this summary. I guess the wording was not properly edited there. sorry for the confusion there chap.

Sorry you did not get an answer, the Elders are the GroupSize oldest peers in a section, only if that contains no infants is it complete. Now it is a bit more subtle, each churn event (membership change) means this group of Elders is a running transition set of

  1. state -> incomplete
  2. state -> complete

So each time we lost a peer for whatever reason we are at group size - 1, consensus is then based on majority of group size - 1 and >50% of age of valid voters. However to add a new peer we are also at group size - 1 so still incomplete.

After churn we are complete to do the network bidding, but at churn the section has an incomplete group, but the next event makes it complete. I hope that makes sense.

Each Block is effectively through state transition of the event type and valid voters locked in an order. This order can vary between two peers but will have the same start and end point due to quorum being used. i.e. you may see A being replaced by X and B being replaced by Y, I may see B being replaced by X and A being replaced by Y.

We both started with AB and ended with XY in an order cryptographically proven to be valid, however we took different routes to get there based on out of order messages and peers seeing events in different orders etc. It’s quite powerful.

So it is a chain of Blocks and the blocks contain a payload of an event type for routing sections. They may also have a payload of Data types (DataIdentifiers + Hash and that is where you start to see these blocks become very valuable for offline validation. There are a few nuances and we will discuss these as these deeper dives are published. It is actually rather neat, but many find it initially not possible. It then gets deep but well worth the journey, even if it kills me :wink:


#10

Really excited to see this. Still absorbing it but I have some initial questions and nitpicks.


‘Complete group’ definition would be clearer if it used ‘adults’ instead of ‘age >4’ - that’s what the definitions are for after all.

I feel the order of terms should go from ‘simplest’ to ‘most complex’ but it starts with a fairly complex one. So something like starting with defining infant, adult, elder and then talking about how they combine then talk about exceptional cases etc… just makes it easier to read if it’s sequential by conceptual complexity.


I’m a bit unclear on the definition of neighbours.

Is that one bit by value, or one bit by length? ‘Differing in exactly one bit’ is not clear to me.

Are the following sections neighbours?

1000000001
1001000001

Does this mean sections are connected to prefix.length other sections?

How does neighbours work for imbalance of prefix length? Are these neighbours?

1000000001
100100000
1000000001
10010000000

Why does sorting matter? Is only one node relocated per valid block? Is sorting intended to bias relocation of younger nodes over older nodes?


Really glad to see this topic being discussed in further detail. It’s fascinating.


#11

I wont detail this all here, but instead this link will answer this in much more detail (rfc)

Yes this is exactly correct, we only relocate one peer, if many have same age then we choose the closest to the hash of the churn event. We are still simulating the effects of various orders here (and a few other parts). This should be more complete by next update with the sim code.

I am really glad we have some community members with deep understanding and capability taking part in the conversation now. This is where we get the strength of the crowd on our side. Great so see.


#12

Yea, I was fully aware of this. Ever done OS writing and tried to estimate OS call usage :slight_smile: Fun stuff.

But I was hoping that someone might have done some simulations or estimations based on a ball park figure from the alphas/testnets. And then pulled in figures like 1/10 of that and 10x and 100x and even 1000x that figure to get an idea of what might happen if one section experiences high event rates.

Excellent

Summary

imagine Burns saying that


#13

From the answer, I think there is a misunderstanding because I simply meant the term “Block” is too vague. IMO a better term should be chosen for a more readable specification. As it contains a network event with a vector of proofs, why not something like “EventProofs”?

But personally, I think that proof is too strong because it is only a claim signed by a peer. And the differences between a vote and a proof are only implementation details because both are related to 3 elements:

  • The payload
  • The peer id
  • The signature

It’s just that the peer id has been removed from the vote because it is already part of the message and payload has been removed from proof because it has been put in common in the so called “Block”.

So, votes and proofs are basically the same things and to simplify the specification, I would only keep only one of these notions in it, preferably votes because technically speaking they are not proofs.

For example, I would delete Vote definition, rename Proof to Vote and rename Block to EventVotes:

struct PeerId {
    age: u8,
    pub_key: PublicKey,
}

struct Vote {
    peer_id: PeerId,
    sig: Signature,
}

struct EventVotes<T> {
    payload: T,
    proofs: BTreeSet<Vote>,
}

I feel the same vagueness about “group size” expression because the question “what is a group size?” has been asked many times by the community. Additionally, this expression is becoming ambiguous because previously it only meant the data duplication factor but now it also means the number of elders in a section.

Even if you give the same value to these two quantities, as I explained in my previous post it would be clearer to use the right term depending on the specific context, which is here the number of elders.


#14

I think this is because you are seeing only one use here. Its a generic struct over the payload. So the same pattern is used for message transfer where payload is a type of message, data storage and updates, where the payload is dataidentifier etc.

The reason for a Vote is that it is sent from one peer to another. The Proof is a proof of a vote in the Block. So the block contains proofs of peer votes, but the vote itself is a peer doing something. Kind alike the proof is what the Peer did where the Vote is it doing that thing, if you see what I mean. So the Vote has to say what it is voting for and the proof only needs the signature of the block payload and identifier of the voter.

Group size never only meant this, that is maybe a confusion by us, I am not sure. It comes from the Kademlia spec really, there you have group size (k), refresh, republish etc. we have gone a good bit away from pure kad, but this is where the group size comes from. It is selected in kad etc. as the number of nodes that cannot all go off line between a refresh.

Replication factor of data is not the same (although in kademlia it is :wink: ) and will likely not be on release, right now for testing it is, but if the group is secured using group size then the replication factor only need be as large or small as the number of nodes that cannot vanish before the network can act. That should be a relatively small number.

Here the elders are defined by the group_size, they are only elders because they are the X oldest adults where X is group size.

I hope that helps, I may be missing something fmr your comment though, sorry of I cam.


#15

This is the problem. Group size is too much overloaded:

  1. Original definition in Kademlia spec as you rightly indicate
  2. Data duplication factor
  3. Min section size (name of parameter containing its value in routing config file)

Ok, that’s a move in the right direction that will distinguish 1 and 2.

I also accept that 2 = 3 because to be able to duplicate each data n times a section must have at least n nodes (but formally, this should have been 3 >= 2).

Now this expression has a 4th supplementary role, that is the number of elders in a section. And this is where I argue because it is too confusing and not maintainable to reuse group size expression.

Instead, to make the data chain spec clearer, you should talk about (target) elder count.

And a possible way to implement 2, 3 and 4 values could be:

  • define 2 parameters in config file, for example data_duplication_factor and target_elder_count
  • define min_section_size in code as max(data_duplication_factor, target_elder_count). This is to insure that 3>=2 and 3>=4.

#16

This min section size will not be in the code though. Group size is all there is and it is only the number we decide for the number of Adults we wish to be Elders. I think it will be much clearer as data chains is implemented.


#17

OK here is my interpretation of the OP for the more technically challenged reader (like me), stepping back a little to add a bit of an overview. I’ve left out the ordering of the blocks as I still don’t understand that and one or two other details have been omitted too.

It seems to me that the purpose of datachains is more than simply tracking the state of Elders - that’s just one aspect of what they do, but the only one mentioned in the OP - and this is a cause of some of the confusion expressed in the comments afterwards - that and confusion over terms like ‘group’, it being a work in progress. Certainly getting a firm handle on things is not that straightforward, but it’s been fun learning and trying!

There are two things I’d really like to see. One is a succinct description (< 100 words) of what a datachain is and why it’s useful. I’ve rootled around but couldn’t find one anywhere. I remember reading a very short and simple description of a blockchain which made subsequent learning much easier, something like “a blockchain is an immutable ledger to which information can only be added, never changed or taken away. It is cryptographically secure and distributed across multiple machines, meaning that there is no need for a trusted central authority.” Could we do something similar for datachains?

The second is a description of how a datachain propagates across the network. I don’t really understand that at all.

Anyway, let me know if I’ve missed anything vital or got anything wrong.

A high-level look at consensus, node ageing and data chains

An autonomous network of decentralised nodes needs a set of rules to decide what is true and optimum, and what is potentially dangerous or counterproductive - since no human can do that for it. Because there is no central, universal source of ‘the truth’ (including current and elapsed time), such decisions, triggered by events on the network, must carried out by groups of nodes which reach a consensus by following rules as to what is and is not valid.

So, for example, a group of nodes might decide amongst themselves that a new member fulfils all the criteria required to join their number; conversely, they might decide that a particular node is acting suspiciously and should be thrown off the network. In another example, a group of nodes might decide that their number has become too small to ensure the required levels of security and that they should merge with a neighbouring group; and a large group of nodes might decide to split into two smaller entities.

Here, the word ‘group’ is used in general terms; as we’ll see, in the SAFE network it can have more specific definitions.

For a decision made by a group of nodes to be valid, a minimum number of them (a quorum) must agree. This quorum is set at more than 50% of the nodes. So, it there are eight nodes and five of them agree (by voting) on an event - say a new member trying joining the group is acceptable - then their vote will carry (the new member will be allowed to join) and their votes will be stored in a signed and secured block. If only four nodes agree then no action will be taken (in this example the node will not be allowed to join).

When it comes to decision making, not all nodes are equal. Some will have proven themselves to be trustworthy and to have sufficient resources (bandwidth, CPU, uptime), while others will be newer and/or less reliable. The concept of node_age gives a number to this reliability factor, ranging from 0 to 4. Nodes with a node_age of 4 are classed as Adults. They will have been moved around the network a few times and proven themselves to be reliable; they are trusted to vote on network events (e.g. allowing a new member to join). Nodes with a node_age < 4 are called Infants - they are not yet trusted to vote.

While Adults are capable of voting, in reality only the oldest and most trusted among them - the Elders - have that right. The Elders confer among themselves as to what is valid in their section. Only the results of their decisions (blocks) are shared with the Adults and Infants rather than the votes themselves. While they cannot vote themselves, Adults and Children are able to contribute by storing data chunks.

Sections

At any given time, each node on the SAFE Network is a member of exactly one section. A section is a group of nodes that is responsible for looking after data stored within a certain range of addresses on the network. The number of nodes in a section varies and fluctuates constantly through the process of churn, but at any time most sections will have between 10 and 20 nodes.

Within each section are a certain number of Elders, who are able to vote on events within the section, and also (usually - see below) a number of Adults and Infants. As well as being able to vote among their section peers (a process called local consensus), Elders within a particular section are also connected to Elders in sibling sections, near-neighbours as defined by their XOR address prefix. In this way, messages are passed across the network from Elders in one section to Elders in the next section, and so on.

If a particular section grows much larger than average (about 12 nodes), in general it will split into two smaller sections. (Note: in some circumstances, which we will not go into here, a split will not be possible.) Likewise, if, as a result of nodes leaving during ‘churn events’ the number of nodes drops below a certain level specified by a parameter called group_size, then its Elders will be triggered to seek a merger with a sibling group. At present, group_size is 8.

In an average section containing 12 nodes, there will be eight Elders plus a total of four Adults and Infants. In fact, eight (remember group_size = 8) is the minimum number of Elders that is permitted. So what happens if the number of Elders in a section drops below 8 as a result of churn, or an Elder being demoted or removed? Well, since an Adult also has node_age > 4, the oldest Adult node is simply promoted to the Elder type, i.e. it is given voting rights. If two Adults have the same age a ‘tie-breaker’ algorithm is deployed to decide which gets promoted.

A ‘complete group’ is a section containing a minimum of group_size Elders (we’ve been using 8). The most reliable of these Elders will be the least likely to be lost during a churn event. Infants, Adults and those Elders that participate least in voting or which are least reliable are liable to be moved off to other sections at any time. This is important for security reasons - it’s much harder for an attacker to control a section whose membership is in constant flux. But temporarily, prior to the completion of a churn event, sections may find they do not have group_size Elders, nor do they have sufficient Adults to promote to Elders. In this case, the group is defined as ‘incomplete’ and all section members influence node ageing.

Datachains

From the point of view of a section, Elders can exist in various states, including Live (active), Gone (moved to another section or offline) or Dead (banned). Changes in these states are recorded in a permanent ledger called a datachain, which is held by all members of the section. So, for example, if an Elder changes from the state Gone to Live, this will be recorded in two consecutive blocks in the datachain along with the cryptographic proofs of the voting process validating the event. Indeed, one purpose of the datachain is to give nodes a provable history on the network and to add resilience. The network can survive large-scale churn events or even a complete outage and be able to rebuild itself and reinstate the data according the records stored in the data chain. Another purpose is to ensure the integrity and durability of data by providing a cryptographically secure record of its identity, validity and location. Datachains should also improve the efficiency of the network by reducing the number of nodes that need to store data.

The states of Adult and Infant nodes, as verified by the Elders, are also stored, but since these nodes cannot vote and play a less important role in the continuity of the network this record is only temporary.


#18

There is a little more to add. It may make it easier, or worse :smiley: I never know. Anyway here goes.

There are several consensus mechanisms in play, the two main ones are

  1. Running Network Consensus
  2. Off-line consensus (data chain).

On the network a group can tell another group something and it will be believed. The other group is the next group on the binary tree (the xor space). So from A->B->C A cannot really tell C (usually) something as it’s not connected to it, so it tells B who is connected to it and can also see the membership changes etc, Then B tells C and so on. That is pretty much running consensus.

Then chain consensus where there is a progression of membership change events in the voting members of a section, this is the Elders as you point out (thanks). As each membership change happens it must be signed by the majority of the previous membership. This can be tightened to beyond majority where peers are penalised for not voting (but later, just take that as a future thing).

There is a lot that can happen now, so a client for instance on joining the network can connect to it’s close_group (which will also allow push notifications etc. so likely what will happen) and keep the last valid Block from that chain. On rejoin the chain in that section must progress form that Block forward, cryptographically secured. This amongst many other things allows us to know we are on the right network etc.

If the network restarts clients and peers do a similar thing, go back to the section they were in and things are again OK. Or they can be given data signed by the last known valid Block or a future generation of that (if they have been off for a while). This ignores the issue that many clients sign data so will be certain its their data via that route as well.

Back to running network consensus and chain consensus for a second. As the chain comprises unique blocks with unique membership at the section level we can do a trick. that is convert running network consensus into chain consensus. All we do is Vote as the current Elders on the event/data/upgrade etc. and we can create a Block for that event. It can sit on top of the chain signed by the most recent Elders and all is well in the world. So we can basically store any descriptor or data or pretty much anything like this. Converting to local (chain) consensus and locking it into the chain.

This trick also has another side effect, for a peer to join a section it can be given all data, in fact all recent Blocks and give he section Elders the Votes for all the Blocks. In terms of data it means they must get all the data and provide signatures of it to us. This replaces resource proof with a different work, this work is the network checking, can this peer get all this stuff and sign it. So then we know the peer is good and holds the data, it signed to say so. Now peers joining a section have worked hard, but have signed they have the data, so when we tell them to deliver data to a client and they do not, then there is no excuse, they should be punished (probably killed).

It goes on a good bit, but this is where I think we can start to really see the benefits of such a chain. Also as membership blocks are unique, we can prune the chain very efficiently. No need to grow forever.

There is a lot more but it is pretty exciting now to see many more folks getting involved in the thought experiments here.


#19

Great post @JPL thank you.

I recall a post or two by David when he first had the idea which might be a basis for this. I may have saved them the links.

I haven’t tracked down the post I’m thinking of yet, but here is where Datachains / Data_chains were first discussed I think.


#20

Thanks David. Turning in now but I’ll read and digest tomorrow.