RFC 29 – Data Chains

Viv · August 30, 2016, 1:54pm

Discussion topic for RFC 29 – Data Chains

dirvine · August 30, 2016, 2:19pm

The way I see this coming into play is in phases. Here are some example notions, although I believe there is a great deal more that can be achieved. These points are merely to help the reader see a larger picture and perhaps see other immediately beneficial uses.

Data Republish, allow a node to effectively prove a DataBlock is valid on the network. This holds the type and hash of the actual data, so the block can exist without the data, but the data at any time can appear and be known valid.
Network entropy measure - we can measure a blocks lifetime on the network in terms of where it was created (first seen) in terms of which bucket it was in (as the network grows/shrinks buckets or common leading bits change). This measure can be used for many measurements and ranking type mechanisms, such as how many churn events has this node bene in a particular group, or noting a mass increase in a data type (such as safecoin) in particular parts of the network etc. (there is a lot can be pattern analysed).
The ability for ledger based systems (with a small change to SD as currently coded in the data_Chains repo) where an SD item can be Put with a ledger bit set (so not at version 0). This can be used on per type or per transaction basis. So features such as I want to keep a receipt of a safecoin transaction or similar can actually be possible. Same can be used where we want appendable items to persist (force ledger bit set on immutable comments etc.)
Ability to track a nodes history in the network even with changing id’s if it were deemed useful to track a nodes rank over sessions or through time.
The ability for differing size nodes, where not all nodes in a group hold all data. This is a simpler mechanism but likely vital.
Ability on data republish for a network to collapse fully and then reconnect without data loss. This is why archive nodes initially hold more data than they can be asked for. Also helps with software catastrophic failure (such as bad update, but this shouldn’t happen if we implement update validation).
Ability, if required, to force one use only crypto keys? This makes joining more difficult as network grows over time.
With merkle locks (Qi currently adding the detail to the RFC) in place then the Genesis groups can be secured and chains validated form that point.
The possibility of two network (re-)combining from differing genesis blocks. This is interesting and does require rules as to which data will be accepted and how it’s paid for, but could be a considerable addition to decentralised networks, perhaps even in read-only mode to start with. This is for sure a separate phase/rfc group, but very interesting.
Securing group claims across the network. With secured genesis blocks then link chains can be transferred across the network to prove a valid network node in a particular group. This could be granularised based on distance between nodes (xor) and secured checkpoints (the location in the chain where groups split).

davidpbrown · August 31, 2016, 3:23pm

Is it ignorant to wonder that datachains are very much like blockchains in cryptocurrency? If so, it is worth making the similarity more obvious and highlighting the difference of what can be done and what not? For example, most cryptocurrency blockchains cannot do colour- that is, each data point is unique rather than being fungible; yet colour could be powerful in bridging object ownership in the real world. Are datachains likely to be more flexible than blockchain technology currently or are they deliberately adopting the best of them?

dirvine · August 31, 2016, 11:01pm

I am now also proposing a simple but important change. As nodes could keep keys or harvest them then recreating groups is potentially possible. Albeit seemingly far fetched it becomes a definite issue when MaidSafe has the genesis keys etc.

The fix though also fixes a niggle during design and that is the linkDescriptor and it’s purpose.

So I propose this linkDescriptor now is the

Hash(Hash of current group + hash of all data blocks + hash of previous valid link block)

As blocks will be out of sequence then the blocks (Identifiers only) should also be passed in the LinkBlock (but not stored in the chain). This allows all nodes to agree the blocks in this part of the chain and make it immutable as well as signed. If a block is seen by any node all nodes locate that block in this chain component (it wills til either validate or not on a majority regardless of out of order receipts of nodeblocks). Possible downside is a chain may then have blocks that never validate or be cleaned up. This may not be a showstopper though.

This effectively locks each block of data between churn events, including splits and merges.

The added bonus is the group can reform through churn and this is covered as well, a potential issue @AndreasF mentioned earlier today. In addition it makes checkpoints easier to guarantee via a series o flocked links and additionally prevents data blocks being removed from a chain maliciously.

This small change I believe secures against malicious removal, key harvesting and removes caring about who created the genesis block.