I’m looking into CRDT data types for implementing a FileTree data type, and would like to understand how the existing CRDT type(s) work and what their features are.
For now I’m browsing the code and will post my notes, but any insights or summary that can be provided would be helpful (@oetyng do you know of any design notes which could be shared?).
Here’s how it works AFAIK
- each vault copy of a Sequence CRDT is stored as a single file, and loaded into memory in its entirety when in use for gets (e.g. range requests) or mutation (e.g. insert/delete)
- when the client accesses or mutates a Sequence CDRT, the entire object is fetched to the client and the mutation applied both to the local Sequence CDRT and sent as an individual CDRT op to the vaults. Note: This differs from local-first implementations where local mutations are typically accumulated within the local CRDT, and batched to other peers on demand. Whereas in SAFE, mutations are propagated as soon as possible so that they are not lost if the client device shuts down (or crashes). <- assumption
- the Sequence CDRT object will grow in size with every mutation even if size of the key-value entries is the same, because the CDRT metadata grows in size as it records the entire history of changes.
- at some point this growth will become a problem for vaults, but even sooner for client devices and applications because of device memory needed to hold the Sequence and the ‘costs’ (e.g. latency and mobile data costs) of retrieving a copy from vaults.
- it is up to the client to encrypt the content (key-value entries) of Sequence CDRT
- the CRDT metadata is, I think, not encrypted. <- assumption
Let me know if I’ve got this wrong or make corrections and post a reply noting the change.
@maidsafe all hints appreciated, thanks.
This post is a Wiki, so feel free to correct or add to it, with refs where useful.