Thanks for your contribution. We decided to implement some of the changes you suggested. However, it is worth noting that even with those changes it is still possible to exceed the limits of a particular mutable data chunk. To understand why, let me briefly explain how mutations work:
When a client sends a mutation requests against a data chunk, the request is received by the group of nodes responsible for that chunk. Those nodes then vote on the mutation (this is where the validation happens) and if majority of them accepts it, they send a message among themselves to apply the mutation to the data in their chunk stores.
Now imagine two clients sending mutations against a single data concurrently. Say the data already has 99 entries (one below the limit) and they both attempt to insert one entry each. The nodes receive one of the mutations and validate it against the data in their chunk store. This validation passes, so they vote for it. Then the other mutation arrives and they again validate it against the data in the chunk store. This still passes, because the vote on the previous mutation hasn’t been finalized yet. So they vote for it too. After both votes are finalized, both mutations are applied and the data in the chunk store is updated. The data now exceeds the entry count limit by one.
That said, we are currently discussing a modification to the above approach which would make exceeding limits much less likely. Here is a brief outline:
A node, before voting on a mutation, would consider all pending mutations (those that it already voted on, but have not yet received majority of the votes from the group). If the number of newly inserted entries in all the pending mutations is less than half of the remaining allowed number of entries, the mutation is accepted, otherwise rejected. The size limit would be checked similarly.