Transparency or opacity of SD modifications

You will have to agree to this part in conjunction to the rest of what he mentions though:

With those modifications, yes - else all that i state e.g here in 3 points etc shows it wouldn’t make much sense in doing so. People will complain later and say we should have made it explicit or not expose it as a user-facing feature at all.

Also @tfa and @rob it might be worth mentioning the way you guys see Delete operation tying into this as i remember @tfa saying it (the usage of version field) would be useful only in conjunction to a particular way things are deleted. There might be better posts explaining those, but the most recent mentions are here and here. Worth mentioning those too in case they get missed getting discussed.

Well if using hash then just the hash is enough and sha256(something) == 32 bytes but yes compared to u16 you mention (== 2 bytes) it is certainly more. Also i think you misinterpreted u16/u64 etc (correct me if i am wrong though) - u64 is 64 bits == 8 bytes. u16 is 16 bits == 2 bytes etc.

(Edit: FWIW md5 would be 16 bytes i think (don’t know if there is anything shorter))

@Viv @dirvine @AndreasF Pls do look into this suggestion to see how much it complicates (or does not) churn/refreshes other operations - the point about version there - if one field is only suppose to change if data changes, and another for other changes etc. Also it would be great if you guys could state what you think about the discussion so far - expose version field - keeping in mind the use-cases, caveats etc. mentioned. Do you want to expose it with a disclaimer, do you not want to expose it, do you want changes in the structure as fraser suggests or somewhere in the operation of vaults and then expose it, what concrete use case would it solve etc ? Also remember if you are going for merging SD and AD would you want more fields for finer information in the struct Fraser presented - how would those things scale in future etc - vs of-course clients hashing the field they care about and want to track changes for.

Sorry if i seem to be a skeptical - just don’t want to rush in make a wrong decision which ends up supporting a total of 1 use case while causing confusion for rest.

1 Like

No problems at all, if it’s exposed read only I cannot see a problem. I do not think though that having several version are going to work right now. Baby steps are Ok, but large changes like that needs an RFC itself, I would be amazed if there were no side effects there. So the version as is now is my vote really.

So I would say if we did expose the inner version then Ok, but not any more right now.

1 Like

And would you want to put a disclaimer in the API that a version increment could mean any of these 3 points ?

1 Like

I would say it’s an internal incrementing counter, required by the network to exist and only increment by 1 on each change of any data element within it. That may be enough?

  1. If i did a post of same data none of the data elements would change though - just the version increments - is that OK?

If yes then we will have to say something like: it might increment for no change too.

  1. And ofcourse for AppendableData - it might not change even if data was changed - depending on if owner did it as a POST OR owner/users did it as APPEND ?

Aren’t those going to cause confusions ^^^ ?

Yes possibly, perhaps “any POST will increment the version by one only or fail”?

Ok - so leaving it exposed in safe_core with this disclaimer. @Krishna Could you also put this disclaimer in the Launcher-App api’s - Something along the lines: version will increment for any successful POST operation by 1, but beyond that there is no guarantee - all internals could have changed, only a particular internal could have changed OR there might have been no change at all. Also if tracking for AppendableData, a changes in internals might not result in version increment.

( :frowning: I want to see how a concrete use-case/production-grade-app based on above disclaimers/uncertainities even looks like).

1 Like

Personally I’m not for keeping individual versions for the same reason as you mentioned that the types get bloated based on the level of detail expected based on use case and this might better be served elsewhere at the app scope itself unless its generic enough to fit all types and be scalable with future changes.

If what we’re after is basic fingerprinting of data(if this is the use-case), then we can always add a hash identifier to the data type that can be retrieved from the vault explicitly via a different RPC to validate data before GET(and save some BW) or using it to match expected data when pulling OR provide the same RPC but do the hash in vaults on the fly and return the hash.

I still don’t see a specific use-case that cannot be achieved with whats proposed so in that case if we CAN achieve the requirements, would certainly prefer in not bloating the fundamental types as it might not be a feature for all.

This ^^^ i would vote and prefer infinitely more to an application that bases itself on uncertainties.

Sugar-coated POST operation: I like this expression, it is exactly both what I want and what was implemented before. The consequences at the safe_level level are the following:

  • Nullify signatures (I suppose you rather mean nullify owner key): The user deleting the SD has many possibilities to define who can recreate the SD with the owner key field:
    • anybody by nullifying this field
    • only him/herself by entering his/her public key
    • nobody (even him/herself) by entering the hash of a known string (like “David Irvine - Creator of the Safe Network”)
    • a set of users by entering their public keys
  • Nullify data: The user has to provide an empty data that is part of the SD that can be signed
  • Bump version: the version must be incremented because the deleted version is normally different from previous version, but not necessarily like in any POST operation.

At the low level API these fields can be computed automatically, though the choice could be left to the user for the owner key, possibly a simple enumeration like: not rewritable, rewritable by user, rewritable by anybody. Just to be sure everybody follows: if the SD is recreated afterward, then its version field is bumped again.

We can also simplify things by checking in handle_delete function at the vault level, that the owner key field contains the hash of a specific string. That way no deleted SD can be recreated, whatever its origin (low lew API or safe_core).

But I don’t have a strong opinion about it (either the latter or the user choice).

A comment in former vault code indicates the reason:

// Reducing content to empty to avoid later on put bearing the same name

And it is exactly the reason why I want this code restored.

Yes, I completed agree with this. To be precise version is set by core, vaults continue to check it is sequentially incremented and strictly nothing is changed in the code managing this field.

I think that current version field is good enough.

I have mixed feelings about major_index and minor_index fields, because in the majority of cases only data is modified. They add complexity because currently vaults just check correct increment of version without actually comparing data or keys, which is fine for me because the rule is: if SD is modified then its version is incremented. The converse is not necessarily true, it is up to the application: it can repost a SD with same data and keys and version will be incremented. A use case can be the periodic generation of a SD and the app doesn’t bother about whether or not something has changed, or the version is an index into something else.

Currently this is possible because low level API increments version automatically. Maintaining automatic increment of the right version(s) depending on the kind of operations done by the user is an added complexity to low level API.

I don’t think these fields are worth the added complexity in both safe_vault and low level API. I am an adept of KISS principle: there is currently a field that indicates how many times the SD has been modified. We don’t know the kind of modifications and it is an upper bound but it is enough for transparency and certainly better than nothing.

Lastly, I don’t agree at all about data field (in struct Version) because vaults cannot check its correct increment. Users have to trust the app about this field, so this version could be simply added in the data part of the SD (by the app itself).

Thats what I understood was the intention to be previously.

Read access only supplied through the API

I thought @Fraser’s idea of major/minor version was good. It would allow an APP to see when Data changes and When control/management info changes. Could be good to see that owners or other info has been updated for some APPs without the need to keep hashes. Yes I know thats low level but I used to write RTOSs for fun too.

I agree too, otherwise the your code would be more complex for no real benefit. And in the API doco it says that version increments on a number of factors data, owner, etc changes and can change for network operations.

To my way of thinking the version should be updated on an append. You changed the SD afterall. Seems odd you could increment the version on a network operation yet not increment on an append.

1 Like

Yeah - fair enough. I’d like to put together an RFC on this, but suffice to say that I’m not a fan of centralisation nor of trusted managed owned servers (I’d argue that we’re already proposing to make use of these hardware devices on each computer already, since we do make use of timers) so those aspects would not be what I was pushing for in an RFC.

But I agree that it’s a large topic and there will doubtless be difficulties and edge cases, so yes, an RFC would be good.

As per my comment above - it’s not something I can justify quickly - it needs an RFC and I don’t want to drag this off topic. The main reasons for preferring this though are the ability to completely remove SD from the network, which is beneficial for the network (since it doesn’t then have to manage unwanted data ad infinitum) and for the clients (who get a refund or only pay proportionately to the amount of time a piece of data is managed by the network).

We make use of durations, not time. The difference is agreeing what the time is, hardware based atomic clocks etc. may help but still have edge cases…

yes - because (again), it wasn’t designed for that. You might continue to see more edge cases as things evolve.

Thread Summary (Version field):

We decided we will have no changes to the current impl. Client libs will expose this but the functionality of the same needs detailed in docs as disclaimers to not mislead users.

Version field is used by Vault to handle POST operations and sync data during churn and not risk replacing old data with new. This however from the PoV of Vaults could mean any change(data/owner-keys/…) or no change if the user chose to explicitly POST with just the version incremented. So this CANNOT be considered as a guarantee of mutation of data.

Now if client-A is expecting version “5” of some data(D), when Client-A makes a Get(D) request and gets version “5” from the network, this merely means “There has not been another POST operation to this data item in the network which is accepted”

With Version only influenced in POST operations, it can also have a different meaning based on the specific data type in terms of data mutation.

StructuredData - if we expect version “5” and get same version from the network, we know for sure nothing has changed(data/owners/…). If version has changed to “6” for example, it could mean anything. data has changed or owners have changed or no change has happened.

(Pub/Priv)AppendableData - since non owners do not make POST requests to “append data” but use “APPEND” rpc, data mutation can occur even in the same version (this is what allows multiple people to append data and vault takes the union. This is outside the signed scope of the type). For this type, version if expected at “5” and what is received by Client-A making the Get(D) request is also “5”, it could mean data has changed. Owner-only editable fields such as owners-list or filter-options cannot have changed. If version has changed to “6” for example, it could mean anything: data has changed or owners have changed or no change has happened.

If there is a requirement to track the actual data mutation, this can be kept client side right now with an expected hash instead of version and get it to just selectively include the parts for which mutation needs to be tracked. Eventually we can also look at the network providing hash replies for data to avoid needing to fetch the entire data before confirming expectations, As a performance optimisation, this needs taken to a separate rfc and handled once vault/routing team are sorted with their current backlog (which is long already)

5 Likes

Thread Summary (Deletion handling):

Considering data space being paid for is just when submitting a PUT request, Delete flow is being changed to accommodate the use cases highlighted in the thread by @tfa

So in essence all three options expected as use cases should now be achievable (well once it’s implemented)

to retain ownership but empty content

  • Owner submits a POST request with empty content.

to let go of ownership and NOT allow anyone else to claim this data

  • Owner submits a POST request with empty content and sets the new owner to a known constant value used to represent “no owner”. All further requests targeting that data will fail, except for GETs which will return the data as normal.

to let go of ownership and allow anyone else to claim this data

  • Owner submits a DELETE request (which has the version incremented by 1 and then resigned) to the vault and the vault nulls all fields excluding version.
  • For someone else to now claim this data, they would need to first do a GET request to retrieve the chunk and learn the current version(v) of data.
    • POST and further DELETE requests will return an error regardless of version used by sender.
    • PUT request similar to normal PUT request but use the version as “v + 1” instead of “0”. any other version will return an error.
5 Likes