SAFE Containers and NFS emulation - summary


#1

I’ve tried to summarise how I think some aspects of network storage works, to help me work out how best to use the API and how to implement storage functionality.

I’d like confirmation and corrections as needed - and by all means fill in any relevant things I missed etc (or if this is documented in detail a reference is fine, but looking at the code is hard for me to figure out this detail and takes a lot of time, but I’ve done my best! :slight_smile:):

Changes to this OP:

  • addendum: noting that WHM _public doesn’t seem to be consistent with RFC wrt ‘flat’ directory structure
  • corrected and annotated in accordance with Gabriel’s first reply below.

Account Structures

  • Logging in retrieves an account MD (container) with keys for certain defaults:
    • _metadata - a reserved key which holds metadata about the container (including name and tag type - anything else?)
    • permissions - access control information (public keys). Is this a reserved key or something in its own right?
    • root containers (keys whose value points to a container), accessible via the following container names:
      • _public for public containers and files (but apps can put anything in an entry)
      • _private, _documents, _photos etc (but not all names implemented yet)
      • _publicNames - for containers corresponding to each public ID
    • anything else?

I notice that the values for the container/MD entries are much shorter than the values I’m storing which reference immutable data. So what is the value that’s stored in a key used to reference a container/MD? Is it a network address, or a container name that will be combined with a type tag to retrieve the container itself, or something else? (I.e. For example, what is the value stored in the _public container for a key of /_public/<publicname>/root-www?)

_publicNames Root Container

  • _publicNames container has a entry for each public name owned by the account, which maps to a services container for the name
    • the services container has one entry for each service created for the public name, which maps to another public container referencing content (i.e. immutable data/files)
  • so for example, resolving a SAFE Browser URL means:
    • using the domain portion of the URL to access the services container for that public name (the domain). The container is the MD found at the address, or xorname, given by SHA3(’<public-name>’), with type_tag 15001 (SAFE services).
    • then reading the value of the ‘www’ service entry of the services container, to retrieve the root container (an MD) which holds the files accessible through the service
    • reading the value of a named file to get the address of the immutable data for the file (or MD for a directory)
    • getting the content at the address and serving in a response

Web Hosting Manager
Question:

  • where WHM uploads a folder which has files and a subfolder,
  • and where the folder is represented by a public container, say /_public/mywebsite/root-www,
  • and the subfolder is represented by a public container, say /_public/mywebsite/root-www/images,
  • does the folder container (/_public/mywebsite/root-www) contain an entry for the subfolder (or only for each file)?

Root Container Entries

The WHM UI assumes that every entry in the _public container refers to a container, even if they are not (I know because it offers file entries - I’ve created there - when I modify a server/container mapping). Is this a problem!? E.g. a WHM or API bug, or bad practice for me to put anything other than a container as an entry in a root public container? Am I supposed only to insert entries corresponding to containers in a root container?

Metadata

  • is there a recommended way to set/get metadata for a container (other than name and type tag)?

    @bochaco: You can use the MutableData setMetadata function: http://docs.maidsafe.net/beaker-plugin-safe-app/#windowsafemutabledatasetmetadata to set it (you can even set it with quickSetup). We are missing a function to retrieve it though.
    An example of use of this metadata is when you request access to share a random MutableData, the authenticator will display it to help the user understand what the MD contains and to decide if it should be allowed or not.

  • is there a recommended way to set/get additional metadata for files (e.g. content type)?

Mutable Data Operations and Limits

  • there are a maximum of 1,000 entries per MD, and 1MB in size per MD

  • when either limit is reached, no more entries can be inserted but changes that don’t violate the limit may succeed (e.g. setting a smaller sized value that decreases the size of 1MB MD)

  • EntryMutationTransaction::remove does not remove the key. It clears the value and causes the key’s version to be incremented.

  • what is the Buffer parameter passed to the Entries::forEach() handler?

    @bochaco: That’s the entry’s key, look at the example snippet in this section of the documentation: http://docs.maidsafe.net/beaker-plugin-safe-app/#windowsafemutabledataentriesforeach

  • assuming (as I stated) an account login returns an MD, there’s a limit on the number of root containers it could hold (1,000)

  • the act of creating new public containers and inserting them into the _public root container will use up entries up to a maximum of 1,000

  • there’s effectively a limit on the total number of public containers (folders uploaded) per account, because each uses up one of the 1,000 entries in _public root container (but this may be further reduced if you remove and insert entries for other reasons too (such as renaming a folder causing a key to be deleted and a new entry inserted)

  • there’s effectively a limit on the number of files and subfolders that can be uploaded per folder, because each file and subfolder of a folder uses one of the 1,000 entries in the MD container which represents the folder

Thanks for reading, and for any clarifications. I hope this will also help with ideas for what information to add to documentation and tutorials for developers.
__
UPDATE: looking back at my notes on RFC-0046 New Auth Flow - containers.md I see that it is Active but not accepted and does not appear to be reflected in the code based on examining what’s in _public container and looking at the Web Host Manager code. For example it says NFS doesn’t use a hierarchy of containers but a flat key/value structure. This is what I based my RS.js implementation on, but looking at what the Web Host Manager creates I see it does create entries in _public for subfolders which appear to be containers.

For example, after uploading the folders for safe://mywebsite, I see entries in _public container for both '/_public/mywebsite/root-www' and '/_public/mywebsite/root-www/images'. This looks like a heirachy of containers to me, but if not, what is it?

What is the code based on and is this an oversight? Or am I doing (reading) it wrong :slight_smile:?

NOTE: I don’t know yet, but it seems likely that it will be easier to support a SOLID compatible API if containers are a hierarchy, i.e. NOT as in RFC-0046, but as they appear to be in the code. This is because containers are first class objects in LDP (you can create an empty container) and I think they have metadata but haven’t looked in detail yet.


#2

Hi @happybeing, some responses here.

Correct

This is correct, except for the first two steps which I’m not sure if it’s clear.
The domain/public name is used to find the services container associated to it, i.e. if your public ID is mydomain then the services container for that public ID can be found at the address/xorname SHA3('mydomain'), with type_tag 15001.
You may wonder why you then need/want the _publicNames container since you don’t really need it to find the services container for a public ID, well that’s private to each account and just to keep track of the public ID’s owned by it, like a private index.

I think you answered these questions yourself already, the hierarchy is flatten out and the full path of each file is what it’s stored as the entry’s key in the container.

You can use the MutableData setMetadata function: http://docs.maidsafe.net/beaker-plugin-safe-app/#windowsafemutabledatasetmetadata to set it (you can even set it with quickSetup). We are missing a function to retrieve it though.
An example of use of this metadata is when you request access to share a random MutableData, the authenticator will display it to help the user understand what the MD contains and to decide if it should be allowed or not.

Correct

Remove will at the moment just clear out the entry’s value, i.e. if you iterate thru the entries you will still find such an entry with the same key but an empty value.

You cannot insert an entry with the same key as the one you previously removed, you can update it making sure you provide the successor version.

Correct.

That’s the entry’s key, look at the example snippet in this section of the documentation: http://docs.maidsafe.net/beaker-plugin-safe-app/#windowsafemutabledataentriesforeach

That should be the case as of now since containers are simply MutableData’s.


#3

Thanks Gabriel, help much appreciated.

Unfortunately not, because my update was highlighting what the RFC says happens, while the body of my description is based on what I observe.

See what I say about what’s actually stored in _public if you upload a folder and a subfolder. You get an entry for each, which to me looks like a hierarchy of containers (not what the RFC says). If not, why is there an entry for a subfolder when there’s already an entry for its parent?

We can set MD metadata but not read it?! :blush:

Answers to other questions still needed - for anyone reading this!


#4

I think it is indeed a flat structure. With flat they mean all the entries in the NFS MD refer to the files directly, instead of referring to other MDs (which would be called a hierarchy in this context). The NFS MD is flat in the sense that all content are directly listed in the entry-keys. So, flat is talking about the MD structure, not about the file system structure, which is indeed hierarchical like we’re used to.

It seems the point of the metadata is to simply provide a human description to make sure users are able to understand the value of the containers and MDs when asked for permissions on them. So, apparently it’s not yet that important for app developers to read and display this.

I’m very much interested in such information too. I think it would be great if there will be formal definitions put out by MaidSafe on how to manage these containers. App developers need to know what to expect, and what data to store. Right now I’m look at the Web Hosting Manager to conform, but the Web Hosting Manager is also just another app.

It’s not an MD entry (through the API anyway). It’s separate, just like the tag, name, version, data and owners:


#5

As I’ve explained that’s not what I’m seeing, hence my asking and re-asking. I hear that it is supposed to be a flat structure, but if it is I shouldn’t be seeing entries for subfolders in _public, only for each root folder uploaded. So I’m not sure what the Web Hosting Manager is actually doing there. I may be misreading what I’m seeing, or it may be a bug, but I have the evidence! :slight_smile: Below is an extract from console logs which lists all the keys in one of my accounts’ _public container.

Notice that for both the taskrs2 and tt2 public names (public ids) WHM has created an entry for both the root folder it uploaded (‘root-www’, and a subfolder of that folder (‘js’ and ‘vendors’ are subfolders). So that suggests a tree structure, or at least that every folder uploaded, including any subfolders, gets its own container entry in _public. I don’t think that is what the RFC is saying.

scripts.bundle.js:3543 Key:  _public/taskrs2/root-www
scripts.bundle.js:3543 Value:  FURgWhHECnH4CRDfwG1JuI5YtV2bPnAR/9SRUsqOWsg=
scripts.bundle.js:3543 entryVersion:  0
scripts.bundle.js:3543 Key:  _public/taskrs2/root-www/js
scripts.bundle.js:3543 Value:  n5yxoo+JxhQnRvxg4pjtBkWQ+ydI3bqt+40z2ugo3ps=
scripts.bundle.js:3543 entryVersion:  0
scripts.bundle.js:3543 Key:  _public/tt2/root-www
scripts.bundle.js:3543 Value:  telW/dLHiebxwscMl1ggcSUTtdVNwxbzJve4X60T570=
scripts.bundle.js:3543 entryVersion:  0
scripts.bundle.js:3543 Key:  _public/tt2/root-www/vendors
scripts.bundle.js:3543 Value:  rnJfiQk1izJHksyAyuJ1HHRB7+ZAjMjOCUWP7IAQbtk=
scripts.bundle.js:3543 entryVersion:  0

#6

Ah, I see now. Interesting, did you upload the sub-folders separately after uploading the root folder?

By the way, the public container is not described in the RFC, so I can’t say whether it’s correct behaviour.

I’ve already wondered about the public container’s rationale. The RFC is only talking about the NFS convention.

I’m now wondering about what your service folder contains (www.taskrs2) – does it contain the js folder? It should, otherwise the SAFE browser won’t be able to find it, right? And, if this is the case, then where does the _public/taskrs2/root-www/js in your public container point to?