The topic of being able to share files/data stored on the network with links, which don’t use the DNS system, has been discussed a few times in the past on our forums (see further below for some references to previous discussions).
Currently the safe-app-nodejs API, and our SAFE browsers, support fetching safesites and files published with the DNS system, from URLs like
safe://<service name>.<public name>/<path>, but it’s not possible to fetch data using their address on the network, i.e. the XOR address of any data stored on the network, without publishing it under a public name.
Is this a CAS?
There is a concept of Content Addressable Storage (CAS) but from our understanding, there is a subtle difference with our proposal here. A CAS seems to assume the address is derived from the content itself, and in the case of the SAFE Network data this is true for ImmutableData XOR addresses, but not for MutableData XOR addresses, this is why we hesitated to simply call this proposal to be a CAS. But it would be good to also hear other opinions in this regard. We called it XOR URLs in an attempt to highlight they are not the public-name URLs (DNS), and that they are based on the XOR addresses of the native SAFE Network data types.
Related discussions on the SAFE Network Forum:
- Encrypted file sharing
- Canonical urls for files
- POC : Introducing SafeShare, a file sharing and pasting webapp
In order to be able to share a file/data with a URL without the need to publish it with the DNS system under a public name, it is herein proposed:
- Extend the
webFetchAPI to also accept XOR URLs which are constructed based on the XOR address of the file/data.
- Be able to retrieve the content of native data objects, like the MutableData, as raw data. E.g. get the list of entries stored in a referenced MutableData.
- Have our browser render the content in different ways if the content retrieved from a XOR URL is the raw data of a native object, effectively becoming an explorer for SAFE Network native data structures.
Specification and general considerations
XOR address encoding
The XOR address shall be encoded in the XOR URL,
base16 encoding seems to be a good choice as it is case insensitive, as opposed to other case-sensitive encodings like
ImmutableData XOR URL
XOR URLs for ImmutableData’s are the simplest ones since they don’t need any additional information to uniquely identify them on the network, as opposed to MutableData’s that also have a type tag. Therefore an ImmutableData XOR URL can be simply defined as
safe://<encoded ImmutableData XOR addr>.
MutableData XOR URL
As already mentioned above, a string based on the XOR address along with a type tag is needed to uniquely identify a MutableData on the network, therefore the XOR URL for a MD needs to include the type tag in it.
When a MutableData is fetched, if it’s not an NFS container with an
index.html file, the
webFetch function can return the MutableData’s raw data so the browser (or any client app) can render it in a different/specific way (see below for more details of what’s being proposed here in this regard).
MutableData’s are versioned and therefore this shall be also accountable in the XOR URL format to allow any MD URL to (optionally) reference a specific version.
The version value can be used to enforce a specific version to be retrieved, and otherwise fail if that specific version is not found. On the other hand, the latest version will be retrieved if the version value is omitted from the XOR URL.
Given that a referenced MutableData can effectively be the root NFS container of a safesite, any MD XOR URL can also specify a path which needs to be resolved, and the content retrieved, as when using the DNS system with public-name URLs.
Currently, when a public-name URL is resolved to an NFS container which doesn’t have an
index.html file, the browser simply shows an error stating that the safesite content was not found.
For a MD XOR URL that doesn’t have a path, the
webFetch function shall return the raw content of the MD (i.e. its key-value entries), and the browser can automatically generate an HTML page which makes the content browsable, generating links to other data when an entry’s key/value is a
safe:// string, in an analogous/similar way to how web servers on the current internet allow browsing on folders.
A specific MD entry key could be supported (e.g.
__non_browsable), that the owner of the MD can insert in the MD if the “browsable” feature should not be enabled. Although this cannot be really enforced but offered as a feature to be optionally supported by some clients like our browser.
XOR URLs specification
The following are the main requirements we would like to have for the encoding we use to generate the XOR URLs:
- Be able to support new and different types of base encodings and hash functions for the XOR addresses in the future.
- Include the content type within the XOR URL which would allow the client app to correctly render the data to the user especially when referencing an ImmutableData.
We are considering the use of multiformats and CID, which allow us to cover the above requirements. We can use a CID identifier in our URL for specifying the XOR address part, and have additional parts to support MutableData’s type tag, version, as well as the path and query parts. We can then define our SAFE URLs in the following way (BNF-like):
<safe-url> = 'safe://' ( <xor-uri> | <public-name-uri> ) <public-name-uri> = [<service> '.'] <public-name> <path-query-fragment> <xor-uri> = <immutable-data-uri> | <mutable-data-uri> <immutable-data-uri> = <cid> <query-fragment> <mutable-data-uri> = <cid> ':' <type-tag> ['+' <content-version>] <path-query-fragment> <path-query-fragment> = ['/' <path>] <query-fragment> <query-fragment> = ['?' <query>] ['#' <fragment>]
<cid>: follows the CID format which self-describes and encodes:
- the base encoding for the string, we propose to use
base16for the reasons explained above,
- the version of CID for upgradability purposes (
- the content type or codec of the data being referenced,
- the XOR address of the content being referenced
- the base encoding for the string, we propose to use
<content-version>: for future implementation to reference versionable content, using a single address with different versions
<type-tag>: the type tag value if the CID is referencing a MutableData. In the absence of this value, the CID will be assumed to be for an ImmutableData
<path>: the path of the file if the CID is referencing a MutableData which can be accessed through the NFS emulation convention (or other emulations/conventions in the future)
<query>: query arguments, to be used by the client app and not for retrieving the content
<fragment>: fragment of the content, to be used by the client app and not for retrieving the content
webFetch function will simply attempt to decode the
<cid> part, and if it fails, it will do a fallback to assume it’s a public-name URL.
The following are examples of what would become valid XOR URLs:
- ImmutableData XOR URL:
- MutableData XOR URL:
- NFS MutableData XOR URL: