File sharing at the level of safe: URLs

happybeing · November 14, 2016, 2:22pm

The remotestorage.js application backend allows for files stored in a certain folder to be publicly accessible, and accessed directly as a URL. For example, files in /public are all accessible without authorisation, and given the name of a user’s remotestorage server can be accessed with something like https://rsserver.mydomain.com/public/foo.txt

I think this kind of feature might be generally useful, although I’m not sure how desirable it is. Convenient certainly: users could share files just by passing around safe: URLs and no need to use a safe app to access them - just click the URL and SAFE Beaker would download the file.

Anyway, I’m puzzling over how to provide this kind of support for RS.js apps running on SAFEnetwork.

@joshuef do you have any thoughts on how this might best be achieved?

I’ve not dipped into the low level API yet but I’m already struggling a bit to think of a way that this could work well without having resorting to SAFE DNS which is hard to build into the generic backend because it would need to piggy back on a public ID and copy shared files to a special folder etc.

An ideal solution would be if there was a way to specify a safe: URL with something like a file hash which when processed by beaker would access a public file (so no need for need for user authorisation).

Any thoughts?

WhiteOutMashups · November 14, 2016, 3:26pm

Wasn’t this supposed to be a core feature of SAFE? I remember Mr. Irvine saying this in an interview early on.

All public safe files were supposed to be accessible by their simple link

ben · November 14, 2016, 3:45pm

technically everything in the network is just an address in the XOR-Namespace. So, when you know that address you can retrieve the data. Currently the NFS-API doesn’t directly expose that information to you, but everything has such an address, and any address can be fetched by anyone at any time. So all you have to do is share that address with them. Whether you write that in hex (most common hash) or base64 notation doesn’t really matter: internally it is just a really, really big number.

That should very easily be possible. We could have the convention that safe://-is considered a browsing url while safe:HASH would be expected to be a network address that the browser could immediately download. Well. We should probably define that the browser is expecting to find a FILE entry, so it can show some metadata to the user to confirm the download…

Then the only thing missing is a way to get the network address through the NFS API (which shouldn’t be hard either) and voila.

WhiteOutMashups · November 14, 2016, 3:49pm

But I felt like what was being said was more of like an automatic safe://will.doc1 type of thing. I think this could be hugely powerful, intuitive and I sure hope it’s going to be built in and offered

ben · November 14, 2016, 3:59pm

What “type of thing” is that?

safe://will.doc1 is a URL.Do you mean that a file posted to the network will be addressable by its hash (this is not a hash!)? Well, sort of. Self-auth splits them up into smaller chunks of 100kb each and then stores them as immutable-data, which ensures that the address matches the hash of its content. So for every chunk of data stored through NFS that is currently a given (and does de-duplication, too), but you still need to know which chunks to fetch and how to put them back together - we call that a DataMap.

However, because the metadata can be very different for files even having the same chunks in it (like, you called it picture-ben.jpg and I called it my-profile.jpg), the surrounding FILE itself doesn’t hold onto those properties of the hash referencing it but is stored at a random address in the network, which only your NFS knows. But as soon as you have that address (and the data isn’t encrypted) all I said holds true again.

happybeing · November 14, 2016, 4:04pm

What @ben describes would be ideal.

What you describe @whiteoutmashups is desirable in addition, but not as easy to deliver without using an app I think, because something needs to look up your friendly URL and convert it to a network address.

SAFE DNS provides a way for the network to handle that look-up for you but as I’ve mentioned it doesn’t work well in this context. So to provide friendly URLs which don’t require DNS and a public ID is not simple, and not knowing if the low level API would allow this easily I was struggling to come up with a solution.

@ben I’d like to open an issue on this as a feature request - should this be on safe_ffi? And beaker?

Thanks for the speedy response! You’d think we’d talked about this before

ben · November 14, 2016, 4:11pm

We are working on a new way to allow apps to easier provider their own service entries through DNS, too. As well as having a shared _public-area that all apps could put stuff into and that would per default be easily accessible (not yet entirely clear how, though). All part of the authenticator RFCs we are currently finishing up.

safe_core about getting the network address from NFS, same for safe-js and beaker to provide that new url-scheme.

That said. With the authenticator changes coming up, we’ll restructure many of those things alltogether and it is very likely that in that version you’d actually have direct access to the addresses of NFS already because of structural changes we are making. The implementation details for that aren’t clear just yet. But you probably want to look at the RFCs once proposed!

happybeing · November 14, 2016, 4:37pm

Thanks @ben I’ll hold off then until the RFCs are published and advocate for it there if missing!

bochaco · November 14, 2016, 8:11pm

I also like what is being proposed here to be able to reference a file on the network using it’s XOR-Namespace address.

Does this imply that by modifying a file (e.g. editing a text file) the network actually deletes some/all chunks and it creates new immutable-data to store the modified content?

ben · November 14, 2016, 11:08pm

That’s not implied, it is a fundamental feature that immutable data is bound to its hash. Therefore when editing a file it must be stored as a new entity and what NFS does is update said DataMap with those new locations.

But then how does it know that something still references that part? Well, the network actually doesn’t. If you look closely at the ImmutableData-Type, you’ll notice it doesn’t have an owner. Anything you’ve put into the network like this can’t ever be revoked from it. That is intentional. To prevent dead links or losing data: a chunk that was once online, will always be online - maybe not available per DNS but always accessible through its hash.

WhiteOutMashups · November 14, 2016, 11:09pm

Thanks for the fast replies Mr. @ben and I think I understand, so apps / websites could be made on SAFE that convert these long XOR addresses of public files to simple DNS names like safe://will.file1 right? That sounds like it would be everything I’m asking for. Kind of like tiny.cc or tinyURL.com.

ben · November 15, 2016, 9:44am

You could easily provide a web app that does that (similar to tiny.cc) indeed. It could be a tiny JS app, that looks up the location.hash property out of calculates an XOR-address and expect that there is a File-Struct stored. It could then attempt to download its data map. Then the only thing any other app needs to do is store a FileStruct at that very same location (only possible with LOW_LEVEL_API right now). It wouldn’t even have to go through that specific app itself.

bochaco · November 15, 2016, 9:10pm

@ben, thank for the previous responses!
I see that the Markdown editor is already using a URL to download a file, I also can see the file by pasting the URL on Beaker, e.g.:

data:text/markdown;charset=utf8;base64,IyBTQUZFIE1hcmtkb3duIEVkaXRvcgoKKiBBIGxpc3QKKiB3aXRoCiogc29tZSBpdGVtcwoKU29tZSAqKmJvbGQqKiBhbmQgX2l0YWxpY18gdGV4dAoKPiBBIHF1b3RlLi4u

ben · November 16, 2016, 10:02am

Yeah, but that isn’t a network url. That is a base64-encoding of the actual content, a data-uri. Sure, you can use that to share information, but then you don’t need a URL at all, as you are already passing the data

bochaco · November 16, 2016, 4:09pm

I see, well I wouldn’t use it if the file is too big.

happybeing · November 18, 2016, 6:58pm

I’d like to feed in this as a feature request, which if it is not hard to add would be great to see in a minor extension to the client API (as an extension to SAFE NFS and safe-js).

If we can have this I think it would find a lot of use within apps needing to store and load HTML, and make it very easy to implement file sharing with all SAFE apps which is one of the killer features of the internet: sharing user’s SAFE files via a public URL.

Specification

The aim is to be able to create a “safe:” URL for any file stored using SAFE NFS. A URL that beaker can directly interpret and load as part of a normal HTML page or as a file download link.

This would allow, for example, an image captured by a SAFE web app and stored in user’s SAFE NFS storage, to be loaded into a dynamic web page (the app) and displayed within the HTML UI of the app. Or a file generated by a web app could be saved to SAFE NFS, and a URL provided to the user which (if public data) could be passed to anyone using SAFE beaker in order to download the file.

Doing things this way makes it easy to port an existing app, of for someone who already knows how to create and load files on a traditional web server.

At the SAFE NFS API level, a web app would want to be able to provide an NFS file-path, including extension, so /images/happybeing.jpg for example, and have returned a URL which beaker recognises as a hash based URL.

Given a reference to that URL in any HTML, or if pasted into the browser location bar, Beaker would recognise this URL as a SAFE hash URL and use it to access the file via the SAFE low level API, but return the content as if it was a normal file - i.e. with any necessary metadata (e.g. via headers or derived from a file extention that might be part of the URL - out of my depth here!).

Hope that makes sense!

There may be uses for accessing lower level SAFE data (i.e. not NFS files) through URLs but I’m not even going to try and think about the for now.

Implementation

For reference, some notes about implementation from Ben’s post above:

WhiteOutMashups · November 18, 2016, 7:04pm

Yes this is exactly what I meant.

I want this ^ !!

ben · November 18, 2016, 10:30pm

How about prefixing with the “convention” to use? With that I mean something like this:

safe:file:4e1243bd22c66e76c2ba9eddc1f91394e57f9f83

safe:folder:4e1243bd22c66e76c2ba9eddc1f91394e57f9f83

safe:dns:4e1243bd22c66e76c2ba9eddc1f91394e57f9f83

Each respectively letting the “viewer” know in which way they are supposed to interpret the content of the address and how to deal with the actual content behind it. And we could easily extend that to other conventions later (should we add more).

All this said, it looks like a really good proposal to me and fairly thought out. Anyone up for making it into an RFC? You could start by posting your draft as a new topic here and I’ll make it a wiki-topic that we can all discuss, edit and fix and once we are happy we can raise it to the official repo. Any volunteers?

If that happens soon, we might be able to develop that right into this cycle of the authenticator changes (where we have to touch all APIs and Conventions again either way).

tfa · November 19, 2016, 12:16pm

Currently a NFS file is a datamap stored in the SD implementing the directory containing the file. As the directory can be modified or deleted after sending the link, the only sure way to share the file is to transmit the datamap, so the URL can only be the datamap (possibly base64 encoded).

Address of what kind of object? Both branches of the alternative are problematic:

If the object is stored in an immutable data, then this is costly because a PUT must be issued just to be able to create a URL (in addition to those needed for the file itself in the regular NFS).
If it is stored in a SD then it can be removed later by the owner and so the URL becomes a dead link, which is contradictory to the forever file principle

happybeing · November 19, 2016, 3:01pm

@tfa:

If it is stored in a SD then it can be removed later by the owner and so the URL becomes a dead link, which is contradictory to the forever file principle

I think this is actually what would work best in this instance, because:

if I share a file for someone to download I may well want to unshare it later (by invalidating the link)
I think this is expected behaviour, whereas once shared available forever is probably not, and might not be the best default (those dick pics OMG ;-))
it is consistent with other published data - SAFE services work like this already. So I’m not clear what qualifies for the available forever principle and what does not. I suggest we have a chance to clarify that here.
even so, apps that require the data to persist can go get the data map and not rely on the link (provided the file itself is immutable, though I admit I don’t know if that applies here! )
where an app generates a file, and includes it in its own HTML as a link, it is OK to expect the app to manage the case where it later invalidates the link. Again, I think it is desirable default behaviour.

@ben I don’t understand n how to implement this well enough to turn it into an RFC. One day!