How would you represent Mutable Data in the filesystem

joshuef · August 17, 2018, 10:14am

Something that’s cropped up a couple of times in various chats with @happybeing is the many potential benefits of having files (RDF in our talks… ), in the filesystem.

This is a more trivial thing for immutable data, which maps nicely to the filesystem.

But how would you represent mutable data on the filesystem? How would that map from the network? Do we need another form of NFS emulation layer to make this work? What would a user see?

Any ideas/suggestions/thoughts on this?

dirvine · August 17, 2018, 10:49am

Perhaps an interesting “filesystem” would be on that has a standard vocab. By that I mean when you store a file it will not be able to store anywhere. You would select a root and then follow the vocab. So to store a doc you must select the root, then the next level and so on, until you are at the correct directory, say root/medical/cancer/head/<add doc here>

However that would not work and would bore people to death, but this is where I say make it different. So the root is always the root, but the next level could be chosen based on many options, so the usual Documents/Music/Video etc. which is what we are used to (and ignore, take video and music, nowadays these are not 2 different formats all the time). However, I think a mechanism where this second level is different or can change depending on the user’s desire, so it may be Video/Audio/Text then the third level can start to differentiate, say for Text The next level could be Fiction/Non fiction and so on.

Regardless though these docs are all actually *not held in a filesystem at all, but spread across the network and semantically tagged. So no matter how you save it all the correct tags are in place (this needs to be enforced at save time).

This is the rough idea, it’s very subtle in the suggestion as it breaks away from a filesystem as such for information, but we still require a small filesystem as such for OS’s and possibly programme config files, but SAFE can also handle these quite effectively in the users session packet. So a FUSE like device can be a bootstrap for an OS easily, but perhaps it should be limited and not allow the mess we all have.

RDF here is obvious, but the representation of such RDF for users will be the real trick and could potentially make a filesystem based mess into a document retrieval system with millions of entry points depending on the user’s desire and current requirement. So not DMOZ or any such thing, but a mechanism where the routes to docs and info are plentiful, but many routes can still get to the same doc. This last part is the critical part and the hardest, but also the one that makes it interesting.

This plays well with RDF and likely with SAFE, as SAFE is data spread across the globe in a way that means users should not have to think, they need it all locally or in their container for “safety” as the safety is in not having data in that manner. The safety for SAFE going forward will surely be enhanced by the massive increases in data storage mediums as well. So as we grow information, so will the storage devices and this scheme is extremely scaleable then.

Anyway, that is a nugget of an idea.

happybeing · August 17, 2018, 11:28am

David’s response is a very powerful idea, and I see how it might explain the direction you are taking @joshuef but I’m not clear if that’s the case.

Is this what you mean by the question how to represent mutable data on a file system? I took it to be a more lower level thing (such as taking any old MD and exposing as folders and values (and where the value resolves to something understood, then to expose that as folders and/or files). That’s what I’m doing with the FUSE mount (assuming to file where @loureirorg showed the way), and so that’s also the way I’ve been thinking about this questing (a tiny bit).

David is looking at a much bigger vision that builds on top of MD to create something that looks and works like the FS model but is much, much more. I like it

I also see that both are different views of the same thingymagig!

joshuef · August 17, 2018, 12:23pm

I actually think @dirvine’s idea isn’t necessarily related to MD (could be ID as well, I think). Just RDF data in general (and how that could make for quite a flex filesystem, which sounds pretty rad )

I was meaning much more on the level you’re talking about @happybeing. On current filesystems etc:

When you have an MD on your filesystem (as opposed to an ID; and however that’s retrieved from SAFE). What does a user see? (And perhaps taking RDF out of the equation for now).

Would it just be a specific filetype? joshsWebId.mut or something? And then it opens up to… present the user with key:value pairs? That would require some specific program to handle it… Or we just present it as a text file?

Would either of things make sense at all?

(And then what would it take on SAFE to make that happen in a sane manner…?)

drehb · August 17, 2018, 12:29pm

Maybe just store the file as ID and use the MD to track revision history?

joshuef · August 17, 2018, 12:33pm

I’m meaning more a situation where you want to some of your data in your local filesystem. What would a mutable data look like in that context? (or what options are there?)

There are many reasons you might want it on your computer (to work with other programs eg).

happybeing · August 18, 2018, 9:24am

You could have the MD as a .mut file as you suggest or drill down to show the keys as filenames or, as a directory structure using ‘/’ separators as we do now for _public etc.

You could also decide based on the values. So if they appear to be immutable, treat like _public. If we were to have suitable metadata for MD entries that choice can be made more reliably.

The SAFE FUSE design can handle different MD uses cases like that. It has a tree of mount handler classes, each of which will handle a known arrangement of MD (eg the root with any root file container, or the public names container MD/services MD/NFS container MD group etc). Correspondingly in SafenetworkJs there is a class for each kind of MD we know about.

So this makes it easy to cascade MD types and FS views in a file system tree, such as:

_publicNames/
  happybeing/
    messages@email/
       mailbox/
           inbox/
              hi from Josh.eml
              Re: meet up.eml
    blog@www/
       root-www/
           index.html
           posts.html
           lib/
             sort.js

Obviously this relies on understanding the MD format but would be extensible in the same way the SafenetworkJs library can allow for adding support for different SAFE RESTful services (and other things).

I’m not sure what the best way of handling an MD with unknown content would be, so think anything would do to start but also good to try and get decent metadata for both the MD itself and individual values, and be thoughtful about defaults where we don’t have metadata or don’t understand the type.

If you want to play with this at some point it would be easy to do in SAFE FUSE, which would be great. I’m working through the basic file system operations, about to start on the class which handles an NFS emulation MD so I can actually see files! This is a really fun bit, but going slowly with just me (so anyone reading is welcome to join in and help, see here ).

As of yesterday it shows the folder structure in _public but as the handler framework is working it would be easy to add a handler for mounting a raw MD and then play with that to produce listings of any MD.

lukas · September 22, 2018, 3:15pm

I might be totally misunderstanding because i don’t know exactly what @happybeing and @dirvine are talking about.

But this is a way that makes sense to me: The file is available as usual, and there’s also a hidden directory next to it that contains all previous versions of the file.

So for instance you have a picture.png, and in the same directory you have a /.picture.png/ directory containing all previous versions.

This seems like a good basic behaviour, since I guess we want it to behave like a normal directory as much as possible when used.

Edit: I haven’t read up on the specifics of how mutable data works, but I’m assuming here that previous versions of mutable data can be reached.