Resource Description Framework implementations and 'Portable Data...': Some Links and learnings


#21

Yes, but I don’t mean to store it in a non-compliant form, but in a “raw” RDF form.

If we take that RDF is the data model we can store the data as subject–predicate–object entries/items which can then be serialised using any of the standard/compliant serialisation formats like Turtle or JSON-LD. Perhaps what I’m trying to say is to implement it as a triplestore.

This seems also natural as in the clearnet servers can support multiple formats and the client can request a specific format using the Accept header specifying a preference and its weight, e.g. Accept: text/turtle,application/rdf+xml,application/xhtml+xml;q=0.8,text/html;q=0.7

Also, I guess we cannot say JSON-LD is not compliant, but just broader perhaps, as per my understanding…?..


#22

I think here we should regard JSON-LD as non compliant but compatible or find some term that helps people realise that if they create some arbitrary JSON-LD it is quite likely not compatible with RDF unless they take pains to ensure that. The technical term is ‘generalised’ but I don’t think that conveys this point. On the other hand, if you generate Turtle, N3 etc, you definitely have RDF as I understand it.

I see your approach as cleaner than using a generalised, potentially non compliant format, but I assume you would lose compatibility with SAFE NFS because if we use SAFE NFS, then I think Turtle would be the sensible choice.

Why? Because then any Solid app can access RDF resources from a SAFE based file system (eg FUSE), or by a browser fetch that supports SAFE NFS, without recourse to a Solid / RDF API layer. (BTW I’m going to try making a SAFE FUSE mount as a side project of my side project :wink: Fingers crossed that my skills have finally revived sufficiently to tackle that almost four years since David first suggested I have a go at it. Thank god I couldn’t attempt it back then lol).

Maybe we later go native in a technically cleaner more efficient way as you are thinking, after initially focusing on building bridges to as many existing apps as we can, and with other teams to begin with? I think a good way to do the latter is to provide access to SAFE storage via existing popular APIs and protocols. The leverage that provides is something we need to factor in when developing and prioritising these options.

I would though like to also see a full and detailed exploration of the ideas you and @joshuef are developing because I do see the potential merit of them, and I also know that you both bring insight and understanding to this which I lack. So bring it on and let’s wrestle these solutions into something awesome. This is great! :slight_smile:

EDIT: another option would of course be to ensure that SAFE NFS API presents your native RDF as part of a filesystem. That would kill both birds with one stone if it makes sense to do that.


#23

I agree, but I wasn’t trying to mean that we should use one instead of the other, in fact I see them as complementary.

The NFS emulation/convention provides you with a way to organise the resources, but nothing is implied/inferred from it about the content/data stored in that hierarchy.

At the moment we can store only Files with our NFS emulation layer, but what if you could also store RDF resources using the NFS emulation?
In that way you still organise the RDF resources using a hierarchy which you may then want to use to create the links using a public ID. But you could also provide links using the XOR name of the RDF resource itself (if you weren’t interested in a public ID type of URL/link), so the RDF resource is a different MD.

Now, you could of course store RDF resources already simply as text or xml files (ImmutableData’s) using the serialisation you like, and this is where I think we could do better in helping users to follow the standard/convention of the semantic web, by giving an emulation/abstraction layer to create RDF resources which can always be retrieved using different serialisation formats (again by the abstraction layer), while making sure the RDF is stored in such a way that it can be easily serialised in any other format we still don’t provide, but we/anyone might support in the future.

Thus, you could read these RDF resources, by either providing the path to the NFS emulation or directly using the XOR name of the RDF resource, with the RDF emulation; the content is in any case retrieved to you in Turtle, JSON-LD, etc.

Edit: I’m just thinking that my thoughts are being driven by my belief (hopefully an understanding :slight_smile: ) that serialisation is only needed for a transition of existing apps that currently use HTTP, but in the future we won’t/shouldn’t really need a serialisation format, apps just read the entries from a RDF MD.


#24

Only beginning to think about this I think it will take quite a bit of thought to figure out which approach is more efficient in practice, and I think it will depend quite a lot on the use cases.

Seems quite hard to figure out whether native (RDF inside MData etc), or Turtle+NFS IData will be more efficient given uncertainties about uses cases. Large files versus small files, numbers of triples per file/resource, inter resource graph characteristics, frequencies of different kinds of access, sensitivity to latency are going to be hard to model so I’m thinking we’ll need to suck it and see (that’s the technical term for build and test different options :wink:).


#25

Just to be clear: I’m with you Mark that we should probably seek to stick to RDF for interoperability :+1:

I was imagining something along the lines of @bochaco’s RDF-MD.


For me, storage of RDF data as JSON-LD (in the non-generalised rdf-compatible sense), makes most sense in an MD for reasons as outlined by @bzee above (performance; one less layer of abstraction/emulation, permissions etc), as well as an MD being a key-value storage, which aligns well with JSON-LD as a representation of the data.


Something to consider for data formats is usability : I’ve been looking at JSON for years, to me JSON-LD is much clearer to see what going on quickly than other RDF formats.

It is also, just another way of writing JSON, for which there is a huuuuge tooling ecosystem across many many platforms/languages. As a web developer, whatever data I get from a server I’ll probably be converting that to JSON to work with, within my app… (Again, that can vary between languages, but it is another step for a dev to take.)

I think one of the biggest hurdles we’ll have on the network is getting devs to use these data structures.


#26

I think a good direction would be to fork something like the rdflib.js library like @happybeing did. And attempt to make it work with a specific solution (like JSON-LD – MD).


#27

I’m not convinced of these routes yet fellas :slight_smile:

JSON-LD:

I haven’t looked at how it handles triples, so I need to see how it does that because they don’t map well to key/value IMO (being three elements rather than two), though I’m ready to be corrected. I just don’t know yet.

Also, it isn’t clear to me that other RDF representations directly in the MD will be less efficient so that’s something to investigate.

This then leaves the question of using NFS and therefore IData to hold the representation. I like NFS compatibility because it exposes RDF to non Solid aware apps and devs. I expect that to start with there will be a lot of apps using SAFE NFS (including as a virtual drive) so I like the idea of exposing them to files containing RDF, especially in the early days.

WRT non generalised JSON-LD there’s a fundamental problem we create by directly supporting JSON, which is that people will create non compliant content in JSON-LD and store these without necessarily being aware that they’ve done this, or of the consequences. Whereas if we encourage use of a non generalised representation this won’t happen unless people decide to go their own way and generate something non compliant. Because of that, I think those folk are more likely to recognise the consequences and deal sensibly with them (eg converting from RDF stored by another app to JSON-LD and then save it back etc. I think they are likely to see and understand the problems of non compliant content).

The next issue here is that RDF is the way it is for reasons (which I don’t understand but think are probably important), and the tools and libraries designed for it will have features designed to exploit it and so generally be more suitable for the purposes intended for RDF. Whereas when using JSON-LD there will be some tools and libraries which are suited and some which are not - because they were designed for other purposes (eg APIs for which JSON-LD was designed in the first place). So I expect that with JSON-LD there’s increased likelihood of RDF compatibility issues. Some, perhaps most JSON-LD tools and libraries will be built by people who don’t realise this is an issue, or just don’t care because their use case is different. So I see potential for confusion and wasted time here.

Again I haven’t surveyed this, but it worries me, and if you are going to work with RDF I think there’s a benefit to learning to view it in an RDF representation designed for the purpose, rather than one which you may be familiar with but was designed for a different purpose.

Turtle is very readable IMO, but I need to look at some equivalent JSON-LD to know if there’s much at stake in this respect. So the wrestling continues :slight_smile:

Forking RDF Libraries

I have forked rdflib.js and have proposed we do something similar with Solid-auth-client so that adding support for SAFE to an existing Solid app is as close to just dropping in the SAFE js modules as possible.

Because my changes are tiny I’m hopeful that at some point they will be merged by the Solid team and that compatibility with SAFE will become standard in any rdflib.js apps because of that. This is because those changes are literally a few lines, which is feasible because all I’m doing is enabling us to intercept calls to fetch() so we handle any requests for a SAFE URI. This route also means Solid apps which use fetch() directly will work with SAFE, and even those using XmlHttpRequest are trivial to convert. Solid apps can and do mix all these methods together, so if we don’t support at the level of fetch() porting a Solid app to SAFE is typically going to be much harder. It is also likely we would have to support many more libraries than rdflib.js. So it makes sense to start with a RESTful interface to SAFE based on intercepted fetch().

If instead we fork rdflib.js and other Solid libraries in order to bypass the RESTful API and go directly to SAFE API (with or without JSON-LD + MD) we lose a lot of this.

Simplicity v Efficiency

Unless I can be sure of the benefits, I tend to go first for simplicity, compatibility, and in our case ease of adoption by as many people in the Solid space as possible, and handle performance and efficiency later. Better get people hooked on SAFE and demanding faster, better stuff, than make them jump through hoops in order to try it out.

So we can always provide libraries that talk directly to the SAFE API as well, but I don’t think we should skip the step of providing maximum exposure and compatibility first, which is why I’m keen on supporting a Solid RESTful API via fetch() and perhaps also SAFE NFS in the first instance.


#28

@happybeing I think we should separate out the data representation vs solid integration.

WRT/ data representation formats:

While Json-ld can be generalised. It can also conform to RDF. And I’m right there with you that conforming is a good move.

To say it was not designed for RDF I think is wrong. It was designed to be another format to serialise this data, (with a couple of extra benefits that can make it more usable, but therefore non compliant), and be usable for the web as it currently is

But for the purposes of its use on the network, if we agree RDF compliance is paramount. It’s possible that we add tools to validate this. (if RDF data were a specific tagtype or some such…)


I think this could equally happen with any of the RDF formats… there’s nothing stopping anyone writing a turtle file themselves and getting it all wrong.


I really think you should have a look at https://json-ld.org/playground/ . You can easily view the same data in many of the different RDF formats, so you can get a good comparison of what’s going on.


#29

You’re missing my points Josh so when I get time I will clarify. But while I believe I understand your response above it doesn’t IMO address the points I just made. Thanks for the link, I’ll look at that when I get time.


#30

Had a quick look, but these aren’t RDF are they? I don’t see triples, and I’m not sure quads are conformant either. I confess I don’t know enough to say what is / isn’t compliant, which is part of the problem. If we make it easy for people to generate and store JSON-LD with arbitrary JSON-LD generating libraries and tools, we are bound to end up with data that can’t be translated to meaningful conformant RDF representations. It would be like allowing text and trying to convert it to Turtle (extreme, but I hope you see my point).

This is what I mean by suggesting you are missing my points - for example re libraries and tools. My suggestion is that by adopting a non-generalised format, people will choose tools and libraries which generate non-generalised RDF formats (whether Turtle, N3 or whatever).

One thing I’m not clear on still is how you envisage storing triples (ie. conformant, not generalised RDF) in a key/value fashion. Can you take some non-generalised RDF and show how it would be represented in JSON-LD and mapped to the key/value style of an IData? I think that might help bridge the gap in understanding between us. I realise you may not have time, but if you do I think it would help the discussion. Cheers! :slight_smile:


#31

I found the RDF Primer v1.1 to be nice to understand how to describe the same RDF content using different serialisation formats: https://www.w3.org/TR/rdf11-primer. For JSON-LD you can see it here.