[RFC Discussion]: XOR-URLs

#45

oh you mean owned by the app and not owned by my account - aren’t you? and the browser doesn’t own my png …?

ps: oh nice - and when i don’t accidentally damage the link before re-encoding it to base32z i get exactly your link! @bochaco

1 Like
#46

No, I just mean that if the ImmD you are trying to fetch is encrypted by another app, the call to app.immutableData.fetch(iDataAddress) fails because it cannot decrypt the ImmutableData, therefore you don’t get the Reader object where you could call getXorUrl(mimeType) on. So I’m thinking we should have an API that you can do app.immutableData.genXorUrl(iDataAddress, mimeType) which doesn’t try to fetch it but just generate the XOR-URL for you.

That’s really good news!! so to clarify, are you saying that there is no issue then between the Python implementation and JS one for CID as we suspected before?

1 Like
#47

so you mean yes :stuck_out_tongue_winking_eye: i wasn’t aware that encrypted means only possible to decrypt by the app that created it Oo … how can that be …? how do i get the decryption key? do i always need to re-upload an immutable as unencrypted if i want to share it? Oo

No absolutely not - there is an issue and it’s all down in the ‘standard libs for cid’ - I use the character table I found in the JS implementation of base32z to re-encode base32 to JS-base32z ‘by hand’ - the ‘regular python functions’ to encode to base32z leads to different strings…

(my opinion regarding using base32z or not in your api didn’t change … - I think you should definitely go with the standard base32 because that comes with less trouble in all languages … the skipping of certain ambiguous characters might be something one could consider … but this weird re-ordering to match ‘more frequent characters’ to easier to identify characters doesn’t make sense at all … since we are talking about immutable data identifiers => some hash of encrypted data => (at least pseudo-)random data by design and for mutables only not randomly generated mutables will not be randomly distributed (standard containers and stuff using that as starting point will again be randomly distributed because of the account being a random one…) … while i would prefer even more the use of base16…)

…and that i (again) made a mistake with the base32z-encoding thing so that it took me longer to come up with the correct way of doing it again shows why it would be better to just use the standardized base32 … i lost hours in finding the correct way to re-encode data, i lost hours for reverse-engineering base32z to base32 encoded safe-links and then see where the difference is/how the correct code for the image-link you used is, you lost hours in reading my posts and answering my questions … and in the end you and me both are slower in working for the SAFE Network …

1 Like
#48

And this :hugs: absolutely! We’re getting there :slightly_smiling_face: can’t wait to see some really powerful apps on safe :sunglasses:

2 Likes
#49

Nice, good finding, I think you should send a PR to the their repo to fix it :slight_smile:

Not trying to defend any base encoding, in fact I think it’s not that important for two rasons:

  • any base encoding will be decodable (that’s the point of the CID spec), so if some people use base32, base64, etc. they will all would work fine and can fetch the same content. I think the one that most people use would become the standard.
  • remember that both JS and your python implementation won’t be needed (and would go away) if this becomes natively supported by SAFE libs, note the following from our previous discussions:
1 Like
#50

And now, in an attempt to bring back the main discussion, yesterday we had a talk with @joshuef about considering the type tag to be encoded in the CID itself rather than being the port number of the URL. This would not only solve issues we already face with some browsers or libs not supporting port numbers larger than 65525 (even that the spec doesn’t mention such a limit IIRC), but also in the future if we need additional parameters for our data types we can embedd them in the CID string, i.e. the <multihash-content-address> of the CID spec, after all the type tag is part of our addressing system and would still make sense to be part of the content-address part. Just bringing it up for discussion and feedback about this alternative.
The only problem is that strictly speaking that wouldn’t be just a hash but a hash+number, which wouldn’t be a valid CID anymore…?..

1 Like
#51

Could the mutlticodec <key> be used for the tag_type in order to leave the hash as is (see here).

#52

This is the CID string:
<cidv1> ::= <multibase-prefix><cid-version><multicodec-content-type><multihash-content-address>

And the <multihash-content-address> is:
<varint hash function code><varint digest size in bytes><hash function output>

So do you mean to use a multicodec in the <hash function output> ? or where exactly?

#53

valid CID as in ‘a cid is a multiformat-encoded hash’ and we would be be using it as ‘multiformat encoded hash+additional data’ - or did i misunderstand you?

+1

maybe not as intended by the creators but as we don’t only use the hash for data discovery/recovery and it works well with the cid format that makes great sense imho

#54

Like the following definition, which includes clear definitions of what each of these parts exactly are and encode:

If we change any of that we cannot say our URl contains a CID

1 Like
#55

if we’d hash all the bytes of xor-name and typetag it would be a valid cid - wouldn’t it? what would speak against this? - as it indeed would be simpler than splitting up a link into cid + type-tag … +it would benefit from the properties of a cid …? (what would be the motivation to exclude something from the encoding algorithm?)

#56

Yes, that’s an option, but just trying to be very critical and strict (not saying it’s correct), hashing xorname+typetag can strictly be considered a SAFE address? probably yes, but in SAFE currently the xorname is the address, that’s what routing only knows, the type tag is something else, still to locate the data…so that’s all I mean

2 Likes
#57

Sorry, ignore me (often best :stuck_out_tongue_winking_eye:) I misread the page I linked. I see it is not part of CID but showing the generic case of which the CID <mc><hash> is an implementation. Dang it!

1 Like
#58

Okay - since we’re talking now - what’s speaking against adding a small checksum and this systematic to share private data that comes with keys?

Or which other alternatives can you think of (what are their upsides?)

#59

I’m not sure why I don’t see the need for a checksum, if you cannot fetch the content it’s probably invalid, even if the XOR-URL was checked-sum but couldn’t fetch the content what is it that you can get/conclude out of it?

Having decryption keys in the URL to a private content…hold on…isn’t that contradictory? if you do that then decryption keys become just like “an additional encoding” to your URL, as eveyone with such URL can see the content and therefore not private. In fact, some toosl out there already handle sharing data by just providing a difficult-to-guess URL, but if you have the URL you have access, so it’s pseudo private and shared.

I think encryption keys or any key needed to decrypt/fetch a piece of data needs to be out of band, with some other type of sharing mechanism at the application layer (I wouldn’t disagree at all we should provide those utilities though)

Edit: an example of application layer solution is safe://<XOR-URL>?key=<keys to decrypt>

1 Like
#60

True - but you need Internet connection for validating this (and need to wait for the response ‘not available’) so only online possible and slower

Well - simple sharing of encrypted data with a group? (for example my holiday pictures with my grandma (who can click a link in a messenger but for sure can’t operate many programs) and family)

safe://<XOR-URL>?key=<keys to decrypt>

how is this different except for the missing upsides that come with cids?

#61

That is something to be handled by the application and not by the routing/resolver, otherwise it’s like putting encryption keys encoded in a domain name in existing internet

#62

And…? What’s the problem with that? The same mutable would have multiple links then that grant different access rights…?

The application can just take the link as it is and hand it to the api => has the rights that it got granted without the programmer needing to examine the link for maybe or not added keys and creating in those cases other mutables with additional key information added to them…? (and since you need to provide a function for this functionality anyway… Where is the difference for you to process 1 longer link vs. Link +key…?)

edit/ps:

do you realize that you “requested for comments” here but as i comment on it and give technical/usability arguments + examples i only get an answer after asking repeatedly +only get opinions back but no arguments…?

pps: and the difference between ‘putting encryption keys encoded in a domain name in existing internet’ and my suggestion is that the current DNS system is a public list and therefore it’s 100% different to my suggestion … of course you wouldn’t want to have your keys in plain text in a public list … that’s why you can’t do it currently … but we don’t have this limit on safe and names need to be resolved locally anyway so why would we opt for the clumsy work-around that needs to be used by the clearnet…?

Ppps: maybe there is a hidden argument behind the opinion… ‘out of band’ of what do you mean…? Since sharing a link with additional key info or sharing link including the key info is always the same context/band… Out of band of the name resolution? If someone can read the lib calls for name resolution he can read the lib call for retrieving the piece of data with key as argument too…? So I don’t see additional security by splitting it up…? (while if you split up the cid containing both key and xor name and transfer both parts through 2 different channels none of the 2 by themselves can make sense at all if caught by a 3rd party and only put together again will reveal the name+rights at the same time)

#63

I’m not sure that’s completely accurate (and, forgive me, a little unfair). I think everyone here is happy to head new views on things but that doesn’t mean that we all have to agree on everything.

Equally, as we’ve noted in here a couple of times, we’re not ignoring this… it’s just that we’re not always able to reply straight away (or necessarily fast, especially when you’re providing food for thought). I’m sorry if that’s frustrating, I get it, but sometimes we’re going to have to ask for a little patience.


For example, there’s some solid reasoning for maintaining a z-base32 : readability and avoiding confusion.

You argument re: ‘its not standard’, is not a strong one given a) you can use any hashing function you want with CID, as @bochaco has pointed out, that’s part of the purpose (it seems issues you have are issues with a python implementation… which can be PRd to to help everyone out there). and b) it’s proposed that this hashing function should be available in the Safe Client Libs API. So it’s going to be language independent, which is what we’re considering here. (Though props to you for diving in to a language where we don’t have an impl yet! That is massively appreciated :+1: :bowing_man:

#64

Regarding links w/encryption keys:

A link + key is not secure. If it’s passed all together, it may as well not be encrypted. (Which is about as efficient as not encypting, but just no publishing the link anywhere… the link should be effectively un-guessable ).

That’s where ‘out of band’ comes in. Which means: passed but not in the same medium. ie: share your encrypted photos XOR link with friends and family, but tell them the password to decrypt it on the phone.

But they could only do this on the decrypting computer (where a decrypt key is entered). Whereas encoding into the link, anyone could do this without needing your decryption keys… as it’s in the link.