MD and encryption.

oetyng · February 4, 2019, 1:04pm

Questions:

Why is Encrypt/Decrypt asynchronous? Is it doing network requests?
(If not, encryption would be a CPU bound operation and async/await pattern should not be used.)
EDIT: Figured that encryption keys are fetched, but that shouldn’t be the case either, since we’re passing in the MDataInfo which has the keys.
Why are there specific methods for encrypting key and value respectively in an MD? Decryption uses one single method, and to the user of the API it makes no sense why there should be two methods for encrypting. In what situation is it valuable for the user to have this knowledge?

(I restored the question @bzee I think I actually didn’t figure out the correct answer.)

bzee · February 4, 2019, 2:26pm

Could you share your answer to the other question?

If I recall correctly it has to do something with a nonce. Think @ustulation knows the details.

bzee · February 4, 2019, 6:40pm

No, it does not do network requests. This is one question I’ve asked myself numerous times; I decided to trace what these routines do. Let me think out loud: The technical answer is that it uses a call back, thus being implemented as an asynchronous function in the high-level APIs. But, that is a non-answer; why does it use a call back to communicate back the result? Perhaps it has to do with RFC 43 — utilising threads (and thus cores) for performance gain.

Hope someone from @maidsafe can shed some light on these good questions!

ravinderjangra · February 5, 2019, 5:26am

It’s not doing any network request.

In CPU bound code, when you want to do a heavy in-app calculation, such as calculating and displaying the remaining distance to reach the finish line of a car racing game (in our case data encryption). Imagine what will happen if it was done synchronously and the UI is blocked. Therefore, in the CPU bound case, you use the await in an async method that will be running on a background thread. You can read in detail in the official docs.

We have two specific methods for encrypting key and value because for value nonce is generated randomly while for the key the nonce is generated in a predictable way so that we don’t have to get the entire list of keys when a single key is requested.

For example, you know that there’s a key “Safe”, but it’s encrypted (because an MD is private). If the nonce is random, how do you know which key to fetch? It could be ‘ABC’, it could be ‘DEF’. Because of this reason we use predictable nonces so that you know for certain that the key “Safe” will be encrypted as ‘GHI’, and you can fetch this particular key from the MD.

oetyng · February 5, 2019, 1:04pm

. 2) That makes sense. Even though the API makes no difference with regards to parameters, the underlying logic needs to know which of the ones it’s supposed to be. We’re passing in the MDataInfo, and I guess when calling the Key encryption method, it uses the Nonce property of that instance (which is the same from when the MData was created in the first place), and when calling the Value encryption method, it makes a new nonce?

. 1) I am aware of the reasons why we would want parallel processing, that is actually the reason I came to ask this question from the get go. So now that we’ve established that there is no network request (in which case I would have no further to comment on), I will continue with my observation that the API might be assuming too much of how it will be used;
When fetching individual entries from various MDs the async / await pattern would probably be perfectly fine, but when fetching and decrypting many entries - in a tight loop - then async / await is not a good choice (also pointed out in the documentation you linked to).
Instead in that case, you would want to dedicate threads specifically, and avoid the expensive context switch of async / await, e.g. use Parallel.ForEach.

This leads me to think that it should be up to the user of the API to either do something like

var encTask = Task.Run(() => DoTheEncryption/Decryption);
var result = await encTask;

or

Parallel.ForEach(entries, entry =>
{
    var key = decrypt(entry.Key);
    var value = decrypt(entry.Value);
    // add to thread safe collection
});

For example what I am working with, a data store which handles large number of entries at a time, would like to do that. Currently I would need to do this:

Parallel.ForEach(entries, entry =>
{
    var key = keyDecryptionTask(entry.Key).GetAwaiter().GetResult();
    var value = valueDecryptionTask(entry.Value).GetAwaiter().GetResult();
    ...
});

Well, so while the above surely works, I’m not sure if it is actually preferable to assume as much as we do currently, about the API usage.