Why only 500 MiB limit on upload?

mav · November 4, 2019, 4:01am

See this post for context.

Upload of 499 MiB takes 15s

Upload 500 MiB takes 6m15s and shows TimeOut error.

Initially this was thought maybe to be a limit on self_encryption.

However the safe-cli (via safe-api?) does not call self_encryption.

Even if it did, the self_encryption limit is set to 1 GiB so why is it failing at 500 MiB? There’s nothing hardcoded to indicate this should be a limit?

Further to this, there are two definitions of safe data types, one lot in safe_core (immutable data, mutable data etc) and another lot in safe-nd… which one should be used and can this be simplified? (It’s a bit rhetorical, I think safe-nd is the truth and safe_core should be tidied up and have data types completely removed but of course that takes time so I understand it’s been left there for now, maybe I’m missing some purpose to having them split into two places?).

I hope to understand why the upload fails at 500 MiB in this topic, and ultimately to say ‘change this line of code to do bigger uploads’… so far I don’t know how to progress, any thoughts are welcome.

lionel.faber · November 4, 2019, 7:18am

Hey @mav

The reason why the 500 MiB upload fails is because the CLI doesn’t use self encryption yet. It’s on our plate though, and I think this contribution will be super useful.

So right now for Immutable Data, the safe CLI sends a single PutIData request without any encryption/chunking. The self encryption API is already available in safe_core. I can put together an issue with some more details so you can pick it up if you’d like

But I guess there’s more to it than just changing the API. @bochaco and @joshuef were talking about how the CLI can provide an additional parameter to decide if the self encryptor should be invoked or not.

mav · November 4, 2019, 7:24am

Cool, good to know.

Is there a reason for a single 500 MiB chunk not working that you know of? I’m happy to dig around for a while but if it’s something you know off the top of your head all the better

lionel.faber · November 4, 2019, 7:26am

Apologies if this seems confusing to you, but, I don’t think this is the case
Safe-Nd is the only place where we have the data type definitions. Could you please share which part of safe_core looks like data type definitions to you?

I’m guess the potential confusion is that both safe-nd and safe-core have API which seem like data type API. For example,

This API in safe_core and this API is safe-nd might seem similar since they do the same thing i.e. get the value for a particular key in mutable data. The difference is the safe-nd API requires a mutable data object to be available in memory and the safe-core API fetches the value from the network.

Is this the confusion you are having?

lionel.faber · November 4, 2019, 7:27am

That’s an easy one
There’s a 180 second timeout defined in SAFE Client Libs.

mav · November 4, 2019, 7:38am

Ah this explains a lot… thanks for clarifying.

I’m confused by, say

safe_core/src/immutable_data.rs L28 fn create

vs

safe-nd/src/immutable_data.rs L119 PubImmutableData new

To me these seem like they’d do the same thing, ie create a new ImmutableData. But the detail of what they do is, like you say, different. Part of this confusion is simply being new to the codebase (or at least new after six months away from it).

Why is this 180s timeout corresponding to exactly 500 MiB as the failure size? To be more specific, why does my laptop and desktop both fail at exactly 500 MiB even though their disk speed is significantly (at least 10%) different? Why is it not failing at, say, 550 MiB on my desktop? 499 MiB always succeeds and 500 MiB always fails, so to me this is almost certainly not a timing issue, it’s a size issue.

I guess it’s also important to be specific about the type of failure… 10s between ‘failing’ and then 3m between ‘timing out’…

[2019-11-03T22:19:42Z INFO  safe_api::api::files] Processing /tmp/500meg.bin...

... 10s pass

[2019-11-03T22:19:52Z DEBUG quic_p2p::utils] ERROR in communication with peer 127.0.0.1:56281: Connection(ApplicationClosed { reason: ApplicationClose { error_code: 0, reason: b"" } }) - Connection Error: closed by peer: 0. Details: Incoming streams failed

...2m57s pass

[2019-11-03T22:22:49Z INFO  safe_api::api::files] Skipping file "/tmp/500meg.bin". [Error] NetDataError - Failed to PUT Published ImmutableData: CoreError(Request has timed out - CoreError::RequestTimeout)

lionel.faber · November 4, 2019, 7:48am

This is actually an interesting part and it relates to this thread as well

The safe-nd API just creates a single IData object with whatever data you put into it. The safe_core::immutable_data::create API takes the data and starts self-encrypting it into 1MB chunks that are stored on the network this is done recursively till the data map is less thatn 1MB in size.

The IData that the latter returns is just the data map which you can either store on the network / locally / wherever you like to.

lionel.faber · November 4, 2019, 7:53am

I don’t think the write speeds of your computer determine the speed. Since it’s just a client that puts the entire 500 MB in a IData object and sends the request to the shared vault. Depending on the connectivity / write speeds on the shared vault this size may differ. I guess one way to check this would be to start a vault on both these devices and see if the size differs. In my knowledge there is no size restriction anywhere.

And I think it’s also worth noting that the 180 seconds for the request timeout starts when the request leaves the client. So any processing before that doesn’t count. And the response is expected to arrive within the timout. Any post processing after the response arrives doesn’t count as well.

mav · November 4, 2019, 10:56pm

The ‘error’ is coming from quic-p2p, DEFAULT_MAX_ALLOWED_MSG_SIZE is set to 500 MiB. Changing it allowed upload to proceed without error. Thanks for the assistance @lionel.faber especially during a time that I know is very busy for you and everyone at maidsafe.

github.com

maidsafe/quic-p2p/blob/db9f47c3dce2798a58e6c01a77ef55bc765fb6d4/src/lib.rs#L104-L106


/// Default maximum allowed message size. We'll error out on any bigger messages and probably
/// shutdown the connection. This value can be overridden via the `Config` option.
pub const DEFAULT_MAX_ALLOWED_MSG_SIZE: usize = 500 * 1024 * 1024; // 500MiB

Obviously once self_encryption is back in use it will be very uncommon to hit this limit. Interestingly this change affected vault, authenticator and cli since they all talk with each other using quic-p2p. So changing it locally won’t help upload single large unchunked files to the live network since many (most?) vaults won’t accept large files due to their quic-p2p settings.

Perhaps something in the future to consider is how this error can be more meaningful / less difficult to trace through the code.

Also to consider is perhaps the exit conditions. I think (maybe incorrectly) the process could exit after the stream fails rather than wait 3m for the timeout to happen. This would definitely help end-user perceptions. Waiting 3m for nothing (which they don’t know is nothing) is a shame when they could potentially get on with their business in much less time.

lionel.faber · November 5, 2019, 4:52am

Ah nice catch @mav
I wasn’t aware of that quic-p2p setting.

This feedback is super useful. I agree that waiting for 3 mins isn’t the right way to do this here.
Feel free to raise an issue on the SCL repo. I can chime in with more details so we can get this resolved by the team / community

joshuef · November 5, 2019, 8:17am

Maybe we should rename one of these funcs to avoid confusion? chunked_immutable_data / self_encrypted_idata or some such? (or should all instances of IData be self-encrypted? I think you answered this before @lionel.faber, but I may have forgotten )

mav · November 5, 2019, 10:27pm

Personally I feel all data should be self-encrypted. Open to alternative reasons but I think this because

it allows vaults to optimise around a known max chunk size which is lost if some files are not self encrypted. This optimisation affects network and storage and compute, the entire stack, so I feel is quite an understated but important aspect of the network.
forcing self encryption makes single continuous blocks of data more difficult to generate, eg uploading 100 GB at a very specific prefix eg 9f86d081884c7... is trivial without chunks, but (maybe) takes quite a lot of work when using chunks (self-chosen names for appendable data notwithstanding). This affects churn and min vault size etc
keeping a know granularity of data seems to have clear benefits but allowing arbitrarily large chunk sizes seems not to have any specific clear benefits.
self encryption improves privacy and durability, since it increases the chance of the file being broadly distributed. Not sure if this is really true…
the community has engaged with the project on the expectation of self encryption being a key feature. I feel it will take some convincing that it’s ok for it to be optional or non-default.
making it compulsory avoids the user asking the question ‘should I’ or ‘why would I do one or the other’… eg you want to put your movie collection online, seems to a naive user of course you wouldn’t chunk it, of course you’d keep the files ‘in tact’ and ‘original’. Less choices is usually better ux. Even having it as an ‘advanced’ option still makes it a choice.

bochaco · November 6, 2019, 3:41pm

FYI, just so everyone becomes aware of it: https://github.com/maidsafe/safe-api/pull/307

joshuef · November 7, 2019, 2:55pm

Given the above, maybe we should remove the ability in SCL to not self encrypt IDs? Thoughts @lionel.faber?