Trying to build a global network


#1

I am trying to build a global network that I intend to be the seed of a future community network.

My vault is a fork of safe_vault and uses a fork of maidsafe_utilities, but my modifications are minimal and concerns only logging of vault data (because I want to be able to collect some data but I also want my vault to be inter-operable with original safe_vault, so that people are not forced to use my fork).

But I came across several problems:

  • Setting disable_external_reachability_requirement to false doesn’t work.

  • Release mode doesn’t compile.

  • I cannot create an account when the number of vaults is not exactly the min section size.

The first 2 problems are not blocking because I just leave disable_external_reachability_requirement to true and compile in debug mode, but the last one is blocking. I have set min_section_size to 5 and if 5 vaults are running then account creation works but when 6 vaults are running it doesn’t.

I reproduce this problem both when I try to create a manager account or a regular account:

When I try to create the manager account for invitations (./gen_invites --create) with 6 vaults I get this error:

Trying to create an account using given seed from file...
thread 'mainWARN 13:38:38.895597300 Core Event Loop [<unknown> <unknown>:188] Failed to receive response: Timeout
' panicked at 'WARN 13:38:38.896607000 Core Event Loop [<unknown> <unknown>:191] Could not put account to the Network: CoreError(Operation aborted - CoreError::OperationAborted)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!   unwrap! called on Result::Err                                              !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
safe_app/examples/gen_invites.rs:186,16 in gen_invites

Err(CoreError(Operation aborted - CoreError::OperationAborted))

', /home/tfa/.cargo/registry/src/github.com-1ecc6299db9ec823/unwrap-1.2.1/src/lib.rs:67:25
note: Run with `RUST_BACKTRACE=1` for a backtrace.

After having created the manager successfully (with 5 vaults), I put back the sixth vault and I try to create a regular account with an invite. I get this error in Peruse browser:

Core error: Blocking operation was cancelled

If I delete the last vault to have 5 vaults again, then I can create an account with an invite (another one because the network considers that the previous invitation is already claimed).

The global Peruse log file (with the failed creation, followed by the successful one) is:

T 18-12-05 15:15:08.293993 [<unknown> <unknown>:49] Creating unregistered client.
T 18-12-05 15:15:08.296993 [crust::main::service service.rs:556] Network name: Some("tfa")
T 18-12-05 15:15:08.327002 [crust::main::service service.rs:82] Event loop started
T 18-12-05 15:15:08.327993 [crust::main::config_refresher config_refresher.rs:44] Entered state ConfigRefresher
T 18-12-05 15:15:08.327993 [<unknown> <unknown>:537] Waiting to get connected to the Network...
T 18-12-05 15:15:09.359988 [crust::main::active_connection active_connection.rs:63] Entered state ActiveConnection: PublicId(name: ce06c7..) -> PublicId(name: 75744a..)
T 18-12-05 15:15:09.359988 [crust::main::active_connection active_connection.rs:110] Connection Map inserted: PublicId(name: 75744a..) -> Some(ConnectionId { active_connection: Some(Token(11)), currently_handshaking: 0 })
D 18-12-05 15:15:09.360978 [routing::states::bootstrapping bootstrapping.rs:266] Bootstrapping(ce06c7..) Received BootstrapConnect from 75744a...
D 18-12-05 15:15:09.360978 [routing::states::bootstrapping bootstrapping.rs:332] Bootstrapping(ce06c7..) Sending BootstrapRequest to 75744a...
D 18-12-05 15:15:09.366978 [routing::states::client client.rs:91] Client(ce06c7..) State changed to client.
T 18-12-05 15:15:09.366978 [<unknown> <unknown>:555] Connected to the Network.
T 18-12-05 15:34:37.964360 [<unknown> <unknown>:124] Attempting to log into an acc using client keys.
T 18-12-05 15:34:37.965361 [crust::main::service service.rs:556] Network name: Some("tfa")
T 18-12-05 15:34:38.011359 [crust::main::service service.rs:82] Event loop started
T 18-12-05 15:34:38.011359 [crust::main::config_refresher config_refresher.rs:44] Entered state ConfigRefresher
T 18-12-05 15:34:38.011359 [<unknown> <unknown>:537] Waiting to get connected to the Network...
T 18-12-05 15:34:39.066345 [crust::main::active_connection active_connection.rs:63] Entered state ActiveConnection: PublicId(name: 9cd623..) -> PublicId(name: 75744a..)
T 18-12-05 15:34:39.066345 [crust::main::active_connection active_connection.rs:110] Connection Map inserted: PublicId(name: 75744a..) -> Some(ConnectionId { active_connection: Some(Token(11)), currently_handshaking: 0 })
D 18-12-05 15:34:39.067346 [routing::states::bootstrapping bootstrapping.rs:266] Bootstrapping(9cd623..) Received BootstrapConnect from 75744a...
D 18-12-05 15:34:39.067346 [routing::states::bootstrapping bootstrapping.rs:332] Bootstrapping(9cd623..) Sending BootstrapRequest to 75744a...
D 18-12-05 15:34:39.073345 [crust::main::active_connection active_connection.rs:140] PublicId(name: 9cd623..) - Failed to read from socket: ZeroByteRead
I 18-12-05 15:34:39.073345 [routing::states::bootstrapping bootstrapping.rs:316] Bootstrapping(9cd623..) Connection failed: The chosen proxy node already has connections to the maximum number of clients allowed per proxy.
T 18-12-05 15:34:39.073345 [crust::main::active_connection active_connection.rs:227] Connection Map removed: PublicId(name: 75744a..) -> None
D 18-12-05 15:34:39.073345 [routing::states::bootstrapping bootstrapping.rs:365] Bootstrapping(9cd623..) Dropping bootstrap node PublicId(name: 75744a..) and retrying.
I 18-12-05 15:34:39.073345 [routing::states::bootstrapping bootstrapping.rs:141] Bootstrapping(9cd623..) Lost connection to proxy PublicId(name: 75744a..).
T 18-12-05 15:34:40.170331 [crust::main::active_connection active_connection.rs:63] Entered state ActiveConnection: PublicId(name: 9cd623..) -> PublicId(name: adaad7..)
T 18-12-05 15:34:40.170331 [crust::main::active_connection active_connection.rs:110] Connection Map inserted: PublicId(name: adaad7..) -> Some(ConnectionId { active_connection: Some(Token(20)), currently_handshaking: 0 })
D 18-12-05 15:34:40.170331 [routing::states::bootstrapping bootstrapping.rs:266] Bootstrapping(9cd623..) Received BootstrapConnect from adaad7...
D 18-12-05 15:34:40.170331 [routing::states::bootstrapping bootstrapping.rs:332] Bootstrapping(9cd623..) Sending BootstrapRequest to adaad7...
D 18-12-05 15:34:40.189330 [routing::states::client client.rs:91] Client(9cd623..) State changed to client.
T 18-12-05 15:34:40.189330 [<unknown> <unknown>:555] Connected to the Network.

Of course running 5 vaults permanently isn’t a workaround because I want the network to grow and when it becomes public I won’t control the number of vaults. So, what should I do to ensure that this kind of error doesn’t happen when the network is live?

I suppose my firewall configuration is correct because the setup with 5 vaults is working, but maybe not, so here are the ports allowed for inbound connections:

  • TCP/22 (for ssh)
  • TCP/2376, TCP/2377, UDP/4789, UDP/7946, TCP/7946 (for docker, I use it with an overlay network to log data and the internet network for safe exchanges on port 5483)
  • TCP/5483, UDP/5484 (for safe vault)

#2

Just for more info:

What is it you are expecting it to do and what do you observe instead ?

I just did a fresh compilation in release - safe_vault master (at dd7ef5b6...) builds fine. Try cargo update just in-case you have some stale files around. What compilation errors do you get ?

was that with safe_vault master ? Also which version/commit of safe_cleint_libs did you use ?

Lastly did you try completely disabling firewall just to see if that’s fine ?


#3

That did it. Thanks.

I am not expecting anything, I don’t know what disable_external_reachability_requirement parameter does. I thought this parameter was a security parameter to be disabled in a local network (like many others). So I just inverted it for a global network.

Edit: The error I get (with the docker network) is:

E 18-12-08 18:33:01.480232 Bootstrapper has no active children left - bootstrap has failed
I 18-12-08 18:33:01.480494 Bootstrapping(b9bafc..) Failed to bootstrap. Terminating.

Edit 2: Now i get:

E 18-12-08 20:27:26.010797 Failed to Bootstrap: (FailedExternalReachability) Bootstrappee node could not establish connection to us.
I 18-12-08 20:27:26.012825 Bootstrapping(427e3c..) Failed to bootstrap. Terminating.

I reproduced the problem in a local network. This time I used these elements for the vaults:

  • no invites, no resource proof, … (standard setup for a local network, see config file at the end of the post)
  • current Maidsafe safe_vault crate in master branch (no forks of mine),
  • firewall completely disabled on the host
  • a docker bridge network (which exposes all ports to the containers connected to it).

Min section size is still 5 and results are (depending on the total number of vaults running in the local network):

  • first test: 5 nodes OK, 6 nodes NOK, 7 nodes OK, 8 nodes NOK
  • second test: 5 nodes OK, 6 nodes NOK, 7 nodes NOK, 8 nodes OK

5 nodes are always OK and 6 nodes are always NOK, above 6 the results are varying.

I created a program that creates an account with a random seed. OK means the account was successfully created after a few seconds, NOK means that program was seemingly blocked and I stopped it after about 1 minute.

Here is the source code of the program:

extern crate maidsafe_utilities;
extern crate rand;
extern crate safe_authenticator;
#[macro_use]
extern crate unwrap;

use rand::{thread_rng, Rng};
use safe_authenticator::Authenticator;

fn main() {
    unwrap!(maidsafe_utilities::log::init(true));
    println!("\nTrying to create an account using a random seed...");
    let seed = generate_random_printable(32);
    let _ = unwrap!(Authenticator::create_acc_with_seed(seed.as_str(), || ()));
    println!("Success !");
}

fn generate_random_printable(len: usize) -> String {
    thread_rng().gen_ascii_chars().take(len).collect()
}

Note that I didn’t succeed in putting it in an independent crate referencing safe_client_libs, I had to add it as an example directly in safe_client_libs. I am not sure how a sub-crate of a multi-crates project should be referenced and I tried this for Cargo.toml:

[package]
name = "create_account"
version = "0.1.0"

[dependencies]
rand = "~0.3.18"
maidsafe_utilities = "~0.16.0"
safe_authenticator = { git = "https://github.com/maidsafe/safe_client_libs" }

But I get two errors like this one:

error: cannot find macro `wait_for_response!` in this scope
   --> /root/.cargo/git/checkouts/safe_client_libs-e3345e9360262bab/ea8e3f7/safe_authenticator/src/client.rs:188:27
    |
188 |             .and_then(|_| wait_for_response!(routing_rx, Response::PutMData, msg_id))
    |                           ^^^^^^^^^^^^^^^^^

Is there a place where the way to use the authenticator in a rust program is explained?

Appendix:

  • safe_vault.crust.config
{
  "hard_coded_contacts": [
    "172.19.0.2:5483",
    "172.19.0.3:5483",
    "172.19.0.4:5483",
    "172.19.0.5:5483",
    "172.19.0.6:5483"
  ],
  "whitelisted_node_ips": null,
  "whitelisted_client_ips": null,
  "tcp_acceptor_port": 5483,
  "force_acceptor_port_in_ext_ep": false,
  "service_discovery_port": null,
  "bootstrap_cache_name": null,
  "network_name": "local",
  "dev": {
    "disable_external_reachability_requirement": true
  }
}
  • safe_vault.routing.config
{
  "dev": {
    "allow_multiple_lan_nodes": true,
    "disable_client_rate_limiter": true,
    "disable_resource_proof": true,
    "min_section_size": 5
  }
}
  • safe_vault.vault.config
{
  "dev": {
    "disable_mutation_limit": true
  }
}

#4

I also reproduced the problems in a global network without using docker at all (but using my safe_vault fork).

I solved the problem of disable_external_reachability_requirement == false not working, by setting force_acceptor_port_in_ext_ep to true with this safe_vault.crust.config file (with masked IP addresses):

{
  "hard_coded_contacts": [ "...:5483", "...:5483", "...:5483", "...:5483",
    "...:5483", "...:5483", "...:5483", "...:5483" ],
  "whitelisted_node_ips": null,
  "whitelisted_client_ips": null,
  "tcp_acceptor_port": 5483,
  "force_acceptor_port_in_ext_ep": true,
  "service_discovery_port": null,
  "bootstrap_cache_name": null,
  "network_name": "tfa",
  "dev": {
    "disable_external_reachability_requirement": false
  }
}

But I still get the problem of account creation working with 5 nodes but not working with 6 nodes.


#5

external-reachability if enabled (disabled==false) means that the proxy will try and connect back to the bootstrapping node and only if that succeeds will it allow the bootstrap to be successful. It’s a way to ensure there are more nodes on the network which can be reached without holepunch etc. (i.e., more nodes in the network which can be directly reached - so Public or port-f/w’ded or … )


#6

hey @nbaksalyar , can you see if you can reproduce this, thanks !


#7

I have recorded a session with asciinema that demonstrates the problem on a local network, without docker and with crates from Maidsafe exclusively.

asciicast


#8

Hi @tfa, we were able to reproduce this behaviour and we’re looking into it.

Thanks for the report!


#9

Hi @tfa,
Could you please try one thing: change the routing config file for the client apps/browser to be the same as on the vaults side (i.e. change the dev section in <app name>.routing.config). This should do the trick, because clients use routing too, and min_section_size should be the same on the both sides.


#10

Probably worth updating the README.md in SCL to say that @nbaksalyar


#11

I was using "dev": null in the client routing config file and copying the file from the vault corrects the problem. Thank you very much and sorry for the disturbance.