Cannot start node 0.28.0 due to error: Routing(Network( IgdNotSupported))

$ ~/.safe/node/sn_node --local-ip 2001:983:8610:1:854:efb1:52e6:85a3 -l 12000 --first
Cannot start node due to error: Routing(Network(IgdNotSupported))
[sn_node] ERROR 2021-02-27T18:05:06.088588130+01:00 [src/bin/sn_node.rs:118] Cannot start node due to error: Routing(Network(IgdNotSupported))
1 Like

I don’t have it up in front of me, but two quick ideas. The TLDR though is that, because this is an IpV6 not running on localhost, you might want to try supplying --external-ip because IPv6 port forwarding isn’t supported afaik.

  • IpV6 issue

Try supplying 127.0.0.1 or some other IpV4-formatted address just to see what happens?

Looking at the error code docs (in qp2p), the only way you can get this error is if you try to use IGD with IpV6.

Is this the only log line by the way? Every time qp2p returns this error, it’s accompanied by a corresponding info! level log (it only happens in a handful of spots so it was pretty easy to track them all down).

  • --forward-port issue

If this is not running on LAN or local and you are genesis, you need either --forward-port (which doesn’t seem to work for IPv6 as noted) or to supply --external-ip, I think. I pasted the docs here in another response b/c something similar was happening in the other thread. Not sure though as I haven’t tested it out yet.

1 Like

Quick Update

See this post I made in the other thread.

TLDR is that I might’ve been able to repro the issue, and it might be that this isn’t an issue with command line args. To confirm though, try manually overriding RUST_LOG="debug" before running (seems like some of the info logs are suppressed by default) and see if you’re getting the same message I got in the other post.

If you, I, and tfa all get the same thing, seems like it might be being thrown by the igd library calling search_gateway(). If that’s the case, might be a good indication of something going on in qp2p or igd or some router settings we all have in common.

(That said, manually setting up port forwarding and specifying --local-ip and --external-ip together will skip port forwarding and avoid this issue either way).

But we didn’t have to do that with --ip option.

And what to do when there is not router. For example if we want to create a safe network:

  • inside a docker network as in my example,

  • or inside a local LAN with physical machines,

  • or with Digital Ocean droplets connected to the Internet

I think that’s because the new option to manually forward skips automatic port forwarding altogether by coincidence, which happens to get around the issue (You can see exactly this happening if you take a look at the function Endpoint::new() and it’s call to forward_port()). I believe the issue is with the port forwarding though, perhaps caused by the our upgrade to the new version of igd or in some of the other changes in qp2p (also likely given the amount of refactor that library is seeing lately) and we just didn’t notice until now. Which might make sense also given that the command line args to test this were in flux until just recently for that part of code. I suspect even if the flags hadn’t been changed, this issue may still have arisen.

As for the tour examples, the short answer is it’s probably not possible except in the LAN case right now. I dont know that for sure though. Manual port forwarding is just a workaround to the issue right now, it’s not a solution, so I wouldn’t expect it to work in all situations.

If everything were working as it’s supposed to, ideally, you should be able to just use —local-ip In the same way as the old —ip option. The new flags didn’t take anything away by design, but perhaps something happened along the way and we’re just only noticing it now… that’s my working hypothesis at least.

In any case, when you get a moment, can you (or anybody else reading this) confirm or deny that we’re all getting the same issue by posting your full log output after exporting the RUST_LOG=“debug”? This is all assuming that my error is the same as your error, (which it may not be if it’s the same code but with a different reason provided by the full log output)

Update

Seems this still hasn’t been patched out so I opened up an issue on github , so let’s see if the devs have any insight perhaps.

1 Like

At first glance sn_node seems to be the culprit with the following code in sn_node.rs:

    if config.is_local() {
        config.listen_on_loopback();
    } else {
        config.network_config.forward_port = true;
    }

This snippet means that we can only use local host (127.0.0.1) or port forwarding. This is an incredible regression with no possibility to use a basic network anymore (like a LAN, a docker network, some VPS directly connected to the Internet, …).

I tried to work around the problem by implementing a new flag to set forward port to false. But I still get the same error (IgdNotSupported).

Then I saw that qp2p has a feature named “no-igd” which looked promising to allow these use cases. I activated this feature but then the error is NoEchoServiceResponse.

Here I said to myself WTF. I also don’t want an echo service, I just want a simple network where I know my IP address. Possibilities to use IGD or an echo service are useful in some circumstances, but this is not a reason to discard the simple use cases I mentioned above.

My fear is that this limitation isn’t just superficial in sn_node but is deeply rooted in qp2p.

I wanted to debug under VS code to see why we get this error in qp2p but I got this one instead:
“Argument short must be unique -l is already in use”.

It happens because there are two conflicting options flagged as short in qp2p Config structure (“local-ip” and “local-port”). Note: I suspect this is the reason why the smoke test has been commented out (see the same error in Unit testing - my first try. Is it me or does one of the sn_node unit tests fail?).

So I tried to remove these 2 short options in qp2p and recompile sn_node using local references for qp2p and sn_routing. But the problem is that current sn_node is not compatible with current qp2p and sn_routing.

I stopped here because I am not able to solve the incompatibilities.

To recap the whole mess:

  • standard network not supported anymore,
  • duplicate short option -l,
  • unit test commented out instead of correcting the problem,
  • incompatible master repos.

I hope all these problems will be corrected shortly.

4 Likes

FYI an issue has been raised for this and a proposed solution

Just wanted to add another voice to this one in particular.

1 Like

I like your proposed short options:

  • -l, --local
  • -i, --local-ip <local-ip>
  • -p, --local-port <local-port>
  • -I, --external-ip <external-ip>
  • -P, --external-port <external-port>

The problem was hard with -e present twice and -l present 3 times and you solved all the conflicts elegantly.

2 Likes

I don’t think this is feasible and/or even desirable to do. What we should really care about is the released versions (and semantic versioning), just like with any other dependency.
There will always be the possibility of different crates evolving at different pace with/without the same devs involved, specially as we grow, new versions of each crate shall be published following the semantic version rules, then other crates shall update their deps as soon as they made the changes to use the new version. Master repos will many times be compatible but it’s not important, and I’d agree as the crates mature there will be longer period of times when this is true and all compatible, but not something I think should be a goal at all.

In addition to that, we shall, and will, have versioning in the messaging (we have it there ready just not use it yet for checking compatibility and proper error handling and/or backward compatibility mode/s), e.g. if running a newer version of the sn_node on the network side, the client may be able to send messages to it but perhaps with an old version of the messaging protocol, in which case the nodes shall be capable of either:

  1. return an error requiring the client to upgrade to a new version of the messaging protocol
  2. or sn_node in some cases can support backward compatibility for 1, 2, …N, older versions, to allow clients to keep pace with the network side messaging protocol upgrades.

I think this is all very much related to network upgrades capability which we are not focusing on, and I don’t think we will until upcoming testnet, or even Fleming, release is out.

2 Likes

I found a big problem: --external-ip and --external-port options are not taken into account by sn_node. You can specify them but they are silently ignored.

I am a bit disappointed. Someone from @maidsafe could have told us this. There has been so many discussions about these options that were useless without this information.

But this is not enough: I corrected this problem, together with the duplicate short options in qp2p but a standard network still doesn’t work (though there is a little progress because I can create the genesis node now).

I’ll investigate further next weekend.

3 Likes

This is corrected now, and a basic network (without port forwarding) works when both --local-ip/port and --external-ip/port are specified.

Client didn’t work immediately: authd daemon could connect to the network, but I was not able to create an account (with safe auth create –test-coins).

Then I observed that sn_client was updated a few hours ago. So I updated a fork of sn_api to reference this new version (+ new sn_data_types) and this setup worked.

Whole client session:

~ # sn_authd start
Starting SAFE Authenticator daemon (sn_authd)...
sn_authd started (PID: 363)
~ # safe auth create --test-coins
Passphrase:
Password:
Sending request to authd to create a Safe...
Safe was created successfully!
~ # safe auth unlock --self-auth
Passphrase:
Password:
Sending action request to authd to unlock the Safe...
Safe unlocked successfully
Authorising CLI application...
Waiting for authorising response from authd...
Safe CLI app was successfully authorised
Credentials were stored in /root/.safe/cli/credentials
~ # safe files put --recursive ~/xtest/
FilesContainer created at: "safe://hyryyry3uic9zo397ghkghas7sd5bcstpwye5h8d164njoj9a4r4deg8bxhnra"
+  /root/xtest/empty_dir
+  /root/xtest/img
+  /root/xtest/img/safe_logo_blue.svg  safe://hygoykyeqx3yp6upna9wxan8tuwg8q59ma88f9mnwy84m3mbhh755jmothy
+  /root/xtest/index.html              safe://hy8oyryeurq5hcoqzyp5quj19azf3s5xtx5mfm4beu3at3gj6b5c3iwu4pc
~ # safe nrs create test --link safe://hyryyry3uic9zo397ghkghas7sd5bcstpwye5h8d164njoj9a4r4deg8bxhnra?v=0
New NRS Map for "safe://test" created at: "safe://hyryygy3s6ywfon7ofurnqkw4ye8wry8de5t8pmugj3n67ydwki4qf7pmoyn7a"
+  test  safe://hyryyry3uic9zo397ghkghas7sd5bcstpwye5h8d164njoj9a4r4deg8bxhnra?v=0
~ # safe cat test
Files of FilesContainer (version 0) at "test":
+-------------------------+-----------------+------+----------------------+----------------------+--------------------------------------------------------------------+
| Name                    | Type            | Size | Created              | Modified             | Link                                                               |
+-------------------------+-----------------+------+----------------------+----------------------+--------------------------------------------------------------------+
| /empty_dir              | inode/directory | 0    | 2021-04-02T20:45:29Z | 2021-04-02T20:45:29Z |                                                                    |
+-------------------------+-----------------+------+----------------------+----------------------+--------------------------------------------------------------------+
| /img                    | inode/directory | 0    | 2021-04-02T20:45:29Z | 2021-04-02T20:45:29Z |                                                                    |
+-------------------------+-----------------+------+----------------------+----------------------+--------------------------------------------------------------------+
| /img/safe_logo_blue.svg | image/svg+xml   | 5852 | 2021-04-02T20:45:29Z | 2021-04-02T20:45:29Z | safe://hygoykyeqx3yp6upna9wxan8tuwg8q59ma88f9mnwy84m3mbhh755jmothy |
+-------------------------+-----------------+------+----------------------+----------------------+--------------------------------------------------------------------+
| /index.html             | text/html       | 639  | 2021-04-02T20:45:29Z | 2021-04-02T20:45:29Z | safe://hy8oyryeurq5hcoqzyp5quj19azf3s5xtx5mfm4beu3at3gj6b5c3iwu4pc |
+-------------------------+-----------------+------+----------------------+----------------------+--------------------------------------------------------------------+
~ # safe cat test/index.html
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>My simple safe site</title>
</head>
<body>
    <h1>My simple safe site</h1>
    <h2>Relative link:</h2>
    <img src="img/safe_logo_blue.svg">
    <h2>Absolute link:</h2>
    <img src="/img/safe_logo_blue.svg">
    <h2>Versioned safe link:</h2>
    <img src="safe://test/img/safe_logo_blue.svg?v=0">
    <h2>Unversioned safe link (shouldn't be displayed):</h2>
    <img src="safe://test/img/safe_logo_blue.svg">
</body>
</html>
~ # sn_authd stop
Stopping SAFE Authenticator daemon (sn_authd)...
Success, sn_authd (PID: 363) stopped!

Hard coded contact was put in ~/.safe/node/node_connection_info.config (in my case ["172.20.0.3:5483"] which is the IP address of my genesis node).

I am not sure that such a high number is needed but I used 15 nodes (all Docker containers in a Docker bridge network). Client was also a Docker container in the same network.

5 Likes

Test just above was using IPv4. Running IPv6 also works but is a little harder to configure:

  • node configuration is the same with --local-ip, --external-ip and -h options specifying IPv6 addresses instead of IPv4

  • but sn_api configuration needs 2 files simultaneously:

~ # cat ~/.safe/node/node_connection_info.config
[
    "[fd2f:9ab3:8b80:69fa::3]:5483"
]
~ # cat ~/.safe/client/sn_client.config
{
    "hard_coded_contacts": [
        "[fd2f:9ab3:8b80:69fa::3]:5483"
    ],
    "external_ip": "fd2f:9ab3:8b80:69fa::2",
    "local_ip": "fd2f:9ab3:8b80:69fa::2",
    "forward_port": false
}

(where “fd2f:9ab3:8b80:69fa::2” is client address and “fd2f:9ab3:8b80:69fa::3” is contact node address)

3 Likes

I’m actually still not that familiar with docker containers.
Did you create one manually?
Can a Docker container be created with a working node and api version for an OS, like Manjaro and then be shared, so another person can just download the same container?

I’ve not used them a lot but I think you summarised it correctly. A Docker container let’s you install a clean OS, configure it and then anyone can just load that container and run it as you set it up.

1 Like

To use an exact terminology: containers cannot be shared. A running image is called a container. But yes, an image can be shared.

But I didn’t create images, I just run a standard image named Alpine which is a light weight Linux distribution. I run it to create containers which I use like VMs. Maybe this is a degenerated usage of Docker but I find it handy.

I am on Windows 10 with WSL 2, Ubuntu distro and Docker Desktop. I enabled integration with additional distros to be able to launch docker commands from Ubuntu shell.

To build from Ubuntu an executable runnable on Alpine the target to use is x86_64-unknown-linux-musl. I use cross to simplify cross compilation. Note that cross also uses Docker!.

1 Like

For information a new flag named --skip-igd has been added to create a standard network (without IGD).

For the genesis node the command is: sn_node --first 172.20.0.3:5483 --skip-igd --clear-data.

For the following nodes the command is simply sn_node --skip-igd --clear-data -h '["172.20.0.3:5483"]' (-h argument is socket address of contact node). In this case a random port is selected by sn_node which is a problem if we want to open only one port dedicated to sn_node in the firewall.

There is a possibility to control the port number but for that we need to pass the local socket address with the chosen port address twice and the command becomes: sn_node --local-addr 172.20.0.4:5483 --public-addr 172.20.0.4:5483 --skip-igd --clear-data -h '["172.20.0.3:5483"]'.

Client side configuration is the same as before ( ~/.safe/node/node_connection_info.config with content ["172.20.0.3:5483"]) and port number cannot be chosen.

The IPv6 story is also the same as before:

  • for nodes just replace IPv4 socket addresses (in --first, --local-addr, --public-addr and -h arguments) by IPv6 addresses (with brackets to delimit the IP part, for example "[fd2f:9ab3:8b80:69fa::3]:5483").

  • for sn_api two files are still needed (the same as those mentioned before) and port number cannot be chosen.

2 Likes

To be precise --public-addr can be omitted, but then the port number specified by --local-addr is not taken into account (a random one is used instead), which is the same as omitting both options:

~ # sn_node -m 2000000000 --local-addr 172.20.0.4:5483 --skip-igd --clear-data -h '["172.20.0.3:5483"]' -vvvv > v.log 2>&1 &
~ #
~ # netstat -lunp | grep sn_node
udp        0      0 172.20.0.4:35743        0.0.0.0:*                           60/sn_node

--local-addr cannot be omitted, otherwise we get Failed to create Config: Configuration("--public-addr passed without specifing local address using --first or --local-addr").

The need to pass twice the same socket address to be able to control the port number is a regression compared to earlier --ip/--port parameters. Of course now you can use IGD but there is no reason that the usage becomes more complex when we don’t use it.

I have issued a PR to correct this by automatically duplicating --local-addr to --public-addr when the 3 following conditions are met:

  • --local-addr is specified
  • --skip-igd is specified
  • --public-addr is not specified

I have proposed this in sn_node to not destabilize qp2p, and it’s only a small test to add in sn_node.

2 Likes