Remote Node Control Experiment in Rust Using MaidSafe's qjsonrpc Crate

happybeing · January 27, 2021, 1:13pm

I agree it’s a welcome and interesting innovation and at the same time I’m concerned!

What might be the security implications here? One might be DDoS but I think that isn’t specific to this case as nodes are exposed in other ways.

So I guess a risk might be in using whatever command processing is available to change node behaviour. That could be a threat to individual nodes or their owners’ devices, or en masse might open up a network attack. Again the same might be true in general, so I guess the issue is whether this capability is as secure or if there are ways it can open up vulnerabilities which aren’t otherwise present.

I honestly don’t know if these are real concerns but seems worth raising the possibility early on. I don’t want this to derail or discourage exploring and building though, because this can be a useful feature that makes it easier to manage nodes and enhance usability which is definitely desirable. I can’t help myself though!

Great work @Scorch!

Scorch · January 27, 2021, 6:28pm

As far as security is concerned, it’s using the same interface as authd does and connections are encrypted using TLS. The connection itself then, we can trust probably as far as we trust authd. Which I’d say is pretty far at this point, given that we trust authd already to manage app permissions, tokens, and safes, among other things. It’s also worth pointing out that it’s possible to include (and authd already does this) added information like passphrases etc. that could be used to seal the interface off.

More or less my thoughts exactly. The rest would depend on exactly what commands are available eventually. This would probably be a debate to be had on a per-feature basis, as people propose new functionality. Perhaps in the case of really sensitive information, like personal identifiers, I imagine something like conditional compilation would need to be used to ensure only certain binaries (e.g. maybe used in a test net for debug or something, where the stakes are low) would include more “dangerous” commands.

That last bit is a bit of speculation, but I guess I’m getting at the idea that having an interface is rather benign in and of itself (as pointed out by @mav earlier, bitcoin already has something similar). It’s the sort of API that’s exposed through it that would need to be scrutinized.

If this goes anywhere, it would be worth it to start a thread(s) to publicly ask for what types of functionality people can think of that would be useful/safe to expose, and debate it from there.

southside · January 27, 2021, 8:11pm

Resource usage and logging.
It’s supposed to be an autonomous network so unless there is a problem with resource hogging, we should just let it run - or stop it, gracefully for preference, especially if that would simplify code elsewhere.

bochaco · January 27, 2021, 8:30pm

I think we can secure the interface by requiring signatures to the commands, so perhaps when launching the sn_node you either set a PK or have the sn_node to give you a keypair that you use to sign each command you send to it?? the connection is already secured with TLS as mentioned, so it’s just about using any sort of authentication to accept the commands and invalid ones can be rejected quickly by the sn_node??.

I always imagined being able to run my sn_node on a remote location, and be able to not only monitor it but also change settings to it remotely, like reward wallet, amount of resources to share. If when setting it up I get a key pair that instance will then verify commands are signed with, I can simply carry the corresponding SK with me to get access to it remotely. Moreover, perhaps one day even tunneling these commands through Safe so I can do it from truly anywhere, as I could use its Safe URL, even create an NRS URL for it, then I could open Safe browser with safe://<my-sn-node-xorurl>/get-stats…?!..and you see something like: Validator 56628 - Open Source Ethereum Blockchain Explorer - beaconcha.in - 2024

southside · January 27, 2021, 9:10pm

I had imagined that these are parameters that Jim Collinson would work his magic on as part of the setup

Scorch · January 29, 2021, 5:55pm

Just a random little update. So the PR isn’t merged yet, but some of the comments have already led to some more ideas for qjsonrpc.

More Example Code

Since the original PR was also intended to be a proof of concept for the node rpc interface, it was a bit more complicated. As per some of the comments, I wrote an even simpler example and opened a new PR. Between these two examples, I think it’s a pretty comprehensive overview of qjsonrpc usage

Type Safety

While writing this example, I was trying my best to abstract out some of the connection details from the consumers of the RPC interface. This way we can do all of our type-checking at compile time, instead of at runtime (e.g. Errors like when mistype a hard-coded method name and don’t realize until the server sends back an error).

The server process is pretty much entirely abstracted out, so it only needs to worry about structured data in the form of enum Query instead of working with the raw JsonRpcRequest and JsonRpcResponse types. Unfortunately, the client still has to do risky things like send("method-name", json!(value)).

In response, I’m in the process of playing with a new trait idea that would give access to a type-safe qjsonrpc API. The traits would be something like a StructuredRequest and StructuredResponse which provide reliable conversion to and from JsonRpc types, and the api might be something like send_structured() or send_checked() in addition to the existing send()/get_next() functions. It might add some more flexibility. Once the examples are merged, and I get a chance to do some testing/debugging, I’ll open a PR for this as well I think

bochaco · February 1, 2021, 2:34pm

I just reviewed, tested and merged this one, it’s very nice @Scorch , it shows very clearly and simple how to use the API, it’s so simple we will be able to even put that code in the README moving forward I believe.

Scorch · February 1, 2021, 3:58pm

Thanks, I’m glad it could be useful!

I’m almost thinking it’s not worth it to merge the other PR as an example (for the more complicated version). Instead, it could be rolled in with the Structured traits, and included as a convenient, library-provided abstraction. Something like RpcDaemon?

That would effectively move the RpcDaemon module out of the example and into the lib, and then leave the async_server example with just a main.rs, query.rs, and response.rs. Does this seem plausible you think?

bochaco · February 1, 2021, 4:08pm

I was also thinking about that, I think we should draw the line of what this crate is and provides, which I guess it’s the foundations for JSON-RPC over QUIC. Then other layers like the RPC daemon could be part of the lib but as you say maybe more clearly as a second upper layer, perhaps within its own mod namespace and feature-gated? or even a separate crate and keep this crate to be just focused on the protocols and its messages rather than how the user creates a client or a server? …?..just thoughts as you can see

Scorch · February 1, 2021, 4:28pm

That’s a good idea, agreed.

I’m wondering if think this might even be two feature flags (Actually, one flag and one crate).

The idea of a StructuredRequest and a StructuredResponse seems like a convenience for constructing & parsing JsonRpcRequest and JsonRpcResponse reliably and with some better compile-time guarantees. This doesn’t add any functionality or anything new, but it might stop you from shooting yourself in the foot. That sort of “feels” like a feature flag to me.

Especially because, even if there were no StructuredRequest trait, it could be easily recreated (and probably would be) by any consumer of qjsonrpc by implementing send_my_structured_type(client: &ClientEndpoint, req: MyStructuredType). Providing it would be pure convenience. Down the line, if we wanted to, It would also set the foundation for us to write a macro to automatically derive the trait, without changing code of existing implementers.

On the other hand, RpcDaemon seems to be built strictly on top of the existing structure. Using RpcDaemon is a way to explicitly dodge using the underlying library entirely. That seems more like a new crate with qjsonrpc as a dependency.

Update

Was a great idea, but, as much as I would like to, trying to make JSON objects look even vaguely like a strongly-typed object is an uphill battle. It’d be easier at that point to just implement binary-encoded messages instead. Probably going to drop the new crate idea for a bit and just work directly with qjsonrpc::Endpoint inside a node for future experiments. Sunk cost sucks, but better to figure it out now I guess

Scorch · February 3, 2021, 10:04pm

Here’s a fun little hack I got going. I was messing around with my local node binary and was able to embed a qjsonrpc endpoint into the node (actually, it’s wrapped inside an Option, so it only runs if --rpc-port is supplied), and my little client was able to communicate with a node in a locally-run baby fleming net.

To get it to be compatible with run-baby-fleming, I also had to hack the sn_launch_tool a bit and recompile the latest sn_cli release to supply the proper --rpc-port args.

There’s nothing fancy yet like identity verification, no special methods (other than ping), and it’s by no means “production-quality”… But it works, which is always cool to see after tinkering around with it for a bit!

Note on qjsonrpc `SocketAddr` binding

(This is mostly for my own reference, so I don’t forget later.)

It also turns out that you can bind a qjsonrpc::Endpoint to an IPv4 address, but the client can’t actually talk to an IPv4 address it seems, which took a hot second to realize. At least that’s what I think is going on there, but I haven’t verified yet. I’m just saying that because, when I initially tried to bind the node-side Endpoint to an IPv4 localhost address, they weren’t able to communicate. If it turns out that is the case, it might be a good idea to try and patch that in the qjsonrpc lib (probably in ClientEndpoint::connect()).

Anyway, it was cool to see and wanted to share. Happy hacking

Scorch · February 8, 2021, 2:10pm

Having fun with sandboxing and I’ve got updates abound

Updates

Nodes now generate RPC public/private keys on startup in a node subfolder called rpc
I’ve added a NodeRpcClient to sn_api similar to the AuthdClient, which issues remote procedure calls
On sending procedure calls, a one-time passphrase is generated and signed by the client to verify the sender’s identity. Message integrity itself is still managed by TLS, so I think that covers the bases
Nodes can take the RPC port on the command line, and runs the service on localhost on the specified port. If not provided, disables the interface entirely.
sn_launch_tool was modified to assign a random port number between [34005, 36133) and default launches nodes with the rpc interface enabled. This is more of a temporary sorta hack to test things out.
Added a node subcommand in sn_cli to test this out (more on that in a second).

Baby’s First CLI Command

Inspired by a thought I had a few months ago, here’s something basic that is not particularly easy to do yet. You can’t ask a running node for its node_id (e.g. it’s public key on the Safe Net) or its reward_key. You can check the logs, but that’s not possible to do programmatically, so it’s annoying to do for multiple nodes at once… So the first command I decided to try was get-id, which just queries the node via the RPC interface for its node_id and its public reward_key.

This leads to the following flow:

> safe node run-baby-fleming
...
...
Launching genesis node (#1)...
running node with rpc on port 33496
...
...
Done!
> safe node get-id --rpc-port 33496 --cert-base-path <genesis_node_base>/rpc/
node_id: PublicKey::Ed25519(bb7139..)
reward_key: PublicKey::Bls(81d50f..)

Interested in Playing Around?

I created a few new branches so if you want to try this locally, you can now!

To clone the sn_node, run:

> git clone --branch node-rpc https://github.com/Scorch-Dev/sn_node
> cd sn_node
> cargo build --release

And similarly for sn_api, run:

> git clone --branch node-rpc-client https://github.com/Scorch-Dev/sn_api
> cd sn_api
> cargo build --release

At this point maybe rename your stable versions of the sn_node and safe executables to something like sn_node_stable and safe_stable. Then copy the built binaries to those locations

> mv ~/.safe/safe ~/.safe/safe_stable
> mv ~/.safe/node/sn_node ~/.safe/node/sn_node_stable

> cp sn_api/target/release/safe ~/.safe/
> cp sn_node/target/release/sn_node ~/.safe/node/

Things Left to Do

This is still definitely a work in progress, so there’s a few things left to iron out.

I mentioned this in my last post, but it wasn’t a bug in qjsonrpc that I had found, but just something I failed to notice. My machine resolves localhost to IPV6, so that’s why I had trouble when binding the node to IPV4. Currently, that hasn’t been fixed yet. If your machine resolves localhost to IPV4, this won’t work for you in all probability.
Code style needs work. Some floating constants/magic numbers are still present here and there.
The get-id doesn’t return structured data yet. It’s just a formatted String for now.
Currently the NodeRpcClient isn’t cached, so we just build a new one each time we send a request, which is wasteful.
Command line parameters to sn node get-id are a little ugly. The problem is that each node potentially runs from a different directory and on a different port, so I’m not sure what the best way to streamline this is yet…
This only works on localhost, similar to authd.

As of now, I’m not sure how far I will take this or not, but for now I’m just having a good time making my node do fun things. Perhaps I can patch it into my local node for the next test-net even. Might be cool to try.

Maybe @bochaco or @joshuef would have some info about this, but do you think MaidSafe might be interested in eventually pulling a feature like this into the upstream repo? It’s still too rough now for that, but I’m wondering about looking forward. Like mentioned earlier, there are security implications. I also don’t know if MaidSafe is in a spot where they want to accept bigger community patches/features at any point in the semi-near future. Both for the sake of stability and resource bandwidth (e.g. reviewing and working with PRs). Beyond that, it was also mentioned that such a feature could be tunneled through safe in the future, so not sure how that would gel with using qjsonrpc on the backend.

Anyway, that’s what I’ve got for now. Ideas, improvements, etc. are welcomed, and stay “Safe” all

joshuef · February 8, 2021, 3:54pm

super nice stuff @Scorch . I havent played with it yet i’m afraid, but conceptually great to see. I can totally imagine us getting this sort of thing in. We absolutely need an api for interacting with the node

bochaco · February 8, 2021, 7:07pm

Very nice @Scorch , it sounds very good feature set to me. Personally I believe we should try to incorporate these features in our CLI, api and node, sooner or later. Some ideas below.

I think we should split concerns/challenges here, on one side and probably the first step we should aim at, it’s talking and monitoring just a single node, and locally. Which if you think about it, it will probably be the most common use case for a user joining the network, with a node farming in his/her PC and the CLI to manage it. So we could go for this feature first, have everything needed to send queries and commands to a single node which runs locally.

Once that’s working, then we can have an interface in the node which can return the infrastructure information, let’s say the list of all nodes that are in the same section as the node you are requesting the info to. We already have something similar for communications between clients and nodes when they need to join/connect to a section, so it’s just a matter of having a similar service on this sn_node interface. When the CLI received the list of nodes with the infrastructure query, it can then send the queries/commands to all the nodes. In summary, you query one node to give you the list of nodes, and then send the actual query/command to all of them.

Note this was just an idea, and not necessarily Maidsafe will end up implementing in the core functionality, we will all need to evaluate how thing evolve and if it’s a good thing to have. Either way, I think it’s fair to say that even if it’s tunneled through Safe, and perhaps qjsonrpc won’t be needed and the messages would need to be sent as an opaque payload to the node using sn_client communication mechanism and protocol. I think we are far from that yet, and perhaps for local monitoring it’s still preferable to use local qjsonrpc messages, I have no idea. Thus the only thing I would do regarding this is keep it in mind just to make sure the design can fit in easily if the time comes for such a feature. I hope I’m making sense and you get the idea.

bochaco · February 8, 2021, 7:14pm

I think this can/should be exposed in node’s API, along with a many other things which I presume will be needed to send queries and commands to, like changing the reward key.

Scorch · February 9, 2021, 2:06pm

I think that’s a good way to go about it from the CLI perspective. Running only on localhost, we can arbitrate a default port like 34001 (following in the tradition that authd takes ports in the 33000 range, perhaps node rpc runs in the 34000 range), and a default base path like ~/.safe/node/rpc. CLI will fill that in by-default when sending queries and simplify the whole process. I think it still makes sense to allow for optional parameters to query on other ports so that it’s flexible enough that multiple local nodes could be talked to (which I think is valuable in the case of a localized test-net). In any case, it simplifies the majority of use cases and lets us focus on only worrying about localhost for now.

A Tentative CLI API

So trying to put something together based on comments so far, the CLI API might look something like the following. This takes some inspiration from git, in that a single subcommand can be an accessor or mutator, depending on arguments supplied:

keys : prints the node id/public key & reward key.
keys --set-reward-key <reward_key>: sets rewards PK from argument prints out the new output of keys.
keys --set-reward-key [ -f, --from_file ] <reward_key_file>: Sets the rewards key from a file and prints out the new output of keys
storage [--detailed]: prints current used space space and max space. With --detailed, if the node is an elder or adult, also prints how the storage is spread among the chunk stores.
storage --set-max <maximum used space>: Sets the maximum capacity space if it’s greater than the current used space (As of the last time I looked at those files, reducing beyond that is another beast).
netcfg: prints out some networking related information, like the address of the node and of the rpc interface. Could be expanded with more stats later presumably.
logs [--from-top] [--offset=0] <num_lines=10>: Get at most num_lines log entries. If --from-top, then fetches the oldest num_lines starting at index offset. Without --from-top, the last log fetched is offset by the offset amount from the bottom.
status: a convenience that wraps some of the above to quickly grok what’s going on with the node. Prints a birds-eye of the age, node root path, node id, reward key, the current/max storage, the network address, and the rpc socket address.

bochaco · February 10, 2021, 6:32pm

Perhaps these can be children of a rewards subcommand, e.g. $ safe node rewards set-key <reward-key>, and as you point out the rewards subcommand alone shows info about pk and current balance maybe…?..

Not fully sure but this is probably already covered by the networks subcommand, perhaps we need some additional fields that can be saved/mapped to each of the networks in CLI settings, not sure if you are familiar with it but specially with upcoming release of CLI, check https://github.com/maidsafe/sn_api/blob/7e00228a94a5e31b805e2f9d86f17bff4a4ac1b9/sn_cli/README.md#set-network-bootstrap-address

Scorch · February 10, 2021, 8:04pm

Yea, this does sound more intuitive to me. In hindsight, keys is a bit of an overloaded term anyway.

Thanks for pointing that out, it was an oversight on my part. I’ll have to go back and re-read some of the CLI user guide. In the current context, it is redundant, so it’s probably safe to nix it for now.

That said (I might be misremembering this, as I can’t find the post now…), I thought I recalled a suggestion about getting routing-related information. I think it was about querying the node_id of known peers… Something like safe node routing for debug purposes? I don’t think that falls under safe networks because it has more to do with the state of the running node and not necessarily the CLI application state. I don’t know if that would be useful (or if I just imagined the whole thing?!) but it’s an idea.

For now then, I think it’d be safe to focus on setting up the rewards, storage, logs, and status commands. Those seem to be the most immediately useful from what I can tell.

bochaco · February 10, 2021, 8:18pm

Ah right, I think I misunderstood what the netcfg command would be for then, the one I was pointing out is for configuring the network bootstrapping address, but not for the other info you are mentioning like known peers that’s something you’d need to query the node. So basically with safe networks set... cmd you’d set the endpoint where to reach your node, so then you can send queries to it like the list of its peers. Similar to how node join <network name> command works, it boottraps and joins the network that was mapped in CLI settings to that network name.

Scorch · February 18, 2021, 2:24pm

Another week, and deeper down the rabbit hole we go. I’ve got more updates below

Datatypes Repo

Because I got constantly annoyed of parsing serde_json::Value and catching every little parse error or missing field, I made a new repo sn_node_rpc_data_types that Im’ using for the time being to sync up the serialization of parameter and result. It’s been a lot more convenient and less verbose. As mentioned earlier, if we ever do move to tunneling through safe in some future timeline, these kinds of structured types would make that transition easier.

More Working Commands (CLI and Node Support implemented):

(I’m just going to copy/paste these mostly from my CLI output to save time)

Logs

Prints out a specified number of log lines starting at some specified offset.

USAGE:
    safe.exe node logs [FLAGS] [OPTIONS]

OPTIONS:
        --node-path <node-path>       Path of node root dir (default is ~./safe/node). Th SN_NODE_PATH env var can also
                                      be used [env: SN_NODE_PATH=]
        --rpc-port <node-rpc-port>     [default: 34000]
        --log-id <log-id>             Which log to fetch by id (options: 0 = plaintext logs) [default: 0]
        --num-lines <num-lines>       How many log lines to fetch [default: 10]
        --start-idx <start-idx>       Start line index of log fetch. start_idx > 0 implies offset from the start
                                      start_idx < 0 implies start from index abs(start_idx) from the end (e.g. -1 means
                                      start at the last line) [default: -10]

Note the log_id field. While right now we only have plaintext logs that live on disk, there’s an interesting possibility of adding more log varieties later for structured data collection. Examples include maybe a structured log for QoS and similar things. These don’t even need to live on disk, it’s structured in such a way that, even if these lived in a circular buffer in RAM it would work out fine. It’s pretty flexible.

Rewards

Right now this fetches the reward key (and lets you set it via --set-key, ~~although I’m still debugging set-key right now. Seems my node fails to launch with a serialization error from sn_routing. That might be an issue in upstream though, I haven’t investigated yet~~). More info like farm attempt success rate and such could be useful to add in later down the line, so it’s also pretty flexible in that sense.

USAGE:
    safe.exe node rewards [FLAGS] [OPTIONS]

OPTIONS:
        --node-path <node-path>       Path of node root dir (default is ~./safe/node). Th SN_NODE_PATH env var can also
                                      be used [env: SN_NODE_PATH=]
        --rpc-port <node-rpc-port>    What port to issue node remote procedure calls on [default: 34000]
        --set-key <set-key>           If provided, sets a new reward key from a hex string before fetching rewards info

Storage

Right now this just reports basic information on used space (e.g. total offered vs used), but I’m working on getting a --detailed flag going.

USAGE:
    safe.exe node storage [FLAGS] [OPTIONS]

OPTIONS:
        --node-path <node-path>       Path of node root dir (default is ~./safe/node). Th SN_NODE_PATH env var can also
                                      be used [env: SN_NODE_PATH=]
        --rpc-port <node-rpc-port>     [default: 34000]

Conclusion

I got some more features working in the past week and this is rather functional right now, allowing for some rough edges. Once I get the last of the functionality included for these commands, as well as a status command going, I’ll take a look at sn_launch_tool and sn_cli to make it easier to launch a node with the RPC enabled and to issue RPCs with fewer flags.