Decentralised computation app - how?


#1

I’ve been thinking about how I would build a web app demo, so everyone running that app becomes part of a decentralised computation network. Perhaps implementing my old friend the genetic algorithm, but with the work being done by many nodes in parallel.

For now this is a thought experiment, but I’d quite like to code it or get together with others to work on it if it seems feasible.

Terminology

  • c-node - an instance of the app running on a client computer. You could run multiple c-nodes by running more than one instance of the app on your computer, so this is not the same as a SAFE node.

It is relatively [ahem] simple if all c-nodes are assumed to be honest, which is fine for a simple demo, but I have enjoyed trying to figure out how to make this immune to dishonest c-nodes and this is what I want to explore in this topic.

Defending From Attacks

DDoS

A typical attack might be to DDoS c-nodes or their communications, but I think SAFE network makes this all but impossible with no work needed in the app other than to make sensible use of shared data structures and so on. There might be holes in that reasoning but they won’t be apparent until we get into more detail and so aren’t relevant here.

Collaboration

Another attack would be for a group of c-nodes to collaborate to corner the rewards of such a system - a bit like centralised bitcoin mining.

I have thought of a reasonable solution to this, which would work well providing the number of c-nodes is large. This is unlikely for a demo app, so only worth bothering with if it was a serious app that could attract a lot of participants (not my aim for now - but that could grow out of this).

Could SAFE Provide The Defence?

Having noticed similarities between my solution and how SAFE network makes it difficult for SAFE-nodes to collaborate, I’m wondering if this aspect could be handled by using the low level SAFE API. For example, by ensuring a c-node submitting a problem to be worked on can’t somehow favour workers controlled by the same person.

To clarify the question, here’s an outline of how a Computation dApp might work.

Computation dApp Example

Every c-dApp instance can submit computation to be worked on. This role is called ‘boss’.

Every c-dApp instance can compute the result of a computation and submit it to the ‘boss’ for a reward. This role is called ‘worker’.

One way to keep c-nodes honest is to have them first gain reputation by submitting work, or doing work, and vouching for each other after working together correctly.

For worker reputation this means doing computations which a boss signs as correctly completed. Once a worker has sufficient boss points, it could then pass a threshold where it earns a reward when doing work for a boss, awarded/paid by the boss.

And vice versa, a boss can gain reputation for it’s role by satisfied workers signing the job to indicate that the boss correctly acknowledge their work (by signing the results and giving any reward due).

Cheating To Gain Reputation

A dishonest participant could gain reputation by doing no work at all by having a bunch of c-nodes he’s running all signing off on each other, and then have an unfair advantage over honest workers who will take longer to gain reputation, and may not gain it at all if other honest bosses prioritise work to workers who have a good reputation.

Once the cheat has reputation enough for his workers to get real rewards, he shifts his workers to doing real work for honest bosses, and has made it harder for any honest workers to compete for those rewards.

There’s a lot of details left out of this, so no doubt there are other potential attacks, but they are not of interest here. The above example is just so I can propose a solution to this problem, and ask how or if that could be handed off to SAFE network using the low level API, instead of having to code it into the c-dApp (which is pointless for a demo).

Preventing Collaboration

One way to do this is to make it hard for a worker to end up working for a particular boss. SAFE network prevents collaboration by a combination of means. One is by ensuring that SAFE nodes (s-nodes) which collaborate are are chosen at random. On a large network the chances of two s-nodes owned by the same person being selected to work together becomes very low, which makes it very costly to try and cheat.

This is achieved by ensuring that each s-node is given a random address, and that nodes with similar addresses are chosen to collaborate.

So, I’m wondering if there’s a way to tag onto SAFE, perhaps using the low level API to create custom messages that would be handled by the c-dApp and which exploit the s-node address to make it unlikely that any two c-nodes would end up working together as boss and worker.

Can this be done done with the existing API, and if so can you give me some hints?! :slight_smile:

Also, would this also be possible with the safe-nodejs API and just a browser based web app? Or are we talking about something that would rely on in future additions to the API?

Tricky to answer sorry! I hope I’ve explained well enough. Gosh this is a long post - thank you for reading :slight_smile:


#2

The technologies I think of when considering distributed computation systems are

It sounds like the proposal is somewhere between BOINC and Ethereum, but closer to BOINC.

I think this can be done, perhaps not using the safe node-naming system directly, but it could be achieved with a specific implementation of that same algorithm in your own app.

Messaging is going to be key to this idea, since the client’s goal is to replace ‘computation cost’ with ‘network cost’. Messaging directly on the safe network may be cost prohibitive, so some alternative may be webrtc combined with some sort of ranking mechanism (which could be stored on the safe network). I’m not terribly familiar with the api (I’m waiting for it to settle down before committing time to learning it) so this is a pretty generic response for now!

In the short term, leaning toward the BOINC model rather than the Ethereum model does seem sensible to me; I think you’re on the right track.

Some other things to consider are:

  • developing a market for compute, where the client can provide a sample for potential compute nodes to get an idea of how well they can process it, and can then bid a specific price / duration for doing the rest of the work
  • some sort of redundancy so incorrect output can be detected based on a majority rule (much like quorum in close group consensus)
  • a reputation system for compute nodes much like node ranking and aging in the safe network to discourage incorrect output

I think homomorphic encryption is the end game for distributed compute, but will be a long time until it’s ready.

Great post @happybeing :slight_smile:


#3

Thanks @mav I’m very much in agreement with you (without knowing what BOINC is yet). Your analysis and suggestions mirror my own thoughts having chewed on this a bit more.

My thought is that it would be possible to create a powerful distributed computation API if the low level SAFE API could support at least:

  • ensure random selection between c-nodes in different roles (so boss to workers for example, similar to a safe-node and its close group)

I haven’t thought about how that might look as an API, or how SAFE could provide it though. I got as far as thinking:

  • declare a type representing the address space of c-nodes
  • c-nodes register themselves to appear in this address space for each role (such as boss or worker). At this point something equivalent to random address allocation and xor closeness would determine the relationship between nodes, perhaps piggy backing into SAFE defence such as proof of work, to prevent sibyl attacks etc
  • messages are routed, or connections established, between c-nodes according to a scheme I’m not sure about as I write :wink:
  • c-nodes receive messages, handle according to their role, and then respond (via the SAFE API, or possibly directly as you suggest if that can be robust)

I was thinking that a reputation system could be built by c-nodes creating a signed public record (chains of signed entries) where they declare themselves by c-node address and public key (boss + problem in Python, worker in Rust, JS, Python), and can vouch for each other (worker: ‘my result for problem y is z’, boss: ‘worker x correctly solved problem y’, worker: ‘boss b correctly rewarded me for problem p’), in such a way that reputation can be accumulated/recorded, and validated by examining the chain, with c-nodes losing reputation if their entries don’t stack up. Again, not thought through. Very woolly! :slight_smile:

Your feedback encourages me to think more, but I probably have to study more first. My post was in part to try and shortcut that :slight_smile: and tap into expertise from others.

Thanks for taking the time to read, analyse and respond. It’s really helpful to me. :slight_smile:


#4

Hi happybeing, can you help us review some stuff for our app development plan? Thanks


#5

@happybeing
Ipython parallel might give you some good ideas. Fairly general. No need for nodes to sign up for specific projects like BOINC. Although I do like the BOINC model too. Ipython parallel follows a similar boss/worker model. The only issue is that it requires tunneling through ssh in order to transfer data around securely.
https://ipython.org/ipython-doc/3/parallel/parallel_intro.html

Another thought that is a wishlist app item for me: build a safenet version of the amazon ec2 system. Just use the safe network as a management layer only. It would advertise node availability and manage the topology of the dynamic compute cluster, setup the tunnelling and pass keys around, and handle safecoin payment. The cluster itself would then be managed by ssh in the traditional compute cloud way. Let users write the programs in what they are accustomed to in cluster computing (ex. Fortran/C/C++/Python/Rust). The communications for the actual parallel processing and data would be handled by these using something like MPI or OpenCL or other native libraries through the ssh based network that was only setup via safenet. You could just have the compute node consist of the safenet layer and a virtual machine with something like CentOS on it with the required mpi libraries installed along with SLURM for job control to make many people in HPC very happy. Most of the time when you write MPI programs the programmer expects that some nodes go could go offline and tries to safeguard against it anyhow. As long as the “compute churn” was fairly low it might work ok. Traditional HPC gurus would complain about latency, but if it was cheap enough and scaled well they probably wouldn’t care. Especially if the safenet layer found ways to optimize for latency and allowed different payment structures for latency vs. available ram and TFlops and churn rate. I’m new here, so I’m not sure how this would jive with safenet anonymity requirements. All the nodes in the p2p compute cluster would need to know one another’s ip address in order to setup the ssh tunnels and build the dynamic cluster under this scenario. There may be clever ways to get around this constraint that I am unaware of. This may not be the ideal decentralized computation app, but it would get things started and would be a useful tool in the short term.


#6

Computing model Interactive proof systems deals with these problems

The prover is all-powerful and possesses unlimited computational resources, but cannot be trusted, while the verifier has bounded computation power.

We could split procedure into several steps, deliver them to randomly choosed nodes and verify each step by Parallel programming model.

It eliminates the need of reputation systems. If someone commit too many malicious operations, others could ban it immediately for a long time. while verifiers requires only limited resources, a lot of verifiers could be randomly choosen and it won’t affect scalability.