Decentralised computation app - how?


#1

I’ve been thinking about how I would build a web app demo, so everyone running that app becomes part of a decentralised computation network. Perhaps implementing my old friend the genetic algorithm, but with the work being done by many nodes in parallel.

For now this is a thought experiment, but I’d quite like to code it or get together with others to work on it if it seems feasible.

Terminology

  • c-node - an instance of the app running on a client computer. You could run multiple c-nodes by running more than one instance of the app on your computer, so this is not the same as a SAFE node.

It is relatively [ahem] simple if all c-nodes are assumed to be honest, which is fine for a simple demo, but I have enjoyed trying to figure out how to make this immune to dishonest c-nodes and this is what I want to explore in this topic.

Defending From Attacks

DDoS

A typical attack might be to DDoS c-nodes or their communications, but I think SAFE network makes this all but impossible with no work needed in the app other than to make sensible use of shared data structures and so on. There might be holes in that reasoning but they won’t be apparent until we get into more detail and so aren’t relevant here.

Collaboration

Another attack would be for a group of c-nodes to collaborate to corner the rewards of such a system - a bit like centralised bitcoin mining.

I have thought of a reasonable solution to this, which would work well providing the number of c-nodes is large. This is unlikely for a demo app, so only worth bothering with if it was a serious app that could attract a lot of participants (not my aim for now - but that could grow out of this).

Could SAFE Provide The Defence?

Having noticed similarities between my solution and how SAFE network makes it difficult for SAFE-nodes to collaborate, I’m wondering if this aspect could be handled by using the low level SAFE API. For example, by ensuring a c-node submitting a problem to be worked on can’t somehow favour workers controlled by the same person.

To clarify the question, here’s an outline of how a Computation dApp might work.

Computation dApp Example

Every c-dApp instance can submit computation to be worked on. This role is called ‘boss’.

Every c-dApp instance can compute the result of a computation and submit it to the ‘boss’ for a reward. This role is called ‘worker’.

One way to keep c-nodes honest is to have them first gain reputation by submitting work, or doing work, and vouching for each other after working together correctly.

For worker reputation this means doing computations which a boss signs as correctly completed. Once a worker has sufficient boss points, it could then pass a threshold where it earns a reward when doing work for a boss, awarded/paid by the boss.

And vice versa, a boss can gain reputation for it’s role by satisfied workers signing the job to indicate that the boss correctly acknowledge their work (by signing the results and giving any reward due).

Cheating To Gain Reputation

A dishonest participant could gain reputation by doing no work at all by having a bunch of c-nodes he’s running all signing off on each other, and then have an unfair advantage over honest workers who will take longer to gain reputation, and may not gain it at all if other honest bosses prioritise work to workers who have a good reputation.

Once the cheat has reputation enough for his workers to get real rewards, he shifts his workers to doing real work for honest bosses, and has made it harder for any honest workers to compete for those rewards.

There’s a lot of details left out of this, so no doubt there are other potential attacks, but they are not of interest here. The above example is just so I can propose a solution to this problem, and ask how or if that could be handed off to SAFE network using the low level API, instead of having to code it into the c-dApp (which is pointless for a demo).

Preventing Collaboration

One way to do this is to make it hard for a worker to end up working for a particular boss. SAFE network prevents collaboration by a combination of means. One is by ensuring that SAFE nodes (s-nodes) which collaborate are are chosen at random. On a large network the chances of two s-nodes owned by the same person being selected to work together becomes very low, which makes it very costly to try and cheat.

This is achieved by ensuring that each s-node is given a random address, and that nodes with similar addresses are chosen to collaborate.

So, I’m wondering if there’s a way to tag onto SAFE, perhaps using the low level API to create custom messages that would be handled by the c-dApp and which exploit the s-node address to make it unlikely that any two c-nodes would end up working together as boss and worker.

Can this be done done with the existing API, and if so can you give me some hints?! :slight_smile:

Also, would this also be possible with the safe-nodejs API and just a browser based web app? Or are we talking about something that would rely on in future additions to the API?

Tricky to answer sorry! I hope I’ve explained well enough. Gosh this is a long post - thank you for reading :slight_smile:


#2

The technologies I think of when considering distributed computation systems are

It sounds like the proposal is somewhere between BOINC and Ethereum, but closer to BOINC.

I think this can be done, perhaps not using the safe node-naming system directly, but it could be achieved with a specific implementation of that same algorithm in your own app.

Messaging is going to be key to this idea, since the client’s goal is to replace ‘computation cost’ with ‘network cost’. Messaging directly on the safe network may be cost prohibitive, so some alternative may be webrtc combined with some sort of ranking mechanism (which could be stored on the safe network). I’m not terribly familiar with the api (I’m waiting for it to settle down before committing time to learning it) so this is a pretty generic response for now!

In the short term, leaning toward the BOINC model rather than the Ethereum model does seem sensible to me; I think you’re on the right track.

Some other things to consider are:

  • developing a market for compute, where the client can provide a sample for potential compute nodes to get an idea of how well they can process it, and can then bid a specific price / duration for doing the rest of the work
  • some sort of redundancy so incorrect output can be detected based on a majority rule (much like quorum in close group consensus)
  • a reputation system for compute nodes much like node ranking and aging in the safe network to discourage incorrect output

I think homomorphic encryption is the end game for distributed compute, but will be a long time until it’s ready.

Great post @happybeing :slight_smile:


#3

Thanks @mav I’m very much in agreement with you (without knowing what BOINC is yet). Your analysis and suggestions mirror my own thoughts having chewed on this a bit more.

My thought is that it would be possible to create a powerful distributed computation API if the low level SAFE API could support at least:

  • ensure random selection between c-nodes in different roles (so boss to workers for example, similar to a safe-node and its close group)

I haven’t thought about how that might look as an API, or how SAFE could provide it though. I got as far as thinking:

  • declare a type representing the address space of c-nodes
  • c-nodes register themselves to appear in this address space for each role (such as boss or worker). At this point something equivalent to random address allocation and xor closeness would determine the relationship between nodes, perhaps piggy backing into SAFE defence such as proof of work, to prevent sibyl attacks etc
  • messages are routed, or connections established, between c-nodes according to a scheme I’m not sure about as I write :wink:
  • c-nodes receive messages, handle according to their role, and then respond (via the SAFE API, or possibly directly as you suggest if that can be robust)

I was thinking that a reputation system could be built by c-nodes creating a signed public record (chains of signed entries) where they declare themselves by c-node address and public key (boss + problem in Python, worker in Rust, JS, Python), and can vouch for each other (worker: ‘my result for problem y is z’, boss: ‘worker x correctly solved problem y’, worker: ‘boss b correctly rewarded me for problem p’), in such a way that reputation can be accumulated/recorded, and validated by examining the chain, with c-nodes losing reputation if their entries don’t stack up. Again, not thought through. Very woolly! :slight_smile:

Your feedback encourages me to think more, but I probably have to study more first. My post was in part to try and shortcut that :slight_smile: and tap into expertise from others.

Thanks for taking the time to read, analyse and respond. It’s really helpful to me. :slight_smile:


#4

Hi happybeing, can you help us review some stuff for our app development plan? Thanks


#5

@happybeing
Ipython parallel might give you some good ideas. Fairly general. No need for nodes to sign up for specific projects like BOINC. Although I do like the BOINC model too. Ipython parallel follows a similar boss/worker model. The only issue is that it requires tunneling through ssh in order to transfer data around securely.
https://ipython.org/ipython-doc/3/parallel/parallel_intro.html

Another thought that is a wishlist app item for me: build a safenet version of the amazon ec2 system. Just use the safe network as a management layer only. It would advertise node availability and manage the topology of the dynamic compute cluster, setup the tunnelling and pass keys around, and handle safecoin payment. The cluster itself would then be managed by ssh in the traditional compute cloud way. Let users write the programs in what they are accustomed to in cluster computing (ex. Fortran/C/C++/Python/Rust). The communications for the actual parallel processing and data would be handled by these using something like MPI or OpenCL or other native libraries through the ssh based network that was only setup via safenet. You could just have the compute node consist of the safenet layer and a virtual machine with something like CentOS on it with the required mpi libraries installed along with SLURM for job control to make many people in HPC very happy. Most of the time when you write MPI programs the programmer expects that some nodes go could go offline and tries to safeguard against it anyhow. As long as the “compute churn” was fairly low it might work ok. Traditional HPC gurus would complain about latency, but if it was cheap enough and scaled well they probably wouldn’t care. Especially if the safenet layer found ways to optimize for latency and allowed different payment structures for latency vs. available ram and TFlops and churn rate. I’m new here, so I’m not sure how this would jive with safenet anonymity requirements. All the nodes in the p2p compute cluster would need to know one another’s ip address in order to setup the ssh tunnels and build the dynamic cluster under this scenario. There may be clever ways to get around this constraint that I am unaware of. This may not be the ideal decentralized computation app, but it would get things started and would be a useful tool in the short term.


#6

Computing model Interactive proof systems deals with these problems

The prover is all-powerful and possesses unlimited computational resources, but cannot be trusted, while the verifier has bounded computation power.

We could split procedure into several steps, deliver them to randomly choosed nodes and verify each step by Parallel programming model.

It eliminates the need of reputation systems. If someone commit too many malicious operations, others could ban it immediately for a long time. while verifiers requires only limited resources, a lot of verifiers could be randomly choosen and it won’t affect scalability.


#7

Perhaps Crust could be leveraged to allow point to point communication instead of SSH? It seems very fit for this sort of purpose.


#8

uuuuh - using a Crust API and sending the code that needs to be executed together with the optional payload

as POC one could just send e.g. a js-file that will be executed if there are only whitelisted commands used (so no execute-statements or other possibly dangerous stuff for the executer) but callbacks over Crust should be possible

the POC for computing on SAFE could be e.g. a “Centralized BTC/MAID Exchange” (centralized computing nodes to ensure noone gets tricked here)
The Exchange could maintain Crust connections to all traders -> sends updates on bids/asks + corresponding Crust IDs (== XOR address?) to the traders
Crust connection Between Nodes that are willing to trade; A <> B
1.) A tells B it is willing to buy the coins at the offered price
2.) B inserts his sending Address BTC, receiving Address for MAID + Crust ID and sends it to A
3.) A takes this contract, inserts his sending address MAID and receiving Address BTC and signs it
4.) B signs the finished contract as well and sends it to the “server” for execution
5.) the computing nodes contact A and B and request signed txs, if both transactions are valid they get broadcasted - if not then none of them and the tricked user gets notified

hmhmmm … but just programming an exchange utilizing crust and not trying to make it a POC for decentralized computing would probably save you a lot of work… maybe not worth to waste too much thought on this …

ps: okay and if a bad actor does send his MAID with high tx-fee to another address after the BTC-tx has been broadcastet he might still trick the system …


#9

Perhaps, yes. SSH isn’t necessarily required, it’s just the way people are used to interacting with the cluster environments. I see four separate computation scenarios:

  1. SafeCluster : This is simply a distributed compute cluster using SAFE as a communications layer, and global node management at the safe app level. Use Case: Scientific computing, large parallel workloads requiring a high degree of synchronization. A node in this cluster would run some sort of SafeOS and the user has full control to run scientific computation type jobs using standard MPI/OpenMP or OpenCL libraries. Assuming that the SafeOS is running as a guest VM there would need to be some way to have it’s entire operation within the container unreadable from the the host machine. Ideally, one would not need to rely on intel’s trusted zone tech or similar techniques advertised by other manufacturers. I mentioned that each node in the cluster would need to know the ip addresses of the other nodes because that is typically how MPI clusters are setup now. For simplicity usually one just sets up iptables rules so that nodes in the cluster can only communicate with eachother. Since safe nodes also need to know the ip addresses for the first hop, maybe there is a way to have routing and crust transparently handle the communication. (ie. Node A sends an MPI command to known ip B, which routes automatically to node C whose ip is known by node D where the actual computation takes place. When D sends back to what it knows as ip C, the computation actually takes place on A, etc.)
  1. SafeCompute: This is more like a Boinc project model where individual safe apps are installed and are periodically sent an input dataset and send back the output dataset. I would say this falls under embarrassingly parallel processing. Might be good for genetic algorithm optimization too. GridCoin is currently exploring ways to allocate blockchain tokens related to proof of computation. All this needs is a safe app that knows the location of a large shared network dataset.

  2. Holonomic Computation: Already mentioned in the forum. IIRC this is basic arithmetic that operates directly on an encrypted input to produce a correct encrypted output. More of a long-term research topic.

  3. SafeProcess, SafeSSI, Smart Contracts built into the datachains, others : For a more network based computation at a deeper level I’m thinking analogous to SAFE-NFS, but instead for managing network “processes”. That way a “process” could be executed from a SAFE CLI similar to the way it is done on a local machine. Projects like OpenSSI seek to present a “single system image” (SSI) to the user that aggregates all of the network resources and migrates processes as required. This too would likely require a SafeOS and is a higher level approach than SafeCluster. ScaleMP is a commercial entity that has made progress in providing good SSI solutions in recent years. Others, you tell me.


#10

I downloaded the sourcecode for NEO last month, and was digging around to wrap my head around how we could do this on SAFENetwork.
One part is a consensus network, another is the code execution.

And so, I came to this as I was thinking about how we can execute unknown code.
In .NET framework there is something called AppDomain, which actually let’s us run code in a sandbox.
But, this is not cross platform - and that’s a showstopper for SAFENet related things. However, with .NET Core, (which is cross platform) they have discontinued this sandboxing capability.

And so I was looking at other ways, and well, implementing a Virtual Machine is the standard way to run smart contracts.

I’ve never implemented a virtual machine so I dug around a bit. It was great to see that NEO is coding in C#, since that’s my preferred language.
Looking at the NEO code for this, it is actually not as big of a thing as I first thought. Not saying it is trivial, but I had imagined it would be a larger code base for it.

There are limitations with SmartContracts though. I really liked the possibility to run code in a sandbox, but if there is a good smartcontract framework available, naturally it will be used.

So, I’m thinking, the bytecode instructions needs to be defined. What things are supposed to be done in SAFENetwork? We have the standard arithmetic operations, but then there is the currency transaction part.

So actually, the absolute best thing would be if MaidSafe implemented this in yet another Persona.
There is a big chance they will do, and that fact could act as a deterrant for others to launch a project based on it. I mean, I know for sure that I wouldn’t like to quit my job, take all my savings, just to see the excellent engineers on MaidSafe do it 10 times better soon after :slight_smile:
On the other hand, they are kind of busy with a tonne of other things too, so it would be really nice if this effort could be taken on by others, soon.


#11

Oh no I’ve been found out and now my source code is out there.

spoiler

@neo on community forum - I know bad joke in a serious topic - blame the coffee again


#12