ServiceFabric and SAFENetwork

oetyng · March 17, 2018, 2:27pm

As I’ve mentioned elsewhere, I’m planning to implement the interfaces IReliableDictionary and IReliableQueue used by micro service orchestration framework ServiceFabric, which runs on Linux and Windows.

Now, ServiceFabric, that is a big topic in itself. The entire Service Fabric runtime is just now being open sourced on GitHub, which is very good for my project. Repo is here.

So what are these things?

Service Fabric is a distributed systems platform that makes it easy to package, deploy, and manage scalable and reliable microservices and containers. Service Fabric also addresses the significant challenges in developing and managing cloud native applications.

And a context:

A lot of Microsoft runs on Service Fabric, including Azure infrastructure services and large-scale solutions like Azure SQL DB, Azure Cosmos DB, and Cortana. It’s our secret sauce for large scale distributed applications that run our business.

It can run on any set of machines, and is not tied to Azure, although there is an Azure version.

The data storage basis for this framework, is exposed as the above mentioned IReliableDictionary and IReliableQueue.

They work much like any dictionary or queue, just that they are replicated over the replicas of the service, that run on the nodes in the ServiceFabric cluster.

So, what is also a major part of this framework, is something called the Actor Model, where we keep data and calculations together. This and the concept of MicroServices are programming architecture concepts, pretty much well known, by name at least, among all developers.

The project

For the past year and a half, I have designed and implemented a Peer-2-peer lending platform running in ServiceFabric, and this is how my interest for combining these things have been spurred

So, what I want to do, is allowing to just reference a library, set a simple config, and switch out the underlying implementation of data storage, so that instead of storing on the node disks in the ServiceFabric cluster, it is all stored in SAFENetwork.

In theory, this means that for example the system in the company I currently work for, as well as those services mentioned above, could with a few very simple steps, be based on SAFENetwork.

In reality, we will probably see a different performance profile, which will dictate which kind of services would actually be able to use it in the same way it does today.

So, it would be awesome to have collaboration with any of all you intelligent and gifted people, to get comments and ideas on the project.
I’m currently under an insane workload at my job, but when I get to it, there will be more details about this.
So stay tuned!

oetyng · March 20, 2018, 1:08am

So, I’ll elaborate some more, I’m on my phone, so will not be too detailed.

I’m referring to a C# implementation, probably an adaptor which implements the ServiceFabric interfaces. An existing, tightly coupled solution would need to do a few changes more than just config, but nothing advanced.

Any creators of a framework will not be thinking much about giving examples where the code is tightly coupled to the framework. And this is true also for ServiceFabric, if you look around on the net.
But I have separated the domain logic from the framework (almost done with the entire system), which means I can choose any host to run the system, be it a console app and inmemory storage, or a ServiceFabric cluster or any other kind of container or host.

So our ServiceFabric code is referencing the domain logic, passing in the SF statemanagers wrapped in adaptors. The domain logic does not know it is an IReliableDictionary from SF, or something using SAFENetwork…

So this refactor, it actually opens up for passing in a SAFENetwork implementation.
And that’s the part I’ll figure out.
Now, it isn’t trivial to get the full fledged functionality of SF reliable state. But I’ll make a trivial first version, and then we’ll see how far I can take it.

What makes it non-trivial is the transactions that ServiceFabric uses. It let’s you mutate state in one and even multiple reliable collections within a single transaction. And to replicate this in SAFENetwork… Well, let’s say I’ve had to think a bit about the various access patterns that opens up with a global storage solution, and since storing state means a handful of commits, there are some special patterns of compensatory actions and processes that needs to go in to it, to get something that looks like a transaction, and it’s very exciting but I do not know if or how it will be solved in a way that is sufficiently reliable.

So, one teaser is that ServiceFabric uses logs for building up these transactions, and well, logs, streams… and that’s where I saw the EventStore come to use,also the aggregate abstractions as they are meant to solve collaborative work over same state, which I would say distributed state management is. But I am so far from the advanced low level performamce coding so that’s why I hinted at that, in reality, we’ll have to see just how performant that can be.
This could probably work fine for some businesses, but it cannot be the backbone of a cloud infrastructure. Just to put it in perspective.