I have a few questions being the secure network geek I am…
How is the remote monitoring/control and logging of this SAFE NPM Module health state planned/designed to be managed/debugged interactively by support staff when the need arises?
fyi- Apstra claims they have autonomous networking all figured out with L2 VXLANS over Secure IP in the DC, however I can tell you there will always be a need for the above.
Has anyone looked at Log4j like type capability a la Winston having what this module will generate as health/state info in log form to have it equipped with a false positive filter on a smart restful api which can send module generated health/state data as accumulated to 2 or more running Management App Nodes living in a Support Network CUG using mcast/groupcasts organized as Layer 5/4 to send it out so the info can be accessed/analysed and re-routed stored other Vaults for later reference/analysis/reporting by support staff via their Browsers?
Any thoughts using C2 like management of this module logging of health info/state to ensure a secure chain of custody through the network so the log info cant be faked by a rogue module?
Other info which might be useful:
Also if anyone is interested there is a LGPL V2 licensed project still accessible on sourceforge.net (sf.net) called “Sherlock” that we (ex-Platespin core founding engineering group) created back in 2004 using Log4J to collect syslog relay data from network devices which was C2 compliant (milspec stuff) and stored in C2 compliant and organized flat file hierarchy with a back end post-processor written in Perl using Regex to dump to a RDBMS so we could get at it with a report server, it might be useful as a reference…
I am really thinking about how SAFE Network keeps track of itself in terms of state/health and also keeps track of the network environment on which it rides… If you can point me to some links which talk about this I would be grateful and might be able to contribute thereafter : ) R2
thanks for moving this into its own thread/topic: )
The state/health of the network I think should be considered when relocating vaults and that means knowing
I am just putting out a few ideas out here below and hopefully any replies will help me to better understand how this is treated in Alpha 2 and beyond…
Link state up/down check the router path
not all networks are mesh and load balanced
lots of cascading star networks out there
Bandwidth ok (clear or congested) ping does the work here
time to respond 4096 data gram
so the client side determines health ?
fyi- primary use case- I am working on a large distributed systems SaaS for remote control/monitoring and logging of clean power stations, so I am really trying to better understand if SAFE Network can get the job done for us reliably on the wireless/line of site WIMAX-WIFI/wireline-fibre networks out there (Canada and elsewhere), and in the exception use case if there is a problem with the underlying carrier can SAFE Network let the client side monitoring/control and logging app know about it so we can then tell the Clean Power Station to try the SAT links, as these stations don’t have a lot of storage and in many cases quite remote and there are SLAs involved regarding proving station uptime and maintaining station visibility…
Also we need to log all the data from the Station as a regulatory requirement for long periods
This service is not due for prime time operation until 2019…
thanks again for the new topic setup…
you can check out the clean power station as to what it looks like at mobismart.ca, it’s basically a diesel replacement solution for those places/countries/companies hit with carbon credit taxes for running diesel gear.
Great thread title/question OP. I just finished reading the thread updates to the Secure Random Relocation RFC and was about to ask, “What metrics does SAFE use to determine section health?” I decided to search the forum first and found this thread and some familiar faces.
I don’t mean to hijack the thread, but what kind of metrics are used to trigger a relocation/split/merge? My intuition tells me it has to do with the number of members in a group/section and if there are too many then the infants are kicked out, too few and it get’s merged with another section that has too few nodes. I’m sure there is a lot more to it, so if anyone has a reference or two, please let me know. @mav mentioned sections ‘most in need’ are relocated/targeted, so maybe he has read something recently he could send our way? What have I missed?
Edit: I think I’ve found some of what I was looking for… what else do I need?
Safecoin algorithm has some health algorithms, see RFC 0005 Balance Network Resources.
Datachains have indirect implications on health in so much as they store prior state of the network which may be interpreted in the context of ‘health’ - see RFC 0029 Data Chains
Disjoint Groups is a mechanism for splitting and / or joining sections on the network, which has implications for the definition of health. See RFC 0037 Disjoint Groups. It isn’t explicitly stated, but sections that are close to needing a merge or close to splitting are considered ‘unhealthy’ and would ideally be helped by the network to prevent merging or encourage splitting. So the definition of ‘health’ in this case is not being close to merging or splitting. This depends on the age distribution of vaults in the section as well as the number of vaults in the section, so it’s a little complex and not explicitly documented anywhere. Try having a look at the thread Analysing the google attack, especially this comment by Viv.
There’s a proof-of-resource mechanism that shows nodes have at least a certain minimum capacity to perform computations (ie 'show they can be healthy). Not sure where to find more info on this but have a look at routing resource_prover.rs.
Overall the concept of network health is still being refined and there is no single definitive source for it. Mainly it comes down to measuring how well resources are balanced across the network.