Looking for Info on Profiling Tools?

Scorch · July 23, 2020, 2:16pm

Hi all, I was wondering if somebody could point me to any information regarding profiling tools for the SAFE net.

Not necessarily code profiling in the sense of stepping through code or memory utilization, or even the logs. More like SAFE-specific tools. E.g. I recall there was something a tool presented in the SAFE YouTube series for PARSEC gossip graph visualization (though I suppose that’s sort of moot with AT2 on the horizon). Seems it would be handy if we could generate traces in XOR space for messages routed on test nets, or visualize network events for things like sections splitting/merging/etc. like in the gossip graph example.

I haven’t yet found a centralized document on this (or figured out yet if these kinds of tools exist in the current state of the network). Trawling the various git repos is slow-going, so thanks in advance if anybody can provide some info on this

drehb · July 24, 2020, 2:55am

As far as I know they don’t exist.

mav · July 26, 2020, 6:16am

Nothing exists yet to my knowledge but these sort of ‘internal tools’ will be useful to have.

I started on a vault log viewer that aggregates several vault logs into a single chronological view, was going to extend it with some SAFE specific overlays but haven’t had the motivation to do it yet, maybe one day. Repo is here: github.com/iancoleman/log_viewer_for_safe_vaults conversation is here.

I’ve thought having some visualisation of routes through the xor namespace could be handy, eg message starts at 0x33bb8... and fetches chunk at 0x1e87f... and is returned to the origin. The to and fro routes are different when using nearest neighbours (see disjoint section message routing and reliable message delivery) so would be neat to have analysis of it, especially for cache purposes. But maybe too early for that, nearest neighbour routing may not be what is used in the final release (“all section will know all elders of all sections. We can still have it massivly scalable by only knowing neighbors, but that means more work right now.” source).

github.com/maidsafe/routing-sims is worth a look.

Would be interested to hear of any tools you think of.

Scorch · July 29, 2020, 12:47pm

I’m going to spit-ball some hypothetical stuff in an “ideal testing suite” world for a second, so feel free to let me know if this sounds off-base.

Something like the git repo you linked (log_viewer_for_safe_vaults) I think overlaps a bit with what I had in mind, though I think it’s only one facet of an overarching theme. In general, I see the overarching task to be aggregating several individual views of the network (e.g. what individual vaults see), into a single, more cohesive one (perhaps a section or multiple sections).

I see several potential tools that could help with this. I say “tools” but I think it would be better to think of it as “views of the network.”

I think in one view, there is a timeline of various network events (Perhaps anything requiring a quorum of votes. Maybe section formation/vault leaving & joining. I suppose that would also apply to section signing a message hop if I’m reading the above-linked RFCs correctly). Every network event could be viewed as an aggregation of one or several nodes in the network witnessing the same thing.

Expanding any one network event would allow us to see an expansion of each vault’s view of the consensus process (e.g. when and from whom a specific vault ‘saw’ the event) and examine things like the time to reach consensus, average number of messages passed during the voting process, and perceived order of events at any time step for any given vault or set of vaults.

In another sense, like mentioned above, being able to trace a messages route like you mentioned would be interesting to see. Seeing a message’s start, intermediate, and endpoints might be a nice metric for measuring efficiency of caching algorithms and network speed given different hyper parameters for things like section sizes, lru cache depth, etc. This sort of view lends itself to the handy “circular” DHT representation as well for easy viewing.

As I’m writing this, it seems to create a neat hierarchy. Network events comprised of vault events generated via messages that travel along a given route. Each hop along that route is perhaps comprised of a group signing like in the reliable message delivery rfc, which would create a convenient link back to the top higher layers. Not unlike a nested tree scheme.

Beyond aggregating plain-text logs, conditionally compiling in extra debug output could allow statistic collection to be extensible and poll-able via a telemetry-style interface (I recall seeing some mentions of telemetry in the forum iirc, but a quick search on the maidsafe GitHub seems to indicate it doesn’t exist in code). Graphical representation would be then a matter of aggregating and presenting polled data. In a scaled network, knowledge of neighbors might yield a partial network view, but further queries to other section of the network could certainly be made to construct the full view of the interesting information.

Scorch · August 2, 2020, 1:30pm

Recently stumbled on this issue here about querying nodes for performance information and making networks decisions based on that (evicting nodes is an example given in the issue). Does anybody know what/if any work has been done on this?

Seems similar to a telemetry-style interface for querying nodes I alluded to above. Would be interesting to play around with that code to see how extensible it is (if it exists). Or maybe it’s worth proposing such an interface if it’s not yet in existence? Judging from the git issue, seems the latter would be the case, but I don’t know for sure.

mav · August 3, 2020, 3:19am

I don’t know of any work done on this.

It sounds similar to SAFE Network Health Metrics but that’s only an exploratory doc, no code or intention to code.

Also rfc0057 section health does talk about this a little, although this rfc is not exactly what is being implemented so take it as a guideline rather than a rigid design doc.

Also worth noting that the now-defunct parsec has a useful list of malicious observations that would impact on the observed health of a peer.