I’m going to spit-ball some hypothetical stuff in an “ideal testing suite” world for a second, so feel free to let me know if this sounds off-base.
Something like the git repo you linked (log_viewer_for_safe_vaults) I think overlaps a bit with what I had in mind, though I think it’s only one facet of an overarching theme. In general, I see the overarching task to be aggregating several individual views of the network (e.g. what individual vaults see), into a single, more cohesive one (perhaps a section or multiple sections).
I see several potential tools that could help with this. I say “tools” but I think it would be better to think of it as “views of the network.”
I think in one view, there is a timeline of various network events (Perhaps anything requiring a quorum of votes. Maybe section formation/vault leaving & joining. I suppose that would also apply to section signing a message hop if I’m reading the above-linked RFCs correctly). Every network event could be viewed as an aggregation of one or several nodes in the network witnessing the same thing.
Expanding any one network event would allow us to see an expansion of each vault’s view of the consensus process (e.g. when and from whom a specific vault ‘saw’ the event) and examine things like the time to reach consensus, average number of messages passed during the voting process, and perceived order of events at any time step for any given vault or set of vaults.
In another sense, like mentioned above, being able to trace a messages route like you mentioned would be interesting to see. Seeing a message’s start, intermediate, and endpoints might be a nice metric for measuring efficiency of caching algorithms and network speed given different hyper parameters for things like section sizes, lru cache depth, etc. This sort of view lends itself to the handy “circular” DHT representation as well for easy viewing.
As I’m writing this, it seems to create a neat hierarchy. Network events comprised of vault events generated via messages that travel along a given route. Each hop along that route is perhaps comprised of a group signing like in the reliable message delivery rfc, which would create a convenient link back to the top higher layers. Not unlike a nested tree scheme.
Beyond aggregating plain-text logs, conditionally compiling in extra debug output could allow statistic collection to be extensible and poll-able via a telemetry-style interface (I recall seeing some mentions of telemetry in the forum iirc, but a quick search on the maidsafe GitHub seems to indicate it doesn’t exist in code). Graphical representation would be then a matter of aggregating and presenting polled data. In a scaled network, knowledge of neighbors might yield a partial network view, but further queries to other section of the network could certainly be made to construct the full view of the interesting information.