Thoughts about naming and structure in SAFENetwork libraries.

oetyng · April 16, 2019, 11:33am

I was looking at routing lib the other day, and just happened to notice a few things that I wanted to say something about.
It’s about naming. Naming in programming is something I consider very important, because it is a powerful influencer in how the code quality evolves - it is fundamental for the developers’ deeper understanding of the domain; it guides them in their programming, or misleads them, or adds to the cognitive load. It puts foggy layers over the essential parts, or distinguishes them.

I think my fascination with naming has many levels to it. It’s not only within programming that I am striving to be very accurate, to use the most suitable words to describe things, to try avoid ambiguity (unless it’s desired, which in socialization that certainly is the case some times, but less so in programming).

To me, when a name is brought up for something in the code, I expect it to have a rationale. There is something that makes this name be appropriate, and not some other - and same for the structure and pattern in the set of names used in a context; I also expect it to make sense in relation to other names already used. It adds most value when there is a form, an idea, which determines why this particular set of things has this name, and this other set has another name. The names should be chosen as to pinpoint what it is we consider important in their difference and characteristics. When seeing the names in the code base, if the naming is consistent with what it actually does and is, you would understand so much more about what you are seeing.

The power of naming is also very strong on the destructive side. The smallest grain of ambiguity can make you very confused and require large amount of cognitive load to identify (then also confirm) just what the dissonance is about, what you have misunderstood and what the reality really is.

Good naming allows the subconscious pattern matching powers of your brain to take over a lot of that cognitive work, and you just intuitively get it.

In the end, it’s all about being able to create with as little friction as possible, and to get results that are as solid as possible.

OK, so let’s get down to business.
I have looked at a few enums that I saw when looking at the code in lib.rs in fleming_design_model/routing_model/src.

They caught my attention because they are about messaging, and as I have worked with messaging, I have come to form a specific view about naming conventions in that area. This specific view might not at all be relevant all the times for the code here, but it might be, so I will share my thoughts, in hope that maybe it can be useful.

I am most certainly unaware and ignorant on nuances and details in the specific cases, so my observations would correspondingly be of various degree of relevance. But hopefully there is at least something to continue working on generally with naming and structuring - if only broadening the discussion (which I’m sure are going on internally at various levels) in general about principles and practices, also to interested parties among community members (not sure how big that group is though, might be a very niche interest ).

Event

Events are generally something that happened, it is a fact, and are for that reason expressed in past tense.

pub enum Event {
    /// Received a request message.
    Request {
        /// The request message.
        request: Request,
        /// The source authority that sent the request.
        src: Authority<XorName>,
        /// The destination authority that receives the request.
        dst: Authority<XorName>,
    },
    /// Received a response message.
    Response {
        /// The response message.
        response: Response,
        /// The source authority that sent the response.
        src: Authority<XorName>,
        /// The destination authority that receives the response.
        dst: Authority<XorName>,
    },
    /// A node has connected to us.
    NodeAdded(XorName, RoutingTable<XorName>),
    /// A node has disconnected from us.
    NodeLost(XorName, RoutingTable<XorName>),
    /// Our own section has been split, resulting in the included `Prefix` for our new section.
    SectionSplit(Prefix<XorName>),
    /// Our own section requires merged with others, resulting in the included `Prefix` for our new
    /// section.
    SectionMerge(Prefix<XorName>),
    /// The client has successfully connected to a proxy node on the network.
    Connected,
    /// Disconnected or failed to connect - restart required.
    RestartRequired,
    /// Startup failed - terminate.
    Terminate,
    // TODO: Find a better solution for periodic tasks.
    /// This event is sent periodically every time Routing sends the `Heartbeat` messages.
    Tick,
}

Most of these adhere to the principle, but there are a couple of exceptions.
Suggested fixes:

pub enum Event {
    RequestReceived {
        ...
    },
    ResponseReceived {
        ...
    },
    SectionMerged(...),
    Terminated,
    Ticked,
}

We would by that consistently be using past tense.

Action

This maps to commands IMO, which are always imperative, meaning they tell what to do.

pub enum Action {
    NodeSendMessage {
        src: Authority<XorName>,
        dst: Authority<XorName>,
        content: UserMessage,
        priority: u8,
        result_tx: Sender<Result<(), InterfaceError>>,
    },
    ClientSendRequest {
        content: Request,
        dst: Authority<XorName>,
        priority: u8,
        result_tx: Sender<Result<(), InterfaceError>>,
    },
    Id {
        result_tx: Sender<PublicId>,
    },
    Timeout(u64),
    ResourceProofResult(PublicId, Vec<DirectMessage>),
    Terminate,
}

First two are easier to interpret. It seems to me that
Third is harder. You do say “id someone”, but is that really what this action is about? It looks like a result of some sort.
Timeout could be the imperative form, telling something to timeout.
ResourceProofResult does not look like a command at all. I don’t know what it is.
Terminate is a proper command.

Id and ResourceProofResult I cannot understand if they are actually commands, if they are not I guess they are misplaced, otherwise the naming is a bit strange.

Request

Request is a mix of command and query by the look of the enum.

pub enum Request {
    /// Represents a refresh message sent between vaults. Vec<u8> is the message content.
    Refresh(Vec<u8>, MsgId),
    /// Gets MAID account information.
    GetAccountInfo(MsgId),

    // --- ImmutableData ---
    // ==========================
    /// Puts ImmutableData to the network.
    PutIData {
        /// ImmutableData to be stored
        data: ImmutableData,
        /// Unique message identifier
        msg_id: MsgId,
    },
    /// Fetches ImmutableData from the network by the given name.
    GetIData {
        /// Network identifier of ImmutableData
        name: XorName,
        /// Unique message identifier
        msg_id: MsgId,
    },

    // --- MutableData ---
    /// Fetches whole MutableData from the network.
    /// Note: responses to this request are unlikely to accumulate during churn.
    GetMData {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// Unique message identifier
        msg_id: MsgId,
    },
    // ==========================
    /// Creates a new MutableData in the network.
    PutMData {
        /// MutableData to be stored
        data: MutableData,
        /// Unique message identifier
        msg_id: MsgId,
        /// Requester public key
        requester: PublicSignKey,
    },
    /// Fetches a latest version number.
    GetMDataVersion {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// Unique message identifier
        msg_id: MsgId,
    },
    /// Fetches the shell (everthing except the entries).
    GetMDataShell {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// Unique message identifier
        msg_id: MsgId,
    },

    // Data Actions
    /// Fetches a list of entries (keys + values).
    /// Note: responses to this request are unlikely to accumulate during churn.
    ListMDataEntries {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// Unique message identifier
        msg_id: MsgId,
    },
    /// Fetches a list of keys in MutableData.
    /// Note: responses to this request are unlikely to accumulate during churn.
    ListMDataKeys {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// Unique message identifier
        msg_id: MsgId,
    },
    /// Fetches a list of values in MutableData.
    /// Note: responses to this request are unlikely to accumulate during churn.
    ListMDataValues {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// Unique message identifier
        msg_id: MsgId,
    },
    /// Fetches a single value from MutableData
    GetMDataValue {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// Key of an entry to be fetched
        key: Vec<u8>,
        /// Unique message identifier
        msg_id: MsgId,
    },
    /// Updates MutableData entries in bulk.
    MutateMDataEntries {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// A list of mutations (inserts, updates, or deletes) to be performed
        /// on MutableData in bulk.
        actions: BTreeMap<Vec<u8>, EntryAction>,
        /// Unique message identifier
        msg_id: MsgId,
        /// Requester public key
        requester: PublicSignKey,
    },

    // Permission Actions
    /// Fetches a complete list of permissions.
    ListMDataPermissions {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// Unique message identifier
        msg_id: MsgId,
    },
    /// Fetches a list of permissions for a particular User.
    ListMDataUserPermissions {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// A user identifier used to fetch permissions
        user: User,
        /// Unique message identifier
        msg_id: MsgId,
    },
    /// Updates or inserts a list of permissions for a particular User in the given MutableData.
    SetMDataUserPermissions {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// A user identifier used to set permissions
        user: User,
        /// Permissions to be set for a user
        permissions: PermissionSet,
        /// Incremented version of MutableData
        version: u64,
        /// Unique message identifier
        msg_id: MsgId,
        /// Requester public key
        requester: PublicSignKey,
    },
    /// Deletes a list of permissions for a particular User in the given MutableData.
    DelMDataUserPermissions {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// A user identifier used to delete permissions
        user: User,
        /// Incremented version of MutableData
        version: u64,
        /// Unique message identifier
        msg_id: MsgId,
        /// Requester public key
        requester: PublicSignKey,
    },

    // Ownership Actions
    /// Changes an owner of the given MutableData. Only the current owner can perform this action.
    ChangeMDataOwner {
        /// Network identifier of MutableData
        name: XorName,
        /// Type tag
        tag: u64,
        /// A list of new owners
        new_owners: BTreeSet<PublicSignKey>,
        /// Incremented version of MutableData
        version: u64,
        /// Unique message identifier
        msg_id: MsgId,
    },

    // --- Client (Owner) to MM ---
    // ==========================
    /// Lists authorised keys and version stored in MaidManager.
    ListAuthKeysAndVersion(MsgId),
    /// Inserts an autorised key (for an app, user, etc.) to MaidManager.
    InsAuthKey {
        /// Authorised key to be inserted
        key: PublicSignKey,
        /// Incremented version
        version: u64,
        /// Unique message identifier
        msg_id: MsgId,
    },
    /// Deletes an authorised key from MaidManager.
    DelAuthKey {
        /// Authorised key to be deleted
        key: PublicSignKey,
        /// Incremented version
        version: u64,
        /// Unique message identifier
        msg_id: MsgId,
    },
}

The description “Action” is used in one place, and it makes me wonder what the distinction between a Request and an Action is. Both Request and Action seem very general in naming, but it seems these are partitioned by some more distinct logic, and I guess I would have named it in that way. [Something]Action and [SomethingElse]Request.
Also the question comes, why there is the distinct Action (which I would map to Command) when there is also the mixed Request (which maps to Commands and Queries).

InsAuthKey and DelAuthKey I would definitely call InsertAuthKey and DeleteAuthKey.
When I first saw InsAuthKey I was wondering what it might mean. I would on the other hand not blink when reading InsertAuthKey.

Members of pub enum Response are the same as Request, so omitted here.

MessageContent

This is an interesting enum.

pub enum MessageContent {
    // ---------- Internal ------------
    /// Ask the network to relocate you.
    ///
    /// This is sent by a joining node to its `NaeManager`s with the intent to become a full routing
    /// node with a new ID in an address range chosen by the `NaeManager`s.
    Relocate {
        /// The message's unique identifier.
        message_id: MessageId,
    },
    /// Notify a joining node's `NaeManager` so that it sends a `RelocateResponse`.
    ExpectCandidate {
        /// The joining node's current public ID.
        old_public_id: PublicId,
        /// The joining node's current authority.
        old_client_auth: Authority<XorName>,
        /// The message's unique identifier.
        message_id: MessageId,
    },
    /// Send our Crust connection info encrypted to a node we wish to connect to and for which we
    /// have the keys.
    ConnectionInfoRequest {
        /// Encrypted Crust connection info.
        encrypted_conn_info: Vec<u8>,
        /// The sender's public ID.
        pub_id: PublicId,
        /// The message's unique identifier.
        msg_id: MessageId,
    },
    /// Respond to a `ConnectionInfoRequest` with our Crust connection info encrypted to the
    /// requester.
    ConnectionInfoResponse {
        /// Encrypted Crust connection info.
        encrypted_conn_info: Vec<u8>,
        /// The sender's public ID.
        pub_id: PublicId,
        /// The message's unique identifier.
        msg_id: MessageId,
    },
    /// Reply with the address range into which the joining node should move.
    RelocateResponse {
        /// The interval into which the joining node should join.
        target_interval: (XorName, XorName),
        /// The section that the joining node shall connect to.
        section: (Prefix<XorName>, BTreeSet<PublicId>),
        /// The message's unique identifier.
        message_id: MessageId,
    },
    /// Inform neighbours about our new section. The payload is just a unique hash, as the actual
    /// information is included in the `SignedMessage`'s proving sections anyway.
    NeighbourInfo(Digest256),
    /// Sent from a section that signed a neighbour's section info to that neighbour.
    NeighbourConfirm(Digest256, ProofSet, Vec<(SectionInfo, ProofSet)>),
    /// Inform neighbours that we need to merge, and that the successor of the section info with
    /// the given hash will be the merged section.
    Merge(Digest256),
    /// Sent to all connected peers when our own section splits
    Ack(Ack, u8),
    /// Part of a user-facing message
    UserMessagePart {
        /// The hash of this user message.
        hash: Digest256,
        /// The unique message ID of this user message.
        msg_id: MessageId,
        /// The number of parts.
        part_count: u32,
        /// The index of this part.
        part_index: u32,
        /// The message priority.
        priority: u8,
        /// Is the message cacheable?
        cacheable: bool,
        /// The `part_index`-th part of the serialised user message.
        payload: Vec<u8>,
    },
    /// Confirm with section that the candidate is about to resource prove.
    ///
    /// Sent from the `NaeManager` to the `NaeManager`.
    AcceptAsCandidate {
        /// The joining node's current public ID.
        old_public_id: PublicId,
        /// The joining node's current authority.
        old_client_auth: Authority<XorName>,
        /// The interval into which the joining node should join.
        target_interval: (XorName, XorName),
        /// The message's unique identifier.
        message_id: MessageId,
    },
    /// Approves the joining node as a routing node.
    ///
    /// Sent from Group Y to the joining node.
    NodeApproval(GenesisPfxInfo),
}

LocalEvent

enum LocalEvent {
    TimeoutAccept,
    RelocationTrigger,
    TimeoutCheckElder,
    JoiningTimeoutResendCandidateInfo,
    JoiningTimeoutRefused,
    ComputeResourceProofForElder(Name, ProofSource),
}

We have a mix of conventions here.
Only the next to last is in past tense, JoiningTimoutRefused. The last one seems to be a result.
JoiningTimeoutResendCandidateInfo I cannot really understand by reading it, it seems to have a command baked into it? Or was it a timeout when resending candidate info and joining?
My suggestions would be:

enum LocalEvent {
AcceptTimedout,
RelocationTriggered,
CheckElderTimedout,
???
JoiningTimeoutRefused,
ElderCRPCompleted(Name, ProofSource), // CRP => ComputeResourceProof
}

ParsecVote

This here I am still not sure if a ParsecVote is for something we think should happen, or something we think we saw happened.
It looks to me like both of these exist in this enum. Since the naming is inconsistent elsewhere, I am not sure if this is because of inconsistency, or because that is how it is meant to work.
If we were used to consistency in naming, we would be able to read from the naming how it works.

enum ParsecVote {
    ExpectCandidate(Candidate),
    Online(Candidate),
    PurgeCandidate(Candidate),
    AddElderNode(Node),
    RemoveElderNode(Node),
    NewSectionInfo(SectionInfo),
    RelocationTrigger,
    RefuseCandidate(Candidate),
    RelocateResponse(Candidate, SectionInfo),
    CheckElder,
}

The majority of these are imperative though, so that seems to be the idea.
Online and NewSectionInfo are not, so I would suggest that they conform.
Not sure what the intention of the Online cmd is so I have no good suggestion.
NewSectionInfo seems to be to be a SetNewSectionInfo, RenewSectionInfo or ReplaceSectionInfo.

Rpc

This here is also a mix of various types of requests.
But to me there are two kinds of Rpcs: commands, and queries. You either request for something to happen, or you ask for some information. And that would give a consistent naming in the form of imperative for commands, and prefixing with Get, for queries.
These Rpc names seem to describe objects at time, and it is unclear what they do.

enum Rpc {
    RefuseCandidate(Candidate),
    RelocateResponse(Candidate, SectionInfo),
    RelocatedInfo(Candidate, SectionInfo),

    ExpectCandidate(Candidate),
    ResendExpectCandidate(Section, Candidate),

    ResourceProof {
        candidate: Candidate,
        source: Name,
        proof: ProofRequest,
    },
    ResourceProofReceipt {
        candidate: Candidate,
        source: Name,
    },
    NodeApproval(Candidate, GenesisPfxInfo),

    ResourceProofResponse {
        candidate: Candidate,
        destination: Name,
        proof: Proof,
    },
    CandidateInfo {
        candidate: Candidate,
        destination: Name,
        valid: bool,
    },

    ConnectionInfoRequest {
        source: Name,
        destination: Name,
        connection_info: i32,
    },
    ConnectionInfoResponse {
        source: Name,
        destination: Name,
        connection_info: i32,
    },
}

I will leave it here, because I am just scratching the surface, and I want to inspire and raise awareness for this aspect, so that it perhaps, if my observations are found useful, can be taken into account both in discussions, as well as in new code and refactors.

pierrechevalier83 · April 16, 2019, 4:36pm

Thanks a lot @oetyng! I just went over your suggestions and I’m impressed!

First of all, I agree strongly of course on all the views you expressed in the first paragraph about the fact naming matters etc. In code, it is of course crucial to express intent as best we can.

Some of our naming, as you’ve pointed out has grown a bit organically and got slightly inconsistent.

I agree with almost all the individual suggestions you’ve made. Also for some, I would maybe go a bit further. For instance, with the Rpc, it could be nested so you have Rpc::Request::FooBar and Rpc::Response::FooBaz. I see you prefer command and queries, but using Request and Response here would keep consistent with our code across multiple crates.

If you were willing to take the time and raise PRs with subsets of your points, I would be absolutely happy to discuss it with the team and merge anything we agree is sensible (Note that these names reflect the naming in the wider routing crate, so would probably want to update there too).

If you don’t have the time, we’ll simply discuss it internally and see which points we think are worth picking up and doing our own PRs for.

Thanks again for the excellent constructive feedback

Edit:
If you want to raise pull requests, please note that we’re moving the routing_model code from a branch in maidsafe/routing to its own maidsafe/routing_model repo to avoid unnecessary coupling.

oetyng · April 23, 2019, 9:43am

Thanks @pierrechevalier83 for the encouragement and I’m glad to hear that you like the suggestions!
(Regarding Rpc, that seems like a good approach I agree.)

I think unfortunately I will not be able to do the PRs right now, so probably better you go ahead. (I feel too unfamiliar with the crate to comfortably do PRs, so I would want to read up more. I certainly would with a bit more confidence in my knowledge of the code.)

pierrechevalier83 · April 23, 2019, 11:58am

FYI, @Fraser already raised a pull request to routing with many of your suggestions We’ll start there before porting the changes to the routing_model repo.