When designing the architecture of a distributed / microservice system, one of the patterns used often is to make components communicate over an asynchronous message queue / bus / streaming platform. This approach has a lot of benefits, such as reduced coupling, improved resiliency and fault tolerance. It also creates some challenges, and one of them is finding an answer to a question: "Who defines and owns the protocol that services use to exchange messages?".

The question is usually answered in one of three ways:

Protocol is a shared artifact that both consumer and producer conform to
Protocol is owned by a producer and consumer conforms to it
Protocol is owned by a consumer and producer conforms to it

I believe there is a place for all of these three patterns and depending on the context, all of them can be a good choice. In this blog post I want to share some of the heuristics that help me to make a decision.

Protocol ownership heuristics

Heuristic 1: Can I add more producers?

Let's consider and example of a company implementing a loyalty program for its customers. The company decided to implement two components that communicate over a message queue:

Transaction Ingestor - responsible for receiving messages from a payment provider, filtering out transactions that were not completed by a member of loyalty program and then sending Accrue Points Requests to the Accrual Service
Accrual Service - service responsible for receiving messages from the Transaction Ingestor and accruing points for the members of loyalty program

The answer to Can I add more producers? question is likely to be true in this case. It's plausible to say that company might want to integrate with another payment provider (or change it) and still be able to accrue points for the members of the program. In such case the clue that we are getting from this answer is that the protocol should be defined and owned by the consumer. The reason is that we can add more producers and not a single one of them seems like a good candidate for owning the protocol for the accrue points request. On the other hand it seems unlikely that we will have two consumers interested in handling accrual logic.

Heuristic 2: Can I add more consumers?

Now it's time to have a look at another example of a pair of services that can be integrated using an asynchronous message queue:

Accrual Service - service publishing Points Accrued Events, which are an outcome of processing Accrue Points Requests
Promotion Engine - service listening to Points Accrued Events responsible for awarding extra benefits when certain rules are fulfilled

Given our example the answer to the question Can I add more consumers? could be yes. It's fairly possible that we will have more services interested in Points Accrued Events and one such example could be a Fraud Detection Service. Such service is responsible for detecting patterns in points accruals and preventing any fraud attempts. With the above clues at hand we could say that the protocol should be owned by producer as no other service is likely to produce messages like this, but many might be interested in consuming them.

Heuristic 3: Is the protocol a shared standard?

If the heuristics discussed in previous sections didn't help to reach the conclusion, the third option we have is to ask the question whether we are dealing with something that can be considered a shared standard. It is fairly easy to spot if an actual industry standard is used (i.e. HL7), but might be not so obvious when creating new design of a software system.

A shared protocol is something that's used across many services, and it's hard / impossible to pinpoint which service should be an owner. In the case of loyalty system it could be that any amounts (of money or points) should be transported using fixed-point number representation. This protocol is not something that any single service should own, but at the same time it's important that all services agree on the protocol that transports such amount.

Given the context and use cases it makes sense that for the fixed-point representation of data a shared protocol should be created and maintained independently from services depending on it. Other examples that can fall into this category: date and time formats, identifiers, health checks or metadata.

A good shared protocol can be hard to identify at the very beginning. Good approach to creating it is to grow it organically and extract independent artifact only if the duplication becomes painful or obvious.

A case against shared protocols

During the past couple of years, when working with various teams I've noticed a tendency to create a single shared protocol for all services. In some of the cases it was caused by the lack of experience or agreement between teams, so a shared protocol was considered a "safe middle ground". Unfortunately this approach has a couple of downsides:

Shared protocol adds dependency during development - any change to it must be accepted and reviewed and build before the service owning it can implement the change. If Accrual Service owns protocol of Points Accrued and needs to add a new field to the payload, we can easily do it as part of one changeset. If the protocol was a shared library two separate changesets are usually required to make it happen. First to change and publish new protocol, second to change the implementation.
Shared protocols tend to bloat over time - because the protocol is shared it's easy for developers to keep adding messages definitions that might not be strongly related to the work they are doing. It's often the path of least resistance as it's easier to add to existing definition rather than extract it as a separate, well-scoped artifact. On the other hand, if the protocol was owned by a service itself it would be easier to notice that new change might not belong to it.

Implementation recommendations

Create small, well scoped protocols

Creating a small protocol owned by a single service tends to lead to a better design, as it's easier to segregate responsibilities and ensure cohesion. Smaller protocols also tend to be simpler to update, as the changesets are smaller and client services can easily notice if the changes made are relevant for them. On the contrary, a single "master" protocol for the set of services makes this information harder to discover - it's not easy to say if the change to the protocol affects a service or not.

Use Semantic Versioning

Semantic Versioning is a standard that can be adopted to give both sides (protocol owners and users) a common understanding of what a change in a version means. Such protocol then can be published to an artifact repository, which will make it easy to install and track.

Couple protocol version with service versions

In cases where it's possible to design a service owns its protocol it might be beneficial to couple the version of the service with the version of the protocol. If that's done making sure that a currently deployed version of the service supports given protocol will be trivial (assuming Semver is used). As a result it can reduce the change synchronisation & upgrade overhead. One way of automating this approach is to publish the new protocol every time the service itself is built and the version was bumped.

Summary

Deciding which service (if any) should be an owner of a protocol can be a challenging task, and the set of heuristics discussed in this article can help you make this decision. As with any heuristics these are not hard-and-fast rules and should be followed only if they make sense in your context.

Irrespectively of the chosen strategy make sure that the protocol is well scoped, properly versioned, automatically built and published to an artefact repository. It will help you keep track of versions used and make sure that the clients can be easily upgraded.