Modelling aggregates with "Aggregate Design Canvas"

Designing a good aggregate with the right boundaries and clear responsibilities is not a trivial task. A lot of times when I discuss various design options with people, I learn that they rely on gut feeling or implicit heuristics to guide modelling decisions. In order to make this design process more explicit I've decided to create an Aggregate Design Canvas. It was inspired by the Bounded Context Canvas that I’m using during strategic workshops. The BCC yields great results and it felt natural to create its tactical counterpart.

Aggregate Design Canvas v1

The Aggregate Design Canvas is an attempt to capture and formalise the process I use to design and evaluate an aggregate boundary. It is intended to be used during Design-Level Event Storming (or similar) sessions and should be treated as a checklist of things to keep in mind during the modelling process. It will also allow you to compare two alternative designs and decide which one makes more sense in a given context.

Too big aggregate is likely to be hard to maintain, cause concurrency conflicts, and deliver poor performance. Too small might force you to implement many corrective policies to sync instances, which might increase the complexity of the solution. So how do you decide if aggregate is too big or too small? If you want to learn and create useful models then read on!

1. Name

Give your aggregate a good name. In some domains it makes sense to include as part of the name the length of a cycle, or some other indication of the life span of the aggregate.

2. Description

Summarise the main responsibilities and purpose of the aggregate. It’s a good idea to include the reasons why such boundaries were chosen and tradeoffs that were made compared to other designs.

3. State Transitions

Usually the aggregate goes through explicit state transitions, that impact the way the way it can be interacted with. Too many transitions might indicate that process boundaries weren't modelled properly and can be split. Very naive / simple transitions might indicate that the aggregate is anaemic and that logic was pushed out to services.

In this section list the possible states or draw a small transition diagram in this section of the canvas. To learn more about process oriented aggregates check out this post by Thomas Ploch.

4. Enforced Invariants & 5. Corrective Policies

One of the main jobs of the aggregate is to enforce business invariants. These invariants protect business logic and listing main ones in this section will make sure that you agree on responsibilities that the aggregate has. Large numbers of enforced invariants can indicate high local complexity of the aggregate implementation.

Corrective policies are the ones that aggregate is involved with because you’ve made an explicit tradeoff to give up some of the invariants. A lot of corrective policies could indicate an increased complexity. Listing on the canvas both Invariants and Corrective Policies will make these design tradeoffs explicit. If you need a bit more information on the topic, please review the modelling business rules post from this blog.

6. Handled Commands & 7. Created Events

In this section you list all the commands that the aggregate is capable of handling and also all events that will be created as a result. It might be a good idea to create connectors between them in order to validate that you are not missing any of the building blocks.

8. Throughput

Goal of this section is to estimate how likely a single aggregate instance is going to be involved in concurrency conflicts (when two or more competing callers try to make changes at the same time). For each metric estimate the average and maximum - it will help you to reason about the outliers as they often drive the boundary reevaluation.

The Command handling rate metric describes the rate at which the aggregate is processing new commands. On the other hand the Total number of clients says how many different clients are likely to issue these commands.

To give you an example - if an aggregate models a basket on the website then it’s likely there will be only one client issuing commands to this basket. If we compare it to an aggregate that models a conference booking system then it’s likely we are going to have tens or hundreds of clients trying to book tickets.

Aggregate concurrency conflict chance evaluation chart

Aggregate concurrency conflict chance evaluation chart

Putting these metrics on a graph will give you a rough estimate of a Concurrency conflict chance, which is what we are ultimately looking for. Plotting both Avg and Max for multiple alternatives will allow you to explicitly talk about the throughput tradeoffs. Generally speaking aiming for a small chance of conflict will deliver better customer experience, but will also increase the complexity of implementation. Or if we put it in a different way - bigger aggregates will have higher chance of concurrency conflict, but less policies to correct data.

9. Size

Last section of the canvas will help you estimate the hypothetical size of the aggregate. In this case the size itself is being measured in the number of events per aggregate instance. In some domains events are smaller, and in others much larger, so please remember about it during evaluation.

The Event growth rate metric should estimate how many events are appended to a single aggregate instance. The Lifetime of an instance will tell us how long the instance is going to live and as a consequence how many events will be accumulates and fetched when we need to process a new command.

Aggregate size evaluation chart

Aggregate size evaluation chart

Medium and large number of events might impact the customer experience and make the command handling slow. Fortunately in most cases this can be dealt with using snapshots. Another thing to look for are long lived instances (potentially infinite). This might cause problems when it comes to data archivization and ever growing streams. For that reason it’s usually a good heuristic to scope the aggregate to a specific time period (e.g. billing period).

Summary

The Aggregate Design Canvas is a tool that will help you design and reason about aggregates and their boundaries. It can be used either when designing new systems, redesigning existing ones or documenting solutions that you are currently working on. I’m very much looking forward to learning about your experience feedback. Please give me a shout if you decide to use it and have some comments or find it helpful!