05 the Confusion Between Aggregate Repository and Factory Idioms

05 The confusion between Aggregate Repository and Factory Idioms #

In the previous lesson, we learned that to transform a domain model into a program design, it can be realized through three types of object designs, namely services, entities, and value objects. We also discussed the design principles of anemic models versus rich models. However, this is not enough. We also need to consider the design of aggregates, repositories, and factories.

Design Principles of Aggregates #

Aggregates are a very important concept and design principle in domain-driven design. They express the relationships between a whole and its parts in the real world. For example, an order and its order details, a form and its form details, or an invoice and its invoice details. Taking an order as an example, in the real world, an order and its order details are actually the same thing, where the order details are an attribute of the order. However, since a relational database cannot express a one-to-many relationship within a single field, the order details must be designed as a separate table.

Nevertheless, in the design of the domain model, we restore the relationship to the real world and design it in the form of an “aggregate”. In the domain model, we design the order details as an attribute of the order. The specific code is as follows:

public class Order {

    private Set<Item> items;

    public void setItems(Set<Item> items){
        
        this.items = items;
        
    }

    public Set<Item> getItems(){
        
        return this.items;
        
    }

    …

}

With this design, when creating an order, we no longer create order details separately, but create them within the order. When saving an order, we should save both the order table and the order details table together in the same transaction. When querying an order, we should retrieve both the order table and the order details table, and assemble them into a single order object. At this point, the order is treated as a whole and can be operated on without the need to separately operate on the order details.

In other words, the operations on order details are encapsulated within the order object’s design and implementation. For client programs, they just need to use the order object, including accessing the order details as attributes, without needing to worry about how it is internally implemented.

The following design principles are followed for aggregates:

When creating or updating an order, simply fill in or update the order details within the order object.
When saving an order, only the order object needs to be saved as a whole, without concerning about how the order data is saved, which tables it is saved to, or whether there is a transaction. All the details of saving the database are encapsulated within the order object.
When deleting an order, just delete the order object, and the deletion of order details is handled internally within the order object, without the need for external programs to be concerned.
When querying or loading an order, client programs only need to query the order object based on a query statement or ID, and the querying program will automatically fill in the corresponding order details.

Aggregates reflect a relationship between a whole and its parts. It is precisely because of this relationship that when operating on the whole, the operations on the parts are encapsulated. However, not all relationships between objects have a whole and parts relationship, and those that do not have a whole and parts relationship cannot be designed as aggregates. Therefore, correctly identifying aggregate relationships becomes particularly important.

The whole and parts relationship means that when the whole does not exist, the parts become meaningless. The parts are a part of the whole and have the same lifecycle as the whole. For example, only when an order is created can its order details be created. If the order does not exist, the order details become meaningless and should be deleted. This kind of relationship possesses a whole and parts relationship, which is an aggregate relationship.

For example, the relationship between an order and a user is not an aggregate, because the user does not only exist when creating an order; the user already exists when the order is created. When deleting an order, the user does not get deleted along with the order, because even if the order is deleted, the user still remains.

So, is the relationship between a restaurant and a menu an aggregate relationship? The key is how the system is designed. If the system is designed such that each restaurant has its own unique menu and each menu belongs to a specific restaurant, then the restaurant and menu have an aggregate relationship. For example, each restaurant has “Kung Pao Chicken” on its menu, but each restaurant’s “Kung Pao Chicken” can be different, such as in its description, image, or price, or even in the database records. In this case, in order to query the menu, you need to first query the restaurant, and the menu becomes meaningless without the restaurant.

However, the relationship between a restaurant and a menu can also be designed in another way. That is, all menus are shared, and each restaurant only decides whether to include specific items on their menu. In this case, there is a single menu object in the system, and “Kung Pao Chicken” is just one record within this object. For each restaurant, if their menu includes “Kung Pao Chicken”, they reference this object, otherwise they don’t. In this case, the menu is no longer a part of the restaurant, and without the restaurant, this item still exists, so it is not an aggregate relationship.

Therefore, the most effective method of determining an aggregate relationship is to explore: if the parts do not exist when the whole does not exist. If this is the case, it is an aggregate; otherwise, it is not.

Aggregate Root - The Unique Entry Point for External Access #

With aggregate relationships, the parts are encapsulated within the whole, which creates a constraint that external programs cannot bypass the whole to manipulate the parts. All the operations on the parts must be done through the whole. Thus, the whole becomes the unique entry point for external access referred to as the “aggregate root”.

This means that once the relationship between objects is designed as an aggregate, external programs can only access the aggregate root and not the other objects within the aggregate. The benefit of this is that when the business logic within the aggregate changes, it only affects the internal of the aggregate and requires updates only within the aggregate, without affecting external programs. This effectively reduces the maintenance cost of changes and improves the design quality of the system. However, such a design is sometimes effective, but not always. For example, when managing orders and performing operations such as add, delete, and modify, aggregation is effective. However, when it comes to calculating sales, analyzing sales trends, and calculating sales proportions, a large number of order details need to be aggregated and analyzed. If each aggregation and analysis of order details requires querying the orders every time, the query and analysis process will become extremely inefficient and unusable.

Therefore, domain-driven design is usually suitable for business operations involving add, delete, and modify, but not suitable for analysis and statistics. In a system, business operations involving add, delete, and modify can adopt domain-driven design, but in scenarios involving analysis and aggregation rather than add, delete, and modify, there is no need to adopt domain-driven design. Instead, direct SQL queries can be used, without the need to follow the constraints of aggregation.

Implementation of Aggregation Design #

Earlier, the concept of aggregation was mentioned as a very important concept in domain-driven design. Through aggregation design, the real situation of the real world can be reflected, the quality of software design can be improved, and the cost of future changes can be effectively reduced. However, while the concept of aggregation was mentioned earlier, to truly implement aggregation in software design, two very important components are needed: repository and factory.

For example, now that an order is created, the order contains multiple order details, and they are aggregated. When the order creation is completed, how should it be saved to the database? Both the order table and order detail table need to be saved, and it should be done in a transaction. Who is responsible for saving and adding the transaction?

In the past, we used the anemic model, which means using order DAO and order detail DAO to complete the database saving, and then the order service adds the transaction. This design lacks aggregation and encapsulation, making it difficult to maintain in the future . So, what should the design look like when using aggregation?

After adopting aggregation, the saving of orders and order details will be encapsulated in the order repository. In other words, after adopting domain-driven design, it is usually necessary to implement a repository to access the database. So, what is the difference between a repository and a data access layer (DAO)?

Generally, the data access layer accesses a specific table in the database, such as an order DAO for the order table, an order detail DAO for the order detail table, and a user DAO for the user table.

When data needs to be saved to the database, the DAO is responsible for saving, but it saves a specific table, such as the order DAO saving the order table, the order detail DAO saving the order detail table, and the user DAO saving the user table.
When data needs to be queried, the DAO is still used for querying, but it queries a specific table, such as the order DAO querying the order table, and the order detail DAO querying the order detail table.

So, what should we do if we want to display the user’s name when querying orders? We can create another order object and add the “user name” attribute to it. In this way, when querying the order table using the order DAO, the user table can be joined in the SQL statement to complete the data query. At this point, it will be found that two or more order objects have been awkwardly designed in the system, and the newly added order object is significantly different from the order object in the domain model, which is not intuitive enough. When the system is simple, this is not a big problem, but when the business logic of the system becomes more and more complex, it becomes difficult to read the program, and changes become more and more troublesome.

Therefore, when dealing with complex business systems, we hope that the program design can correspond well to the domain model: the program should be designed as the domain model looks. We can design the order object like this, and the code for the association design of the order object is as follows:

public class Order {

 ......

 private Long customer_id;

 private Customer customer;

 private List<OrderItem> orderItems;

 /**
  * @return the customerId
  */
 public Long getCustomerId() {
  return customer_id;
 }

 /**
  * @param customerId the customerId to set
  */
 public void setCustomerId(Long customerId) {
  this.customer_id = customerId;
 }

 /**
  * @return the customer
  */
 public Customer getCustomer() {
  return customer;
 }
 
 /**
  * @param customer the customer to set
  */
 public void setCustomer(Customer customer) {
  this.customer = customer;
 }
 
 /**
  * @return the orderItems
  */
 public List<OrderItem> getOrderItems() {
  return orderItems;
 }
 
 /**
  * @param orderItems the orderItems to set
  */
 public void setOrderItems(List<OrderItem> orderItems) {
  this.orderItems = orderItems;
 }
}

/*
    */
    
    public Customer getCustomer() {
    
        return customer;
    
    }
    
    /**
    
    * @param customer the customer to set
    
    */
    
    public void setCustomer(Customer customer) {
    
        this.customer = customer;
    
    }
    
    /**
    
    * @return the orderItems
    
    */
    
    public List<OrderItem> getOrderItems() {
    
        return orderItems;
    
    }
    
    /**
    
    * @param orderItems the orderItems to set
    
    */
    
    public void setOrderItems(List<OrderItem> orderItems) {
    
        this.orderItems = orderItems;
    
    }
    
}
    
![DDD 05– Golden Sentence.png](../images/Ciqc1F_AzieAT7v1AAEAFf4jjt0477.png)

As can be seen, references to the customer object and order item objects are added to the order object:

- The order object has a many-to-one relationship with the customer object, so it is implemented as an object reference.
- The order object has a one-to-many relationship with the order item object, so it is implemented as a collection of order items.

In this way, when creating an order object, the object can be filled with the customerId and the corresponding order item collection, and then handed over to the order repository for saving. During the saving process, encapsulation is performed by saving the order table and the order item table together, and a transaction is added on top of it.

It is important to note if the relationship between the objects is an aggregation. There is a difference when saving them. For example, in this case, the order and order item have an aggregation relationship, so when saving the order, the order item also needs to be saved, and it should be done in the same transaction. On the other hand, the order and customer do not have an aggregation relationship, so there is no need to manipulate the customer table when saving the order. Only when querying, for example, when querying an order, the corresponding customer needs to be queried as well.

This is a complex saving process. However, by encapsulating it in the order repository, the client program does not need to know how it is saved. It only needs to set the relationships between the domain objects during domain object modeling, such as setting it as "aggregation". This maintains consistency with the domain model and simplifies development, making future changes and maintenance easier. The design and implementation of the repository will be explained in later courses.

With this design, how should loading and querying be done? The so-called "loading" is to retrieve a record using the primary key ID. For example, to load an order, it is done by querying the order using the order ID. But how does the order repository implement the loading of an order object?
First of all, the most obvious approach is to use SQL statements to query this order from the database. Unlike DAO:

  * When the order repository queries for orders, it simply queries the order table and does not involve joining other tables such as the user table;
  * After querying the order, it is encapsulated in an order object, and then the user object and order detail object are queried and filled in;
  * After filling, you will have one user object and multiple order detail objects, which need to be assembled into the order object.

At this point, the creation and assembly work is delegated to another component - the factory.

#### Factory in DDD

The factory in DDD is not the same concept as the factory in design patterns. They are different. In design patterns, to avoid the dependency between the caller and the callee, the callee is designed as multiple implementations under an interface, and these implementations are placed in a factory. In this way, the caller can get a specific implementation class from the factory by using a key value. The factory is responsible for finding the corresponding implementation class based on the key value, creating it, and returning it to the caller, thus reducing the coupling between the caller and the callee.

In DDD, the factory is responsible for **creating domain objects through assembly** and is the starting point of the domain object's lifecycle. For example, if the system needs to load an order by ID:

  * The order repository will delegate this task to the order factory, which will call the order DAO, order detail DAO, and user DAO to perform the query;
  * Then, the order object, order detail object, and user object obtained are assembled. The order detail object and user object are set into the "order detail" and "user" properties of the order object respectively;
  * Finally, the order factory returns the assembled order object to the order repository.

These are the tasks that the factory needs to perform in DDD.

#### Repository in DDD

However, when the order factory returns the order object to the order repository, the order repository does not simply return the object to the client program. It also has a caching function. In **DDD, the concept of a "repository" means that if the server is a very powerful server, we do not need any database at all**. **All domain objects created by the system are stored in the repository, and when these objects are needed, they can be obtained from the repository through an ID**.

But in reality, we do not have such a powerful repository. Therefore, when implementing the repository internally, it will persist the domain objects to the database. The database is an internal implementation of the repository for data persistence. It can also have another internal implementation, which is to store the frequently used domain objects in a cache. In this way, when the client program retrieves a domain object by ID, the repository first searches in the cache:

  * If found, it is directly returned without querying the database;
  * If not found, the repository notifies the factory, the factory calls the DAO to query the database, and then assembles the result into a domain object to return to the repository.

After receiving this domain object, the repository returns it to the client program while storing it in the cache.

The above is the process of loading an order by ID. So how does it perform the process of **querying orders based on certain conditions**? The operation of querying orders is also delegated to the order repository.

  * The order repository first queries the order table through the order DAO, but here it only queries the order table and does not involve any joins;
  * After the order DAO queries the order table, it performs pagination and returns a page of data to the order repository;
  * At this point, the order repository hands the query result to the order factory to fill in the corresponding user and order detail objects, complete the assembly, and finally return the assembled collection of order objects to the repository.

In summary, when adopting domain-driven design, accessing the database is no longer a simple DAO, which is not a good design. Through the repository and the factory, a layer of encapsulation is added to the original DAO, including operations such as saving, loading, and querying, as well as aggregating and assembling. These operations are encapsulated and shielded from the upper client program. In this way, the client program can complete its own business in the domain model without the need for these operations. The technical threshold is reduced, and change and maintenance become simpler.

### Summary

This chapter explains a very important design concept in DDD: aggregate and its implementation: factory and repository, which are important pillars of the rich domain model design in DDD. Through these designs, we can see the many differences with the traditional anemic model design based on DAO.

  * Through aggregation, the relationship between the whole and its parts is implemented, and the client program can only operate on the whole, while operations on the parts are encapsulated in the repository and the factory;
  * The client program does not need to be concerned with the operations on the database, only operations on the repository. Operations on the cache and the database are encapsulated in the repository and the factory, thereby reducing the technical threshold and development workload of business development;
  * Data queries are no longer carried out by SQL statements and joins, but through the factory for filling and assembly. This design is more conducive to microservice design and big data optimization.

They provide good design for improving the design quality of software systems, reducing maintenance costs, and dealing with high concurrency.

In addition, a worth considering question is that in traditional domain-driven design, each module implements its own repository and factory, which greatly increases development workload. However, the designs of these repositories and factories are generally similar, leading to a large amount of duplicated code. Can we extract the commonality through abstraction and form a generic repository and factory, which can be implemented in the underlying technical platform, thereby further reducing the development cost and technical threshold of domain-driven design? That is to say, the implementation of domain-driven design also requires corresponding platform architecture support. This aspect will be further explored in the architecture design section of DDD.