13 How Does ShardingSphere Implement System Scalability with Microkernel Architecture #

We have mentioned multiple times in the course that ShardingSphere uses a microkernel architecture to achieve framework scalability. As the course progresses, we will discover that the implementation of various components such as ConfigCenter for configuration center, ShardingEncryptor for data desensitization, and RegistryCenter interface for database governance also use the microkernel architecture. So, what exactly is a microkernel architecture? Today we will discuss the basic principles of this architectural pattern and its application in ShardingSphere.

What is Microkernel Architecture? #

Microkernel is a typical architectural pattern, different from ordinary design patterns. Architectural patterns are high-level patterns used to describe the system’s structural composition, relationships, and related constraints. Microkernel architecture is widely used in open-source frameworks, not only in ShardingSphere but also in the popular PRC framework Dubbo, which has implemented its own microkernel architecture. So, before introducing what microkernel architecture is, it is necessary to explain why these open-source frameworks use microkernel architecture.

Why Use Microkernel Architecture? #

Essentially, microkernel architecture is used to improve system scalability. Scalability refers to the flexibility of a system when inevitable changes occur and the ability to balance the cost of providing such flexibility. In other words, when adding new business to the system, there is no need to change the existing components. The new business can be encapsulated in a new component to complete the overall business upgrade. We consider such a system to have good scalability.

In terms of architectural design, scalability is an eternal topic in software design. To achieve system scalability, one approach is to provide a plug-and-play mechanism to respond to changes. When an existing component in the system does not meet the requirements, we can implement a new component to replace it, and the entire process should be transparent to the system operation. We can also replace these new and old components as needed.

For example, in the next lesson, we will introduce the distributed primary key functionality provided by ShardingSphere. There may be many implementations of distributed primary keys, and the scalability at this point is that we can replace the original implementation with any new distributed primary key implementation without changing the business code that depends on distributed primary keys.

Microkernel architecture pattern provides architectural design support for this implementation scalability concept, and ShardingSphere achieves high scalability based on the microkernel architecture. Before discussing how to implement the microkernel architecture, let’s briefly describe the specific structural composition and basic principles of the microkernel architecture.

What is Microkernel Architecture? #

In terms of structural composition, microkernel architecture consists of two components: the kernel system and plugins. The kernel system typically provides the minimum functional set required for system operation, while plugins are independent components that include custom business code to enhance or extend additional business capabilities to the kernel system. In ShardingSphere, the distributed primary key mentioned earlier is a plugin, and ShardingSphere’s runtime environment constitutes the kernel system.

What specifically refers to plugins here? This requires us to clarify two concepts: one concept is the commonly mentioned API, which is the interface exposed by the system to the outside world. The other concept is SPI (Service Provider Interface), which is the extension point that plugins have. As for the relationship between the two, the API is oriented towards business developers, while the SPI is oriented towards framework developers. Together, they form ShardingSphere itself.

The implementation of plug-and-play mechanisms may sound simple, but it is not easy to do. We need to consider two aspects. On one hand, we need to organize the system changes and abstract them into multiple SPI extension points. On the other hand, after implementing these SPI extension points, we need to build a concrete implementation that supports this plug-and-play mechanism to provide an SPI runtime environment.

So, how does ShardingSphere implement the microkernel architecture? Let’s take a look together.

How to Implement Microkernel Architecture? #

In fact, JDK has provided us with a way to implement microkernel architecture. This implementation focuses on how to design and implement SPI and provides some development and configuration conventions. ShardingSphere uses this specification. First, we need to design a service interface and provide different implementations as needed. Next, we will simulate the application scenario of distributed primary keys.

Based on the SPI convention, create a separate project to store the service interface and provide the interface definition. Please note that the complete class path of this service interface is com.tianyilan.KeyGenerator, and the interface only contains a simple example method to obtain the target primary key.

package com.tianyilan;

public interface KeyGenerator {
    String getKey();
}

Two simple implementation classes are provided for this interface: UUIDKeyGenerator based on UUID and SnowflakeKeyGenerator based on the Snowflake algorithm. To simplify the demonstration, we will directly return a simulated result here. The actual implementation process will be explained in detail in the next lesson.

public class UUIDKeyGenerator implements KeyGenerator {

    @Override
    public String getKey() {

        return "UUIDKey";
    }
}

public class SnowflakeKeyGenerator implements KeyGenerator {

    @Override
    public String getKey() {

        return "SnowflakeKey";
    }
}

The next step is crucial. In the META-INF/services/ directory of this code project, create a file named com.tianyilan.KeyGenerator, which is the full class path of the service interface. The content of the file should indicate the full class paths of the two implementation classes corresponding to this interface, namely com.tianyilan.UUIDKeyGenerator and com.tianyilan.SnowflakeKeyGenerator.

After packaging this code project into a jar file, create another code project that requires this jar file and complete the following Main function as shown below.

import java.util.ServiceLoader;
import com.tianyilan.KeyGenerator;

public class Main {
    public static void main(String[] args) {

        ServiceLoader<KeyGenerator> generators = ServiceLoader.load(KeyGenerator.class);

for (KeyGenerator generator : generators) { 
     System.out.println(generator.getClass()); 
     String key = generator.getKey(); 
     System.out.println(key); 
}

Now, the role of this project is a user of the SPI service. Here, the ServiceLoader utility class provided by the JDK is used to obtain all implementations of the KeyGenerator. Currently, there are two KeyGenerator implementations defined in the META-INF/services/com.tianyilan.KeyGenerator file in the jar package. When we execute this Main function, we will get the following output:

     class com.tianyilan.UUIDKeyGenerator 
     UUIDKey 
     class com.tianyilan.SnowflakeKeyGenerator 
     SnowflakeKey

If we modify the content of the META-INF/services/com.tianyilan.KeyGenerator file, removing the definition of com.tianyilan.UUIDKeyGenerator, and repack it into a jar file for the user of the SPI service to reference. When executing the Main function again, we will only get the output result based on SnowflakeKeyGenerator.

So far, the complete implementation process demonstration of the SPI provider and user is complete. We summarize the development process of microkernel architecture based on the JDK’s SPI mechanism with a picture:

This example is very simple, but it is the basis for implementing the microkernel architecture in ShardingSphere. Next, let’s turn our attention to ShardingSphere and see how it applies the SPI mechanism.

How does ShardingSphere achieve extensibility based on microkernel architecture? #

The implementation process of the microkernel architecture in ShardingSphere is not complicated. In essence, it is a encapsulation of the SPI mechanism in the JDK. Let’s take a look together.

Basic implementation mechanism of microkernel architecture in ShardingSphere #

We found that there is an independent project shardingsphere-spi in the root directory of the ShardingSphere source code. Obviously, from the naming, this project should include the code related to implementing SPI in ShardingSphere. We quickly browse through this project and find that there is only one interface definition and two utility classes. Let’s first take a look at this interface definition TypeBasedSPI:

public interface TypeBasedSPI { 

    // Get the type of the SPI
    String getType(); 

    // Get the properties
    Properties getProperties(); 

    // Set the properties
    void setProperties(Properties properties); 
}

From its positioning, this interface should be a top-level interface in ShardingSphere. In the previous lesson, we provided the class hierarchy of the implementation class of this interface. Next, let’s take a look at the NewInstanceServiceLoader class. From the naming, it is not difficult to imagine that the purpose of this class is similar to a ServiceLoader, used to load new target object instances:

public final class NewInstanceServiceLoader { 

    private static final Map<Class, Collection<Class<?>>> SERVICE_MAP = new HashMap<>(); 

    // Use ServiceLoader to get new instances of SPI services and register them in SERVICE_MAP
    public static <T> void register(final Class<T> service) { 
        for (T each : ServiceLoader.load(service)) { 
            registerServiceClass(service, each); 
        } 
    } 

    @SuppressWarnings("unchecked") 
    private static <T> void registerServiceClass(final Class<T> service, final T instance) { 
        Collection<Class<?>> serviceClasses = SERVICE_MAP.get(service); 
        if (null == serviceClasses) { 
            serviceClasses = new LinkedHashSet<>(); 
        } 
        serviceClasses.add(instance.getClass()); 
        SERVICE_MAP.put(service, serviceClasses); 
    } 

    @SneakyThrows 
    @SuppressWarnings("unchecked") 
    public static <T> Collection<T> newServiceInstances(final Class<T> service) { 
        Collection<T> result = new LinkedList<>(); 
        if (null == SERVICE_MAP.get(service)) { 
            return result; 
        } 
        for (Class<?> each : SERVICE_MAP.get(service)) { 
            result.add((T) each.newInstance()); 
        } 
        return result; 
    } 
}

In the code snippet above, we first see the familiar ServiceLoader.load(service) method, which is a specific usage of the ServiceLoader utility class in the JDK. Additionally, we notice that ShardingSphere uses a HashMap to store the class definitions and the one-to-many relationship between class instances. This can be seen as a caching mechanism for improving access efficiency.

Finally, let’s take a look at the implementation of TypeBasedSPIServiceLoader, which depends on the NewInstanceServiceLoader class introduced earlier. The following code demonstrates how to obtain a list of instance classes based on NewInstanceServiceLoader and filter them based on the specified type:

// Use NewInstanceServiceLoader to obtain a list of instance classes and filter them based on the type
private Collection<T> loadTypeBasedServices(final String type) {
    return Collections2.filter(NewInstanceServiceLoader.newServiceInstances(classType), new Predicate<T>() {
        @Override
        public boolean apply(final T input) {
            return type.equalsIgnoreCase(input.getType());
        }
    });
}

TypeBasedSPIServiceLoader exposes the service interface externally and sets the corresponding properties for the service instances obtained through the loadTypeBasedServices method:

// Create a service instance based on the type through SPI
public final T newService(final String type, final Properties props) {
    Collection<T> typeBasedServices = loadTypeBasedServices(type);
    if (typeBasedServices.isEmpty()) {
        throw new RuntimeException(String.format("Invalid `%s` SPI type `%s`.", classType.getName(), type));
    }
    T result = typeBasedServices.iterator().next();
    result.setProperties(props);
    return result;
}

In addition, TypeBasedSPIServiceLoader also exposes a newService method that does not require the type to be passed in. This method uses the loadFirstTypeBasedService utility method to obtain the first service instance:

// Create a service instance based on the default type through SPI
public final T newService() {
    T result = loadFirstTypeBasedService();
    result.setProperties(new Properties());
    return result;
}

private T loadFirstTypeBasedService() {
    Collection<T> instances = NewInstanceServiceLoader.newServiceInstances(classType);
    if (instances.isEmpty()) {
        throw new RuntimeException(String.format("Invalid `%s` SPI, no implementation class load from SPI.", classType.getName()));
    }
    return instances.iterator().next();
}

This concludes the introduction to the contents of the shardingsphere-spi code project. This part is equivalent to the plug-in runtime environment provided by ShardingSphere. Next, we will discuss the specific usage of this runtime environment based on several typical application scenarios provided by ShardingSphere.

Application of Microkernel Architecture in ShardingSphere #

SQL Parser SQLParser

We will introduce the SQLParser class in Lesson 15, which is responsible for the entire process of parsing a specific SQL statement into an abstract syntax tree. The generation of this SQLParser is handled by SQLParserFactory:

public final class SQLParserFactory {
    public static SQLParser newInstance(final String databaseTypeName, final String sql) {
        // Load all extensions by SPI
        for (SQLParserEntry each : NewInstanceServiceLoader.newServiceInstances(SQLParserEntry.class)) {
            ...
        }
    }
}

As we can see, here the TypeBasedSPIServiceLoader is not used to load instances, but a lower-level NewInstanceServiceLoader is used directly.

The SQLParserEntry interface introduced here is located in the org.apache.shardingsphere.sql.parser.spi package of the shardingsphere-sql-parser-spi project. Obviously, from the naming of the package, this interface is an SPI interface. Within the SQLParserEntry class hierarchy interface, there are a number of implementation classes, each corresponding to a specific database:

Drawing 4.png

Implementation class diagram for SQLParserEntry

Let’s take a look at the shardingsphere-sql-parser-mysql code project for MySQL. We can find an org.apache.shardingsphere.sql.parser.spi.SQLParserEntry file in the META-INF/services directory:

Drawing 5.png

SPI Configuration in the MySQL Codebase #

Here, we can see that it points to the org.apache.shardingsphere.sql.parser.MySQLParserEntry class. We can also find a org.apache.shardingsphere.sql.parser.spi.SQLParserEntry file in the META-INF/services directory of the shardingsphere-sql-parser-oracle codebase:

Drawing 6.png

SPI Configuration in the Oracle Codebase #

Obviously, this should point to the org.apache.shardingsphere.sql.parser.OracleParserEntry class. By doing so, the system dynamically loads the SPI at runtime based on the classpath.

Note that in the class hierarchy of the SQLParserEntry interface, the TypeBasedSPI interface is not actually used, but the native SPI mechanism provided by the JDK is fully adopted.

Configuration Center

Next, let’s find an example that uses TypeBasedSPI, such as the ConfigCenter that represents the configuration center:

public interface ConfigCenter extends TypeBasedSPI

Obviously, the ConfigCenter interface inherits the TypeBasedSPI interface. In ShardingSphere, there are two implementation classes of the ConfigCenter interface: ApolloConfigCenter and CuratorZookeeperConfigCenter.

In the org.apache.shardingsphere.orchestration.internal.configcenter package of the sharding-orchestration-core codebase, we find the ConfigCenterServiceLoader class, which extends the previously mentioned TypeBasedSPIServiceLoader class:

public final class ConfigCenterServiceLoader extends TypeBasedSPIServiceLoader<ConfigCenter> {

    static {
        NewInstanceServiceLoader.register(ConfigCenter.class);
    }

    public ConfigCenterServiceLoader() {
        super(ConfigCenter.class);
    }

    // Load ConfigCenter based on SPI
    public ConfigCenter load(final ConfigCenterConfiguration configCenterConfig) {
        Preconditions.checkNotNull(configCenterConfig, "Config center configuration cannot be null.");
        ConfigCenter result = newService(configCenterConfig.getType(), configCenterConfig.getProperties());
        result.init(configCenterConfig);
        return result;
    }
}

So, how does it work? First, the ConfigCenterServiceLoader class registers all ConfigCenter instances in the system using the NewInstanceServiceLoader.register(ConfigCenter.class) statement, which loads all ConfigCenter instances in the classpath using the JDK ServiceLoader utility class.

As we can see in the load method above, the SPI instance is created based on the type using the newService method of the parent TypeBasedSPIServiceLoader class.

Taking ApolloConfigCenter as an example, let’s look at how it is used. In the META-INF/services directory of the sharding-orchestration-config-apollo codebase, there should be a configuration file named org.apache.shardingsphere.orchestration.config.api.ConfigCenter that points to the ApolloConfigCenter class:

Drawing 7.png

SPI Configuration in the Apollo Codebase

The other ConfigCenter implementations are also the same. You can check the SPI configuration files in projects such as sharding-orchestration-config-zookeeper-curator.

So far, we have gained a comprehensive understanding of the microkernel architecture in ShardingSphere, which provides a high level of flexibility and pluggability. The microkernel architecture is also a common architectural pattern. In this lesson, we have introduced the characteristics and composition of this architectural pattern and provided a specific solution for implementing this architectural pattern using the SPI mechanism provided by the JDK.

ShardingSphere extensively uses the microkernel architecture to decouple the system core from the various components. We have provided specific implementation examples based on the parsing engine and configuration center. When learning these examples, it is important to understand the encapsulation mechanism of the JDK SPI in ShardingSphere.

Here’s a question for you to think about: What encapsulations does ShardingSphere make to the JDK SPI mechanism when using the microkernel architecture?

That’s all for this lesson. In the next lesson, we will continue to explore the infrastructure of ShardingSphere and provide the design principles and multiple implementation options for distributed primary keys. Make sure to tune in on time.