04 Methods for Integrating Sharding Sphere in Business Systems

04 Methods for Integrating ShardingSphere in Business Systems #

In the previous lesson, I detailed the compatibility between ShardingSphere and the JDBC specification. We know that ShardingSphere has rewritten the JDBC specification and embedded sharding mechanism. Based on this compatibility, developers can use ShardingSphere just like using the various interfaces exposed by the JDBC specification. In this lesson, we will discuss the specific ways to use ShardingSphere in a business system.

How to abstract the application methods of an open source framework? #

When we design and implement an open source framework ourselves, how do we plan its application methods? As an open source framework related to database access, ShardingSphere provides multiple dimensions of application methods. We can abstract these application methods to extract a template. This template consists of four dimensions: lower-level tools, basic specifications, development frameworks, and domain frameworks, as shown in the following figure:

2.png

Lower-level tools #

Lower-level tools refers to the target tools that the open source framework is facing or the third-party tools it depends on. These lower-level tools are often not controllable or manageable by the framework itself. The role of the framework is to add an application layer on top of these lower-level tools to encapsulate the usage of these tools.

For ShardingSphere, the lower-level tools mentioned here actually refer to relational databases. Currently, ShardingSphere supports mainstream relational databases including MySQL, Oracle, SQLServer, PostgreSQL, and any other databases that follow the SQL92 standard.

Basic specifications #

As an open source framework, it often needs to be compatible with the basic specifications that have been established in the industry. In other words, if you want the framework to be recognized by other developers, you need to consider the basic specifications that developers are currently using. For example, if you design an open source framework related to distributed tracing, it generally needs to be compatible with the open distributed tracing specification like OpenTracing.

For ShardingSphere, the involved basic specification is clear, which is the JDBC specification we detailed in the previous lesson.

Development frameworks #

Although the open source framework itself is a development framework, we usually prefer to integrate it with existing mainstream development frameworks rather than designing and implementing a completely new development framework. Currently, the most popular development framework in the Java world is the Spring family of frameworks.

ShardingSphere integrates with both Spring and Spring Boot, the two mainstream development frameworks of the Spring family. Developers familiar with these two frameworks do not need any learning costs when using ShardingSphere for development.

Domain frameworks #

For some open source frameworks, it is also necessary to consider integrating with domain frameworks in order to provide better user experience and user friendliness, which is different from the previously mentioned development frameworks applicable to any scenario. The so-called domain frameworks refer to the development frameworks that belong to the same professional domain as the designed open source framework. Business developers are accustomed to using these domain-specific development frameworks in their daily development processes, so when designing their own open source frameworks, they also need to fully consider the integration and integration with these frameworks.

For ShardingSphere, the domain frameworks refer to common ORM frameworks such as MyBatis and Hibernate. ShardingSphere provides seamless integration solutions with these domain frameworks, so developers familiar with ORM frameworks do not need any learning costs when using ShardingSphere for development.

Next, we will analyze the development support provided by the ShardingSphere framework to developers based on the previously abstracted application methods for open source frameworks.

Integration with Databases and JDBC #

Since ShardingSphere ultimately operates on relational databases and has rewritten based on the JDBC specification, the specific application is relatively simple. We only need to understand the usage of JDBC drivers and database connection pools.

JDBC Driver #

ShardingSphere supports mainstream relational databases that implement the JDBC specification such as MySQL and Oracle. When using these databases, the common practice is to specify the JDBC driver class, URL, username, and password corresponding to the specific database. Here is an example of how to specify the JDBC driver through a .yaml configuration file in a Spring Boot application using MySQL:

driverClassName: com.mysql.jdbc.Driver
url: jdbc:mysql://localhost:3306/test_database
username: root
password: root

Database Connection Pool #

The purpose of configuring the JDBC driver is to obtain a Connection for accessing the database. To improve performance, it is common practice to use a database connection pool. The connection pool stores the created Connection objects and provides them when needed.

ShardingSphere supports a variety of mainstream third-party database connection pools, including DBCP, C3P0, BoneCP, Druid, and HikariCP. When using ShardingSphere, we can create a DataSource to use a database connection pool. For example, in Spring Boot, we can use Alibaba’s DruidDataSource class to initialize a DataSource based on the Druid database connection pool in a .properties configuration file:

spring.shardingsphere.datasource.names= test_datasource
spring.shardingsphere.datasource.test_datasource.type=com.alibaba.druid.pool.DruidDataSource
spring.shardingsphere.datasource.test_datasource.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.test_datasource.jdbc-url=jdbc:mysql://localhost:3306/test_database
spring.shardingsphere.datasource.test_datasource.username=root
spring.shardingsphere.datasource.test_datasource.password=root

For developers using the Spring framework, they can directly inject a DruidDataSource JavaBean into the Spring container:

<bean id="test_datasource" class="com.alibaba.druid.pool.DruidDataSource" destroy-method="close">
    <property name="driverClassName" value="com.mysql.jdbc.Driver"/>
    <property name="url" value="jdbc:mysql://localhost:3306/ test_database"/>
    <property name="username" value="root"/>
    <property name="password" value="root"/>
</bean>

Integration with Development Frameworks #

From the above configuration information, you can see that ShardingSphere integrates with two mainstream development frameworks, namely Spring and Spring Boot, both of which encapsulate the JDBC specification. Of course, for scenarios where Spring framework is not used or cannot be used, ShardingSphere can also be used directly in native Java applications.

Before discussing the specific integration methods for development frameworks, let’s design a simple application scenario. Suppose there is a user table “User” in the system, which contains a large amount of data. Therefore, we plan to split it into two databases, ds0 and ds1, and then further split each database into two tables, user0 and user1:

1.png

Next, let’s see how to achieve database sharding using Java native, Spring, and Spring Boot development frameworks for this scenario.

Java Native #

If you use the Java native development approach, it means that we need to create and manage all the classes related to sharding in ShardingSphere using Java code. Unless otherwise specified, this course will default to using Maven to manage package dependencies. So first, you need to import the Maven dependency for the sharding-jdbc-core component:

<dependency>
    <groupId>org.apache.shardingsphere</groupId>
    <artifactId>sharding-jdbc-core</artifactId>
</dependency>

Next, following the usage of JDBC, you need to create implementations of interfaces such as DataSource, Connection, and Statement, and use these classes to perform specific database access operations.

Let’s first look at the Java code for creating a DataSource. Here, a utility class called DataSourceHelper is built to obtain a DruidDataSource based on Druid:

public final class DataSourceHelper{

    private static final String HOST = "localhost";
    private static final int PORT = 3306;
    private static final String USER_NAME = "root";
    private static final String PASSWORD = "root";

    public static DataSource createDataSource(final String dataSourceName) {
        DruidDataSource result = new DruidDataSource();
        result.setDriverClassName(com.mysql.jdbc.Driver.class.getName());
        result.setUrl(String.format("jdbc:mysql://%s:%s/%s", HOST, PORT, dataSourceName));
        result.setUsername(USER_NAME);
        result.setPassword(PASSWORD);
        return result;
    }
}

Since we need to create two user databases in the example, a Map is used to store the two DataSource objects:

private static Map<String, DataSource> createDataSourceMap() {
    Map<String, DataSource> result = new HashMap<>();
    result.put("ds0", DataSourceHelper.createDataSource("ds0"));
    result.put("ds1", DataSourceHelper.createDataSource("ds1"));
    return result;
}

Once we have the collection of DataSource objects with the initialized DataSource, we can then obtain the target DataSource by designing the sharding rules:

public DataSource dataSource() throws SQLException {
    // Create the sharding rule configuration
    ShardingRuleConfiguration shardingRuleConfig = new ShardingRuleConfiguration();

    // Create the table rule configuration
    TableRuleConfiguration tableRuleConfig = new TableRuleConfiguration("user", "ds${0..1}.user${0..1}");

    // Create the distributed primary key generation configuration
    Properties properties = new Properties();
    properties.setProperty("worker.id", "33");
    KeyGeneratorConfiguration keyGeneratorConfig = new KeyGeneratorConfiguration("SNOWFLAKE", "id", properties);
    tableRuleConfig.setKeyGeneratorConfig(keyGeneratorConfig);
    shardingRuleConfig.getTableRuleConfigs().add(tableRuleConfig);

    // Shard the databases based on the gender, a total of 2 databases
    shardingRuleConfig.setDefaultDatabaseShardingStrategyConfig(new InlineShardingStrategyConfiguration("sex", "ds${sex % 2}"));

    // Shard the tables based on the user ID, a total of 2 tables
    shardingRuleConfig.setDefaultTableShardingStrategyConfig(new StandardShardingStrategyConfiguration("id", "user${id % 2}"));

    // Create the specific DataSource using the factory class
    return ShardingDataSourceFactory.createDataSource(createDataSourceMap(), shardingRuleConfig, new Properties());
}

Here, various rule configuration classes in ShardingSphere are used, including sharding rule configuration, table rule configuration, and distributed primary key generation configuration. At the same time, we use the row expression in the sharding rule configuration to set specific sharding rules. The specific usage of the row expression will be explained in the next lesson. Here, we simply shard the databases based on the user’s age and shard the tables based on the user’s ID. Lastly, at the end of the method, we pass in the previously initialized collection of DataSource objects and use the factory class to create a specific target DataSource.

Once we have obtained the target DataSource, we can use the core interfaces in JDBC to execute the SQL statements passed in:

List<User> getUsers(final String sql) throws SQLException {
    List<User> result = new LinkedList<>();
    try (Connection connection = dataSource.getConnection();
         PreparedStatement preparedStatement = connection.prepareStatement(sql);
         ResultSet resultSet = preparedStatement.executeQuery()) {
        while (resultSet.next()) {
            User user= new User();
            // Code for assigning values to the User object is omitted
            result.add(user);
            }
        }
        return result;
    }

As you can see, familiar JDBC interfaces such as Connection, PreparedStatement, and ResultSet are used here to execute queries and retrieve results. The whole process is just like using regular JDBC. However, at this point, the implementation classes behind these JDBC interfaces have already embedded sharding functionality.

Spring #

If we use Spring as our development framework, the creation process of each core object in JDBC will be delegated to the Spring container. ShardingSphere seamlessly integrates with the Spring framework based on the namespace mechanism. To use this mechanism, you need to import the corresponding Maven dependency:

<dependency>
    <groupId>org.apache.shardingsphere</groupId>
    <artifactId>sharding-jdbc-spring-namespace</artifactId>
</dependency>

The namespace mechanism in Spring is essentially the addition of customized configuration items based on the XML Schema of the Spring configuration file and their parsing. Therefore, we will see a series of custom configuration items related to sharding in the XML configuration file. For example, the initialization process of DataSource is equivalent to creating a Java Bean:

<bean id="ds0" class="com.alibaba.druid.pool.DruidDataSource">
    <property name="driverClassName" value="com.mysql.jdbc.Driver"/>
    <property name="url" value="jdbc:mysql://localhost:3306/ds0"/>
    <property name="username" value="root"/>
    <property name="password" value="root"/>
</bean>

Next, we can initialize the corresponding sharding rules through a series of configuration items and finally complete the creation of the target DataSource:

<!-- Create sharding configuration -->
<sharding:inline-strategy id="databaseStrategy" sharding-column="sex" algorithm-expression="ds${sex % 2}" />

<!-- Create table sharding configuration -->
<sharding:inline-strategy id="tableStrategy" sharding-column="id" algorithm-expression="user${id % 2}" />

<!-- Create distributed primary key generation configuration -->
<bean:properties id="properties">
    <prop key="worker.id">33</prop>
</bean:properties>
<sharding:key-generator id="keyGenerator" type="SNOWFLAKE" column="id" props-ref="properties" />

<!-- Create sharding rule configuration -->
<sharding:data-source id="shardingDataSource">
    <sharding:sharding-rule data-source-names="ds0, ds1">
        <sharding:table-rules>
            <sharding:table-rule logic-table="user" actual-data-nodes="ds${0..1}.user${0..1}" database-strategy-ref="databaseStrategy" table-strategy-ref="tableStrategy" key-generator-ref="keyGenerator" />
        </sharding:table-rules>
    </sharding:sharding-rule>
</sharding:data-source>

We will discuss the content of these configuration items in detail in the next lesson.

Spring Boot #

If you are using the Spring Boot framework, you will also need to write some configuration items. In Spring Boot, there are two forms of configuration organization: .yaml files and .properties files. Here is an example of configuring DataSource using .properties files:

spring.shardingsphere.datasource.names=ds0,ds1 
spring.shardingsphere.datasource.ds0.type=com.alibaba.druid.pool.DruidDataSource
spring.shardingsphere.datasource.ds0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds0.jdbc-url=jdbc:mysql://localhost:3306/ds0
spring.shardingsphere.datasource.ds0.username=root
spring.shardingsphere.datasource.ds0.password=root 
spring.shardingsphere.datasource.ds1.type=com.alibaba.druid.pool.DruidDataSource
spring.shardingsphere.datasource.ds1.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds1.jdbc-url=jdbc:mysql://localhost:3306/ds1
spring.shardingsphere.datasource.ds1.username=root
spring.shardingsphere.datasource.ds1.password=root

With the DataSource, we can also set the corresponding database sharding strategy, table sharding strategy, and distributed primary key generation strategy:

spring.shardingsphere.sharding.default-database-strategy.inline.sharding-column=sex
spring.shardingsphere.sharding.default-database-strategy.inline.algorithm-expression=ds$->{sex % 2} 
spring.shardingsphere.sharding.tables.user.actual-data-nodes=ds$->{0..1}.user$->{0..1}
spring.shardingsphere.sharding.tables.user.table-strategy.inline.sharding-column=id
spring.shardingsphere.sharding.tables.user.table-strategy.inline.algorithm-expression=user$->{id % 2}
spring.shardingsphere.sharding.tables.user.key-generator.column=id
spring.shardingsphere.sharding.tables.user.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.user.key-generator.props.worker.id=33

It can be seen that compared to the namespace mechanism provided by Spring, the configuration style based on Spring Boot is relatively concise and clear, and easy to understand.

Once we provide these configuration items, we can directly inject a DataSource in the application to obtain JDBC objects such as Connection. However, in the daily development process, if we use the Spring and Spring Boot development frameworks, we generally do not directly use the native JDBC interface to operate the database, but rather achieve this through integration with common ORM frameworks. Let’s take a look.

Integration with ORM Frameworks #

In the Java field, mainstream ORM frameworks can be divided into two categories. One category follows the JPA (Java Persistence API) specification, with representative frameworks like Hibernate and TopLink. The other category completely adopts a custom approach to map objects to relations, with MyBatis as a representative.

Here, taking the Spring Boot development framework as an example, I will briefly introduce the integration methods of these two ORM frameworks. With the powerful automatic configuration mechanism provided by Spring Boot, we find that it is very simple to integrate these ORM frameworks.

JPA #

To use JPA in Spring Boot, we need to add the Maven dependency for spring-boot-starter-data-jpa in the pom.xml file:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>

Once the Maven dependency is added, Spring Boot will automatically import a series of utility packages such as spring-orm, hibernate-entity-manager, and spring-data-jpa. Then, we can add the JPA-related configurations in the application.properties configuration file:

spring.jpa.properties.hibernate.hbm2ddl.auto=create-drop
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
spring.jpa.properties.hibernate.show_sql=false

Of course, we need to define the JPA entity classes and repository interfaces in the business code, and complete the scanning for the packages in the Spring Boot startup class:

@ComponentScan("com.user.jpa")
@EntityScan(basePackages = "com.user.jpa.entity")
public class UserApplication

MyBatis #

For MyBatis, the integration steps are also similar. First, we need to add the Maven dependency:

<dependency>
    <groupId>org.mybatis.spring.boot</groupId>
    <artifactId>mybatis-spring-boot-starter</artifactId>
</dependency>

Then, since the startup of MyBatis depends on the dedicated configuration provided by the framework, usually we will organize these configurations in a separate configuration file and refer to it in the application.properties file:

mybatis.config-location=classpath:META-INF/mybatis-config.xml

In the mybatis-config.xml configuration file, it will include at least the definitions for various MyBatis mapper files:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE configuration
        PUBLIC "-//mybatis.org//DTD Config 3.0//EN"
        "http://mybatis.org/dtd/mybatis-3-config.dtd">
<configuration>
    <mappers>
        <mapper resource="mappers/UserMapper.xml"/>
    </mappers>
</configuration>

In the Mapper file, it contains the mapping between the entities and database schema required for running MyBatis, as well as the definitions of various SQL statements for database operations.

Finally, we also need to add the scanning mechanism for packages that include entity and repository definitions in the Spring Boot startup class:

@ComponentScan("com.user.mybatis")
@MapperScan(basePackages = "com.user.mybatis.repository")
public class UserApplication

In this way, we have finished introducing the various ways to use ShardingSphere in the business system.

Summary #

As an excellent open-source framework, ShardingSphere provides various integration methods for developers to use it in business systems for database sharding and table splitting. In this lesson, we first organized the application methods that an open-source framework should have, and analyzed the specific implementation mechanisms of these application methods in ShardingSphere. It can be seen that from the JDBC specification to the Spring and Spring Boot development frameworks, and then to mainstream ORM frameworks such as JPA and MyBatis, ShardingSphere provides comprehensive integration solutions.

Here’s a question for you to think about: What tools and standards does ShardingSphere provide for integration to achieve ease of use for developers?

On the other hand, in today’s lesson, we also saw that the main way to use ShardingSphere is actually to use its configuration system to create and set various configuration items. It can be said that the configuration work is the main work of using ShardingSphere for development.