19 Routing Engine How to Integrate Various Routing Strategies and Algorithms in the Routing Process

19 Routing Engine How to Integrate Various Routing Strategies and Algorithms in the Routing Process #

In the previous lesson “18 | Routing Engine: How to implement data access sharding routing and broadcast routing?”, when introducing the ShardingRule object, we introduced the sharding strategy ShardingStrategy in the ShardingSphere routing engine, which is a core concept in the routing engine and directly affects the final routing result. Today, we will discuss this core concept.

Overall Structure of Sharding Strategy #

Let’s first take a look at the definition of the sharding strategy ShardingStrategy. ShardingStrategy is located in the org.apache.shardingsphere.core.strategy.route package of the sharding-core-common project, and its definition is as follows:

public interface ShardingStrategy {

    // Get sharding column
    Collection<String> getShardingColumns();

    // Perform sharding
    Collection<String> doSharding(Collection<String> availableTargetNames, Collection<RouteValue> shardingValues);
}

We can see that ShardingStrategy includes two core methods: one for specifying the sharding column, and the other for performing sharding and returning the target DataSource and Table. ShardingSphere provides a series of sharding strategy implementations for us, and the class hierarchy is as follows:

Drawing 0.png

ShardingStrategy implementation class diagram

If we look at the code of these specific ShardingStrategy implementation classes, we will find that each ShardingStrategy contains another core concept related to routing, which is the sharding algorithm ShardingAlgorithm. We can see that ShardingAlgorithm is an empty interface, but it includes four inherited interfaces, namely:

  • PreciseShardingAlgorithm
  • RangeShardingAlgorithm
  • ComplexKeysShardingAlgorithm
  • HintShardingAlgorithm

And these four interfaces each have a group of implementation classes. The class hierarchy of ShardingAlgorithm is as follows:

Drawing 1.png

ShardingAlgorithm sub-interfaces and implementation class diagram

Please note that there is not a one-to-one relationship between ShardingStrategy and ShardingAlgorithm. In a ShardingStrategy, multiple ShardingAlgorithms can be used to perform specific routing execution strategies. Therefore, we have the following class hierarchy relationship diagram:

Drawing 2.png

Due to the independence of sharding algorithms, ShardingSphere separates them. From a relationship perspective, the sharding strategy includes the sharding algorithm and sharding key. We can simply abstract the composition structure of the sharding strategy as follows:

Sharding Strategy = Sharding Algorithm + Sharding Key

Detailed Explanation of ShardingSphere Sharding Strategies #

In ShardingSphere, there are a total of five ShardingStrategy implementations:

  • None sharding strategy (NoneShardingStrategy)
  • Hint sharding strategy (HintShardingStrategy)
  • Standard sharding strategy (StandardShardingStrategy)
  • Complex sharding strategy (ComplexShardingStrategy)
  • Inline sharding strategy (InlineShardingStrategy)

Next, let’s discuss these ShardingStrategies one by one.

1. None Sharding Strategy (NoneShardingStrategy) #

This time we start with something simple. Let’s take a look at NoneShardingStrategy, which is a strategy that does not perform sharding. Its implementation is as follows:

public final class NoneShardingStrategy implements ShardingStrategy {

    private final Collection<String> shardingColumns = Collections.emptyList();

    @Override
    public Collection<String> doSharding(final Collection<String> availableTargetNames, final Collection<RouteValue> shardingValues) {
        return availableTargetNames;
    }
}

In NoneShardingStrategy, we can see that it directly returns the input availableTargetNames without performing any specific routing operations.

2. Hint Sharding Strategy (HintShardingStrategy) #

Next, let’s take a look at HintShardingStrategy. Recall that in the previous lesson, we used this ShardingStrategy to determine whether to perform routing based on a Hint. We know that in some scenarios, the sharding field is not determined by the SQL itself, but depends on other external conditions. In this case, we can use the SQL Hint to flexibly inject the sharding field.

For information about the concept of Hint and the application method of pre-routing, please refer to the content in the lesson “[07 | Data Sharding: How to implement database sharding, table sharding, database sharding + table sharding, and forced routing (Part 2)?]”.

Based on HintShardingStrategy, we can perform sharding strategies by relying on Hint instead of SQL parsing. The implementation of HintShardingStrategy depends on the HintShardingAlgorithm, which extends the ShardingAlgorithm interface.

Its definition is as follows, and we can see that this interface also has a doSharding method:

public interface HintShardingAlgorithm<T extends Comparable<?>> extends ShardingAlgorithm {

    // Perform sharding based on Hint information
    Collection<String> doSharding(Collection<String> availableTargetNames, HintShardingValue<T> shardingValue);
}

For Hint, because it actually directly intervenes in the execution process of SQL, it often performs direct routing based on the incoming availableTargetNames. Therefore, let’s take a look at the only implementation class DefaultHintShardingAlgorithm of the HintShardingAlgorithm interface in ShardingSphere:

public final class DefaultHintShardingAlgorithm implements HintShardingAlgorithm<Integer> {
@Override 
public Collection<String> doSharding(final Collection<String> availableTargetNames, final HintShardingValue<Integer> shardingValue) { 
    return availableTargetNames; 
} 
}

We can see that the execution of this sharding algorithm is indeed based on availableTargetNames, but it just returns directly. So for HintShardingStrategy, it actually does not perform any routing effect by default. The complete implementation of HintShardingStrategy is as follows:

public final class HintShardingStrategy implements ShardingStrategy { 

    @Getter 
    private final Collection<String> shardingColumns; 

    private final HintShardingAlgorithm shardingAlgorithm; 

    public HintShardingStrategy(final HintShardingStrategyConfiguration hintShardingStrategyConfig) { 
        Preconditions.checkNotNull(hintShardingStrategyConfig.getShardingAlgorithm(), "Sharding algorithm cannot be null."); 
        shardingColumns = new TreeSet<>(String.CASE_INSENSITIVE_ORDER); 
        // Get HintShardingAlgorithm from the configuration
        shardingAlgorithm = hintShardingStrategyConfig.getShardingAlgorithm(); 
    } 

    @SuppressWarnings("unchecked") 
    @Override 
    public Collection<String> doSharding(final Collection<String> availableTargetNames, final Collection<RouteValue> shardingValues) { 
        ListRouteValue shardingValue = (ListRouteValue) shardingValues.iterator().next(); 
        Collection<String> shardingResult = shardingAlgorithm.doSharding(availableTargetNames,
                new HintShardingValue(shardingValue.getTableName(), shardingValue.getColumnName(), shardingValue.getValues())); 
        Collection<String> result = new TreeSet<>(String.CASE_INSENSITIVE_ORDER); 
        result.addAll(shardingResult); 
        return result; 
    } 
}

We notice that in HintShardingStrategy, the construction of the shardingAlgorithm variable is done through the HintShardingStrategyConfiguration configuration class. Obviously, we can set the specific HintShardingAlgorithm through the configuration. In the daily development process, we generally need to implement a custom HintShardingAlgorithm and configure it.

《07 | 数据分片:如何实现分库、分表、分库+分表以及强制路由(下)?》 demonstrates this practice, and you can review it.

3. StandardShardingStrategy #

StandardShardingStrategy is a standard sharding strategy that provides support for =, >, <, >=, <=, IN, and BETWEEN AND operations in SQL statements.

We know that the sharding strategy is a combination of the sharding algorithm and the sharding key. For the StandardShardingStrategy, it supports only a single sharding key and provides the PreciseShardingAlgorithm and RangeShardingAlgorithm sharding algorithms.

  • PreciseShardingAlgorithm is required and is used to handle = and IN sharding operations.
  • RangeShardingAlgorithm is optional and is used to handle BETWEEN AND, >, <, >=, <= sharding operations.

Before introducing the StandardShardingStrategy, let’s discuss these two sharding algorithms involved.

(1) PreciseShardingAlgorithm

For PreciseShardingAlgorithm, this interface is used to handle = and IN sharding operations with a single key.

There are two implementation classes, PreciseModuloDatabaseShardingAlgorithm and PreciseModuloTableShardingAlgorithm, respectively. Obviously, the former is used for database-level sharding, and the latter is used for table operations. Their sharding methods are the same, that is, using the modulo operation. Take PreciseModuloDatabaseShardingAlgorithm as an example, its implementation is as follows:

public final class PreciseModuloDatabaseShardingAlgorithm implements PreciseShardingAlgorithm<Integer> { 

    @Override 
    public String doSharding(final Collection<String> availableTargetNames, final PreciseShardingValue<Integer> shardingValue) { 
        for (String each : availableTargetNames) { 
            // Perform modulus operation on the sharding value with 2 
            if (each.endsWith(shardingValue.getValue() % 2 + "")) { 
                return each; 
            } 
        } 
        throw new UnsupportedOperationException(); 
    } 
}

As we can see, here we perform a modulus calculation on PreciseShardingValue and compare it with the input availableTargetNames to determine the target database.

(2) RangeShardingAlgorithm

For RangeShardingAlgorithm, it is relatively more complicated. The RangeShardingAlgorithm also has two implementation classes: RangeModuloDatabaseShardingAlgorithm and RangeModuloTableShardingAlgorithm. Their naming and code style are very similar to the implementation classes of PreciseShardingAlgorithm.

Here, take RangeModuloDatabaseShardingAlgorithm as an example, and its implementation is as follows:

public final class RangeModuloDatabaseShardingAlgorithm implements RangeShardingAlgorithm<Integer> { 

    @Override 
    public Collection<String> doSharding(final Collection<String> availableTargetNames, final RangeShardingValue<Integer> shardingValue) { 
        Collection<String> result = new LinkedHashSet<>(availableTargetNames.size()); 
        // Determine the range of sharding based on the sharding value 
        for (Integer i = shardingValue.getValueRange().lowerEndpoint(); i <= shardingValue.getValueRange().upperEndpoint(); i++) { 
            for (String each : availableTargetNames) { 
                // Perform modulus operation on the sharding value with 2 and compare it with the target database 
                if (each.endsWith(i % 2 + "")) { 
                    result.add(each); 
                } 
            } 
        } 
        return result; 
    } 
}
Compared with PreciseModuloDatabaseShardingAlgorithm, there is an extra layer of for loop here, in which the calculation and comparison of each value from the lowerEndpoint() to the upperEndpoint() of the ValueRange are added.

(3) StandardShardingStrategy Class

After introducing the sharding algorithm, let's go back to the StandardShardingStrategy class and take a look at its doSharding method, as shown below:

```java
@Override 
public Collection<String> doSharding(final Collection<String> availableTargetNames, final Collection<RouteValue> shardingValues) { 
    RouteValue shardingValue = shardingValues.iterator().next(); 
    Collection<String> shardingResult = shardingValue instanceof ListRouteValue 
         //If the sharding value is a list, use the PreciseShardingAlgorithm
            ? doSharding(availableTargetNames, (ListRouteValue) shardingValue)
            //If the sharding value is a range, use the RangeShardingAlgorithm
            : doSharding(availableTargetNames, (RangeRouteValue) shardingValue); 
    Collection<String> result = new TreeSet<>(String.CASE_INSENSITIVE_ORDER); 
    result.addAll(shardingResult); 
    return result; 
}

It can be seen that different doSharding methods are executed based on the type of the input shardingValues. If the input is a ListRouteValue, the PreciseShardingAlgorithm is used as shown below:

private Collection<String> doSharding(final Collection<String> availableTargetNames, final ListRouteValue<?> shardingValue) { 
    Collection<String> result = new LinkedList<>(); 
    for (Comparable<?> each : shardingValue.getValues()) { 
       //Perform sharding using the PreciseShardingAlgorithm
        String target = preciseShardingAlgorithm.doSharding(availableTargetNames, new PreciseShardingValue(shardingValue.getTableName(), shardingValue.getColumnName(), each)); 
        if (null != target) { 
            result.add(target); 
        } 
    } 
    return result; 
}

If it is a RangeRouteValue, the RangeShardingAlgorithm is used as shown below:

private Collection<String> doSharding(final Collection<String> availableTargetNames, final RangeRouteValue<?> shardingValue) { 
        if (null == rangeShardingAlgorithm) { 
            throw new UnsupportedOperationException("Cannot find range sharding strategy in sharding rule."); 
        } 
        //Perform sharding using the RangeShardingAlgorithm
        return rangeShardingAlgorithm.doSharding(availableTargetNames,
                new RangeShardingValue(shardingValue.getTableName(), shardingValue.getColumnName(), shardingValue.getValueRange())); 
}

4. ComplexShardingStrategy #

In contrast to StandardShardingStrategy, which only supports single sharding keys, ShardingSphere’s website states that ComplexShardingStrategy supports multiple sharding keys.

The doSharding method of ComplexShardingStrategy is as follows:

public Collection<String> doSharding(final Collection<String> availableTargetNames, final Collection<RouteValue> shardingValues) { 
    Map<String, Collection<Comparable<?>>> columnShardingValues = new HashMap<>(shardingValues.size(), 1); 
    Map<String, Range<Comparable<?>>> columnRangeValues = new HashMap<>(shardingValues.size(), 1); 
    String logicTableName = ""; 
    for (RouteValue each : shardingValues) { 
        if (each instanceof ListRouteValue) { 
            //Construct ListRouteValue
            columnShardingValues.put(each.getColumnName(), ((ListRouteValue) each).getValues()); 
        } else if (each instanceof RangeRouteValue) { 
            //Construct RangeRouteValue
            columnRangeValues.put(each.getColumnName(), ((RangeRouteValue) each).getValueRange()); 
        } 
        logicTableName = each.getTableName(); 
    } 
    Collection<String> shardingResult = shardingAlgorithm.doSharding(availableTargetNames, new ComplexKeysShardingValue(logicTableName, columnShardingValues, columnRangeValues)); 
    Collection<String> result = new TreeSet<>(String.CASE_INSENSITIVE_ORDER); 
    result.addAll(shardingResult); 
    return result; 
}

Here, ListRouteValue and RangeRouteValue are constructed based on the input RouteValue, and then passed to the ComplexKeysShardingAlgorithm for calculation. Due to the complexity of the relationship between multiple sharding keys, ComplexShardingStrategy does not provide much encapsulation. Instead, it directly combines sharding key values and sharding operators and passes them to the sharding algorithm, which is fully implemented by the application developer, providing maximum flexibility.

Considering this, the only implementation class of ComplexKeysShardingAlgorithm in ShardingSphere, DefaultComplexKeysShardingAlgorithm, is very simple. Its code is as follows:

public final class DefaultComplexKeysShardingAlgorithm implements ComplexKeysShardingAlgorithm<Integer> { 

    @Override 
    public Collection<String> doSharding(final Collection<String> availableTargetNames, final ComplexKeysShardingValue<Integer> shardingValue) { 
        return availableTargetNames; 
    } 
}

It can be seen that DefaultComplexKeysShardingAlgorithm is the same as the implementation of NoneShardingStrategy. It means that it does nothing, and all the work needs to be designed and implemented by the developer.

5. InlineShardingStrategy #

Compared to the various sharding strategies introduced earlier, InlineShardingStrategy uses a special mechanism to implement routing.

We have used row expressions extensively in the introduction of sharding case studies, and we know that when using row expressions, we need to specify a sharding column shardingColumn and an expression like ds$->{user_id % 2}.

You may be curious about how ShardingSphere parses such expressions. Based on the variables defined by InlineShardingStrategy, we can find the answer to the question:

```markdown
// Sharding Column
private final String shardingColumn;
// Closure instance in Groovy
private final Closure<?> closure;

Apparently, ShardingSphere uses the Closure object in Groovy here. Groovy is a dynamically typed language that can be run on the JVM. It can be used for both object-oriented programming and as a pure scripting language. Using this language, we don't need to write too much code while still having access to closures and other features of dynamic languages. In terms of usage, it is basically the same as using Java code.

Based on the dynamic language features of Groovy, InlineShardingStrategy provides support for = and IN sharding operations in SQL statements, currently only supporting single sharding keys. For commonly used sharding algorithms such as ds$->{user_id % 2}, they can be used through simple configuration, avoiding tedious Java code development.

Let's go directly to the doSharding method of InlineShardingStrategy. The implementation process of this method is the same as that of the standard sharding strategy, StandardShardingStrategy. The only difference is that it needs to parse input parameters through Groovy to obtain the final routing result:

```java
private Collection<String> doSharding(final ListRouteValue shardingValue) {
    Collection<String> result = new LinkedList<>();
    for (PreciseShardingValue<?> each : transferToPreciseShardingValues(shardingValue)) {
        // Use the execute method to parse the final result
        result.add(execute(each));
    }
    return result;
}

The execute method here builds a Closure object in Groovy, sets the corresponding parsing strategy and the property to be parsed, and finally returns the parsed result:

private String execute(final PreciseShardingValue shardingValue) {
    // Build a Closure object in Groovy
    Closure<?> result = closure.rehydrate(new Expando(), null, null);
    result.setResolveStrategy(Closure.DELEGATE_ONLY);
    result.setProperty(shardingColumn, shardingValue.getValue());
    // Get the parsed result
    return result.call().toString();
}

Finally, as a summary, we need to note that all ShardingStrategy related classes are located in the org.apache.shardingsphere.core.strategy package of the sharding-core-common project:

Drawing 4.png

Package structure of ShardingStrategy related classes

All ShardingAlgorithm related classes are located in the org.apache.shardingsphere.api.sharding package of the sharding-core-api project:

Drawing 5.png

Package structure of ShardingAlgorithm related classes

As mentioned earlier, the creation of ShardingStrategy depends on ShardingStrategyConfiguration. ShardingSphere also provides a ShardingStrategyFactory class to create various concrete ShardingStrategies:

public final class ShardingStrategyFactory {

    public static ShardingStrategy newInstance(final ShardingStrategyConfiguration shardingStrategyConfig) {
        if (shardingStrategyConfig instanceof StandardShardingStrategyConfiguration) {
            return new StandardShardingStrategy((StandardShardingStrategyConfiguration) shardingStrategyConfig);
        }
        if (shardingStrategyConfig instanceof InlineShardingStrategyConfiguration) {
            return new InlineShardingStrategy((InlineShardingStrategyConfiguration) shardingStrategyConfig);
        }
        if (shardingStrategyConfig instanceof ComplexShardingStrategyConfiguration) {
            return new ComplexShardingStrategy((ComplexShardingStrategyConfiguration) shardingStrategyConfig);
        }
        if (shardingStrategyConfig instanceof HintShardingStrategyConfiguration) {
            return new HintShardingStrategy((HintShardingStrategyConfiguration) shardingStrategyConfig);
        }
        return new NoneShardingStrategy();
    }
}

The various ShardingStrategyConfigurations used here are also located in the org.apache.shardingsphere.api.sharding.strategy package of the sharding-core-api project:

Drawing 6.png

Package structure of ShardingStrategyConfiguration related classes

In this way, by explaining the route engine, we have touched upon a large number of source code in ShardingSphere.

So far, the content about the ShardingSphere route engine is basically introduced. As a summary, we have added the content of ShardingStrategy and ShardingAlgorithm to the timing diagram given in “Lesson 17 | Route Engine: How to Understand the Operating Mechanism of ShardingRouter, the Core Class of Sharding Engine?”:

Drawing 7.png

From Source Code Analysis to Daily Development #

In the process of designing software systems, when facing complex business scenarios, separation of concerns is always a design point to consider. ShardingSphere’s design and implementation of sharding strategy nicely illustrates this point.

The sharding strategy in ShardingSphere is actually a complex concept, but by separating the specific sharding algorithms and abstracting the ShardingAlgorithm interface, and establishing a flexible one-to-many relationship between ShardingStrategy and ShardingAlgorithm, we can better grasp the class hierarchy of the entire sharding strategy system. This separation of concerns mechanism can also be applied to daily development processes.

Summary and Preview #

Following the previous lesson, today we have comprehensively introduced the five sharding strategies and four sharding algorithms in ShardingSphere, as well as their combinations.

Drawing 9.png

The routing process in the ShardingSphere route engine relies on the functionality of these sharding strategies and sharding algorithms. Of course, as a highly extensible open-source framework, based on our own business needs, we can implement specific sharding algorithms and embed them into specific sharding strategies.

Here is a question for you to think about: How do sharding strategies and sharding algorithms collaborate in ShardingSphere? Feel free to discuss it with everyone in the comments, and I will provide feedback and answers.

Based on the route engine, the next lesson will enter another core stage of the ShardingSphere sharding engine, which is the rewrite engine.