32 Plus Lesson 2 Take You Through Some Important Concepts of Java 8 Course Part Two

32 Plus Lesson 2 Take You Through Some Important Concepts of Java 8 Course Part Two #

In the previous lesson, several examples have actually involved the basic usage of the Stream API. Today, I will give you a detailed introduction to the complex and powerful Stream API.

The Stream API is used for projection, transformation, filtering, sorting, and other operations on collections. Furthermore, these operations can be chained together in a manner similar to SQL statements, greatly simplifying the code. It can be said that Stream operations are the most important part of Java 8 and are used in most of the code in this course.

Let me explain first, some of the examples may not be easy to understand, so I suggest that you refer to the source code and check the method definitions of Stream operations, as well as the code comments in the JDK.

Detailed Explanation of Stream Operations #

In order to facilitate your understanding of various operations of Stream and the following examples, I have summarized the Stream operations involved in this lesson in a diagram. You can familiarize yourself with it first.

img

In the following discussion, I will focus on the order scenario and provide various API usages of Stream to complete functions such as order statistics, search, and query. We will learn various methods of Stream streaming operations together. You can understand the examples by combining the comments in the code or run the source code yourself to observe the output.

First, we define an Order class, an OrderItem class, and a Customer class as the data structure for the subsequent demo code:

// Order class

@Data
public class Order {

    private Long id;

    private Long customerId; // Customer ID

    private String customerName; // Customer name

    private List<OrderItem> orderItemList; // Order item details

    private Double totalPrice; // Total price

    private LocalDateTime placedAt; // Order placed time

}

// OrderItem class

@Data
@AllArgsConstructor
@NoArgsConstructor
public class OrderItem {

    private Long productId; // Product ID

    private String productName; // Product name

    private Double productPrice; // Product price

    private Integer productQuantity; // Product quantity

}

// Customer class

@Data
@AllArgsConstructor
public class Customer {

    private Long id;

    private String name; // Customer name

}

Here, we have an orders field that stores some mock data, with type List. I won’t include the code for generating mock data here, but it won’t affect your understanding of the code later. You can also download the source code to read it yourself.

Creating Streams #

To use streams, we need to create them first. There are generally five ways to create a stream:

  1. Convert a List or an array into a stream using the stream method.
  2. Create a stream by directly passing multiple elements using the Stream.of method.
  3. Use the Stream.iterate method to construct an infinite stream using iteration, and then limit the number of stream elements using limit.
  4. Use the Stream.generate method to construct an infinite stream by providing an element supplier from the outside, and then limit the number of stream elements using limit.
  5. Construct streams of primitive types using IntStream or DoubleStream.
// Convert a List or an array into a stream using the stream method

@Test
public void stream() {
    Arrays.asList("a1", "a2", "a3").stream().forEach(System.out::println);
    Arrays.stream(new int[]{1, 2, 3}).forEach(System.out::println);
}

// Create a stream by directly passing multiple elements using the Stream.of method

@Test
public void of() {
    String[] arr = {"a", "b", "c"};
    Stream.of(arr).forEach(System.out::println);
    Stream.of("a", "b", "c").forEach(System.out::println);
    Stream.of(1, 2, "a").map(item -> item.getClass().getName()).forEach(System.out::println);
}

// Use the Stream.iterate method to construct an infinite stream using iteration, and then limit the number of stream elements using limit

@Test
public void iterate() {
    Stream.iterate(2, item -> item * 2).limit(10).forEach(System.out::println);
    Stream.iterate(BigInteger.ZERO, n -> n.add(BigInteger.TEN)).limit(10).forEach(System.out::println);
}

// Use the Stream.generate method to construct an infinite stream by providing an element supplier from the outside, and then limit the number of stream elements using limit

@Test
public void generate() {
    Stream.generate(() -> "test").limit(3).forEach(System.out::println);
    Stream.generate(Math::random).limit(10).forEach(System.out::println);
}
// Creating streams of primitive types using IntStream or DoubleStream

@Test
public void primitive() {
    // Demonstrating IntStream and DoubleStream

    IntStream.range(1, 3).forEach(System.out::println);

    IntStream.range(0, 3).mapToObj(i -> "x").forEach(System.out::println);

    IntStream.rangeClosed(1, 3).forEach(System.out::println);

    DoubleStream.of(1.1, 2.2, 3.3).forEach(System.out::println);

    // Various type conversions, with output results as comments

    System.out.println(IntStream.of(1, 2).toArray().getClass()); // class [I

    System.out.println(Stream.of(1, 2).mapToInt(Integer::intValue).toArray().getClass()); // class [I

    System.out.println(IntStream.of(1, 2).boxed().toArray().getClass()); // class [Ljava.lang.Object;

    System.out.println(IntStream.of(1, 2).asDoubleStream().toArray().getClass()); // class [D

    System.out.println(IntStream.of(1, 2).asLongStream().toArray().getClass()); // class [J

    // Note the difference between streams of primitive types and boxed streams

    Arrays.asList("a", "b", "c").stream()   // Stream<String>
            .mapToInt(String::length)       // IntStream
            .asLongStream()                 // LongStream
            .mapToDouble(x -> x / 10.0)     // DoubleStream
            .boxed()                        // Stream<Double>
            .mapToLong(x -> 1L)             // LongStream
            .mapToObj(x -> "")              // Stream<String>
            .collect(Collectors.toList());
}

filter #

The filter method can be used to perform filtering operations, similar to the SQL “where” clause. We can use a single line of code to query all orders in the past six months with a total amount greater than 40 by chaining multiple filter methods together:

// Orders with a total amount greater than 40 in the past six months

orders.stream()
        .filter(Objects::nonNull) // Filter out null values
        .filter(order -> order.getPlacedAt().isAfter(LocalDateTime.now().minusMonths(6))) // Orders in the past six months
        .filter(order -> order.getTotalPrice() > 40) // Orders with a total amount greater than 40
        .forEach(System.out::println);  

If we don’t use Streams, we would need an intermediate collection to collect the filtered results, and all the filtering conditions would be piled up together, resulting in longer and less readable code.

map #

The map operation can be used for transformation or projection, similar to the SQL “select” clause. To illustrate, I will count the quantity of all products in the orders using two different approaches. The first approach uses two iterations, while the second approach uses two mapToLong+sum methods:

// Count the quantity of all products in the orders

// Approach with two iterations

LongAdder longAdder = new LongAdder();

orders.stream().forEach(order ->
        order.getOrderItemList().forEach(orderItem -> longAdder.add(orderItem.getProductQuantity())));

// Approach with two mapToLong+sum methods

assertThat(longAdder.longValue(), is(orders.stream().mapToLong(order ->
        order.getOrderItemList().stream()
                .mapToLong(OrderItem::getProductQuantity).sum()).sum()));

Clearly, the second approach doesn’t need the intermediate variable longAdder and is more intuitive.

Here, I would like to add that generating data using for loops is a common operation for us, and it is also extensively used in this course. Now, we can use a single line of code with IntStream and mapToObj to replace the for loop and generate a List of 10 Product elements:

// Transforming IntStream into Stream<Project>

System.out.println(IntStream.rangeClosed(1,10)
        .mapToObj(i->new Product((long)i, "product"+i, i*100.0))
        .collect(toList()));

flatMap #

Next, let’s look at flatMap, which is a combination of map and flat operations. It replaces each element with a stream and then flattens the stream.

For example, if we want to calculate the total price of all orders, there are two ways to do it:

Directly multiplying the number of products in the original product list by the product price, we can flatten the orders into a list of products by using flatMap, replace each Order with a Stream, and use mapToDouble to calculate the total price of each OrderItem. Finally, we sum up the prices.

Using flatMapToDouble to directly flatten each item in the list into a DoubleStream, which means transforming each order into the total price of each product, and then summing them up.

// Directly flatten order items and calculate the total price

System.out.println(orders.stream()
        .flatMap(order -> order.getOrderItemList().stream())
        .mapToDouble(item -> item.getProductQuantity() * item.getProductPrice()).sum());

// Another approach using flatMap+mapToDouble=flatMapToDouble

System.out.println(orders.stream()
        .flatMapToDouble(order ->
                order.getOrderItemList()
                        .stream().mapToDouble(item -> item.getProductQuantity() * item.getProductPrice()))
        .sum());

These two methods can achieve the same result with no essential difference.

sorted #

The sorted operation can be used for in-line sorting scenarios, similar to the order by clause in SQL. For example, to retrieve the top 5 orders with a total price greater than 50, you can specify the sorting criteria by referring to the Order::getTotalPrice method and reverse the order using reversed():

// Retrieve the top 5 orders with a total price greater than 50
orders.stream()
    .filter(order -> order.getTotalPrice() > 50)
    .sorted(comparing(Order::getTotalPrice).reversed())
    .limit(5)
    .forEach(System.out::println);

distinct #

The distinct operation is used for deduplication, similar to the distinct keyword in SQL. For example, the following code can be used to:

  • Retrieve the distinct customers who have placed orders. It uses the map function to extract the customer names from the orders and then applies distinct:
// Distinct customers who have placed orders
System.out.println(orders.stream()
    .map(order -> order.getCustomerName())
    .distinct()
    .collect(joining(",")));
  • Retrieve the distinct product names that have been purchased. It uses flatMap+map to extract all product names from the orders and then applies distinct:
// All distinct products that have been purchased
System.out.println(orders.stream()
    .flatMap(order -> order.getOrderItemList().stream())
    .map(OrderItem::getProductName)
    .distinct()
    .collect(joining(",")));

skip & limit #

The skip and limit operations are used for pagination, similar to the limit clause in SQL. skip is used to skip a certain number of items, while limit is used to limit the total number of items. For example, the following code snippets:

  • Retrieve the customer names and order times of the first 2 orders, sorted by the order time.
// Retrieve the customer names and order times of the first 2 orders, sorted by the order time
orders.stream()
    .sorted(comparing(Order::getPlacedAt))
    .map(order -> order.getCustomerName() + "@" + order.getPlacedAt())
    .limit(2)
    .forEach(System.out::println);
  • Retrieve the customer names and order times of the 3rd and 4th orders, sorted by the order time.
// Retrieve the customer names and order times of the 3rd and 4th orders, sorted by the order time
orders.stream()
    .sorted(comparing(Order::getPlacedAt))
    .map(order -> order.getCustomerName() + "@" + order.getPlacedAt())
    .skip(2)
    .limit(2)
    .forEach(System.out::println);
.collect(summingInt(OrderItem::getProductQuantity)))));
    
    

You can see that these 6 operations can be achieved in one line of code using the Stream approach, but if implemented using non-Stream approach, it would require several lines or even a dozen lines of code.

I have summarized some commonly used static methods of the Collectors class into a figure, you can organize your thoughts further:

![img](../images/5af5ba60d7af2c8780b69bc6c71cf3de.png)

Among them, groupBy and partitionBy are more complex, so let me give you some examples.

### groupBy

groupBy is a grouping and aggregation operation, similar to the "group by" clause in SQL. It, along with partitioningBy introduced later, are special collectors and also terminal operations. Grouping operations are more complex, so I have prepared 8 examples to help you understand them better:

The first example groups by customer name, uses the Collectors.counting method to count the number of orders for each person, and then outputs them in descending order by the number of orders.

The second example groups by customer name, uses the Collectors.summingDouble method to calculate the total order amount, and then outputs them in descending order by the total amount.

The third example groups by customer name, uses the Collectors.summingInt method twice to calculate the total quantity of products purchased, and then outputs them in descending order by the total quantity.

The fourth example calculates the most purchased product. First, use flatMap to transform orders into products, then group the product names as Keys and use Collectors.summingInt as Values to group and calculate the purchase quantity, and finally sort the Map in reverse order and get the first Entry, and then query the Key to get the best-selling product.

The fifth example calculates the most purchased product in the same way as the fourth example. Instead of sorting the Map, this time we directly use the Collectors.maxBy collector to obtain the largest Entry.

The sixth example groups by customer name and calculates the order with the highest amount for each user. The Key is the customer name, and the Value is the Order. Use the Collectors.maxBy method to directly obtain the order with the highest amount, and then use collectingAndThen to extract the contents of Optional.get, and finally iterate over the Key/Value to get the desired result.

The seventh example groups and counts the order ID lists according to the order year and month. The Key is the formatted order date as year and month, and the Value is directly converted to a List of order IDs using the Collectors.mapping method.

The eighth example groups and counts the order ID lists according to the order year and month + customer name. Compared to the previous example, this example has an extra grouping operation. The second grouping is based on customer name.

```java
// Group by customer name and count the number of orders
System.out.println(orders.stream().collect(groupingBy(Order::getCustomerName, counting()))
        .entrySet().stream().sorted(Map.Entry.<String, Long>comparingByValue().reversed()).collect(toList()));

// Group by customer name and calculate the total order amount
System.out.println(orders.stream().collect(groupingBy(Order::getCustomerName, summingDouble(Order::getTotalPrice)))
        .entrySet().stream().sorted(Map.Entry.<String, Double>comparingByValue().reversed()).collect(toList()));

// Group by customer name and calculate the total quantity of products purchased
System.out.println(orders.stream().collect(groupingBy(Order::getCustomerName,
        summingInt(order -> order.getOrderItemList().stream()
                .collect(summingInt(OrderItem::getProductQuantity)))))
        .entrySet().stream().sorted(Map.Entry.<String, Integer>comparingByValue().reversed()).collect(toList()));

// Calculate the most popular product, and get the first one after sorting in reverse order
orders.stream()
        .flatMap(order -> order.getOrderItemList().stream())
        .collect(groupingBy(OrderItem::getProductName, summingInt(OrderItem::getProductQuantity)))
        .entrySet().stream()
        .sorted(Map.Entry.<String, Integer>comparingByValue().reversed())
        .map(Map.Entry::getKey)
        .findFirst()
        .ifPresent(System.out::println);

// Calculate the most popular product in another way, directly use maxBy
orders.stream()
        .flatMap(order -> order.getOrderItemList().stream())
        .collect(groupingBy(OrderItem::getProductName, summingInt(OrderItem::getProductQuantity)))
        .entrySet().stream()
        .collect(maxBy(Map.Entry.comparingByValue()))
        .map(Map.Entry::getKey)
        .ifPresent(System.out::println);

// Group by customer name and select the order with the highest total amount for each user
orders.stream()
        .collect(groupingBy(Order::getCustomerName, collectingAndThen(maxBy(comparingDouble(Order::getTotalPrice)), Optional::get)))
        .forEach((k, v) -> System.out.println(k + "#" + v.getTotalPrice() + "@" + v.getPlacedAt()));

// Group and count the order ID lists according to the order year and month
System.out.println(orders.stream().collect
        (groupingBy(order -> order.getPlacedAt().format(DateTimeFormatter.ofPattern("yyyyMM")),
                mapping(order -> order.getId(), toList()))));

// Group and count the order ID lists according to the order year and month + customer name
System.out.println(orders.stream().collect
        (groupingBy(order -> order.getPlacedAt().format(DateTimeFormatter.ofPattern("yyyyMM")),
                groupingBy(order -> order.getCustomerName(),
                        mapping(order -> order.getId(), toList())))));

If you do not use the Stream and convert it to regular Java code, implementing these complex operations may require dozens of lines of code.

partitionBy #

partitioningBy is used for partitioning, which is a special form of grouping where there are only two groups: true and false. For example, we can partition customers into two groups based on whether they have placed orders or not. The partitioningBy method takes a Predicate as the data partitioning criteria, and the output is a Map<Boolean, List<T>>. For example:

public static <T> Collector<T, ?, Map<Boolean, List<T>>> partitioningBy(Predicate<? super T> predicate) {
    return partitioningBy(predicate, toList());
}

Let’s test it out. By combining partitioningBy with anyMatch, we can partition customers into two groups: those who have placed orders and those who haven’t.

// Partition customers based on whether they have placed orders
System.out.println(Customer.getData().stream().collect(
        partitioningBy(customer -> orders.stream().mapToLong(Order::getCustomerId)
                .anyMatch(id -> id == customer.getId()))));

Key Review #

Today, I have used a considerable amount of space and examples to introduce many specific stream operations in Stream. Some of the examples may be difficult to understand, so I suggest you refer to the source code to see the method definitions of these operations and the code comments in the JDK.

Finally, I suggest you think about what information you need to calculate using SQL in your daily work, and whether these SQL queries can also be rewritten using Stream. The Stream API is extensive and profound, but there are patterns to follow. The key is to understand the functional interface definitions of these API parameters, so that we can understand whether we need to provide, consume, or transform data. To master the Stream methods, it is necessary to test and practice more to reinforce memory and deepen understanding.

I have put the code used today on GitHub, and you can click on this link to view it.

Reflection and Discussion #

Using Stream makes it very convenient to perform various operations on List. Is there a way to observe the data changes throughout the entire process? For example, how can we observe the original data of map after performing filter+map operations?

The Collectors class provides many ready-to-use collectors. Is there a way to implement a custom collector? For example, implement a MostPopularCollector to get the most frequently occurring element in a List, satisfying the following two test cases:

assertThat(Stream.of(1, 1, 2, 2, 2, 3, 4, 5, 5).collect(new MostPopularCollector<>()).get(), is(2));

assertThat(Stream.of('a', 'b', 'c', 'c', 'c', 'd').collect(new MostPopularCollector<>()).get(), is('c'));

Regarding Java 8, do you have any insights or experiences to share? I am Zhu Ye. Feel free to leave a comment in the comment section to share your thoughts. You are also welcome to share this article with your friends or colleagues for further discussion.