Java Collection framework is a well-designed framework even without the Stream API. However, sometimes, writing a simple function, for example, finding an object in a collection based on some conditions, needs a for loop. Writing such a program is easy, but writing similar for-loop many times is boring. Before having the "for in" syntax surgar, the index calculation in a for loop is annoying. In addition, if the indices naming is not appropriate, debuging in a nested for loops is a terrible job. After using Apache Commons Collections, Apache Commons Collections has become the necessary library in my every project. Here is a simple example, to find an
Person object in a
List whose first name or last name matches the given value, the traditional way is like Code List 1 -- writing a for loop to check every person's first name and last name.
How about using Apache Commons Collections? The program is lised in Code List 2.a, and basically, a for-loop is not needed. Just call the
find method with an object that implements the
Predicate interface. Code List 2.b shows the implementation of the object. Only the
evaluate method is required to implement -- return
true if the given object matches the condition. That's all. What!? The lines of code become more than that of Code List 1. Yes. It does, but the
PersonNamePredicate is reusable and easy to test. In the
CollectionUtils class, there are 12 methods that use
Predicate object to filter, select, or count objects in a collection. Therefore, I think it is worth writing the class.
Well, the topic of this article is the new Stream API in Java 8. So how to find an object with the Stream API? The answer is shown in Code List 3. The lines of code are not reduced much. However, in comparison with Code List 1, Code List 3 can be interpreted as filter the objects out based on a condition and return the first one if it exists; otherwise return
null and the detail of the loop is ignored. So in semantic or readability, does this way improve the level of abstraction?
Is it possible to find an object like Code List 2.a, but with Stream API? Yes, it is possible. First, write a helper class
StreamUtils like Code List 4.a which provides a method
find(Collection, Predicate). Second, revise the
PersonNamePredicate as Code List 4.b, and then use just one line of code to find an object like Code List 4.c. Of course, if you do not want
PersonNamePredicate to support both Apache Commons Collections and Java Stream API, the
test method of
java.util.Predicate is the only method required to implement. What is the advantage to write so many codes? Besides using
parallelStream() as Code List 4.a may bring the advantage of parallel processing, this way does not bring much advantages. This reason is that the application (finding an object) is very simple, and using the Stream API is overkill.
The concept of Java Stream API is similar to the concept of Unix Pipeline or pipes and filters design pattern -- concatenating several simple operations to complete a meaning job. Since the operation is very simple, usually, using Lambda expression is concise and can improve the readability. As shown in Figure 1, Java Stream can concatenate serveral intermediate operations, and in the end, only one terminal operation as a pipeline. The intermediate operation is used to transforma the content of the stream, e.g., filtering (
filter(Predicate)), mapping (
map(Function), sorting (
sorted(Comparator)), etc. And the terminal operation is used to produce the final result from or perform side effect on the content of the stream, e.g., collecting (
collect(Collector)), applying something for each (
forEach(Consumer)), or reducing (
Figure 1 - Stream Pipeline
For example, a pipeline like Figure 2 can be used to summarize the assets of the rich persons who have assets of over 1 billion dollars. First, the
filter(Predicate) filters out the persons who have assets of over 1 billion dollars. Then, the
map(Function) extracts the value part of the assets. Finally, the
reduce(BinaryOperation) aggregates values as the result. In fact, these similar operations are frequently used. Therefore, in Java Stream API, the Collectors class provides frequently-used terminal operations, e.g.,
summarizingDouble(ToDoubleFunction) combining a
map(Function) intermediate operation and a terminal operation
reduce(BinaryOperation) with the default implementation to simplify the composition of a pipeline.
Figure 2 - Stream Pipeline Example
The example is not concrete enough? One more concrete example. Assume that
Exam represents a kind of examination, and a person can take an examination many times. Thus, in
List is used to keep all examinations taken by the examinee. How to get the rank of the examinees whose score was more than 700 in any taken examination? To eliminate duplicated code, the
getHighestScore() method like Code List 5 is added into
Person to get the highest score in the taken examination (using the Stream API, too).
Then, a method
showRank(List<Person>, double) can be written as Code List 6. The first parameter is the list of all examinees, and the second parameter is the score threshold required to show on the rank. The program first calls the
stream() method to obtain the
Stream object, and calls
filter(Predicate) method of the
Stream object to filter out the examinee whose score is under the threshold. Here, using the Lambda Expression to write the predicate function is intuitive and improves the readability. Call the
sorted(Comparator) method to sort the examinees based on the score, and then the
map(Function) method combining the examinee's fullname and score, e.g., "Spirit Tu: 840.0," as the result. Note that the element in the stream returned by the
map(Function) method is not
Person object anymore -- the element becomes a string object. Therefore, the Lambda Expression in
e represents a string and can be printed on the console directly. Finally, call
showRank(persons, 700) to show the rank of examinees who ever got score more than 700 in one examination. The entire process of Code List 6 can illustrated as the pipeline in Figure 3.
Figure 3 - The pipeline of Code List 6
Honestly, I feel very kind of Java Stream API because I studied visual dataflow language many years in graduate school. In VisualTPL (my study), the concept of loop is implicit. What to do is more important that how to do. In the same way, Java Stream API internalizes the loop, the importance of an operation is to do what. Both improve the abstraction level and readability largely. However, the features provided by Java Stream API are more than that described in this article. The next article will describe other features.