[Java] 스트림(Stream) 소화하기

💡 모던 자바의 꽃 스트림(Stream)을 제대로 활용해보자!

목표

Stream이 무엇인지 설명할 수 있다.
Stream이 기존 방식의 차이 및 장단점에 대해 설명할 수 있다.
Stream이 필요한 상황에서 적극 활용할 수 있다.

Stream

A sequence of elements supporting sequential and parallel aggregate operations.

컬렉션에서 맵과 리듀스 변환과 같이 순차 및 조합 작업을 지원하는 클래스다.

int sum = widgets.stream()
    .filter(w -> w.getColor() == RED)
    .mapToInt(w -> w.getWeight())
    .sum();

스트림 파이프라인(Stream pipeline)

0 또는 다수의 중개 오퍼레이션과 단 1개의 종료 오퍼레이션으로 구성된다.
스트림의 데이터 소스는 오직 종료 오퍼네이션을 실행할 때에만 처리되는데, 다른 의미로
중개 오퍼레이션은 종료 오퍼레이션을 만나기 전까지 지연 평가(lazy evaluation)된다고 한다.

중개 오퍼레이션(Intermediate operation)

Stream 객체를 리턴한다.
Stateless/Stateful 오퍼레이션으로 더 상세히 구분 가능하다.
(대부분 Stateless, distinct나 sorted처럼 이전 소스 데이터를 참조하는 것들이 Stateful)
예) filter, map, sorted, limit, ...

종료 오퍼레이션(Terminal operation)

Stream 객체가 아닌 종료 오퍼레이션 리턴 타입에 참조값을 반환한다.
종단 연산자로 반드시 1개만 존재한다.
예) collect, forEach, allMatch, count, min, max, ...

정리

종료형 오퍼레이션가 오지 않으면 중개형 오퍼레이션는 무의미하고 볼 수 있다. 『가장 빨리 만나는 코어 자바9』의 p306 내용을 인용하면 '스트림을 이용할 때는 일을 수행하는 방법이 아니라 하고 싶은 일은 명시한다'라고 적혀있다.

즉, 중개 오퍼레이션들은 무엇을 하고 싶은가를 정의만 한 것이라고 볼 수 있다. 종료형 오퍼레이션을 끝으로 개발자가 정의한 것들이 수행되고 결과를 도출하는 것이다.

아래의 예제를 보면, 하고 싶은 일은 명시한다는 것이 어떤 느낌인지 감을 잡을 수 있다.

for - 반복하다가 단어 길이 3보다 크면 증가시켜라.
stream - 단어 길이 3보다 큰 것을 필터링하고, 이를 카운팅해라.

Stream의 경우 helper methods를 읽는 것만으로도 무슨 의미인지 파악이 가능하다.

예제에 한하여 동일 스코프 내에서 처리로 더 읽기 쉽고 명료하다.

String[] words = new String[] { "a", "ab", "abc", "apple", "banana", "cherry" };
List<String> wordList = Arrays.asList(words);

// Using for
long matchedWordCountWithFor = 0;
for (String word : words) {
    if (word.length() > 3) {
        matchedWordCountWithFor++;
    }
}
System.out.println(matchedWordCountWithFor);

// Using stream
long matchedWordCountWithStream = wordList.stream()
    .filter(word -> word.length() > 3)
    .count();
System.out.println(matchedWordCountWithStream);

/** 
 * Result
 * > 3
 * > 3
 */

"종단 연산이 없는 스트림 파이프라인은 아무 일도 하지 않는 명령어인 no-op 과 같다", 『기계인간』

종료 오퍼레이션을 빼먹는 일은 없도록 하자!!!

예제

Combine Multiple Collections

외부 APIs로부터 데이터를 가져와 하나로 합칠 때, 다음과 같은 예제들을 활용할 수 있다.

Collection<String> collectionA = Arrays.asList("F", "U", "L", "L");
Collection<String> collectionB = Arrays.asList("C", "O", "D", "E", "R");

① Stream.concat(A, B)

Stream 클래스에서 제공하는 concat 메서드를 활용하는 방법이다.

Collection<String> combinedCollection = Stream.concat(
        collectionA.stream(),
        collectionB.stream())
        .collect(Collectors.toList());
combinedCollection.stream().forEach(x -> System.out.print(x + " "));
// F U L L C O D E R

② Stream.of(A, B).flatMap(Function<? super T, ? extends Stream<...>> mapper)

Stream 클래스에서 제공하는 of와 flatMap 메서드를 활용하는 방법이다.

Collection<String> combinedCollection = Stream.of(
        collectionA,
        collectionB)
        .flatMap(Collection::stream)
        .collect(Collectors.toList());
combinedCollection.stream().forEach(x -> System.out.print(x + " "));
// F U L L C O D E R

①과 ② 방식이 가장 직관적으로 Stream을 통해 두 개 이상의 Collection을 합치는 방식이다.

Stream.concat(A, B) Method의 경우, 2개의 콜렉션만을 합치는 반면,

Stream.of(T... values) Method의 경우, array를 파라미터로 받기 때문에 2개 이상의 콜렉션을 합칠 때 사용한다.

Collectors examples

종료 오퍼레이션 중 가장 많이 사용되는 것은 Stream API의 Stream.collect() 메서드 일 것이다.

어떤 종류의 컬렉션 구조로 만들 수 있는지 다음 예제를 통해 살펴보자.

import static java.util.stream.Collectors.toList;
import static java.util.stream.Collectors.toMap;
import static java.util.stream.Collectors.toSet;

List<String> list = Arrays.asList("abc", "cba", "cb", "abc", "ab");

Collectors.toList() & Collectors.toSet()

toList()는 모든 Stream 요소를 List 인스턴스로, toSet()은 Set 인스턴스로 수집한다.

List<String> result = list.stream().collect(toList());
Set<String> result = list.stream().collect(toSet());

Collectors.toMap()

toMap()은 Stream 요소를 Map 인스턴스로 수집한다. 이 경우, 다음의 두 가지 기능을 제공해야 한다.

keyMapper
valueMapper

keyMapper는 Stream 요소에서 Map 키를, valueMapper는 지정된 키와 관련된 값을 추출하는데 사용된다.

다음 예시는 문자열을 키로 저장하고 해당 문자열의 길이를 값으로 저장한다.

Map<String, Integer> result = list.stream()
    .collect(toMap(Function.identity(), String::length))

Function.identity()는 대체 무엇일까?

/**
 * Returns a function that always returns its input argument.
 *
 * @param <T> the type of the input and output objects to the function
 * @return a function that always returns its input argument
 */
static <T> Function<T, T> identity() {
    return t -> t;
}

이런 것도 기본적으로 지원하고 있었다...

만약, 컬렉션에 중복 요소가 포함되어 있으면 어떻게 될까?

toSet과 달리 toMap은 중복을 자동 필터링하지 않는다. 중복 키가 존재하면, IllegalStateException이 발생한다.

assertThatThrownBy(() -> {
    list.stream().collect(Collectors.toMap(Function.identity(), String::length));
}).isInstanceOf(IllegalStateException.class);

키 중복이 예상되거나 불가피할 경우, 다음과 같이 사용해야 한다.

Map<String, Integer> result = list.stream()
    .collect(
        Collectors.toMap(
        	Function.identity(),
        	String::length,
        	(item, identicalItem) -> item));

Stream.collect()의 세 번째 인수로 BinaryOperator이며 충돌 처리 방법을 지정할 수 있다.

Collectors.collectionAndThen()

내가 원하는 Collection으로 수집하고, 다른 작업을 수행하고 싶을 때는 어떻게 해야할까?

기가 막히게도 이 역시 제공한다.

List<String> defaultList = list.stream()
	.collect(Collectors.collectingAndThen(toList(), ImmutableList::copyOf));

위 예제는 Stream 요소를 List 수집 후 불변 리스트 인스턴스로 변환하는 것이다. (초기값 저장 활용 👍)

물론, 불변 객체를 만드는 방법은 다음과 같이도 사용한다.

List<String> defaultChars = list.stream()
    	.collect(Collectors.toUnmodifiableList());
        
List<String> defaultWords = list.stream().collect(
        Collectors.collectingAndThen(
                Collectors.toList(),
                Collections::unmodifiableList));

Collectors.joining()

Stream 요소를 전부 결합하여 문자열 반환한다.

String joinSimple = list.stream().collect(Collectors.joining());

String joinDelimiter = list.stream().collect(Collectors.joining(" "));

String joinDelimiterPrefixPostfix = list.stream().collect(joining(" ", "PRE-", "-POST"));

위 출력값은 각각 "abccbacbabcab", "abc cba cb abc ab", "PRE-abc cba cb abc ab-POST" 이다.

각 파라미터에 null 값은 NullPointerException을 발생시키니, NOT USED는 ""로 null-safe하게 처리하자.

Collectors.couning()

모든 스트림 요소를 카운팅한다.

Long count = list.stream().collect(counting()); // 5

Collectors.maxBy() / mainBy()

maxBy / minBy 는 Comparator 인스턴스에 따라 스트림의 가장 큰거나 작은 요소를 Optional로 반환한다.

Optional<String> maxString = list.stream()
        .collect(Collectors.maxBy(Comparator.reverseOrder()))
        .get();
// cba
Optional<String> maxString = list.stream()
        .collect(Collectors.maxBy(Comparator.reverseOrder()))
        .get();
// ab

Collectors.groupingBy()

GroupingBy 콜렉터는 특성별로 Stream 요소들를 그룹화하고, 결과를 Map 인스턴스에 저장한다.

다음은 문자열을 길이별로 그룹화하고 결과를 저장하는 예제다.

Map<Integer, Set<String>> lengthStatistics = list.stream()
	.collect(Collectors.groupingBy(String::length, Collectors.toSet()));
// {2=[ab, cb], 3=[abc, cba]}

assertThat(lengthStatistics)
	.containsEntry(2, newHashSet("ab", "cb"))
	.containsEntry(3, newHashSet("abc", "cba"));

참 신박하쥬?
이를 잘 활용한다면, 외부 API로부터 받아온 데이터를 그래프에 충분히 활용 가능하다 👍

Collectors.partitioningBy()

partitioningBy는 Predicate 인스턴스를 허용하고 Stream 요소 대상 부울 값을 키로 하고, 컬렉션을 값으로 저장하여 Map 인스턴스에 저장한다. groupingBy와 다른 점은 키(key)에 있음을 주의하자.

Map<Boolean, List<String>> partedList = list.stream()
	.collect(partitioningBy(s -> s.length() > 2));
// {false=[cb, ab], true=[abc, cba, abc]}

Collectors.teeing()

두 개의 수집기를 사용하고, 이를 결합하여 새로운 것을 만드는 일은 자주 발생한다.

다음 간단한 예제와 감이 안 잡히지만 이럴 때 쓰지 않을까 싶은 예제이다.

List<Integer> numbers = Arrays.asList(42, 19, 2, 35);

int sumOfMinMax = numbers.stream().collect(
        Collectors.teeing(
                Collectors.minBy(Integer::compareTo), // First collector
                Collectors.maxBy(Integer::compareTo), // Second collector
                (minBy, maxBy) -> minBy.get() + maxBy.get()
        ));
// 44

List<Employee> employeeList = Arrays.asList(
        new Employee(1, "Employee A", 80),
        new Employee(2, "Employee B", 90),
        new Employee(3, "Employee C", 75),
        new Employee(4, "Employee D", 100));
                
HashMap<String, Employee> employeePerformanceMap = employeeList.stream().collect(
    	Collectors.teeing(
                Collectors.maxBy(Comparator.comparing(Employee::getScore)),
                Collectors.minBy(Comparator.comparing(Employee::getScore)),
                (e1, e2) -> {
                    HashMap<String, Employee> map = new HashMap<>();
                    map.put("MAX", e1.get());
                    map.put("MIN", e2.get());
                    return map;
                }));
// {
// 	MIN=Employee[id=3, name=Employee C, score=75],
// 	MAX=Employee[id=4, name=Employee D, score=100]
// }

아직 제대로 활용해본적은 없으나 상당히 유용해 보인다 : )

단, Java 12부터 추가된 메서드임을 주의하자!!!

Number Data Collectors examples

Collectors는 숫자 데이터에 대해 여러가지 헬퍼 메서드들도 제공한다.

이름만 봐도 대충 사용 가능한 것들이다.

List<Integer> numberList = Arrays.asList(42, 19, 2, 35);

Collectors.summarizingDouble/Long/Int()

Stream에서 추출된 요소(숫자 데이터)에 대한 통계 정보를 포함하는 특수 클래스를 리턴하는 콜렉터다.

DoubleSummaryStatistics 라는 클래스이며, 다음과 같은 정보들을 출력할 수 있다.

DoubleSummaryStatistics summary = numberList.stream()
    .collect(Collectors.summarizingDouble(Integer::intValue));
System.out.println(summary.getAverage()); // 24.5
System.out.println(summary.getCount()); // 4
System.out.println(summary.getMax()); // 42.0
System.out.println(summary.getMin()); // 2.0
System.out.println(summary.getSum()); // 98.0

Collectors.averagingDouble/Long/Int()

Double average = numberList.stream()
    .collect(Collectors.averagingDouble(Integer::intValue));
// 24.5

Collectors.summingDouble/Long/Int()

Double sum = numberList.stream()
	.collect(Collectors.summingDouble(Integer::intValue));
// 98.0

출처

"더 자바, Java 8", 백기선
"Stream (Java Platform SE 8) (oracle.com)", Oracle Docs
"Java Stream의 사용 - 기계인간 John Grib", 기계인간(John Grib), 2019-09-24
Guide to Java 8 Collectors, Grzegorz Piwowarek, 2021-12-31

Since. 2022. 05. 25.

Last Updated. 2022. 06. 08.

블로그의 정보

배부른코딩로그

[Java] 스트림(Stream) 소화하기

목표

Stream

스트림 파이프라인(Stream pipeline)

중개 오퍼레이션(Intermediate operation)

종료 오퍼레이션(Terminal operation)

정리

예제

Combine Multiple Collections

Collectors examples

Number Data Collectors examples

출처

블로그의 정보

활동하기

티스토리툴바

목표

Stream

스트림 파이프라인(Stream pipeline)

중개 오퍼레이션(Intermediate operation)

종료 오퍼레이션(Terminal operation)

정리

예제

Combine Multiple Collections

Collectors examples

Number Data Collectors examples

출처

블로그의 정보

활동하기

공유하기

다른 글

티스토리툴바