Java-Stream distinct 去重

程序员文章站 2022-07-10 18:30:28

写在前面distinct()返回由该流的不同元素组成的流。distinct()是Stream接口的方法。distinct()使用hashCode() 和equals()方法来获取不同的元素。因此，我们的类必须实现hashCode()和equals()方法。对象去重代码实例未重写hashCode()&equals()public class Student { private String name; private int age; private int type...

写在前面

distinct()返回由该流的不同元素组成的流。distinct()是Stream接口的方法。distinct()使用hashCode() 和equals()方法来获取不同的元素。因此，我们的类必须实现hashCode()和equals()方法。

对象去重

代码实例未重写hashCode()&equals()

public class Student {
    private String name;
    private int age;
    private int type;

    public Student() {
    }

    public Student(String name, int age, int type) {
        this.name = name;
        this.age = age;
        this.type = type;
    }

    @Override
    public String toString() {
        return "Student{" +
                "name='" + name + '\'' +
                ", age=" + age +
                ", type=" + type +
                '}';
    }

    public static void main(String[] args) {
        List<Student> students = Arrays.asList(
                new Student("张三", 18, 1),
                new Student("李四", 20, 1),
                new Student("小明", 22, 3),
                new Student("Jack", 22, 3),
                new Student("Jack", 22, 3)
        );

        // 未重写 hashCode() & equals()
        students.stream().distinct().forEach(System.out::println);

        /** 输出 两个Jack一模一样
         * Student{name='张三', age=18, type=1}
         * Student{name='李四', age=20, type=1}
         * Student{name='小明', age=22, type=3}
         * Student{name='Jack', age=22, type=3}
         * Student{name='Jack', age=22, type=3}
         */
    }
}

重写hashCode()&equals()

public class Student {
    private String name;
    private int age;
    private int type;

    public Student() {
    }

    public Student(String name, int age, int type) {
        this.name = name;
        this.age = age;
        this.type = type;
    }

    @Override
    public String toString() {
        return "Student{" +
                "name='" + name + '\'' +
                ", age=" + age +
                ", type=" + type +
                '}';
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Student)) return false;
        Student student = (Student) o;
        return age == student.age &&
                type == student.type &&
                Objects.equals(name, student.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name, age, type);
    }

    public static void main(String[] args) {
        List<Student> students = Arrays.asList(
                new Student("张三", 18, 1),
                new Student("李四", 20, 1),
                new Student("小明", 22, 3),
                new Student("Jack", 22, 3),
                new Student("Jack", 22, 3)
        );

        // 重写 hashCode() & equals()
        students.stream().distinct().forEach(System.out::println);

        /** 输出 只有一个Jack
         * Student{name='张三', age=18, type=1}
         * Student{name='李四', age=20, type=1}
         * Student{name='小明', age=22, type=3}
         * Student{name='Jack', age=22, type=3}
         */
    }
}

结论：distinct()要想进行对象去重，必须实现对象的hashCode()&equals()方法

对象属性去重

1.利用`filter`筛选原理，不需要重写`hashCode()`&`equals()`方法

distinct()不提供按照属性对对象列表进行去重的直接实现。它是基于hashCode()和equals()工作的。如果我们想要按照对象的属性，对对象列表进行去重，我们可以通过其它方法来实现。如下代码段所示：

    public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor){
        Map<Object,Boolean> seen = new ConcurrentHashMap<>();
        return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
    }

代码实例

public class Student {
    private String name;
    private int age;
    private int type;

    public Student() {
    }

    public Student(String name, int age, int type) {
        this.name = name;
        this.age = age;
        this.type = type;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }

    public int getType() {
        return type;
    }

    public void setType(int type) {
        this.type = type;
    }

    @Override
    public String toString() {
        return "Student{" +
                "name='" + name + '\'' +
                ", age=" + age +
                ", type=" + type +
                '}';
    }

    // putIfAbsent() key存在时不添加到map并返回value，不存在时则添加返回null
    public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor){
        Map<Object,Boolean> seen = new ConcurrentHashMap<>();
        return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
    }

    public static void main(String[] args) {
        List<Student> students = Arrays.asList(
                new Student("张三", 18, 1),
                new Student("李四", 20, 1),
                new Student("小明", 22, 3),
                new Student("Jack", 22, 3),
                new Student("Jack", 22, 3)
        );
        // 单属性去重 按照age去重
        students.stream().filter(distinctByKey(Student::getAge)).forEach(System.out::println);
        /** 输出 age不同，有三个
         * Student{name='张三', age=18, type=1}
         * Student{name='李四', age=20, type=1}
         * Student{name='小明', age=22, type=3}
         */

        System.out.println("---------cut line------------");

        // 多属性去重 按照name+age+type去重
        students.stream().filter(distinctByKey(Student::getName)).filter(distinctByKey(Student::getAge))
                .filter(distinctByKey(Student::getType)).forEach(System.out::println);
        /** 输出 name+age+type不同，有2个
         * Student{name='张三', age=18, type=1}
         * Student{name='小明', age=22, type=3}
         */
    }
}

2.利用`TreeSet`特性

代码实例
利用TreeSet不可重复的特性，不需要重写hashCode()&equals()方法

public class Student {
    private String name;
    private int age;
    private int type;

    public Student() {
    }

    public Student(String name, int age, int type) {
        this.name = name;
        this.age = age;
        this.type = type;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }

    public int getType() {
        return type;
    }

    public void setType(int type) {
        this.type = type;
    }

    @Override
    public String toString() {
        return "Student{" +
                "name='" + name + '\'' +
                ", age=" + age +
                ", type=" + type +
                '}';
    }

    //基于 TreeSet 和 collectingAndThen(先获取数据，然后再对数据进行操作)，不需要重写 hashcode() 和 equals()
    // 注意：打破了原来的添加顺序
    public static void main(String[] args) {
        List<Student> students = Arrays.asList(
                new Student("张三", 18, 1),
                new Student("李四", 20, 1),
                new Student("小明", 22, 3),
                new Student("Jack", 22, 3),
                new Student("Jack", 22, 3)
        );

        // --------单属性去重 按照age去重---------
        TreeSet<Student> treeSet = students.stream().collect(Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(Student::getAge))));
        treeSet.forEach(System.out::println);
        /** 输出 age不同的，有三个
         * Student{name='张三', age=18, type=1}
         * Student{name='李四', age=20, type=1}
         * Student{name='小明', age=22, type=3}
         */
        System.out.println("-----------cut line--------");

        // 想把结果再转化成List，可以使用 collectingAndThen()方法 该方法先获取数据，然后再对数据进行操作
        List<Student> unique = students.stream().collect(
                Collectors.collectingAndThen(
                        Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(Student::getAge))), ArrayList::new)
        );
        unique.forEach(System.out::println);
        /** 输出 age不同的，有三个
         * Student{name='张三', age=18, type=1}
         * Student{name='李四', age=20, type=1}
         * Student{name='小明', age=22, type=3}
         */

        System.out.println("-----------cut line--------");

        // --------多属性去重 按照name+age+type去重--------
        List<Student> unique1 = students.stream().collect(
                Collectors.collectingAndThen(
                        Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(o -> o.getName() + "-" + o.getAge() + "-" + o.getType()))), ArrayList::new)
        );
        unique1.forEach(System.out::println);
        /** 输出 name+age+type都不同的，有四个
         * Student{name='Jack', age=22, type=3}
         * Student{name='小明', age=22, type=3}
         * Student{name='张三', age=18, type=1}
         * Student{name='李四', age=20, type=1}
         */
    }
}

补充知识 `Comparator` 比较器

TreeSet有一个构造方法参数是 new TreeSet(Comparator，<? super E> comparator)，
Comparator接口中有一个静态方法comparing()，返回值类型就是Comparator，所以在Stream流中使用这个方法创建比较器，可以达到去重目的。

@Test
public void test(){
    // 第一种
    Comparator<Book> comparator = (o1, o2) -> o1.getPrice() - o2.getPrice();
    int compare = comparator.compare(new Book("Java", 20.0), new Book("C++", 20.0));
    
    // 第二种
    Comparator<Book> comparing = Comparator.comparing(book -> book.getPrice());
    int compare2 = comparing.compare(new Book("Java", 20.0), new Book("C++", 20.0));

    // 输出是结果一样的
    System.out.println("compare = " + compare);
    System.out.println("compara2 = " + compare2);
}

为了在Stream中书写简单方便，故使用第二种写法

@Test
public void test(){
    List<Book> books = Arrays.asList(
        new Book("Java", 20.0),
        new Book("Java", 30.0),
        new Book("C++", 20.0)
    );

    Comparator<Book> comparing = Comparator.comparing(book -> book.getPrice());
    TreeSet<Book> treeSet = books.stream().collect(
        Collectors.toCollection(
            () -> new TreeSet<>(comparing)
    ));
        
    // 合并两行
    TreeSet<Book> treeSet = books.stream().collect(
        Collectors.toCollection(
            () -> new TreeSet<>(Comparator.comparing(book -> book.getPrice()))
    ));
        
    treeSet.forEach(System.out::println);
    //输出 （根据price去重）
    //Book{name='Java', price=20.0}
    //Book{name='C++', price=20.0}
}

本文地址：https://blog.csdn.net/qiuwen_521/article/details/109629204

Java-Stream distinct 去重

写在前面

对象去重

对象属性去重

1.利用`filter`筛选原理，不需要重写`hashCode()`&`equals()`方法

2.利用`TreeSet`特性

补充知识 `Comparator` 比较器

Shell实现文本去重并操持原有顺序

[PHP] PHP多个进程配合redis的有序集合实现大文件去重

JavaScript 高性能数组去重

python实现文本去重且不打乱原本顺序

javascript数组常见操作方法实例总结【连接、添加、删除、去重、排序等】

使用Python检测文章抄袭及去重算法原理解析

python3.4.3下逐行读入txt文本并去重的方法

Js数组去重的5个方法

pandas 实现将重复表格去重,并重新转换为表格的方法

梳理js数组去重中代码比较简洁的方案

Java-Stream distinct 去重

写在前面

对象去重

对象属性去重

1.利用filter筛选原理，不需要重写hashCode()&equals()方法

2.利用TreeSet特性

补充知识 Comparator 比较器

Shell实现文本去重并操持原有顺序

[PHP] PHP多个进程配合redis的有序集合实现大文件去重

JavaScript 高性能数组去重

python实现文本去重且不打乱原本顺序

javascript数组常见操作方法实例总结【连接、添加、删除、去重、排序等】

使用Python检测文章抄袭及去重算法原理解析

python3.4.3下逐行读入txt文本并去重的方法

Js数组去重的5个方法

pandas 实现将重复表格去重,并重新转换为表格的方法

梳理js数组去重中代码比较简洁的方案

1.利用`filter`筛选原理，不需要重写`hashCode()`&`equals()`方法

2.利用`TreeSet`特性

补充知识 `Comparator` 比较器