欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

【统计去重】扫描HBase中埋点数据并去重&计算内存占用

程序员文章站 2024-02-24 10:33:16
...
    public static void main(String[] args) {
        BloomFilter<CharSequence> bloomFilterCard =
                BloomFilter.create(Funnels.stringFunnel(Charset.forName("UTF-8")),
                        100000,
                        0.00001f);
        int count = 0;
        for (int i = 0; i < 1000000; i++) {
            int temp = RandomUtils.nextInt(100);
            if (!bloomFilterCard.mightContain(String.valueOf(temp))) {
                count++;
                bloomFilterCard.put(String.valueOf(temp));
            }
        }
        System.out.println(count);
        System.out.println(GraphLayout.parseInstance(bloomFilterCard).totalSize());
    }

 

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>18.0</version>
</dependency>
<dependency>
    <groupId>org.openjdk.jol</groupId>
    <artifactId>jol-core</artifactId>
    <version>0.10</version>
</dependency>