欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Storm Trident示例Aggregator

程序员文章站 2022-06-18 23:18:01
...

Aggregator首先在输入流上运行全局重新分区操作(global)将同一批次的所有分区合并到一个分区中,然后在每个批次上运行的聚合功能,针对Batch操作。与ReduceAggregator很相似。

省略部分代码,省略部分可参考:https://blog.csdn.net/nickta/article/details/79666918

static class State {
    	int count = 0;
    }
FixedBatchSpout spout = new FixedBatchSpout(new Fields("user", "score"), 3,  
                new Values("nickt1", 4), 
                new Values("nickt2", 7),  
                new Values("nickt3", 8), 
                new Values("nickt4", 9),  
                new Values("nickt5", 7), 
                new Values("nickt6", 11), 
                new Values("nickt7", 5) 
                ); 
        spout.setCycle(false); 
        TridentTopology topology = new TridentTopology(); 
        topology.newStream("spout1", spout) 
                .shuffle() 
                .each(new Fields("user", "score"),new Debug("shuffle print:"))
                .parallelismHint(5)
                .aggregate(new Fields("score"), new BaseAggregator<State>() {
                	//在处理每一个batch的数据之前,调用1次
                	//空batch也会调用
					@Override
					public State init(Object batchId, TridentCollector collector) {
						return new State();
					}
					//batch中的每个tuple各调用1次
					@Override
					public void aggregate(State state, TridentTuple tuple, TridentCollector collector) {
						state.count = tuple.getInteger(0) + state.count;
					}
					//batch中的所有tuples处理完成后调用 
					@Override
					public void complete(State state, TridentCollector collector) {
						collector.emit(new Values(state.count));
					}
                	
                }, new Fields("sum"))
                .each(new Fields("sum"),new Debug("sum print:"))
                .parallelismHint(5);

输出:

[partition4-Thread-136-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt1, 4]
[partition4-Thread-136-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt3, 8]
[partition3-Thread-118-b-0-executor[36 36]]> DEBUG(shuffle print:): [nickt2, 7]
[partition4-Thread-136-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt5, 7]
[partition3-Thread-118-b-0-executor[36 36]]> DEBUG(shuffle print:): [nickt4, 9]
[partition3-Thread-118-b-0-executor[36 36]]> DEBUG(shuffle print:): [nickt6, 11]
[partition1-Thread-82-b-1-executor[39 39]]> DEBUG(sum print:): [19]
[partition2-Thread-66-b-1-executor[40 40]]> DEBUG(sum print:): [27]

[partition4-Thread-136-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt7, 5]
[partition3-Thread-54-b-1-executor[41 41]]> DEBUG(sum print:): [5]