Apache Flink闻名已久,一直没有亲自尝试一把,这两天看了文档,发现在real-time streaming方面,Flink提供了更多高阶的实用函数。
用Apache Flink实现WordCount
- 下载Apache Flink 0.10.1
- 启动local模式
bin/start-local.sh
- 运行scala-shell
bin/start-scala-shell.sh remote localhost 6123
Flink中JobManager的默认监听端口是6123
- wordcount
val text = env.fromElements("Whether The slings and arrows of outrageous fortune")
val counts = text.flatMap{ _.toLowerCase.split("\\W+")}.map{ (_,1)}.groupBy(0).sum(1)
counts.print