欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Mac Spark 运行 wordcount 程序

程序员文章站 2022-06-14 14:59:24
...

1. mac 安装 spark

2. 安装sbt

brew install sbt
 

3. 写wordcount scala程序

import org.apache.spark.{SparkConf, SparkContext}

object SparkWordCount {

  def FILE_NAME:String = "word_count_results_";

  def main(args:Array[String]): Unit ={
    if(args.length < 1){
      println("Usage:SparkWordCount FileName");
      System.exit(1);
    }
    val conf = new SparkConf().setAppName("Spark Exercise: Spark Version Word Count Program");
    val sc = new SparkContext(conf);
    val textFile = sc.textFile(args(0));
    val wordCounts = textFile.flatMap(line => line.split(" ")).map(
      word => (word, 1)
    ).reduceByKey((a, b) => a + b)

    wordCounts.saveAsTextFile(FILE_NAME + System.currentTimeMillis());
    println("Word Count program running results are successfully saved.");
  }
}

3. sbt 文件

name := "SparkWordCount"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.0"

这边的scala的版本和spark的版本要根据你自己本地的版本进行修改

4. 目录结构

 

Mac Spark 运行 wordcount 程序

除了箭头的,其他忽略,是编译生成的。

 

5. 编译

sbt package

6. 提交到spark 运行

 

spark-submit --class SparkWordCount ./target/scala-2.11/sparkwordcount_2.11-1.0.jar /usr/local/Cellar/spark-2.3.0/README.md

这边传入的文件要是你本地有的。

 

http://www.codeblogbt.com/archives/144037