欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

SparkStreaming整合Flume的pull方式之启动报错解决方案

程序员文章站 2023-11-23 08:09:34
Flume配置文件: 但是在启动Flume时,报以下错误: 解决方案: 由于用到了agent的sink是 org.apache.spark.streaming.flume.sink.SparkSink类型,需要把spark streaming flume sink_2.11 2.4.3.jar复制到 ......

flume配置文件:

simple-agent.sources = netcat-source
simple-agent.sinks = spark-sink
simple-agent.channels = memory-channel

#describe/configure the source
simple-agent.sources.netcat-source.type = netcat
simple-agent.sources.netcat-source.bind = centos
simple-agent.sources.netcat-source.port= 44444

# describe the sink
simple-agent.sinks.spark-sink.type=org.apache.spark.streaming.flume.sink.sparksink
simple-agent.sinks.spark-sink.hostname= centos 
simple-agent.sinks.spark-sink.port= 41414

simple-agent.channels.memory-channel.type = memory
simple-agent.channels.memory-channel.capacity = 1000
simple-agent.channels.memory-channel.transactioncapacity = 100

simple-agent.sources.netcat-source.channels = memory-channel
simple-agent.sinks.spark-sink.channel = memory-channel

但是在启动flume时,报以下错误:

2019-10-16 11:35:14,559 (conf-file-poller-0) [error - org.apache.flume.node.pollingpropertiesfileconfigurationprovider$filewatcherrunnable.run(pollingpropertiesfileconfigurationprovider.java:142)] failed to load configuration data. exception follows.
org.apache.flume.flumeexception: unable to load sink type: org.apache.spark.streaming.flume.sink.sparksink, class: org.apache.spark.streaming.flume.sink.sparksink
    at org.apache.flume.sink.defaultsinkfactory.getclass(defaultsinkfactory.java:71)
    at org.apache.flume.sink.defaultsinkfactory.create(defaultsinkfactory.java:43)
    at org.apache.flume.node.abstractconfigurationprovider.loadsinks(abstractconfigurationprovider.java:410)
    at org.apache.flume.node.abstractconfigurationprovider.getconfiguration(abstractconfigurationprovider.java:98)
    at org.apache.flume.node.pollingpropertiesfileconfigurationprovider$filewatcherrunnable.run(pollingpropertiesfileconfigurationprovider.java:140)
    at java.util.concurrent.executors$runnableadapter.call(executors.java:511)
    at java.util.concurrent.futuretask.runandreset(futuretask.java:308)
    at java.util.concurrent.scheduledthreadpoolexecutor$scheduledfuturetask.access$301(scheduledthreadpoolexecutor.java:180)
    at java.util.concurrent.scheduledthreadpoolexecutor$scheduledfuturetask.run(scheduledthreadpoolexecutor.java:294)
    at java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1149)
    at java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:624)
    at java.lang.thread.run(thread.java:748)
caused by: java.lang.classnotfoundexception: org.apache.spark.streaming.flume.sink.sparksink
    at java.net.urlclassloader.findclass(urlclassloader.java:381)
    at java.lang.classloader.loadclass(classloader.java:424)
    at sun.misc.launcher$appclassloader.loadclass(launcher.java:349)
    at java.lang.classloader.loadclass(classloader.java:357)
    at java.lang.class.forname0(native method)
    at java.lang.class.forname(class.java:264)
    at org.apache.flume.sink.defaultsinkfactory.getclass(defaultsinkfactory.java:69)
    ... 11 more

解决方案:

由于用到了agent的sink是 org.apache.spark.streaming.flume.sink.sparksink类型,需要把spark-streaming-flume-sink_2.11-2.4.3.jar复制到flume的lib目录,否则,会报找不到org.apache.spark.streaming.flume.sink.sparksink类的错误。

欢迎关注我的公号:彪悍大蓝猫,持续分享大数据、java、安全干货~