spark on yarn 出现的问题(一)
程序员文章站
2024-01-19 10:14:40
...
测试spark on yarn
spark版本:spark-0.9.0-incubating-bin-hadoop2
WordCount.scala 代码:
使用这个程序在spark on yarn 测试的时候,报了下面的异常:
根据异常错误,发现是无法连接上8030这个端口,这个端口是RM的调度端口,在yarn-site.xml中也配置了:
可是却死活连不上,对应这个问题的解决方案是:
在yarn-site.xml中增加如下配置,明确指定RM的hostname:
这样就能个连接上,具体原因得去阅读spark的源码了。
spark版本:spark-0.9.0-incubating-bin-hadoop2
WordCount.scala 代码:
import org.apache.spark._ import SparkContext._ object WordCount { def main(args: Array[String]) { if (args.length != 3 ){ println("usage is org.test.WordCount <master> <input> <output>") return } val sc = new SparkContext(args(0), "WordCount", System.getenv("SPARK_HOME"), Seq(System.getenv("SPARK_TEST_JAR"))) val textFile = sc.textFile(args(1)) val result = textFile.flatMap(line => line.split("\\s+")).map(word => (word, 1)).reduceByKey(_ + _) result.saveAsTextFile(args(2)) } }
使用这个程序在spark on yarn 测试的时候,报了下面的异常:
14/03/05 15:57:48 DEBUG ipc.Client: The ping interval is 60000 ms. 14/03/05 15:57:48 DEBUG ipc.Client: Connecting to 0.0.0.0/0.0.0.0:8030 14/03/05 15:57:49 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/03/05 15:57:50 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/03/05 15:57:51 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/03/05 15:57:52 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/03/05 15:57:53 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/03/05 15:57:54 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/03/05 15:57:55 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/03/05 15:57:56 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/03/05 15:57:57 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/03/05 15:57:58 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/03/05 15:57:58 DEBUG ipc.Client: closing ipc connection to 0.0.0.0/0.0.0.0:8030: 拒绝连接 java.net.ConnectException: 拒绝连接 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642) at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399) at org.apache.hadoop.ipc.Client.call(Client.java:1318) at org.apache.hadoop.ipc.Client.call(Client.java:1300) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy11.registerApplicationMaster(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy12.registerApplicationMaster(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:197) at org.apache.spark.deploy.yarn.ApplicationMaster.registerApplicationMaster(ApplicationMaster.scala:138) at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:102) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:429) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) 14/03/05 15:57:58 DEBUG ipc.Client: IPC Client (1553281132) connection to 0.0.0.0/0.0.0.0:8030 from hadoop: closed
根据异常错误,发现是无法连接上8030这个端口,这个端口是RM的调度端口,在yarn-site.xml中也配置了:
<property> <name>yarn.resourcemanager.scheduler.address.rm2</name> <value>master2:8030</value> </property>
可是却死活连不上,对应这个问题的解决方案是:
在yarn-site.xml中增加如下配置,明确指定RM的hostname:
<property> <name>yarn.resourcemanager.hostname</name> <value>master1</value> </property>
这样就能个连接上,具体原因得去阅读spark的源码了。