欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

Spark SerializedLambda错误的两种解决方案

程序员文章站 2022-07-05 09:38:35
目录spark serializedlambda错误解决方案(一)解决方案(二)执行spark报错eofexception kryo和serializedlambdaeofexception kryo...

spark serializedlambda错误

在idea下开发spark程序会遇到lambda异常,下面演示异常及解决方案。

例子

import org.apache.spark.sparkconf;
import org.apache.spark.api.java.javardd;
import org.apache.spark.api.java.javasparkcontext;
import org.apache.spark.api.java.function.function;
public class simpleapp {
    public static void main(string[] args) {
        string logfile = "/soft/dounine/github/spark-learn/readme.md"; // should be some file on your system
        sparkconf sparkconf = new sparkconf()
                .setmaster("spark://localhost:7077")
                .setappname("demo");
        javasparkcontext sc = new javasparkcontext(sparkconf);
        javardd<string> logdata = sc.textfile(logfile).cache();
        long numas = logdata.filter(s -> s.contains("a")).count();
        long numbs = logdata.map(new function<string, integer>() {
            @override
            public integer call(string v1) throws exception {
                return 1;
            }
        }).reduce((a,b)->a+b);
        system.out.println("lines with a: " + numas + ", lines with b: " + numbs);
        sc.stop();
    }
}

由于使用jdk1.8的lambda表达式,会有如下异常

18/08/06 15:18:41 warn tasksetmanager: lost task 0.0 in stage 0.0 (tid 0, 192.168.0.107, executor 0): java.lang.classcastexception: cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javardd$$anonfun$filter$1.f$1 of type org.apache.spark.api.java.function.function in instance of org.apache.spark.api.java.javardd$$anonfun$filter$1
    at java.io.objectstreamclass$fieldreflector.setobjfieldvalues(objectstreamclass.java:2233)
    at java.io.objectstreamclass.setobjfieldvalues(objectstreamclass.java:1405)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2290)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.readobject(objectinputstream.java:430)
    at org.apache.spark.serializer.javadeserializationstream.readobject(javaserializer.scala:75)
    at org.apache.spark.serializer.javaserializerinstance.deserialize(javaserializer.scala:114)
    at org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:80)
    at org.apache.spark.scheduler.task.run(task.scala:109)
    at org.apache.spark.executor.executor$taskrunner.run(executor.scala:345)
    at java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1149)
    at java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:624)
    at java.lang.thread.run(thread.java:748)
18/08/06 15:18:41 info tasksetmanager: lost task 1.0 in stage 0.0 (tid 1) on 192.168.0.107, executor 0: java.lang.classcastexception (cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javardd$$anonfun$filter$1.f$1 of type org.apache.spark.api.java.function.function in instance of org.apache.spark.api.java.javardd$$anonfun$filter$1) [duplicate 1]
18/08/06 15:18:41 info tasksetmanager: starting task 1.1 in stage 0.0 (tid 2, 192.168.0.107, executor 0, partition 1, process_local, 7898 bytes)
18/08/06 15:18:41 info tasksetmanager: starting task 0.1 in stage 0.0 (tid 3, 192.168.0.107, executor 0, partition 0, process_local, 7898 bytes)
18/08/06 15:18:41 info tasksetmanager: lost task 1.1 in stage 0.0 (tid 2) on 192.168.0.107, executor 0: java.lang.classcastexception (cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javardd$$anonfun$filter$1.f$1 of type org.apache.spark.api.java.function.function in instance of org.apache.spark.api.java.javardd$$anonfun$filter$1) [duplicate 2]
18/08/06 15:18:41 info tasksetmanager: starting task 1.2 in stage 0.0 (tid 4, 192.168.0.107, executor 0, partition 1, process_local, 7898 bytes)
18/08/06 15:18:41 info tasksetmanager: lost task 0.1 in stage 0.0 (tid 3) on 192.168.0.107, executor 0: java.lang.classcastexception (cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javardd$$anonfun$filter$1.f$1 of type org.apache.spark.api.java.function.function in instance of org.apache.spark.api.java.javardd$$anonfun$filter$1) [duplicate 3]
18/08/06 15:18:41 info tasksetmanager: starting task 0.2 in stage 0.0 (tid 5, 192.168.0.107, executor 0, partition 0, process_local, 7898 bytes)
18/08/06 15:18:41 info tasksetmanager: lost task 0.2 in stage 0.0 (tid 5) on 192.168.0.107, executor 0: java.lang.classcastexception (cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javardd$$anonfun$filter$1.f$1 of type org.apache.spark.api.java.function.function in instance of org.apache.spark.api.java.javardd$$anonfun$filter$1) [duplicate 4]
18/08/06 15:18:41 info tasksetmanager: starting task 0.3 in stage 0.0 (tid 6, 192.168.0.107, executor 0, partition 0, process_local, 7898 bytes)
18/08/06 15:18:41 info tasksetmanager: lost task 1.2 in stage 0.0 (tid 4) on 192.168.0.107, executor 0: java.lang.classcastexception (cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javardd$$anonfun$filter$1.f$1 of type org.apache.spark.api.java.function.function in instance of org.apache.spark.api.java.javardd$$anonfun$filter$1) [duplicate 5]
18/08/06 15:18:41 info tasksetmanager: starting task 1.3 in stage 0.0 (tid 7, 192.168.0.107, executor 0, partition 1, process_local, 7898 bytes)
18/08/06 15:18:41 info tasksetmanager: lost task 0.3 in stage 0.0 (tid 6) on 192.168.0.107, executor 0: java.lang.classcastexception (cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javardd$$anonfun$filter$1.f$1 of type org.apache.spark.api.java.function.function in instance of org.apache.spark.api.java.javardd$$anonfun$filter$1) [duplicate 6]
18/08/06 15:18:41 error tasksetmanager: task 0 in stage 0.0 failed 4 times; aborting job
18/08/06 15:18:41 info tasksetmanager: lost task 1.3 in stage 0.0 (tid 7) on 192.168.0.107, executor 0: java.lang.classcastexception (cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javardd$$anonfun$filter$1.f$1 of type org.apache.spark.api.java.function.function in instance of org.apache.spark.api.java.javardd$$anonfun$filter$1) [duplicate 7]
18/08/06 15:18:41 info taskschedulerimpl: removed taskset 0.0, whose tasks have all completed, from pool 
18/08/06 15:18:41 info taskschedulerimpl: cancelling stage 0
18/08/06 15:18:41 info dagscheduler: resultstage 0 (count at simpleapp.java:19) failed in 1.113 s due to job aborted due to stage failure: task 0 in stage 0.0 failed 4 times, most recent failure: lost task 0.3 in stage 0.0 (tid 6, 192.168.0.107, executor 0): java.lang.classcastexception: cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javardd$$anonfun$filter$1.f$1 of type org.apache.spark.api.java.function.function in instance of org.apache.spark.api.java.javardd$$anonfun$filter$1
    at java.io.objectstreamclass$fieldreflector.setobjfieldvalues(objectstreamclass.java:2233)
    at java.io.objectstreamclass.setobjfieldvalues(objectstreamclass.java:1405)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2290)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.readobject(objectinputstream.java:430)
    at org.apache.spark.serializer.javadeserializationstream.readobject(javaserializer.scala:75)
    at org.apache.spark.serializer.javaserializerinstance.deserialize(javaserializer.scala:114)
    at org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:80)
    at org.apache.spark.scheduler.task.run(task.scala:109)
    at org.apache.spark.executor.executor$taskrunner.run(executor.scala:345)
    at java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1149)
    at java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:624)
    at java.lang.thread.run(thread.java:748)
driver stacktrace:
18/08/06 15:18:41 info dagscheduler: job 0 failed: count at simpleapp.java:19, took 1.138497 s
exception in thread "main" org.apache.spark.sparkexception: job aborted due to stage failure: task 0 in stage 0.0 failed 4 times, most recent failure: lost task 0.3 in stage 0.0 (tid 6, 192.168.0.107, executor 0): java.lang.classcastexception: cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javardd$$anonfun$filter$1.f$1 of type org.apache.spark.api.java.function.function in instance of org.apache.spark.api.java.javardd$$anonfun$filter$1
    at java.io.objectstreamclass$fieldreflector.setobjfieldvalues(objectstreamclass.java:2233)
    at java.io.objectstreamclass.setobjfieldvalues(objectstreamclass.java:1405)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2290)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.readobject(objectinputstream.java:430)
    at org.apache.spark.serializer.javadeserializationstream.readobject(javaserializer.scala:75)
    at org.apache.spark.serializer.javaserializerinstance.deserialize(javaserializer.scala:114)
    at org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:80)
    at org.apache.spark.scheduler.task.run(task.scala:109)
    at org.apache.spark.executor.executor$taskrunner.run(executor.scala:345)
    at java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1149)
    at java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:624)
    at java.lang.thread.run(thread.java:748)
driver stacktrace:
    at org.apache.spark.scheduler.dagscheduler.org$apache$spark$scheduler$dagscheduler$$failjobandindependentstages(dagscheduler.scala:1602)
	at org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(dagscheduler.scala:1590)
    at org.apache.spark.scheduler.dagscheduler$$anonfun$abortstage$1.apply(dagscheduler.scala:1589)
	at scala.collection.mutable.resizablearray$class.foreach(resizablearray.scala:59)
	at scala.collection.mutable.arraybuffer.foreach(arraybuffer.scala:48)
	at org.apache.spark.scheduler.dagscheduler.abortstage(dagscheduler.scala:1589)
	at org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$1.apply(dagscheduler.scala:831)
    at org.apache.spark.scheduler.dagscheduler$$anonfun$handletasksetfailed$1.apply(dagscheduler.scala:831)
	at scala.option.foreach(option.scala:257)
	at org.apache.spark.scheduler.dagscheduler.handletasksetfailed(dagscheduler.scala:831)
	at org.apache.spark.scheduler.dagschedulereventprocessloop.doonreceive(dagscheduler.scala:1823)
	at org.apache.spark.scheduler.dagschedulereventprocessloop.onreceive(dagscheduler.scala:1772)
	at org.apache.spark.scheduler.dagschedulereventprocessloop.onreceive(dagscheduler.scala:1761)
	at org.apache.spark.util.eventloop$$anon$1.run(eventloop.scala:48)
    at org.apache.spark.scheduler.dagscheduler.runjob(dagscheduler.scala:642)
    at org.apache.spark.sparkcontext.runjob(sparkcontext.scala:2034)
    at org.apache.spark.sparkcontext.runjob(sparkcontext.scala:2055)
    at org.apache.spark.sparkcontext.runjob(sparkcontext.scala:2074)
    at org.apache.spark.sparkcontext.runjob(sparkcontext.scala:2099)
    at org.apache.spark.rdd.rdd.count(rdd.scala:1162)
    at org.apache.spark.api.java.javarddlike$class.count(javarddlike.scala:455)
    at org.apache.spark.api.java.abstractjavarddlike.count(javarddlike.scala:45)
    at com.dounine.spark.learn.simpleapp.main(simpleapp.java:19)
caused by: java.lang.classcastexception: cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javardd$$anonfun$filter$1.f$1 of type org.apache.spark.api.java.function.function in instance of org.apache.spark.api.java.javardd$$anonfun$filter$1
    at java.io.objectstreamclass$fieldreflector.setobjfieldvalues(objectstreamclass.java:2233)
    at java.io.objectstreamclass.setobjfieldvalues(objectstreamclass.java:1405)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2290)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2284)
    at java.io.objectinputstream.readserialdata(objectinputstream.java:2208)
    at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2066)
    at java.io.objectinputstream.readobject0(objectinputstream.java:1570)
    at java.io.objectinputstream.readobject(objectinputstream.java:430)
    at org.apache.spark.serializer.javadeserializationstream.readobject(javaserializer.scala:75)
    at org.apache.spark.serializer.javaserializerinstance.deserialize(javaserializer.scala:114)
    at org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:80)
    at org.apache.spark.scheduler.task.run(task.scala:109)
    at org.apache.spark.executor.executor$taskrunner.run(executor.scala:345)
    at java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1149)
    at java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:624)
    at java.lang.thread.run(thread.java:748)
18/08/06 15:18:41 info sparkcontext: invoking stop() from shutdown hook
18/08/06 15:18:41 info sparkui: stopped spark web ui at http://lake.dounine.com:4040
18/08/06 15:18:41 info standaloneschedulerbackend: shutting down all executors
18/08/06 15:18:41 info coarsegrainedschedulerbackend$driverendpoint: asking each executor to shut down
18/08/06 15:18:41 info mapoutputtrackermasterendpoint: mapoutputtrackermasterendpoint stopped!
18/08/06 15:18:41 info memorystore: memorystore cleared
18/08/06 15:18:41 info blockmanager: blockmanager stopped
18/08/06 15:18:41 info blockmanagermaster: blockmanagermaster stopped
18/08/06 15:18:41 info outputcommitcoordinator$outputcommitcoordinatorendpoint: outputcommitcoordinator stopped!
18/08/06 15:18:41 info sparkcontext: successfully stopped sparkcontext
18/08/06 15:18:41 info shutdownhookmanager: shutdown hook called
18/08/06 15:18:41 info shutdownhookmanager: deleting directory /tmp/spark-cf16df6e-fd04-4d17-8b6a-a6252793d0d5

是因为jar包没有分发到worker中。

解决方案(一)

添加jar包位置路径

sparkconf sparkconf = new sparkconf()
                .setmaster("spark://lake.dounine.com:7077")
                .setjars(new string[]{"/soft/dounine/github/spark-learn/build/libs/spark-learn-1.0-snapshot.jar"})
                .setappname("demo");

解决方案(二)

使用本地开发模式

sparkconf sparkconf = new sparkconf()
                .setmaster("local")
                .setappname("demo");

执行spark报错eofexception kryo和serializedlambda

执行spark报错eofexception kryo和serializedlambda问题的解决办法

eofexception kryo问题的解决

发布到spark的worker工作机的项目依赖库中删除底版本的kryo文件,如下:

在执行环境中删除kryo-2.21.jar文件和保留kryo-shaded-3.0.3.jar文件,执行就ok了。

经过查看在kryo-shaded-3.0.3.jar和geowave-tools-0.9.8-apache.jar文件中都有一个类存在,这个类是com.esofericsoftwave.kryo.io.unsafeoutput.class,大小为7066,

然而kryo-2.21.jar确没有这个类。

具体报错信息为:

特别在执行javardd.count()和javardd.maptopair()方法时报错

 java.io.eofexception
at org.apache.spark.serializer.kryodeserializationstream.readobject(kryoserializer.scala:283)
at org.apache.spark.broadcast.torrentbroadcast$$anonfun$8.apply(torrentbroadcast.scala:308)
at org.apache.spark.util.utils$.trywithsafefinally(utils.scala:1380)
at org.apache.spark.broadcast.torrentbroadcast$.unblockifyobject(torrentbroadcast.scala:309)
at org.apache.spark.broadcast.torrentbroadcast$$anonfun$readbroadcastblock$1$$anonfun$apply$2.apply(torrentbroadcast.scala:235)
at scala.option.getorelse(option.scala:121)
at org.apache.spark.broadcast.torrentbroadcast$$anonfun$readbroadcastblock$1.apply(torrentbroadcast.scala:211)
at org.apache.spark.util.utils$.tryorioexception(utils.scala:1346)
at org.apache.spark.broadcast.torrentbroadcast.readbroadcastblock(torrentbroadcast.scala:207)
at org.apache.spark.broadcast.torrentbroadcast._value$lzycompute(torrentbroadcast.scala:66)
at org.apache.spark.broadcast.torrentbroadcast._value(torrentbroadcast.scala:66)
at org.apache.spark.broadcast.torrentbroadcast.getvalue(torrentbroadcast.scala:96)
at org.apache.spark.broadcast.broadcast.value(broadcast.scala:70)
at org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:81)
at org.apache.spark.scheduler.task.run(task.scala:109)
at org.apache.spark.executor.executor$taskrunner.run(executor.scala:345)
at java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1149)
at java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:624)
at java.lang.thread.run(thread.java:748)

cannot assign instance serializedlambda 报错问题的解决

cannot assign instance of java.lang.invoke.serializedlambda to field

在代码添加一行:

conf.setjars(javasparkcontext.jarofclass(this.getclass()));

运行就完全ok了

具体报错信息如下:

java.lang.classcastexception: cannot assign instance of java.lang.invoke.serializedlambda to field org.apache.spark.api.java.javapairrdd$$anonfun$pairfuntoscalafun$1.x$334 of type org.apache.spark.api.java.function.pairfunction in instance of org.apache.spark.api.java.javapairrdd$$anonfun$pairfuntoscalafun$1
 at java.io.objectstreamclass$fieldreflector.setobjfieldvalues(objectstreamclass.java:2233)
 at java.io.objectstreamclass.setobjfieldvalues(objectstreamclass.java:1405)
 at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2291)
 at java.io.objectinputstream.readserialdata(objectinputstream.java:2209)
 at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2067)
 at java.io.objectinputstream.readobject0(objectinputstream.java:1571)
 at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2285)
 at java.io.objectinputstream.readserialdata(objectinputstream.java:2209)
 at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2067)
 at java.io.objectinputstream.readobject0(objectinputstream.java:1571)
 at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2285)
 at java.io.objectinputstream.readserialdata(objectinputstream.java:2209)
 at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2067)
 at java.io.objectinputstream.readobject0(objectinputstream.java:1571)
 at java.io.objectinputstream.defaultreadfields(objectinputstream.java:2285)
 at java.io.objectinputstream.readserialdata(objectinputstream.java:2209)
 at java.io.objectinputstream.readordinaryobject(objectinputstream.java:2067)
 at java.io.objectinputstream.readobject0(objectinputstream.java:1571)
 at java.io.objectinputstream.readobject(objectinputstream.java:431)
 at org.apache.spark.serializer.javadeserializationstream.readobject(javaserializer.scala:75)
 at org.apache.spark.serializer.javaserializerinstance.deserialize(javaserializer.scala:114)
 at org.apache.spark.scheduler.shufflemaptask.runtask(shufflemaptask.scala:85)
 at org.apache.spark.scheduler.shufflemaptask.runtask(shufflemaptask.scala:53)
 at org.apache.spark.scheduler.task.run(task.scala:109)
 at org.apache.spark.executor.executor$taskrunner.run(executor.scala:345)
 at java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1149)
 at java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:624)
 ... 1 more

以上为个人经验,希望能给大家一个参考,也希望大家多多支持。