Scala之文件读取、写入、控制台操作的方法示例
程序员文章站
2024-02-14 15:05:34
scala文件读取
e盘根目录下scalaio.txt文件内容如下:
文件读取示例代码:
//文件读取
val file=source.fromfi...
scala文件读取
e盘根目录下scalaio.txt文件内容如下:
文件读取示例代码:
//文件读取 val file=source.fromfile("e:\\scalaio.txt") for(line <- file.getlines) { println(line) } file.close
说明1:file=source.fromfile(“e:\scalaio.txt”),其中source中的fromfile()方法源自 import scala.io.source源码包,源码如下图:
file.getlines(),返回的是一个迭代器-iterator;源码如下:(scala.io)
scala 网络资源读取
//网络资源读取 val webfile=source.fromurl("http://spark.apache.org") webfile.foreach(print) webfile.close()
fromurl()方法源码如下:
/** same as fromurl(new url(s)) */ def fromurl(s: string)(implicit codec: codec): bufferedsource = fromurl(new url(s))(codec)
读取的网络资源资源内容如下:
<!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <meta http-equiv="x-ua-compatible" content="ie=edge"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title> apache spark™ - lightning-fast cluster computing </title> <meta name="description" content="apache spark is a fast and general engine for big data processing, with built-in modules for streaming, sql, machine learning and graph processing."> <!-- bootstrap core css --> <link href="/css/cerulean.min.css" rel="external nofollow" rel="stylesheet"> <link href="/css/custom.css" rel="external nofollow" rel="stylesheet"> <script type="text/javascript"> <!-- google analytics initialization --> var _gaq = _gaq || []; _gaq.push(['_setaccount', 'ua-32518208-2']); _gaq.push(['_trackpageview']); (function() { var ga = document.createelement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getelementsbytagname('script')[0]; s.parentnode.insertbefore(ga, s); })(); <!-- adds slight delay to links to allow async reporting --> function trackoutboundlink(link, category, action) { try { _gaq.push(['_trackevent', category , action]); } catch(err){} settimeout(function() { document.location.href = link.href; }, 100); } </script> <!-- html5 shim and respond.js ie8 support of html5 elements and media queries --> <!--[if lt ie 9]> <script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script> <script src="https://oss.maxcdn.com/libs/respond.js/1.3.0/respond.min.js"></script> <![endif]--> </head> <body> <script src="https://code.jquery.com/jquery.js"></script> <script src="//netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js"></script> <script src="/js/lang-tabs.js"></script> <script src="/js/downloads.js"></script> <div class="container" style="max-width: 1200px;"> <div class="masthead"> <p class="lead"> <a href="/" rel="external nofollow" > <img src="/images/spark-logo.png" style="height:100px; width:auto; vertical-align: bottom; margin-top: 20px;"></a><span class="tagline"> lightning-fast cluster computing </span> </p> </div> <nav class="navbar navbar-default" role="navigation"> <!-- brand and toggle get grouped for better mobile display --> <div class="navbar-header"> <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar-collapse-1"> <span class="sr-only">toggle navigation</span> <span class="icon-bar"></span> <span class="icon-bar"></span> <span class="icon-bar"></span> </button> </div> <!-- collect the nav links, forms, and other content for toggling --> <div class="collapse navbar-collapse" id="navbar-collapse-1"> <ul class="nav navbar-nav"> <li><a href="/downloads.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >download</a></li> <li class="dropdown"> <a href="#" rel="external nofollow" rel="external nofollow" class="dropdown-toggle" data-toggle="dropdown"> libraries <b class="caret"></b> </a> <ul class="dropdown-menu"> <li><a href="/sql/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >sql and dataframes</a></li> <li><a href="/streaming/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >spark streaming</a></li> <li><a href="/mllib/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >mllib (machine learning)</a></li> <li><a href="/graphx/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >graphx (graph)</a></li> <li class="divider"></li> <li><a href="http://spark-packages.org" rel="external nofollow" rel="external nofollow" >third-party packages</a></li> </ul> </li> <li class="dropdown"> <a href="#" rel="external nofollow" rel="external nofollow" class="dropdown-toggle" data-toggle="dropdown"> documentation <b class="caret"></b> </a> <ul class="dropdown-menu"> <li><a href="/docs/latest/" rel="external nofollow" >latest release (spark 1.5.1)</a></li> <li><a href="/documentation.html" rel="external nofollow" >other resources</a></li> </ul> </li> <li><a href="/examples.html" rel="external nofollow" >examples</a></li> <li class="dropdown"> <a href="/community.html" rel="external nofollow" rel="external nofollow" class="dropdown-toggle" data-toggle="dropdown"> community <b class="caret"></b> </a> <ul class="dropdown-menu"> <li><a href="/community.html" rel="external nofollow" rel="external nofollow" >mailing lists</a></li> <li><a href="/community.html#events" rel="external nofollow" >events and meetups</a></li> <li><a href="/community.html#history" rel="external nofollow" >project history</a></li> <li><a href="https://cwiki.apache.org/confluence/display/spark/powered+by+spark" rel="external nofollow" rel="external nofollow" >powered by</a></li> <li><a href="https://cwiki.apache.org/confluence/display/spark/committers" rel="external nofollow" rel="external nofollow" >project committers</a></li> <li><a href="https://issues.apache.org/jira/browse/spark" rel="external nofollow" rel="external nofollow" >issue tracker</a></li> </ul> </li> <li><a href="/faq.html" rel="external nofollow" >faq</a></li> </ul> </div> <!-- /.navbar-collapse --> </nav> <div class="row"> <div class="col-md-3 col-md-push-9"> <div class="news" style="margin-bottom: 20px;"> <h5>latest news</h5> <ul class="list-unstyled"> <li><a href="/news/submit-talks-to-spark-summit-east-2016.html" rel="external nofollow" >submission is open for spark summit east 2016</a> <span class="small">(oct 14, 2015)</span></li> <li><a href="/news/spark-1-5-1-released.html" rel="external nofollow" >spark 1.5.1 released</a> <span class="small">(oct 02, 2015)</span></li> <li><a href="/news/spark-1-5-0-released.html" rel="external nofollow" >spark 1.5.0 released</a> <span class="small">(sep 09, 2015)</span></li> <li><a href="/news/spark-summit-europe-agenda-posted.html" rel="external nofollow" >spark summit europe agenda posted</a> <span class="small">(sep 07, 2015)</span></li> </ul> <p class="small" style="text-align: right;"><a href="/news/index.html" rel="external nofollow" >archive</a></p> </div> <div class="hidden-xs hidden-sm"> <a href="/downloads.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" class="btn btn-success btn-lg btn-block" style="margin-bottom: 30px;"> download spark </a> <p style="font-size: 16px; font-weight: 500; color: #555;"> built-in libraries: </p> <ul class="list-none"> <li><a href="/sql/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >sql and dataframes</a></li> <li><a href="/streaming/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >spark streaming</a></li> <li><a href="/mllib/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >mllib (machine learning)</a></li> <li><a href="/graphx/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >graphx (graph)</a></li> </ul> <a href="http://spark-packages.org" rel="external nofollow" rel="external nofollow" >third-party packages</a> </div> </div> <div class="col-md-9 col-md-pull-3"> <div class="jumbotron"> <b>apache spark™</b> is a fast and general engine for large-scale data processing. </div> <div class="row row-padded"> <div class="col-md-7 col-sm-7"> <h2>speed</h2> <p class="lead"> run programs up to 100x faster than hadoop mapreduce in memory, or 10x faster on disk. </p> <p> spark has an advanced dag execution engine that supports cyclic data flow and in-memory computing. </p> </div> <div class="col-md-5 col-sm-5 col-padded-top col-center"> <div style="width: 100%; max-width: 272px; display: inline-block; text-align: center;"> <img src="/images/logistic-regression.png" style="width: 100%; max-width: 250px;" /> <div class="caption" style="min-width: 272px;">logistic regression in hadoop and spark</div> </div> </div> </div> <div class="row row-padded"> <div class="col-md-7 col-sm-7"> <h2>ease of use</h2> <p class="lead"> write applications quickly in java, scala, python, r. </p> <p> spark offers over 80 high-level operators that make it easy to build parallel apps. and you can use it <em>interactively</em> from the scala, python and r shells. </p> </div> <div class="col-md-5 col-sm-5 col-padded-top col-center"> <div style="text-align: left; display: inline-block;"> <div class="code"> text_file = spark.textfile(<span class="string">"hdfs://..."</span>)<br /> <br /> text_file.<span class="sparkop">flatmap</span>(<span class="closure">lambda line: line.split()</span>)<br /> .<span class="sparkop">map</span>(<span class="closure">lambda word: (word, 1)</span>)<br /> .<span class="sparkop">reducebykey</span>(<span class="closure">lambda a, b: a+b</span>) </div> <div class="caption">word count in spark's python api</div> </div> <!-- <div class="code" style="margin-top: 20px; text-align: left; display: inline-block;"> text_file = spark.textfile(<span class="string">"hdfs://..."</span>)<br/> <br/> text_file.<span class="sparkop">filter</span>(<span class="closure">lambda line: "error" in line</span>)<br/> .<span class="sparkop">count</span>() </div> --> <!--<div class="caption">word count in spark</div>--> </div> </div> <div class="row row-padded"> <div class="col-md-7 col-sm-7"> <h2>generality</h2> <p class="lead"> combine sql, streaming, and complex analytics. </p> <p> spark powers a stack of libraries including <a href="/sql/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >sql and dataframes</a>, <a href="/mllib/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >mllib</a> for machine learning, <a href="/graphx/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >graphx</a>, and <a href="/streaming/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >spark streaming</a>. you can combine these libraries seamlessly in the same application. </p> </div> <div class="col-md-5 col-sm-5 col-padded-top col-center"> <img src="/images/spark-stack.png" style="margin-top: 15px; width: 100%; max-width: 296px;" usemap="#stack-map" /> <map name="stack-map"> <area shape="rect" coords="0,0,74,95" href="/sql/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" alt="spark sql" title="spark sql" /> <area shape="rect" coords="74,0,150,95" href="/streaming/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" alt="spark streaming" title="spark streaming" /> <area shape="rect" coords="150,0,224,95" href="/mllib/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" alt="mllib (machine learning)" title="mllib" /> <area shape="rect" coords="225,0,300,95" href="/graphx/" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" alt="graphx" title="graphx" /> </map> </div> </div> <div class="row row-padded" style="margin-bottom: 15px;"> <div class="col-md-7 col-sm-7"> <h2>runs everywhere</h2> <p class="lead"> spark runs on hadoop, mesos, standalone, or in the cloud. it can access diverse data sources including hdfs, cassandra, hbase, and s3. </p> <p> you can run spark using its <a href="/docs/latest/spark-standalone.html" rel="external nofollow" >standalone cluster mode</a>, on <a href="/docs/latest/ec2-scripts.html" rel="external nofollow" >ec2</a>, on hadoop yarn, or on <a href="http://mesos.apache.org" rel="external nofollow" >apache mesos</a>. access data in <a href="http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfsuserguide.html" rel="external nofollow" >hdfs</a>, <a href="http://cassandra.apache.org" rel="external nofollow" >cassandra</a>, <a href="http://hbase.apache.org" rel="external nofollow" >hbase</a>, <a href="http://hive.apache.org" rel="external nofollow" >hive</a>, <a href="http://tachyon-project.org" rel="external nofollow" >tachyon</a>, and any hadoop data source. </p> </div> <div class="col-md-5 col-sm-5 col-padded-top col-center"> <img src="/images/spark-runs-everywhere.png" style="width: 100%; max-width: 280px;" /> </div> </div> </div> </div> <div class="row"> <div class="col-md-4 col-padded"> <h3>community</h3> <p> spark is used at a wide range of organizations to process large datasets. you can find example use cases at the <a href="http://spark-summit.org/summit-2013/" rel="external nofollow" >spark summit</a> conference, or on the <a href="https://cwiki.apache.org/confluence/display/spark/powered+by+spark" rel="external nofollow" rel="external nofollow" >powered by</a> page. </p> <p> there are many ways to reach the community: </p> <ul class="list-narrow"> <li>use the <a href="/community.html#mailing-lists" rel="external nofollow" >mailing lists</a> to ask questions.</li> <li>in-person events include the <a href="http://www.meetup.com/spark-users/" rel="external nofollow" >bay area spark meetup</a> and <a href="http://spark-summit.org/" rel="external nofollow" >spark summit</a>.</li> <li>we use <a href="https://issues.apache.org/jira/browse/spark" rel="external nofollow" rel="external nofollow" >jira</a> for issue tracking.</li> </ul> </div> <div class="col-md-4 col-padded"> <h3>contributors</h3> <p> apache spark is built by a wide set of developers from over 200 companies. since 2009, more than 800 developers have contributed to spark! </p> <p> the project's <a href="https://cwiki.apache.org/confluence/display/spark/committers" rel="external nofollow" rel="external nofollow" >committers</a> come from 16 organizations. </p> <p> if you'd like to participate in spark, or contribute to the libraries on top of it, learn <a href="https://cwiki.apache.org/confluence/display/spark/contributing+to+spark" rel="external nofollow" >how to contribute</a>. </p> </div> <div class="col-md-4 col-padded"> <h3>getting started</h3> <p>learning spark is easy whether you come from a java or python background:</p> <ul class="list-narrow"> <li><a href="/downloads.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >download</a> the latest release — you can run spark locally on your laptop.</li> <li>read the <a href="/docs/latest/quick-start.html" rel="external nofollow" >quick start guide</a>.</li> <li> spark summit 2014 contained free <a href="http://spark-summit.org/2014/training" rel="external nofollow" >training videos and exercises</a>. </li> <li>learn how to <a href="/docs/latest/#launching-on-a-cluster" rel="external nofollow" >deploy</a> spark on a cluster.</li> </ul> </div> </div> <div class="row"> <div class="col-sm-12 col-center"> <a href="/downloads.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" class="btn btn-success btn-lg" style="width: 262px;">download spark</a> </div> </div> <footer class="small"> <hr> apache spark, spark, apache, and the spark logo are trademarks of <a href="http://www.apache.org" rel="external nofollow" >the apache software foundation</a>. </footer> </div> </body> </html> process finished with exit code 0
//网络资源读取 val webfile=source.fromurl("http://www.baidu.com/") webfile.foreach(print) webfile.close()
读取中文资源站点,出现编码混乱问题如下:(解决办法自行解决,本文不是重点)
exception in thread "main" java.nio.charset.malformedinputexception: input length = 1
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持。