hadoop运行java程序(jar包)并运行时动态指定参数
程序员文章站
2022-06-25 10:01:49
1)首先启动hadoop2个进程,进入hadoop/sbin目录下,依次启动如下命令[root@node02 sbin]# pwd/usr/server/hadoop/hadoop-2.7.0/sbi...
1)首先启动hadoop2个进程,进入hadoop/sbin目录下,依次启动如下命令
[root@node02 sbin]# pwd /usr/server/hadoop/hadoop-2.7.0/sbin
sh start-dfs.sh sh start-yarn.sh jps
2)通过jps查看是否正确启动,确保启动如下6个程序
[root@node02 sbin]# jps 10096 datanode 6952 nodemanager 9962 namenode 10269 secondarynamenode 12526 jps 6670 resourcemanager
3)如果启动带有文件的话,将文件加入到hdfs 的 /input下,如果出现如下错误的话,
[root@node02 hadoop-2.7.0]# hadoop fs -put sample.txt /input 21/01/02 01:13:15 warn util.nativecodeloader: unable to load native-hadoop library for atform... using builtin-java classes where applicable
在环境变量中添加如下字段
[root@node02 ~]# vim /etc/profile
export hadoop_common_lib_native_dir=${hadoop_prefix}/lib/native export hadoop_opts="-djava.library.path=$hadoop_prefix/lib"
4)进入到hadoop根目录,根据存放位置决定
[root@node02 hadoop-2.7.0]# pwd /usr/server/hadoop/hadoop-2.7.0
5)新建hadoop hdfs 文件系统上的 /input 文件夹(用于存放输入文件)
hadoop fs -mkdir /input
6)传入测试文件,测试文件需要自己上传到根目录下(仅供测试,生产环境下存放到指定目录)
[root@node02 hadoop-2.7.0]# hadoop fs -put sample.txt /input
7)查看传入文件是否存在
[root@node02 hadoop-2.7.0]# hadoop fs -ls /input -rw-r--r-- 1 root supergroup 529 2021-01-02 01:13 /input/sample.txt
8)上传jar包到根目录下(生产环境下,放入指定目录下),测试jar包为study_demo.jar
[root@node02 hadoop-2.7.0]# ll 总用量 1968 drwxr-xr-x. 2 10021 10021 4096 4月 11 2015 bin drwxr-xr-x. 3 10021 10021 4096 4月 11 2015 etc drwxr-xr-x. 2 10021 10021 4096 4月 11 2015 include drwxr-xr-x. 3 10021 10021 4096 4月 11 2015 lib drwxr-xr-x. 2 10021 10021 4096 4月 11 2015 libexec -rw-r--r--. 1 10021 10021 15429 4月 11 2015 license.txt drwxr-xr-x. 3 root root 4096 1月 2 01:36 logs -rw-r--r--. 1 10021 10021 101 4月 11 2015 notice.txt -rw-r--r--. 1 10021 10021 1366 4月 11 2015 readme.txt drwxr-xr-x. 2 10021 10021 4096 4月 11 2015 sbin drwxr-xr-x. 4 10021 10021 4096 4月 11 2015 share -rw-r--r--. 1 root root 1956989 6月 14 2021 study_demo.jar
9)使用hadoop 运行 java jar包,main函数一定要加上全限定类名
hadoop jar study_demo.jar com.ncst.hadoop.maxtemperature /input/sample.txt /output
10)运行结果缩略图
21/01/02 01:37:54 info mapreduce.job: counters: 49 file system counters file: number of bytes read=61 file: number of bytes written=342877 file: number of read operations=0 file: number of large read operations=0 file: number of write operations=0 hdfs: number of bytes read=974 hdfs: number of bytes written=17 hdfs: number of read operations=9 hdfs: number of large read operations=0 hdfs: number of write operations=2 job counters launched map tasks=2 launched reduce tasks=1 data-local map tasks=2 total time spent by all maps in occupied slots (ms)=14668 total time spent by all reduces in occupied slots (ms)=4352 total time spent by all map tasks (ms)=14668 total time spent by all reduce tasks (ms)=4352 total vcore-seconds taken by all map tasks=14668 total vcore-seconds taken by all reduce tasks=4352 total megabyte-seconds taken by all map tasks=15020032 total megabyte-seconds taken by all reduce tasks=4456448 map-reduce framework map input records=5 map output records=5 map output bytes=45 map output materialized bytes=67 input split bytes=180 combine input records=0 combine output records=0 reduce input groups=2 reduce shuffle bytes=67 reduce input records=5 reduce output records=2 spilled records=10 shuffled maps =2 failed shuffles=0 merged map outputs=2 gc time elapsed (ms)=525 cpu time spent (ms)=2510 physical memory (bytes) snapshot=641490944 virtual memory (bytes) snapshot=6241415168 total committed heap usage (bytes)=476053504 shuffle errors bad_id=0 connection=0 io_error=0 wrong_length=0 wrong_map=0 wrong_reduce=0 file input format counters bytes read=794 file output format counters bytes written=17
10)运行成功后执行命令查看,此时多出一个 /output 文件夹
[root@node02 hadoop-2.7.0]# hadoop fs -ls / drwxr-xr-x - root supergroup 0 2021-01-02 01:13 /input drwxr-xr-x - root supergroup 0 2021-01-02 01:37 /output drwx------ - root supergroup 0 2021-01-02 01:37 /tmp
11)查看 /output文件夹的文件
[root@node02 hadoop-2.7.0]# hadoop fs -ls /output -rw-r--r-- 1 root supergroup 0 2021-01-02 01:37 /output/_success -rw-r--r-- 1 root supergroup 17 2021-01-02 01:37 /output/part-00000
12)查看part-r-00000 文件夹中的内容,我这个测试用例用来获取1949年和1950年的最高气温(华氏度)
[root@node02 hadoop-2.7.0]# hadoop fs -cat /output/part-00000 1949 111 1950 22
13)在浏览器端访问端口可以观看可视化界面,对应的是hadoop服务器地址和自己设置的端口,通过可视化界面查看input文件夹面刚刚上传的sample.txt文件
http://192.168.194.xxx:50070/
14)测试程序jar包和测试文件已上传到github上面,此目录有面经和我自己总结的面试题
github
如有兴趣的同学也可以查阅我的秒杀系统
以上就是hadoop如何运行java程序(jar包)运行时动态指定参数的详细内容,更多关于hadoop运行java程序的资料请关注其它相关文章!