hadoop运行java程序(jar包)并运行时动态指定参数

程序员文章站 2022-06-25 10:01:49

1）首先启动hadoop2个进程，进入hadoop/sbin目录下，依次启动如下命令[root@node02 sbin]# pwd/usr/server/hadoop/hadoop-2.7.0/sbi...

1）首先启动hadoop2个进程，进入hadoop/sbin目录下，依次启动如下命令

[root@node02 sbin]# pwd
/usr/server/hadoop/hadoop-2.7.0/sbin

sh start-dfs.sh
sh start-yarn.sh
jps

2）通过jps查看是否正确启动，确保启动如下6个程序

[root@node02 sbin]# jps
10096 datanode
6952 nodemanager
9962 namenode
10269 secondarynamenode
12526 jps
6670 resourcemanager

3）如果启动带有文件的话，将文件加入到hdfs 的 /input下，如果出现如下错误的话，

[root@node02 hadoop-2.7.0]# hadoop fs -put sample.txt /input
21/01/02 01:13:15 warn util.nativecodeloader: unable to load native-hadoop library for atform... using builtin-java classes where applicable

在环境变量中添加如下字段

[root@node02 ~]# vim /etc/profile

export hadoop_common_lib_native_dir=${hadoop_prefix}/lib/native
export hadoop_opts="-djava.library.path=$hadoop_prefix/lib"

4）进入到hadoop根目录，根据存放位置决定

[root@node02 hadoop-2.7.0]# pwd
/usr/server/hadoop/hadoop-2.7.0

5）新建hadoop hdfs 文件系统上的 /input 文件夹(用于存放输入文件)

hadoop fs -mkdir /input

6）传入测试文件,测试文件需要自己上传到根目录下（仅供测试，生产环境下存放到指定目录）

[root@node02 hadoop-2.7.0]# hadoop fs -put sample.txt /input

7）查看传入文件是否存在

[root@node02 hadoop-2.7.0]# hadoop fs -ls /input
-rw-r--r--   1 root supergroup        529 2021-01-02 01:13 /input/sample.txt

8）上传jar包到根目录下（生产环境下，放入指定目录下），测试jar包为study_demo.jar

[root@node02 hadoop-2.7.0]# ll
总用量 1968
drwxr-xr-x. 2 10021 10021    4096 4月  11 2015 bin
drwxr-xr-x. 3 10021 10021    4096 4月  11 2015 etc
drwxr-xr-x. 2 10021 10021    4096 4月  11 2015 include
drwxr-xr-x. 3 10021 10021    4096 4月  11 2015 lib
drwxr-xr-x. 2 10021 10021    4096 4月  11 2015 libexec
-rw-r--r--. 1 10021 10021   15429 4月  11 2015 license.txt
drwxr-xr-x. 3 root  root     4096 1月   2 01:36 logs
-rw-r--r--. 1 10021 10021     101 4月  11 2015 notice.txt
-rw-r--r--. 1 10021 10021    1366 4月  11 2015 readme.txt
drwxr-xr-x. 2 10021 10021    4096 4月  11 2015 sbin
drwxr-xr-x. 4 10021 10021    4096 4月  11 2015 share
-rw-r--r--. 1 root  root  1956989 6月  14 2021 study_demo.jar

9）使用hadoop 运行 java jar包，main函数一定要加上全限定类名

hadoop jar study_demo.jar com.ncst.hadoop.maxtemperature /input/sample.txt /output

10）运行结果缩略图

21/01/02 01:37:54 info mapreduce.job: counters: 49
	file system counters
		file: number of bytes read=61
		file: number of bytes written=342877
		file: number of read operations=0
		file: number of large read operations=0
		file: number of write operations=0
		hdfs: number of bytes read=974
		hdfs: number of bytes written=17
		hdfs: number of read operations=9
		hdfs: number of large read operations=0
		hdfs: number of write operations=2
	job counters 
		launched map tasks=2
		launched reduce tasks=1
		data-local map tasks=2
		total time spent by all maps in occupied slots (ms)=14668
		total time spent by all reduces in occupied slots (ms)=4352
		total time spent by all map tasks (ms)=14668
		total time spent by all reduce tasks (ms)=4352
		total vcore-seconds taken by all map tasks=14668
		total vcore-seconds taken by all reduce tasks=4352
		total megabyte-seconds taken by all map tasks=15020032
		total megabyte-seconds taken by all reduce tasks=4456448
	map-reduce framework
		map input records=5
		map output records=5
		map output bytes=45
		map output materialized bytes=67
		input split bytes=180
		combine input records=0
		combine output records=0
		reduce input groups=2
		reduce shuffle bytes=67
		reduce input records=5
		reduce output records=2
		spilled records=10
		shuffled maps =2
		failed shuffles=0
		merged map outputs=2
		gc time elapsed (ms)=525
		cpu time spent (ms)=2510
		physical memory (bytes) snapshot=641490944
		virtual memory (bytes) snapshot=6241415168
		total committed heap usage (bytes)=476053504
	shuffle errors
		bad_id=0
		connection=0
		io_error=0
		wrong_length=0
		wrong_map=0
		wrong_reduce=0
	file input format counters 
		bytes read=794
	file output format counters 
		bytes written=17

10）运行成功后执行命令查看，此时多出一个 /output 文件夹

[root@node02 hadoop-2.7.0]# hadoop fs -ls /
drwxr-xr-x   - root supergroup          0 2021-01-02 01:13 /input
drwxr-xr-x   - root supergroup          0 2021-01-02 01:37 /output
drwx------   - root supergroup          0 2021-01-02 01:37 /tmp

11）查看 /output文件夹的文件

[root@node02 hadoop-2.7.0]# hadoop fs -ls /output
-rw-r--r--   1 root supergroup          0 2021-01-02 01:37 /output/_success
-rw-r--r--   1 root supergroup         17 2021-01-02 01:37 /output/part-00000

12）查看part-r-00000 文件夹中的内容，我这个测试用例用来获取1949年和1950年的最高气温（华氏度）

[root@node02 hadoop-2.7.0]# hadoop fs -cat /output/part-00000
1949	111
1950	22

13）在浏览器端访问端口可以观看可视化界面，对应的是hadoop服务器地址和自己设置的端口，通过可视化界面查看input文件夹面刚刚上传的sample.txt文件
http://192.168.194.xxx:50070/

hadoop运行java程序(jar包)并运行时动态指定参数

14）测试程序jar包和测试文件已上传到github上面，此目录有面经和我自己总结的面试题

github
如有兴趣的同学也可以查阅我的秒杀系统

以上就是hadoop如何运行java程序(jar包)运行时动态指定参数的详细内容，更多关于hadoop运行java程序的资料请关注其它相关文章！

上一篇： PowerShell入门教程之Cmd命令与PowerShell命令相互调用的方法

下一篇： PowerShell脚本开发之批量扫描IP和端口