Hadoop之提交应用程序（详细步骤）

以提交wordcount为例——hadoop自带的例子程序

启动HDFS和yarn

$HADOOP_HOME/sbin/start-dfs.sh

$HADOOP_HOME/sbin/start-yarn.sh

数据准备

进入$HADOOP_HOME目录

cd $HADOOP_HOME

创建数据文件

vim input.txt

输入如下内容

hello bigdata 2017
hello bigdata 2018
hello bigdata 2019
hello bigdata 2020
hello ynnu 2017
hello ynnu 2018
hello ynnu 2019
hello ynnu 2020

使用同样的方式创建文件input2.txt

HDFS上创建目录

在HDFS根目录下创建文件夹test

hdfs dfs -mkdir /test

在 /test目录下创建文件夹input

hdfs dfs -mkdir /test/input

上传数据

将input1.txt拷贝到HDFS集群的 /test/input

hdfs dfs -copyFromLocal input1.txt /test/input

将input2.txt拷贝到HDFS集群的 /test/input

hdfs dfs -copyFromLocal input2.txt /test/input

查看HDFS的文件

hdfs dfs -ls /test/input

通过浏览器查看

先进入HDFS的web监控界面(通过浏览器搜索 http://10.103.105.62：50070), 点击菜单“Utilities”，再点击菜单“Browse the file system”，选择目录“test”，再选择目录”input”

执行程序

进入$HADOOP_HOME目录

cd $HADOOP_HOME

执行以下命令：

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.6.jar wordcount /test/input /test/output

图1-5和1-6代表了程序执行过程中的输出信息。图1-5记录了程序的执行进度，图1-6描述了程序执行过程中的相关性能计数器。在File System Counters记录了文件读写的字节数；Job Counters中记录了Task任务和Reduce任务的个数(此处分别为2和1)；Map-Reduce framework中记录了Map任务和Reduce任务操作的记录数和字节数。

查看计算结果

查看结果目录

hdfs dfs -ls /test/output

查看最终结果

hdfs dfs -cat /test/output/part-r-00000

程序创建了目录/test/output，并且最终输出结果保存在文件part-r-00000中。一个结果文件，意味着只有一个Reduce任务

在Yarn中查看程序信息

在浏览器中搜索 http://10.103.105.62:8088即可查看结果

Hadoop之提交应用程序（详细步骤）

大数据系统相关栏目本月热门文章