测试环境配置Hadoop伪分布部署
免密登录
免密无效
查看日志解决 部署Hadoop
hadoop-env.shcore-site.xmlhdfs-site.xmlmapred-site.xmlyarn-site.xmlslaves 配置HADOOP环境变量格式化NameNode启动 总结
测试环境配置centos7 虚拟机 dolphinscheduler 2.0.5 MySQL数据库 hadoop伪分布模式(需要hdfs)
[dolphinscheduler@host1 ~]$ java -version java version "1.8.0_151" Java(TM) SE Runtime Environment (build 1.8.0_151-b12) Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode) [dolphinscheduler@host1 ~]$ python --version Python 2.7.5 [dolphinscheduler@host1 ~]$Hadoop伪分布部署
海豚里面hadoop jar包是2.7.3版本
选择对应版本,下载地址
或者通过wget命令下载
wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-2.7.3.tar.gz免密登录
ssh-keygen -t rsa生成.ssh目录及公钥私钥
公钥写入authorized_keys文件
也可以直接使用ssh-copy-id localhost
ssh 登录验证,第一次需要输入密码
[dolphinscheduler@host1 ~]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/dolphinscheduler/.ssh/id_rsa): Created directory '/home/dolphinscheduler/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/dolphinscheduler/.ssh/id_rsa. Your public key has been saved in /home/dolphinscheduler/.ssh/id_rsa.pub. The key fingerprint is: SHA256:lwcGY8SCzQG+OSwJbfokOd4+ljnpM4lT/q/2VrG5hbM dolphinscheduler@host1 The key's randomart image is: +---[RSA 2048]----+ | .=.+= | | . .. +..o | |. o . . o | | = o o o o | |= + = S B . | |.=.o . B o | | .=.= . = | | o.% . . E | | +oBo=o | +----[SHA256]-----+ [dolphinscheduler@host1 ~]$ cd .ssh/ [dolphinscheduler@host1 .ssh]$ cat id_rsa.pub >> authorized_keys [dolphinscheduler@host1 .ssh]$ ll 总用量 12 -rw-r--r--. 1 dolphinscheduler dolphin 404 3月 8 17:07 authorized_keys -rw-------. 1 dolphinscheduler dolphin 1679 3月 8 17:05 id_rsa -rw-r--r--. 1 dolphinscheduler dolphin 404 3月 8 17:05 id_rsa.pub [dolphinscheduler@host1 .ssh]$ chmod 600 authorized_keys [dolphinscheduler@host1 .ssh]$ ssh localhost The authenticity of host 'localhost (127.0.0.1)' can't be established. ECDSA key fingerprint is SHA256:3a9OTk4w9slOAnXL2gLp4FxNga/wB4nR/9Ojh0n1+lY. ECDSA key fingerprint is MD5:06:79:53:e3:fe:35:bf:a9:11:6f:1d:b7:f8:87:88:a8. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts. Last login: Tue Mar 8 16:53:16 2022 from localhost [dolphinscheduler@host1 ~]$ 登出 Connection to localhost closed. [dolphinscheduler@host1 .ssh]$ ssh localhost Last login: Tue Mar 8 17:07:49 2022 from localhost [dolphinscheduler@host1 ~]$免密无效 查看日志
sudo cat /var/log/secure
Authentication refused
Mar 8 16:42:23 host1 sshd[8030]: pam_unix(sshd:session): session closed for user dolphinscheduler Mar 8 16:42:28 host1 sshd[8050]: reprocess config line 142: Deprecated option RSAAuthentication Mar 8 16:42:28 host1 sshd[8050]: Authentication refused: bad ownership or modes for directory /home/dolphinscheduler Mar 8 16:42:30 host1 sshd[8050]: Accepted password for dolphinscheduler from 127.0.0.1 port 33578 ssh2 Mar 8 16:42:31 host1 sshd[8050]: pam_unix(sshd:session): session opened for user dolphinscheduler by (uid=0) Mar 8 16:42:32 host1 sshd[8054]: Received disconnect from 127.0.0.1 port 33578:11: disconnected by user Mar 8 16:42:32 host1 sshd[8054]: Disconnected from 127.0.0.1 port 33578 Mar 8 16:42:32 host1 sshd[8050]: pam_unix(sshd:session): session closed for user dolphinscheduler Mar 8 16:42:33 host1 sshd[8075]: reprocess config line 142: Deprecated option RSAAuthentication Mar 8 16:42:33 host1 sshd[8075]: Authentication refused: bad ownership or modes for directory /home/dolphinscheduler Mar 8 16:42:34 host1 sshd[8075]: Connection closed by 127.0.0.1 port 33580 [preauth] Mar 8 16:42:49 host1 sudo: dolphinscheduler : TTY=pts/4 ; PWD=/home/dolphinscheduler ; USER=root ; COMMAND=/bin/cat /var/log/secure解决
sshd_config配置参数StrictModes改为no,重启服务
[dolphinscheduler@host1 ~]$ sudo vi /etc/ssh/sshd_config [dolphinscheduler@host1 ~]$ sudo systemctl restart sshd.service [dolphinscheduler@host1 ~]$ sudo grep StrictModes /etc/ssh/sshd_config StrictModes no [dolphinscheduler@host1 ~]$部署Hadoop
###解压修改配置文件
hadoop-env.sh[dolphinscheduler@host1 hadoop]$ vi hadoop-env.sh [dolphinscheduler@host1 hadoop]$ grep JAVA_HOME hadoop-env.sh # The only required environment variable is JAVA_HOME. All others are # set JAVA_HOME in this file, so that it is correctly defined on export JAVA_HOME=/usr/local/java/jdk1.8.0_151 [dolphinscheduler@host1 hadoop]$core-site.xml
[dolphinscheduler@host1 app]$ cd hadoop-2.7.3 [dolphinscheduler@host1 hadoop-2.7.3]$ pwd /home/dolphinscheduler/app/hadoop-2.7.3 [dolphinscheduler@host1 hadoop-2.7.3]$ mkdir -p data/tmp [dolphinscheduler@host1 hadoop-2.7.3]$ cd data/tmp/ [dolphinscheduler@host1 tmp]$ pwd /home/dolphinscheduler/app/hadoop-2.7.3/data/tmp [dolphinscheduler@host1 tmp]$ cd ../../etc/hadoop/ [dolphinscheduler@host1 hadoop]$ pwd /home/dolphinscheduler/app/hadoop-2.7.3/etc/hadoop [dolphinscheduler@host1 hadoop]$ [dolphinscheduler@host1 hadoop]$ vi core-site.xml [dolphinscheduler@host1 hadoop]$ cat core-site.xml[dolphinscheduler@host1 hadoop]$ fs.defaultFS hdfs://host1:8020 hadoop.tmp.dir /home/dolphinscheduler/app/hadoop-2.7.3/data/tmp
host1对应配置
[dolphinscheduler@host1 hadoop]$ cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 #::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.56.10 host1 [dolphinscheduler@host1 hadoop]$hdfs-site.xml
[dolphinscheduler@host1 hadoop]$ vi hdfs-site.xml [dolphinscheduler@host1 hadoop]$ [dolphinscheduler@host1 hadoop]$ cat hdfs-site.xmlmapred-site.xml[dolphinscheduler@host1 hadoop]$ dfs.replication 3 dfs.permissions.enabled false dfs.namenode.http.address host1:50070
[dolphinscheduler@host1 hadoop]$ cp mapred-site.xml.template mapred-site.xml [dolphinscheduler@host1 hadoop]$ vi mapred-site.xml [dolphinscheduler@host1 hadoop]$ cat mapred-site.xmlyarn-site.xml[dolphinscheduler@host1 hadoop]$ mapreduce.framework.name yarn
[dolphinscheduler@host1 hadoop]$ vi yarn-site.xml [dolphinscheduler@host1 hadoop]$ cat yarn-site.xmlslaves[dolphinscheduler@host1 hadoop]$ yarn.nodemanager.aux-services mapreduce_shuffle
伪分布式,都是同一台机器host1
[dolphinscheduler@host1 hadoop]$ vi slaves [dolphinscheduler@host1 hadoop]$ cat slaves host1 [dolphinscheduler@host1 hadoop]$ ssh host1 The authenticity of host 'host1 (192.168.56.10)' can't be established. ECDSA key fingerprint is SHA256:3a9OTk4w9slOAnXL2gLp4FxNga/wB4nR/9Ojh0n1+lY. ECDSA key fingerprint is MD5:06:79:53:e3:fe:35:bf:a9:11:6f:1d:b7:f8:87:88:a8. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'host1,192.168.56.10' (ECDSA) to the list of known hosts. Last login: Tue Mar 8 17:07:52 2022 from localhost [dolphinscheduler@host1 ~]$ 登出 Connection to host1 closed. [dolphinscheduler@host1 hadoop]$ ssh host1 Last login: Tue Mar 8 17:30:10 2022 from host1 [dolphinscheduler@host1 ~]$配置HADOOP环境变量
部署用户用的dolphinscheduler,配置文件对应.bash_profile
[dolphinscheduler@host1 ~]$ vi .bash_profile [dolphinscheduler@host1 ~]$ grep HADOOP .bash_profile export HADOOP_HOME=/home/dolphinscheduler/app/hadoop-2.7.3 export PATH=$HADOOP_HOME/bin:$PATH [dolphinscheduler@host1 ~]$ source .bash_profile [dolphinscheduler@host1 ~]$ hadoop version Hadoop 2.7.3 Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff Compiled by root on 2016-08-18T01:41Z Compiled with protoc 2.5.0 From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4 This command was run using /home/dolphinscheduler/app/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar [dolphinscheduler@host1 ~]$格式化NameNode
[dolphinscheduler@host1 ~]$ hadoop namenode -format DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. 22/03/08 17:36:45 INFO namenode.NameNode: STARTUP_MSG: 22/03/08 17:36:45 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT] 22/03/08 17:36:45 INFO namenode.NameNode: createNameNode [-format] Formatting using clusterid: CID-fdc4e4af-4153-4b72-8cb5-ffdee8693e48 22/03/08 17:36:45 INFO namenode.FSNamesystem: No KeyProvider found. 22/03/08 17:36:45 INFO namenode.FSNamesystem: fsLock is fair:true 22/03/08 17:36:45 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000 22/03/08 17:36:45 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true 22/03/08 17:36:45 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 22/03/08 17:36:45 INFO blockmanagement.BlockManager: The block deletion will start around 2022 三月 08 17:36:45 22/03/08 17:36:45 INFO util.GSet: Computing capacity for map BlocksMap 22/03/08 17:36:45 INFO util.GSet: VM type = 64-bit 22/03/08 17:36:45 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB 22/03/08 17:36:45 INFO util.GSet: capacity = 2^21 = 2097152 entries 22/03/08 17:36:45 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false 22/03/08 17:36:45 INFO blockmanagement.BlockManager: defaultReplication = 3 22/03/08 17:36:45 INFO blockmanagement.BlockManager: maxReplication = 512 22/03/08 17:36:45 INFO blockmanagement.BlockManager: minReplication = 1 22/03/08 17:36:45 INFO blockmanagement.BlockManager: maxReplicationStreams = 2 22/03/08 17:36:45 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000 22/03/08 17:36:45 INFO blockmanagement.BlockManager: encryptDataTransfer = false 22/03/08 17:36:45 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000 22/03/08 17:36:45 INFO namenode.FSNamesystem: fsOwner = dolphinscheduler (auth:SIMPLE) 22/03/08 17:36:45 INFO namenode.FSNamesystem: supergroup = supergroup 22/03/08 17:36:45 INFO namenode.FSNamesystem: isPermissionEnabled = false 22/03/08 17:36:45 INFO namenode.FSNamesystem: HA Enabled: false 22/03/08 17:36:45 INFO namenode.FSNamesystem: Append Enabled: true 22/03/08 17:36:45 INFO util.GSet: Computing capacity for map INodeMap 22/03/08 17:36:45 INFO util.GSet: VM type = 64-bit 22/03/08 17:36:45 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB 22/03/08 17:36:45 INFO util.GSet: capacity = 2^20 = 1048576 entries 22/03/08 17:36:45 INFO namenode.FSDirectory: ACLs enabled? false 22/03/08 17:36:45 INFO namenode.FSDirectory: XAttrs enabled? true 22/03/08 17:36:45 INFO namenode.FSDirectory: Maximum size of an xattr: 16384 22/03/08 17:36:45 INFO namenode.NameNode: Caching file names occuring more than 10 times 22/03/08 17:36:45 INFO util.GSet: Computing capacity for map cachedBlocks 22/03/08 17:36:45 INFO util.GSet: VM type = 64-bit 22/03/08 17:36:45 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB 22/03/08 17:36:45 INFO util.GSet: capacity = 2^18 = 262144 entries 22/03/08 17:36:45 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033 22/03/08 17:36:45 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0 22/03/08 17:36:45 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000 22/03/08 17:36:45 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10 22/03/08 17:36:45 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10 22/03/08 17:36:45 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25 22/03/08 17:36:45 INFO namenode.FSNamesystem: Retry cache on namenode is enabled 22/03/08 17:36:45 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis 22/03/08 17:36:45 INFO util.GSet: Computing capacity for map NameNodeRetryCache 22/03/08 17:36:45 INFO util.GSet: VM type = 64-bit 22/03/08 17:36:45 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB 22/03/08 17:36:45 INFO util.GSet: capacity = 2^15 = 32768 entries 22/03/08 17:36:46 INFO namenode.FSImage: Allocated new BlockPoolId: BP-866347120-192.168.56.10-1646732206003 22/03/08 17:36:46 INFO common.Storage: Storage directory /home/dolphinscheduler/app/hadoop-2.7.3/data/tmp/dfs/name has been successfully formatted. 22/03/08 17:36:46 INFO namenode.FSImageFormatProtobuf: Saving image file /home/dolphinscheduler/app/hadoop-2.7.3/data/tmp/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression 22/03/08 17:36:46 INFO namenode.FSImageFormatProtobuf: Image file /home/dolphinscheduler/app/hadoop-2.7.3/data/tmp/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 362 bytes saved in 0 seconds. 22/03/08 17:36:46 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 22/03/08 17:36:46 INFO util.ExitUtil: Exiting with status 0 22/03/08 17:36:46 INFO namenode.NameNode: SHUTDOWN_MSG: [dolphinscheduler@host1 ~]$启动
[dolphinscheduler@host1 sbin]$ pwd /home/dolphinscheduler/app/hadoop-2.7.3/sbin [dolphinscheduler@host1 sbin]$ sh start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh Starting namenodes on [host1] host1: starting namenode, logging to /home/dolphinscheduler/app/hadoop-2.7.3/logs/hadoop-dolphinscheduler-namenode-host1.out host1: starting datanode, logging to /home/dolphinscheduler/app/hadoop-2.7.3/logs/hadoop-dolphinscheduler-datanode-host1.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /home/dolphinscheduler/app/hadoop-2.7.3/logs/hadoop-dolphinscheduler-secondarynamenode-host1.out starting yarn daemons starting resourcemanager, logging to /home/dolphinscheduler/app/hadoop-2.7.3/logs/yarn-dolphinscheduler-resourcemanager-host1.out host1: starting nodemanager, logging to /home/dolphinscheduler/app/hadoop-2.7.3/logs/yarn-dolphinscheduler-nodemanager-host1.out [dolphinscheduler@host1 sbin]$ jps 11267 DataNode 11443 SecondaryNameNode 11735 NodeManager 11624 ResourceManager 12042 Jps 11135 NameNode [dolphinscheduler@host1 sbin]$总结
公司目前用到dolphin scheduler,都是简单的存储过程、datax任务,涉及HDFS功能的都未使用到,因此先本地部署测试一下,相当于未雨绸缪了,以后估计也会用到,由于CSDN对篇幅的要求,涉及HDFS的相关功能,见dolphinscheduler涉及HDFS功能测试(二)



