| 组件 | 版本 | 下载地址 |
|---|---|---|
| Hadoop | 2.7.7 | hadoop 2.7.7 |
| JDK | 1.8 | jdk 8 |
| Mysql | 5.7 | Mysql 5.7 |
| Hive | 2.3.4 | Hive 2.3.4 |
| Spark | 2.1.1 | Spark 2.1.1 |
| IP | 主机名 | 密码 |
|---|---|---|
| 192.168.222.201 | master | password |
| 192.168.222.202 | slave1 | password |
| 192.169.222.203 | slave2 | password |
参考地址:https://blog.csdn.net/su_mingyang/article/details/118070573
- 关闭防火墙,设置开机不自启(三台虚拟机都要做该操作)
- 配置hosts文件(三天能够互相通信)
- 配置SSH
- 时间同步配置NTP或使用date手动调整
参考地址:https://blog.csdn.net/su_mingyang/article/details/120872313
3、安装hadoop 2.7.7 完全分布式参考地址:https://blog.csdn.net/su_mingyang/article/details/120872850
4、搭建Spark 完全分布式 4.1 、解压spark文件[root@master ~]#
tar -xzvf /chinaskills/spark-2.1.1-bin-hadoop2.7.tgz -C /usr/local/src/4.2、重命名
[root@master ~]#
mv /usr/local/src/spark-2.1.1-bin-hadoop2.7 /usr/local/src/spark4.3、配置spark 环境变量
[root@master ~]#
vi /root/.bash_profile
配置内容:
export SPARK_HOME=/usr/local/src/spark export PATH=$PATH:$SPARK_HOME/sbin:$SPARK_HOME/bin4.4 加载环境变量
source /root/.bash_profile4.5 配置spark
[root@master ~]#
cp /usr/local/src/spark/conf/spark-env.sh.template /usr/local/src/spark/conf/spark-env.sh vi /usr/local/src/spark/conf/spark-env.sh
配置内容:
# java位置 export JAVA_HOME=/usr/local/src/java # master节点IP或域名 export SPARK_MASTER_IP=master # worker内存大小 export SPARK_WORKER_MEMORY=1G # Worker的cpu核数 SPARK_WORKER_CORES=1 # hadoop配置文件路径 export HADOOP_CONF_DIR=/usr/local/src/hadoop/etc/hadoop4.6 配置slaves
[root@master ~]#
cp /usr/local/src/spark/conf/slaves.template /usr/local/src/spark/conf/slaves vi /usr/local/src/spark/conf/slaves.template
配置内容:
master slave1 slave24.7 分发文件给slave1和slave2
scp -r /usr/local/src/spark slave1:/usr/local/src/ scp -r /usr/local/src/spark slave2:/usr/local/src/ scp /root/.bash_profile slave1:/root/ scp /root/.bash_profile slave2:/root/4.8 启动Spark 集群
/usr/local/src/spark/sbin/start-all.sh
输出信息:
starting org.apache.spark.deploy.master.Master, logging to /usr/local/src/spark/logs/spark-root-org.apache.spark.deploy.master.Master-1-master.out slave2: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/src/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.out slave1: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/src/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.out master: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/src/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-master.out4.9 web访问
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-iEDbSa7G-1634734160666)(image-20211020204026393.png)]



