春风十里不如你 —— Taozi - Hadoop https://www.xiongan.host/index.php/tag/Hadoop/ 【Hadoop】全分布式安装集群 https://www.xiongan.host/index.php/archives/213/ 2023-05-28T12:39:00+08:00 Hadoop全分布式安装环境准备首先做免密登录,三台虚拟机分别生成秘钥文件//三台都需要操作 ssh-keygen -t rsa //三台都需要打以下命令,进行秘钥分发 [root@tz1-123 ~]# ssh-copy-id tz1-123 [root@tz1-123 ~]# ssh-copy-id tz2-123 [root@tz1-123 ~]# ssh-copy-id tz3-123修改防火墙设置后,查看三台虚拟机防火墙状态//三台,禁止防火墙自启,立即生效 systemctl disable firewalld --now安装JDK(3台),首先上传安装包到根目录,然后解压在/usr/lib/jvm//修改环境变量/etc/profile JAVA_HOME=/usr/lib/jvm/jdk1.8.0_152 JRE_HOME=$JAVA_HOME/jre CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin export JAVA_HOME JRE_HOME CLASS_PATH PATH部署首先将Hadoop软件包上传至/root中,并解压在/usr/local/src/下修改core-site.xml//将以下内容加在<configuration></configuration>之间 <property> <!--hdfs地址--> <name>fs.defaultFS</name> <value>hdfs://tz1-123:9000</value> </property> <!-- 指定hadoop运行时产生文件的存储目录 --> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/src/hadoop/data/tmp</value> </property>修改hadoop-env.sh//在最后添加以下一条 export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_152修改hdfs-site.xml//将以下内容加在<configuration></configuration>之间 <!--副本集个数--> <property> <name>dfs.replication</name> <value>3</value> </property> <!--SecondaryNamenode的http地址--> <property> <name>dfs.namenode.secondary.http-address</name> <value>tz3-123:50090</value> </property>修改yarn-env.sh//在最后添加以下一条 export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_152修改yarn-site.xml//将以下内容加在<configuration></configuration>之间 <!-- reducer获取数据的方式 --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 指定YARN的ResourceManager的地址 --> <property> <name>yarn.resourcemanager.hostname</name> <value>tz2-123</value> </property>修改mapred-env.sh//在最后添加以下一条 export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_152修改mapred-site.xml[root@tz1-123 hadoop]# cp mapred-site.xml.template mapred-site.xml [root@tz1-123 hadoop]# vim mapred-site.xml //将以下内容加在<configuration></configuration>之间 <!-- 指定mr运行在yarn上 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>修改slaves[root@tz1-123 hadoop]# vim slaves tz1-123 tz2-123 tz3-123分发Hadoop软件包[root@tz1-123 hadoop]# scp -r /usr/local/src/hadoop tz2-123:/usr/local/src/ [root@tz1-123 hadoop]# scp -r /usr/local/src/hadoop tz3-123:/usr/local/src/修改/etc/profile//在最后添加以下内容 export HADOOP_HOME=/usr/local/src/hadoop export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin //修改完成后再次分发,并加在环境变量 [root@tz1-123 hadoop]# source /etc/profile [root@tz1-123 hadoop]# scp /etc/profile tz2-123:/etc/ [root@tz1-123 hadoop]# scp /etc/profile tz3-123:/etc/在tz1-123(master)上格式化namenodehdfs namenode -format启动集群并测试[hadoop@tz1-123 ~]$ start-dfs.sh [hadoop@tz2-123 ~]$ start-yarn.sh [root@tz1-123 hadoop]# jps 8096 NameNode 24690 NodeManager 24882 Jps 8293 DataNode [root@tz2-123 ~]# jps 30709 NodeManager 24086 DataNode 30567 ResourceManager 781 Jps [root@tz3-123 ~]# jps 23988 DataNode 604 Jps 30494 NodeManagerHDFS Shell操作HDFS Shell操作以hadoop fs或hdfs dfs开头。#查看某路径下文件夹 hadoop fs -ls HDFS路径 #在HDFS上创建文件夹 hadoop fs -mkdir HDFS文件夹路径 #在HDFS上创建文件夹(上级目录不存在) hadoop fs -mkdir -p HDFS文件夹路径 #将本地文件上传到HDFS上 hadoop fs -put 本地文件路径 HDFS路径 #查看集群文件的内容 hadoop fs -cat HDFS文件路径 #从HDFS上下载文件到本地 hadoop fs -get HDFS文件路径 本地路径 #删除HDFS上空文件夹 hadoop fs -rmdir HDFS文件夹路径 #删除HDFS上的非空文件夹 hadoop fs -rm -r HDFS文件夹路径 #删除HDFS上的文件 hadoop fs -rm HDFS文件路径 #将HDFS上的路径剪切至另一个路径下 hadoop fs -mv HDFS源路径 HDFS目标路径 #将HDFS上的路径复制到另一个路径下 hadoop fs -cp HDFS源路径 HDFS目标路径 #在HDFS上创建一个文件 hadoop fs -touchz HDFS路径我的博客即将同步至腾讯云开发者社区,邀请大家一同入驻:https://cloud.tencent.com/developer/support-plan?invite_code=35n3trqr2ug48 【Hbase】部署安装 https://www.xiongan.host/index.php/archives/195/ 2023-04-03T21:09:37+08:00 Hbase介绍HBase – Hadoop Database,是一个高可靠性、高性能、面向列、可伸缩的分布式存储系统,利用HBase技术可在廉价PC Server上搭建起大规模结构化存储集群。安装上传解压重命名Hbase安装包点击下载配置环境变量,并分发生效[root@master-tz src]# vim /etc/profile //最后添加下面两行 export HBASE_HOME=/usr/local/src/hbase export PATH=$PATH:$HBASE_HOME/bin[root@master-tz src]# source /etc/profile [root@master-tz src]# scp /etc/profile slave01-tz:/etc/profile root@slave01-tz's password: profile 100% 2704 1.7MB/s 00:00 [root@master-tz src]# scp /etc/profile slave02-tz:/etc/profile root@slave02-tz's password: profile [root@slave01-tz ~]# source /etc/profile [root@slave02-tz ~]# source /etc/profile编辑conf下的hbase-env.sh 添加如下内容[root@master-tz conf]# vim hbase-env.sh //添加如下 export JAVA_HOME=/usr/local/src/java export HADOOP_HOME=/usr/local/src/hadoop export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop export HBASE_MANAGES_ZK=false export HBASE_LOG_DIR=${HBASE_HOME}/logs export HBASE_PID_DIR=${HBASE_HOME}/pid配置hbase-site.xml[root@master-tz conf]# vim hbase-site.xml //添加如下,<configuration></configuration>中间 <property> <name>hbase.rootdir</name> <value>hdfs://master-tz:8020/hbase</value> </property> <property> <name>hbase.master.info.port</name> <value>16010</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.tmp.dir</name> <value>/usr/local/src/hbase/tmp</value> </property> <property> <name>zookeeper.session.timeout</name> <value>120000</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>master-tz,slave01-tz,slave02-tz</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/usr/local/src/hbase/tmp/zookeeperhbase </value> </property>修改regionservers文件[root@master-tz conf]# vim regionservers [root@master-tz conf]# cat regionservers master-tz slave01-tz slave02-tz拷贝分发并修改权限[root@master-tz conf]# scp -r /usr/local/src/hbase slave01-tz:/usr/local/src [root@master-tz conf]# scp -r /usr/local/src/hbase slave02-tz:/usr/local/src [root@slave01-tz ~]# chown -R hadoop:hadoop /usr/local/src/hbase/ [root@slave02-tz ~]# chown -R hadoop:hadoop /usr/local/src/hbase/切换到hadoop用户三台关闭防火墙首先开启zookeeper,在开启集群[hadoop@master-tz ~]$ zkServer.sh start //三台都需要开启 [hadoop@master-tz ~]$ start-all.sh //只需要在master开启集群 [hadoop@master-tz ~]$ start-hbase.sh //开启hbase查看web浏览器端 【Hive】Hadoop下的部署(未上接) https://www.xiongan.host/index.php/archives/194/ 2023-03-27T23:41:04+08:00 Hive的部署MySQL安装安装首先上传mysql数据库的rpm压缩包到主机/opt/software//解压缩包到当前目录中 [root@master-tz software]# unzip mysql-5.7.18.zip //进入到rpm软件目录中,首先检查mariadb,如果有就要卸载//安装数据库 [root@master-tz mysql-5.7.18]# rpm -ivh mysql-community-common-5.7.18-1.el7.x86_64.rpm [root@master-tz mysql-5.7.18]# rpm -ivh mysql-community-libs-5.7.18-1.el7.x86_64.rpm [root@master-tz mysql-5.7.18]# rpm -ivh mysql-community-client-5.7.18-1.el7.x86_64.rpm //安装下面的rpm需要首先安装perl软件 [root@master-tz mysql-5.7.18]# yum install -y net-tools perl [root@master-tz mysql-5.7.18]# rpm -ivh mysql-community-server-5.7.18-1.el7.x86_64.rpm配置修改配置文件/etc/my.cnfvim /etc/my.cnf //最后添加五行 default-storage-engine=innodb innodb_file_per_table collation-server=utf8_general_ci init-connect='SET NAMES utf8' character-set-server=utf8 //最后保存退出启动[root@master-tz mysql-5.7.18]# systemctl start mysqld [root@master-tz mysql-5.7.18]# systemctl status mysqld查看MySQL初始密码[root@master-tz mysql-5.7.18]# cat /var/log/mysqld.log | grep password 2023-03-27T08:52:43.074230Z 1 [Note] A temporary password is generated for root@localhost: KbVXiHlul3:> //查看初始密码,下方需要填写 [root@master-tz mysql-5.7.18]# mysql_secure_installation //重新设定密码,并把密码设置为Password123$ //注:允许远程连接设定为n,表示允许远程连接,其它设定为y除了以下是n其他都是y登录数据库客户端[root@master-tz mysql-5.7.18]# mysql -uroot -pPassword123$新建hive用户与元数据mysql>create database hive_db; mysql>create user hive identified by 'Password123$'; mysql>grant all privileges on *.* to hive@'%' identified by 'Password123$' with grant option ; mysql>grant all privileges on *.* to 'root'@'%'identified by 'Password123$' with grant option; mysql>flush privileges;Hive安装安装首先将hive的压缩包上传到虚拟机,并解压,重命名hive,设置hive权限[root@master-tz ~]# tar -zxf apache-hive-2.0.0-bin.tar.gz -C /usr/local/src/ [root@master-tz ~]# cd /usr/local/src/ [root@master-tz src]# mv apache-hive-2.0.0-bin/ hive [root@master-tz src]# chown -R hadoop:hadoop hive/修改环境变量[root@master-tz src]# vim /etc/profile # set Hive environment export HIVE_HOME=/usr/local/src/hive # Hive安装目录 export PATH=$HIVE_HOME/bin:$PATH # 添加将Hive的bin目录 export HIVE_CONF_DIR=$HIVE_HOME/conf #Hive的环境变量 [root@master-tz src]# source /etc/profile修改配置文件hive-site.xml文件首先切换到hadoop用户[hadoop@master-tz conf]$ cd /usr/local/src/hive/conf [hadoop@master-tz conf]$ vim hive-site.xml <?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!--元数据库地址--> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://master-tz:3306/hive_db?createDatabaseIfNotExist=true</value> </property> <!--mysql用户名--> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <!--mysql中hive用户密码--> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>Password123$</value> </property> <!--mysql驱动--> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>hive.downloaded.resources.dir</name> <value>/usr/local/src/hive/tmp</value> </property> <property> <name>hive.exec.local.scratchdir</name> <value>/usr/local/src/hive/tmp/${hive.session.id}_resources</value> </property> <property> <name>hive.querylog.location</name> <value>/usr/local/src/hive/tmp</value> </property> <property> <name>hive.server2.logging.operation.log.location</name> <value>/usr/local/src/hive/tmp/operation_logs</value> </property> <property> <name>hive.server2.webui.host</name> <value>master-tz</value> </property> <property> <name>hive.server2.webui.port</name> <value>10002</value> </property> </configuration>hive-env.sh[hadoop@master-tz conf]$ cp hive-env.sh.template hive-env.sh [hadoop@master-tz conf]$ vim hive-env.sh //增加如下配置项 # Set JAVA export JAVA_HOME=/usr/local/src/java # Set HADOOP_HOME to point to a specific hadoop install directory export HADOOP_HOME=/usr/local/src/hadoop # Hive Configuration Directory can be controlled by: export HIVE_CONF_DIR=/usr/local/src/hive/conf # Folder containing extra ibraries required for hive compilation/execution can be controlled by: export HIVE_AUX_JARS_PATH=/usr/local/src/hive/lib将MySQL的驱动jar包上传至虚拟机,然后将该jar包复制到hive安装路径下的lib文件夹中[root@master-tz software]# cp mysql-connector-java-5.1.46.jar /usr/local/src/hive/lib/确保hadoop集群正常,然后初始化hive元数据库[hadoop@master-tz conf]$ schematool -initSchema -dbType mysql进入hive shell界面[hadoop@master-tz ~]$ hive hive>如果出现以下情况则需要去hive-site.xml配置文件修改为<value>jdbc:mysql://master-tz:3306/hive_db?createDatabaseIfNotExist=true&useSSL=false</value>