Storm多节点部署

发布时间 2023-06-29 19:02:13作者: Modest-Hamilton

Storm多节点部署

环境准备

三台已安装Centos7的虚拟机

hadoop001 192.168.188.145

hadoop002 192.168.188.146

hadoop003 192.168.188.147

配置hosts

在每台机器的/etc/hosts末尾追加

192.168.188.145 hadoop001
192.168.188.146 hadoop002
192.168.188.147 hadoop003

首先确保三台机器之间互相ssh免密登录

在三台机器上分别执行ssh-keygen -t rsa,之后一路回车即可,最终生成各自的公钥

将三台机器的公钥都复制出来,填写到一起,最后填入每台机器的authorized_keys中

查看公钥
[root@hadoop001 logs]# cat /root/.ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCx4+SdBTgk/0biPOjBvPik3hECqKrlou0h+9tBK/wb/3QRLtYUYS7v9Womu7RcCkYOW26pQZ4pOUyJYJ2tgBZJMWQAMBic7CREtRCHPpvS2uhHCWal6NZ7tbhEBfNEgIIvmBkDb8L1qNfCg96z7EuyeLkJdf2iShA+Xvs7YRfdyA2MzSjudQIUGguFiXfiVh4Q+w2K4CIa/4EZDNT5EJbsfHbeuuY91my8l+S4wa3NzUCkW3ak24P5CsRFX/tzmwS/pAzmRdlhA/S+hz4sUCvvvi0mTsi4ISh3h/WLt0+9q48+o82wwdtIQ/iIysZQAbe3n7+U/GUFrviC7zKeUEzH root@hadoop001
全部复制到一起并写入每台机器中
[root@hadoop001 logs]# cat /root/.ssh/authorized_keys 
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCx4+SdBTgk/0biPOjBvPik3hECqKrlou0h+9tBK/wb/3QRLtYUYS7v9Womu7RcCkYOW26pQZ4pOUyJYJ2tgBZJMWQAMBic7CREtRCHPpvS2uhHCWal6NZ7tbhEBfNEgIIvmBkDb8L1qNfCg96z7EuyeLkJdf2iShA+Xvs7YRfdyA2MzSjudQIUGguFiXfiVh4Q+w2K4CIa/4EZDNT5EJbsfHbeuuY91my8l+S4wa3NzUCkW3ak24P5CsRFX/tzmwS/pAzmRdlhA/S+hz4sUCvvvi0mTsi4ISh3h/WLt0+9q48+o82wwdtIQ/iIysZQAbe3n7+U/GUFrviC7zKeUEzH root@hadoop001
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDfwC7fnRZ9w/3xErsLNEBry0E4LR4eqyPgl+FQdi6KdwAGEV/WsgU2HZAcURVzf+LiILut8ZmVT0XPHGZVLINb5XXr/QY3MuGFz0keiKH0LKg5Zch7nV2DX+r76ybCuz9Os3PK1I9AGMfm5F7pD+LAnQJ47PXtis7dnlz0+XBvWf22eDHn4dwDIeQfFVJXe3qp611r0CYG2qR15BkZ9E0S/7+ao1XKTlhcwl4SJPJoXz528Rmdm/moEAzlnEhhifdB1VRm6twIf9ljLpWmyGrPZ9sRBfN32IybQyh0Wx+HJbOsz437yaxPSaFV+8JYgNi0CxWqBaERRJTw/AsYWefj root@hadoop002
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC6uf2CMv9sXEKRoKnjSwyRWWkDNPfiF7oO9YPaIvZxkIgjE3p8oOfLTf5LCqlBa1Kavaj5uZnSG2IMobxShxG+e3Y8jAKffYc2oiXgQVMZHBYrugwNUIQIs2rdNGBwgJTwLRKOoYD4HPcfhXs4JPzwWMPGuuxJSB9u5hdcvBu1UbzeC86Lq15qxg2Q4Z+jy70pHSr7Lj62O8qRYuXpoq1+HmcbpJ51kXs4BZU0FigrYzrFPpvPmxR0TkpEaqTeV0e7JOkjsLv02TO1zGjGEjYRsNg6LLNzL56UWX4q2Fj/PA0d6o7P9Sv3sZ/HURvfGixilZqCpb8rqtLGkNVoeFDN root@hadoop003
测试
[root@hadoop001 logs]# ssh hadoop002
Last login: Sat Apr 22 09:01:43 2023 from hadoop001
[root@hadoop002 ~]#

部署ZooKeeper集群

下载ZooKeeper

一定是下载后缀带bin的压缩包,否则启动时会出错,找不到java class

#解压缩
tar zxvf zookeeper-3.4.10-bin.tar.gz -C /opt/zookeeper
#新建配置文件
cp /opt/zookeeper/zookeeper-3.4.10-bin/conf/zoo_sample.cfg        /opt/zookeeper/zookeeper-3.4.10-bin/conf/zoo.cfg
修改配置
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/tmp/zookeeper
# the port at which the clients will connect
clientPort=2181
#这个部分需要注意,zookeeper的新版中用了netty做为了内嵌的控制台服务,每个节点随便给一个没有占用的端口赋值就行了
admin.serverPort=8001
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
dataDir=/opt/zookeeper/data
dataLogDir=/opt/zookeeper/dataLog
server.1=hadoop001:2888:3888
server.2=hadoop002:2888:3888
server.3=hadoop003:2888:3888
分发包
scp -r /opt/zookeeper hadoop002:/opt
scp -r /opt/zookeeper hadoop003:/opt
配置集群

所有机器都创建以下目录

mkdir   /opt/zookeeper/data
mkdir   /opt/zookeeper/dataLog
mkdir   /tmp/zookeeper

在所有机器的/opt/zookeeper/data目录下创建myid文件,id能和上面的配置文件对应上就可以了

[root@hadoop001 logs]#cat 1 > /opt/zookeeper/data/myid
[root@hadoop002 logs]#cat 2 > /opt/zookeeper/data/myid
[root@hadoop003 logs]#cat 3 > /opt/zookeeper/data/myid
安装jdk

所有机器都要安装并配置环境变量

下载地址:http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

目前在官网下载低于jdk1.8的java jdk的时候需要登陆,这边分享一个账号,方便下载
账号:913898356@qq.com
密码:Oracle123.

下载完毕后解压缩

配置jdk环境变量
[root@hadoop001 logs]# vim /etc/profile
JAVA_HOME=/opt/java/jdk1.8.0_371
export PATH=$PATH:$JAVA_HOME/bin
[root@hadoop001 logs]# source /etc/profile
启动ZooKeeper

所有节点都启动,顺序无所谓

cd    /opt/zookeeper/zookeeper-3.4.10-bin/bin
#启动ZooKeeper
./zkServer.sh   start
#查看状态
./zkServer.sh   status

启动成功可以看到其中有一个节点是leader,其他节点是follower

调试

如果启动后发现ZooKeeper运行不正常,可以停止后用调试模式查看日志

./zkServer.sh stop
#查看错误信息
./zkServer.sh start-foreground

安装Storm

下载地址

解压缩

tar zxvf apache-storm-1.2.4.tar.gz  -C /export/servers
cd /export/server
ln -s apache-storm-1.2.4 storm
配置Storm
vi /export/servers/storm/conf/storm.yaml
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
    - "hadoop001"
    - "hadoop002"
    - "hadoop003"

storm.local.dir: "/export/server/workdir"
#指定nimbus所在服务节点
nimbus.host: "hadoop001"
#nimbus启动jvm可用内存
nimbus.childopts: "-Xmx1024m"
#supervisor启动jvm可用内存
supervisor.childopts: "-Xmx1024m"
#supervisor上worker可用内存
worker.childopts: "-Xmx768m"
#supervisor节点上worker对应端口
supervisor.slots.ports:
    - 6700
    - 6701
    - 6702
    - 6703
分发包
scp -r /export/servers/ hadoop002:/export/
scp -r /export/servers/ hadoop003:/export/
启动集群
#在nimbus.host所属的机器上启动 nimbus服务
cd /export/servers/storm/bin/
nohup ./storm nimbus &
#在nimbus.host所属的机器上启动ui服务
cd /export/servers/storm/bin/
nohup ./storm ui &
#在其它个点击上启动supervisor服务
cd /export/servers/storm/bin/
nohup ./storm supervisor &
调试信息

如果集群启动不成功,可以查看日志信息

#查看nimbus的日志信息
#在nimbus的服务器上
cd /export/servers/storm/logs
tail -100f /export/servers/storm/logs/nimbus.log

#查看ui运行日志信息
#在ui的服务器上,一般和nimbus一个服务器
cd /export/servers/storm/logs
tail -100f /export/servers/storm/logs/ui.log

#查看supervisor运行日志信息
#在supervisor服务上
cd /export/servers/storm/logs
tail -100f /export/servers/storm/logs/supervisor.log

#查看supervisor上worker运行日志信息
#在supervisor服务上
cd /export/servers/storm/logs
tail -100f /export/servers/storm/logs/worker-6702.log
访问ui界面

访问对应的nimbus的8080端口,即可看到Storm的UI界面

kafka部署

因为应用需要从kafka获取数据,因此也需要部署kafka

下载地址

解压tar -zxvf kafka_2.13-3.0.0.tgz -C /app

创建数据文件夹cd kafka_2.13-3.0.0/ & mkdir /app/kafka_2.13-3.0.0/logs

修改配置文件config/server.properties

#advertised_listeners` 是对外暴露的服务端口,真正建立连接用的是 `listeners`,集群要改为自己的iP
advertised.listeners=PLAINTEXT://192.168.56.107:9092
#broker 集群中的全局唯一编号,不能重复
broker.id=0
#删除 topic 功能
delete.topic.enable=true
#自动创建topic,false:生产者发送信息到已存在topic才没有报错
auto.create.topics.enable = false
#处理网络请求的线程数量
num.network.threads=3
#用来处理磁盘 IO 的现成数量
num.io.threads=8
#发送套接字的缓冲区大小
socket.send.buffer.bytes=102400
#接收套接字的缓冲区大小
socket.receive.buffer.bytes=102400
#请求套接字的缓冲区大小
socket.request.max.bytes=104857600
#修改kafka的日志目录和zookeeper数据目录,因为这两项默认放在tmp目录,而tmp目录中内容会随重启而丢失
log.dirs=/app/kafka_2.13-3.0.0/logs
#topic 在当前 broker 上的分区个数.分区数量一般与broker保持一致
num.partitions=3
#用来恢复和清理 data 下数据的线程数量
num.recovery.threads.per.data.dir=1
#segment 文件保留的最长时间,超时将被删除
log.retention.hours=168
#配置连接 Zookeeper 集群地址,新版自带Zookeeper
zookeeper.connect=192.168.56.107:2181,192.168.56.108:2181,192.168.56.109:2181

启动kafka

nohup bin/kafka-server-start.sh config/server.properties > logs/kafka.log 2>1 &

kafka创建topic发送数据

bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic testport --partitions 1 --replication-factor 1

发送数据

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic testport < logs/producer.txt 
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic testport --from-beginning