hadoop storm(八)

storm:分布式实时分析计算系统

storm概念
storm基本概念

Topologies:拓扑,也称为一个任务
Spouts:集群i(拓扑)的消息源
Bolts:集群(拓扑)节点的处理逻辑单元
Configuration:topology配置

tuple:消息元组(在Spouts和Bolts之间传递的数据格式,一种自定义格式的封装)
Stream:流,tuple(消息的处理)经过的路径不一样
Stream groupings:流的分组策略

Tasks:任务处理单元
Executor:工作线程
Workers:工作进程

搭建storm集群

先在cluster1-3中安装zookeeper集群

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 怕有冲突 先将zookeeper目录,在cluster1-3中执行
tar -zcvf z.tar.gz zookeeper-3.4.10
rm -rf /root/zookeeper-3.4.10
# 在cluster1中执行
tar -zxvf zookeeper-3.4.10.tar.gz
cd zookeeper-3.4.10/conf/
mv zoo_sample.cfg zoo.cfg
vim zoo.cfg
dataDir=/root/zookeeper-3.4.10/data
server.1=cluster1:2888:3888
server.2=cluster2:2888:3888
server.3=cluster3:2888:3888
cd ../ && mkdir data && cd data
echo 1 > myid
cd
# 将ZK复制到cluster2和3
scp -r ./zookeeper-3.4.10 root@cluster2:/root
scp -r ./zookeeper-3.4.10 root@cluster3:/root
# 分别在cluster2
echo 2 > /root/zookeeper-3.4.10/data/myid
# 分别在cluster3
echo 3 > /root/zookeeper-3.4.10/data/myid
# 分别在cluster1-3运行
zkServer.sh start

复制安装包

1
scp ./apache-storm-1.1.1.tar.gz root@cluster1:/root/

cluster1在运行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
tar -zxvf apache-storm-1.1.1.tar.gz
mv apache-storm-1.1.1 storm
cd storm/conf
vim storm.yaml
storm.zookeeper.servers:
- "cluster1"
- "cluster2"
- "cluster3"
nimbus.seeds: ["cluster1"]
storm.zookeeper.root: "/root/zookeeper-3.4.10"
cd
scp -r storm root@cluster2:/root/
scp -r storm root@cluster3:/root/
# 配置环境变量(cluster1-3)
echo "export PATH=\${PATH}:/root/storm/bin" >> ~/.bashrc
source ~/.bashrc

在nimbus主机上运行(cluster1)

1
2
nohup storm nimbus 1>/dev/null 2>&1 &
nohup storm ui 1>/dev/null 2>&1 &

访问 http://cluster1:8080/

在supervisor主机上运行(cluster2-3),等nimbus跑起来之后 加一个就让HMaster管理

1
nohup storm supervisor 1>/dev/null 2>&1 &

Share Comments