RabbitMQ學習系列（六）: RabbitMQ 高可用集群－互聯網

前面講過一些RabbitMQ的安裝和用法，也說了說RabbitMQ在一般的業務場景下如何使用。不知道的可以看我前面的博客，http://www.cnblogs.com/zhangweizhong/category/855479.html

本來一直想寫一個介紹RabbitMQ高可用的集群的文章。不過，后來發現園子里，有個已經RabbitMQ大牛寫了，關于高可用集群的文章了。特別巧合的是，還是以前公司的同事。所以，這里就不啰嗦。直接引用過來吧。原文地址：http://www.cnblogs.com/flat_peach/archive/2013/04/07/3004008.html

RabbitMQ是用erlang開發的，集群非常方便，因為erlang天生就是一門分布式語言,但其本身并不支持負載均衡。

Rabbit模式大概分為以下三種：單一模式、普通模式、鏡像模式

單一模式：最簡單的情況，非集群模式。

沒什么好說的。

普通模式：默認的集群模式。

對于Queue來說，消息實體只存在于其中一個節點，A、B兩個節點僅有相同的元數據，即隊列結構。

當消息進入A節點的Queue中后，consumer從B節點拉取時，RabbitMQ會臨時在A、B間進行消息傳輸，把A中的消息實體取出并經過B發送給consumer。

所以consumer應盡量連接每一個節點，從中取消息。即對于同一個邏輯隊列，要在多個節點建立物理Queue。否則無論consumer連A或B，出口總在A，會產生瓶頸。

該模式存在一個問題就是當A節點故障后，B節點無法取到A節點中還未消費的消息實體。

如果做了消息持久化，那么得等A節點恢復，然后才可被消費；如果沒有持久化的話，然后就沒有然后了……

鏡像模式：把需要的隊列做成鏡像隊列，存在于多個節點，屬于RabbitMQ的HA方案。

該模式解決了上述問題，其實質和普通模式不同之處在于，消息實體會主動在鏡像節點間同步，而不是在consumer取數據時臨時拉取。

該模式帶來的副作用也很明顯，除了降低系統性能外，如果鏡像隊列數量過多，加之大量的消息進入，集群內部的網絡帶寬將會被這種同步通訊大大消耗掉。

所以在對可靠性要求較高的場合中適用(后面會詳細介紹這種模式，目前我們搭建的環境屬于該模式)

了解集群中的基本概念：

RabbitMQ的集群節點包括內存節點、磁盤節點。顧名思義內存節點就是將所有數據放在內存，磁盤節點將數據放在磁盤。不過，如前文所述，如果在投遞消息時，打開了消息的持久化，那么即使是內存節點，數據還是安全的放在磁盤。

一個rabbitmq集群中可以共享 user，vhost，queue，exchange等，所有的數據和狀態都是必須在所有節點上復制的，一個例外是，那些當前只屬于創建它的節點的消息隊列，盡管它們可見且可被所有節點讀取。rabbitmq節點可以動態的加入到集群中，一個節點它可以加入到集群中，也可以從集群環集群會進行一個基本的負載均衡。
集群中有兩種節點：
1 內存節點：只保存狀態到內存（一個例外的情況是：持久的queue的持久內容將被保存到disk）
2 磁盤節點：保存狀態到內存和磁盤。
內存節點雖然不寫入磁盤，但是它執行比磁盤節點要好。集群中，只需要一個磁盤節點來保存狀態就足夠了
如果集群中只有內存節點，那么不能停止它們，否則所有的狀態，消息等都會丟失。

思路：

那么具體如何實現RabbitMQ高可用，我們先搭建一個普通集群模式，在這個模式基礎上再配置鏡像模式實現高可用，Rabbit集群前增加一個反向代理，生產者、消費者通過反向代理訪問RabbitMQ集群。

架構圖如下：圖片來自http://www.nsbeta.info

上述圖里是3個RabbitMQ運行在同一主機上，分別用不同的服務端口。當然我們的生產實際里，多個RabbitMQ肯定是運行在不同的物理服務器上，否則就失去了高可用的意義。

集群模式配置

設計架構可以如下：在一個集群里，有4臺機器，其中1臺使用磁盤模式，另2臺使用內存模式。2臺內存模式的節點，無疑速度更快，因此客戶端（consumer、producer）連接訪問它們。而磁盤模式的節點，由于磁盤IO相對較慢，因此僅作數據備份使用，另外一臺作為反向代理。

四臺服務器hostname分別為：queue 、panyuntao1、panyuntao2、panyuntao3（ip:172.16.3.110）

配置RabbitMQ集群非常簡單，只需要幾個命令，配置步驟如下：

step1：queue、panyuntao1、panyuntao2做為RabbitMQ集群節點，分別安裝RabbitMq-Server ，安裝后分別啟動RabbitMq-server

啟動命令 # Rabbit-Server start ，安裝過程及啟動命令參見：http://www.cnblogs.com/flat_peach/archive/2013/03/04/2943574.html

step2：在安裝好的三臺節點服務器中，分別修改/etc/hosts文件，指定queue、panyuntao1、panyuntao2的hosts，如：

172.16.3.32 queue

172.16.3.107 panyuntao1

172.16.3.108 panyuntao2

還有hostname文件也要正確，分別是queue、panyuntao1、panyuntao2，如果修改hostname建議安裝rabbitmq前修改。

請注意RabbitMQ集群節點必須在同一個網段里，如果是跨廣域網效果就差。

step3：設置每個節點Cookie

Rabbitmq的集群是依賴于erlang的集群來工作的，所以必須先構建起erlang的集群環境。Erlang的集群中各節點是通過一個magic cookie來實現的，這個cookie存放在 /var/lib/rabbitmq/.erlang.cookie 中，文件是400的權限。所以必須保證各節點cookie保持一致，否則節點之間就無法通信。

-r--------. 1 rabbitmq rabbitmq 20 3月 5 00:00 /var/lib/rabbitmq/.erlang.cookie

將其中一臺節點上的.erlang.cookie值復制下來保存到其他節點上。或者使用scp的方法也可，但是要注意文件的權限和屬主屬組。

我們這里將queue中的cookie 復制到 panyuntao1、panyuntao2中，先修改下panyuntao1、panyuntao2中的.erlang.cookie權限

#chmod 777 /var/lib/rabbitmq/.erlang.cookie

將queue的/var/lib/rabbitmq/.erlang.cookie這個文件，拷貝到panyuntao1、panyuntao2的同一位置（反過來亦可），該文件是集群節點進行通信的驗證密鑰，所有節點必須一致。拷完后重啟下RabbitMQ。

復制好后別忘記還原.erlang.cookie的權限，否則可能會遇到錯誤

#chmod 400 /var/lib/rabbitmq/.erlang.cookie

設置好cookie后先將三個節點的rabbitmq重啟

# rabbitmqctl stop

# rabbitmq-server start

step4：停止所有節點RabbitMq服務，然后使用detached參數獨立運行，這步很關鍵，尤其增加節點停止節點后再次啟動遇到無法啟動都可以參照這個順序

queue# rabbitmqctl stop

panyuntao1# rabbitmqctl stop
panyuntao2# rabbitmqctl stop

queue# rabbitmq-server -detached

panyuntao1# rabbitmq-server -detached
panyuntao2# rabbitmq-server -detached

分別查看下每個節點

queue# rabbitmqctl cluster_status

Cluster status of node rabbit@queue ...

[{nodes,[{disc,[rabbit@queue]}]},
{running_nodes,[rabbit@queue]},
{partitions,[]}]
...done.

panyuntao1# rabbitmqctl cluster_status

Cluster status of node rabbit@panyuntao1...

[{nodes,[{disc,[rabbit@panyuntao1]}]},

{running_nodes,[rabbit@panyuntao1]},

{partitions,[]}]
...done.

panyuntao2# rabbitmqctl cluster_status

Cluster status of node rabbit@panyuntao2...

[{nodes,[{disc,[rabbit@panyuntao2]}]},

{running_nodes,[rabbit@panyuntao2]},

{partitions,[]}]
...done.

step4：將panyuntao1、panyuntao2作為內存節點與queue連接起來，在panyuntao1上，執行如下命令：

panyuntao1# rabbitmqctl stop_app

panyuntao1# rabbitmqctl join_cluster --ram rabbit@queue

panyuntao1# rabbitmqctl start_app

panyuntao2# rabbitmqctl stop_app

panyuntao2# rabbitmqctl join_cluster --ram rabbit@queue (上方已經將panyuntao1與queue連接，也可以直接將panyuntao2與panyuntao1連接，同樣而已加入集群中)

panyuntao2# rabbitmqctl start_app

上述命令先停掉rabbitmq應用，然后調用cluster命令，將panyuntao1連接到，使兩者成為一個集群，最后重啟rabbitmq應用。在這個cluster命令下，panyuntao1、panyuntao2是內存節點，queue是磁盤節點（RabbitMQ啟動后，默認是磁盤節點）。

queue 如果要使panyuntao1或panyuntao2在集群里也是磁盤節點，join_cluster 命令去掉--ram參數即可

#rabbitmqctl join_cluster rabbit@queue

只要在節點列表里包含了自己，它就成為一個磁盤節點。在RabbitMQ集群里，必須至少有一個磁盤節點存在。

step5：在queue、panyuntao1、panyuntao2上，運行cluster_status命令查看集群狀態：

[root@queue ~]# rabbitmqctl cluster_status
Cluster status of node rabbit@queue ...
[{nodes,[{disc,[rabbit@queue]},{ram,[rabbit@panyuntao2,rabbit@panyuntao1]}]},
{running_nodes,[rabbit@panyuntao2,rabbit@panyuntao1,rabbit@queue]},
{partitions,[]}]
...done.

[root@panyuntao1 rabbitmq]# rabbitmqctl cluster_status
Cluster status of node rabbit@panyuntao1 ...
[{nodes,[{disc,[rabbit@queue]},{ram,[rabbit@panyuntao2,rabbit@panyuntao1]}]},
{running_nodes,[rabbit@panyuntao2,rabbit@queue,rabbit@panyuntao1]},
{partitions,[]}]
...done.

[root@panyuntao2 rabbitmq]# rabbitmqctl cluster_status
Cluster status of node rabbit@panyuntao2 ...
[{nodes,[{disc,[rabbit@queue]},{ram,[rabbit@panyuntao2,rabbit@panyuntao1]}]},
{running_nodes,[rabbit@panyuntao1,rabbit@queue,rabbit@panyuntao2]},
{partitions,[]}]
...done.

這時我們可以看到每個節點的集群信息，分別有兩個內存節點一個磁盤節點

step6:往任意一臺集群節點里寫入消息隊列，會復制到另一個節點上，我們看到兩個節點的消息隊列數一致：（如何發送消息參見：http://www.cnblogs.com/flat_peach/archive/2013/03/04/2943574.html）

root@panyuntao2 :~# rabbitmqctl list_queues -p hrsystem

Listing queues …
test_queue 10000
…done.

root@panyuntao1 :~# rabbitmqctl list_queues -p hrsystem

Listing queues …
test_queue 10000
…done.

root@queue:~# rabbitmqctl list_queues -p hrsystem

Listing queues …
test_queue 10000
…done.

-p參數為vhost名稱

這樣RabbitMQ集群就正常工作了,

這種模式更適合非持久化隊列，只有該隊列是非持久的，客戶端才能重新連接到集群里的其他節點，并重新創建隊列。假如該隊列是持久化的，那么唯一辦法是將故障節點恢復起來。

為什么RabbitMQ不將隊列復制到集群里每個節點呢？這與它的集群的設計本意相沖突，集群的設計目的就是增加更多節點時，能線性的增加性能（CPU、內存）和容量（內存、磁盤）。理由如下：

1. storage space: If every cluster node had a full copy of every queue, adding nodes wouldn’t give you more storage capacity. For example, if one node could store 1GB of messages, adding two more nodes would simply give you two more copies of the same 1GB of messages.

2. performance: Publishing messages would require replicating those messages to every cluster node. For durable messages that would require triggering disk activity on all nodes for every message. Your network and disk load would increase every time you added a node, keeping the performance of the cluster the same (or possibly worse).

當然RabbitMQ新版本集群也支持隊列復制（有個選項可以配置）。比如在有五個節點的集群里，可以指定某個隊列的內容在2個節點上進行存儲，從而在性能與高可用性之間取得一個平衡。

鏡像模式配置

上面配置RabbitMQ默認集群模式，但并不保證隊列的高可用性，盡管交換機、綁定這些可以復制到集群里的任何一個節點，但是隊列內容不會復制，雖然該模式解決一部分節點壓力，但隊列節點宕機直接導致該隊列無法使用，只能等待重啟，所以要想在隊列節點宕機或故障也能正常使用，就要復制隊列內容到集群里的每個節點，需要創建鏡像隊列。

我們看看如何鏡像模式來解決復制的問題，從而提高可用性

step1:增加負載均衡器

關于負載均衡器，商業的比如F5的BIG-IP，Radware的AppDirector，是硬件架構的產品，可以實現很高的處理能力。但這些產品昂貴的價格會讓人止步，所以我們還有軟件負載均衡方案。互聯網公司常用的軟件LB一般有LVS、HAProxy、Nginx等。LVS是一個內核層的產品，主要在第四層負責數據包轉發，使用較復雜。HAProxy和Nginx是應用層的產品，但Nginx主要用于處理HTTP，所以這里選擇HAProxy作為RabbitMQ前端的LB。

HAProxy的安裝使用非常簡單，在Centos下直接yum install haproxy，然后更改/etc/haproxy/haproxy.cfg 文件即可，文件內容大概如下：

#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000

listen rabbitmq_cluster 0.0.0.0:5672

    mode tcp

    balance roundrobin

    server   rqslave1 172.16.3.107:5672 check inter 2000 rise 2 fall 3

    server   rqslave2 172.16.3.108:5672 check inter 2000 rise 2 fall 3

# server   rqmaster 172.16.3.32:5672 check inter 2000 rise 2 fall 3

#---------------------------------------------------------------------

負載均衡器會監聽5672端口，輪詢我們的兩個內存節點172.16.3.107、172.16.3.108的5672端口，172.16.3.32為磁盤節點，只做備份不提供給生產者、消費者使用，當然如果我們服務器資源充足情況也可以配置多個磁盤節點

，這樣磁盤節點除了故障也不會影響，除非同時出故障。

step2:配置策略

使用Rabbit鏡像功能，需要基于rabbitmq策略來實現，政策是用來控制和修改群集范圍的某個vhost隊列行為和Exchange行為

在cluster中任意節點啟用策略，策略會自動同步到集群節點

# rabbitmqctl set_policy -p hrsystem ha-allqueue"^" '{"ha-mode":"all"}'

這行命令在vhost名稱為hrsystem創建了一個策略，策略名稱為ha-allqueue,策略模式為 all 即復制到所有節點，包含新增節點，

策略正則表達式為 “^” 表示所有匹配所有隊列名稱。

例如rabbitmqctl set_policy -p hrsystem ha-allqueue "^message" '{"ha-mode":"all"}'

注意："^message" 這個規則要根據自己修改，這個是指同步"message"開頭的隊列名稱，我們配置時使用的應用于所有隊列，所以表達式為"^"

官方set_policy說明參見

set_policy [-p vhostpath] {name} {pattern} {definition} [priority]

（http://www.rabbitmq.com/man/rabbitmqctl.1.man.html）

ha-mode:

ha-mode	ha-params	Result
all	(absent)	Queue is mirrored across all nodes in the cluster. When a new node is added to the cluster, the queue will be mirrored to that node.
exactly	count	Queue is mirrored to count nodes in the cluster. If there are less than count nodes in the cluster, the queue is mirrored to all nodes. If there are more than countnodes in the cluster, and a node containing a mirror goes down, then a new mirror will not be created on another node. (This is to prevent queues migrating across a cluster as it is brought down.)
nodes	node names	Queue is mirrored to the nodes listed in node names. If any of those node names are not a part of the cluster, this does not constitute an error. If none of the nodes in the list are online at the time when the queue is declared then the queue will be created on the node that the declaring client is connected to.