Friday, 7 May 2010

HA of an MQLink between SIBus and WMQ

Just answered a question from a colleague in Switzerland & I thought that it would be worth posting the answer here for posterity.

Suppose you have an SIBus with a Foreign Bus Connection or MQLink from a queue manager to the SIBus. And further suppose that you've created a Cluster Bus member that contains one messaging engine, which will host the SIBus end of the MQLink. The SIBus end of the MQLink knows the endpoint of the queue manager, and the queue manager end of the link knows the endpoint of the messaging engine, which will be configured in the CONNAME property of the WMQ sender channel.

You can configure the messaging engine to be able to failover between the servers in the WAS cluster, for high availability. As the messaging engine moves (i.e. fails over) from one server to another, it will be listening on a different host/port. This poses the question as to what you should configure for the host/port in the WMQ Sender channel's CONNAME.

There are 3 solutions to this problem, depending on which version of WMQ you are using:

1. If you are using WMQ prior to v7.0.1 then:

a) you could use a shared disk style HA cluster to manage the ME and its endpoint (note: this is not a nice solution, I only mention it for completeness)

b) you could install WMQ supportpac MR01 at the queue manager, which adds a channel exit to the Sender channel. You can then configure a list of the endpoints of the WAS servers and this is used to select an endpoint to use in the CONNAME when starting the channel. This has the advantage that the channel exit "remembers" the last known good endpoint, which minimises reconnect time.

2. If you are using WMQ 7.0.1 then you can configure a comma-separated list of endpoints in the CONNAME of the Sender channel. A disadvantage of this is that when you start the sender channel it always searches from the beginning of the list, trying each endpoint in turn, so there can be a slight delay before a successful connection is made. This is not significant provided you don't disconnect an idle channel to eagerly - i.e. set the DISCINT (disconnect interval) to be relatively long.

[19/05/2010 - added paragraph describing overall topology]