The problem here will be that during normal operation of the environment connections will be made, to a secure bus, authentication will occur and messages sent and received; all is good in the world. Then disaster strikes and for some reason the application server goes down leaving uncommitted work in the bus. The node agent restarts the application server which then connects to the bus and performs recovery. At least that is what should happen. In this case the connection fails with a JMS SecurityException. The original connection was established, but recovery does not work.
So what went wrong here? During normal operation when a connection is made to a transactional resource and a transaction is in effect the connection factory is written into something called the partner log. This contains details of how to connect to all the transactional partners that may be needed during recovery. In this case the connection factory does not contain any information on what security credentials should be used, so no credentials are used, causing this problem.
So if you see this how do we get the transactions resolved? Their are two of options and the first one would be preferable:
- Grant the special Everyone group access to the bus. Assuming dynamic configuration is enabled this will allow recovery to work quickly.
- Turn security off for the bus until recovery is complete. We generally advice restarting the whole bus, but a single application server may work on a temporary basis.
In fact the XA Recovery Alias is not JMS specific it exists for all JCA resources, so it can help when using WebSphere MQ and DB2 too.
Did I hear someone say "yuck"? You do not like this? Well to be honest neither did we. One of the themes of the WAS v7 release was to make it more usable (we use the term "consumable", but who'd want to eat WAS?) so we have tried to simplify here. In WAS v7 the XA recovery alias is no longer required; during recovery the application server will use the WAS server identity to perform recovery. Their are a few limitations. The first is that the special Server group needs to have bus connector authority, and the second is that the recovery server must be in the same cell as the bus. Other than that you are good to go. Oh and do not worry, if you already have an XA recovery alias we will continue to use this unless a problem occurs.
Alasdair