Thursday, March 28, 2013

Problems with IBM Business Monitor Messaging Engine ( SI Bus ) following a teardown

*CAVEAT*

This post relates to my OWN individual experiences on my OWN personal VMware environment. This is NOT NOT NOT a recipe for everyone; your mileage may vary. If in doubt, PLEASE raise a PMR with IBM Support

*CAVEAT*

Having performed a fresh installation of IBM Business Monitor 8.0.1.1 against Oracle 11g R2 after a "teardown" - where I cleaned up the database objects created the first time around - I noticed that the Messaging Engine cluster ( that hosts the Service Integration Bus ) kept restarting.

When I checked SystemOut.log for the offending cluster member, I found: -

...
[28/03/13 09:54:01:606 GMT] 0000001b SibMessage    I   [CEI.BAMCELL.BUS:BAMSR01.Messaging.000-CEI.BAMCELL.BUS] CWSIS1538I: The messaging engine, ME_UUID=3D59E737F07528C9, INC_UUID=62A8E276B06B1903, is attempting to obtain an exclusive lock on the data store.
[28/03/13 09:54:01:766 GMT] 0000001c SibMessage    I   [CEI.BAMCELL.BUS:BAMSR01.Messaging.000-CEI.BAMCELL.BUS] CWSIS1545I: A single previous owner was found in the messaging engine's data store, ME_UUID=09BF782E0B664719, INC_UUID=78437FD9A71F6596
[28/03/13 09:54:01:768 GMT] 0000001d SibMessage    I   [MONITOR.BAMCELL.Bus:BAMSR01.Messaging.000-MONITOR.BAMCELL.Bus] CWSIS1545I: A single previous owner was found in the messaging engine's data store, ME_UUID=E2ABE650D061BE5C, INC_UUID=ADD9DFC1AA982A5A
[28/03/13 09:54:01:771 GMT] 0000001c SibMessage    E   [CEI.BAMCELL.BUS:BAMSR01.Messaging.000-CEI.BAMCELL.BUS] CWSIS1535E: The messaging engine's unique id does not match that found in the data store. ME_UUID=3D59E737F07528C9, ME_UUID(DB)=09BF782E0B664719
[28/03/13 09:54:01:784 GMT] 0000001b SibMessage    I   [CEI.BAMCELL.BUS:BAMSR01.Messaging.000-CEI.BAMCELL.BUS] CWSIS1593I: The messaging engine, ME_UUID=3D59E737F07528C9, INC_UUID=62A8E276B06B1903, has failed to gain an initial lock on the data store.

[28/03/13 09:54:01:788 GMT] 0000001a SibMessage    I   [MONITOR.BAMCELL.Bus:BAMSR01.Messaging.000-MONITOR.BAMCELL.Bus] CWSIS1537I: The messaging engine, ME_UUID=E2ABE650D061BE5C, INC_UUID=5634F9A5B06B1901, has acquired an exclusive lock on the data store.

and: -

...
[28/03/13 09:55:53:555 GMT] 0000000f SibMessage    E   [CEI.BAMCELL.BUS:BAMSR01.Messaging.000-CEI.BAMCELL.BUS] CWSID0046E: Messaging engine BAMSR01.Messaging.000-CEI.BAMCELL.BUS detected an error and cannot continue to run in this server.
[28/03/13 09:55:53:555 GMT] 0000000f HAGroupImpl   I   HMGR0130I: The local member of group IBM_hc=BAMSR01.Messaging,WSAF_SIB_BUS=CEI.BAMCELL.BUS,WSAF_SIB_MESSAGING_ENGINE=BAMSR01.Messaging.000-CEI.BAMCELL.BUS,type=WSAF_SIB has indicated that is it not alive. The JVM will be terminated.
[28/03/13 09:55:53:566 GMT] 0000000f SystemOut     O Panic:component requested panic from isAlive
[28/03/13 09:55:53:567 GMT] 0000000f SystemOut     O java.lang.RuntimeException: emergencyShutdown called:
[28/03/13 09:55:53:567 GMT] 0000000f SystemOut     O    at com.ibm.ws.runtime.component.ServerImpl.emergencyShutdown(ServerImpl.java:632)
[28/03/13 09:55:53:567 GMT] 0000000f SystemOut     O    at com.ibm.ws.hamanager.runtime.RuntimeProviderImpl.panicJVM(RuntimeProviderImpl.java:92)
[28/03/13 09:55:53:569 GMT] 0000000f SystemOut     O    at com.ibm.ws.hamanager.coordinator.impl.JVMControllerImpl.panicJVM(JVMControllerImpl.java:56)
[28/03/13 09:55:53:569 GMT] 0000000f SystemOut     O    at com.ibm.ws.hamanager.impl.HAGroupImpl.doIsAlive(HAGroupImpl.java:882)
[28/03/13 09:55:53:569 GMT] 0000000f SystemOut     O    at com.ibm.ws.hamanager.impl.HAGroupImpl$HAGroupUserCallback.doCallback(HAGroupImpl.java:1388)
[28/03/13 09:55:53:569 GMT] 0000000f SystemOut     O    at com.ibm.ws.hamanager.impl.Worker.run(Worker.java:64)
[28/03/13 09:55:53:569 GMT] 0000000f SystemOut     O    at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1690)
...

The first set of exceptions ( CWSIS1545I and CWSIS1535E ) led me to the solution, aided by this IBM Technote: -


Resolution

I realised that, when I'd cleaned down the database objects from the previous installation of BAM, I'd neglected to remove the schemas for the Messaging Engine.

In Oracle, I used SQL*Plus: -

sqlplus / as SYSDBA

and ran: -

SQL> select username from dba_users;

USERNAME
------------------------------
COGNOS
IBMBUSSP
MONITOR
MONME00
MONCM00

SCOTT

This showed the two schema user objects - MONME00 and MONCM00 - which I then removed: -

SQL> drop user MONCM00 cascade;

User dropped.

SQL> drop user MONME00 cascade;

User dropped.

and then restarted the ME cluster member.

This automatically recreated the objects ( this is almost certainly NOT the default behaviour - most DBAs would prefer to have more control over the creation of database objects such as schemas and users ) and the ME came up without exception.

Job done :-)

*CAVEAT*

This post relates to my OWN individual experiences on my OWN personal VMware environment. This is NOT NOT NOT a recipe for everyone; your mileage may vary. If in doubt, PLEASE raise a PMR with IBM Support

*CAVEAT*

No comments:

Post a Comment