Oracle Service Bus Protection from Failures and Expected Behavior

5-90 Oracle Fusion Middleware High Availability Guide The basic highly available topology is a Weblogic Server homogenous cluster except for the Oracle Service Bus singletons services mentioned above with one Administration Server and two managed servers running on different systems. Only one cluster of Oracle Service Bus is supported in a Weblogic domain. Local data should be stored on shared storage, such as a SAN storage system, multiported disk, or NAS storage. The local data accessed as local files, private to a managed server or Administration Server includes: ■ System files like Weblogic configuration files and server logs. Optionally, you can store WLS JMS data in an Oracle RAC database. JMS is used both internally in Service Bus and as transports to external and proxy services. ■ Oracle Service Bus data such as configuration files and logs ■ User-defined Oracle Service Bus configuration data ■ User files, such as data files, that are read or written by a proxy service with a FileFTP transport ■ The JMS persistent store, which stores alert logs and aggregated performance metrics The Oracle RAC database is used for high availability of the reporting provider and also for the leasing datasource used for server migration. Optionally, a web server farm may be used as a front end to an Oracle Service Bus cluster. A hardware load balancer can load balance web servers or application servers directly. If web servers are used, WebLogic plugins should be configured to load balance HTTP traffic among the Oracle Service Bus managed servers. Email, FTP, or NFS servers can exist in the network, and third party JMS servers or server clusters can also exist in the network.

5.12.2.1 Oracle Service Bus Protection from Failures and Expected Behavior

This section describes how an Oracle Service Bus high availability cluster deployment protects components from failure. This section also describes expected behavior if component failure occurs. Oracle Service Bus is protected from all process failures by the WebLogic Server infrastructure.

5.12.2.1.1 WebLogic Server Failure Oracle Service Bus does not maintain any state, nor

does it support the concept of user sessions. Therefore, Oracle Service Bus does not implement session state replication and failover. For synchronous inbound transports such as HTTP, SB, and EJB, if the managed server processing the request goes down in the middle of request processing, the client receives a connection exception and needs to retry. Node Manager should be configured to perform automatic WebLogic server migration when a managed server fails. For details on WebLogic server migration, see Section 3.9, Whole Server Migration. Use WebLogic server migration for JMS failover. There is no automatic failover for Oracle Service Bus singleton components such as the Aggregator, Alert Manager, and Reporting Message Purger. Use WebLogic server migration to perform the failover of singleton components. Configuring High Availability for Oracle Fusion Middleware SOA Suite 5-91

5.12.2.1.2 Node Failure The external load balancer can be used to directly load balance

HTTP requests to Oracle Service Bus HTTP Proxy Services on the managed servers in the cluster or to a Web ServerOracle HTTP Server OHS cluster front-ending the Oracle Service Bus cluster. There are no sticky session routing requirements. Node Manager can be configured to failover failed Oracle Service Bus nodes. If an Oracle Service Bus managed server fails in the middle of processing HTTP requests, clients need to submit those requests again. For JEJB and SB transports, a client gets a connection error when the server goes down. For all poller transports, the integrity of message processing is handled by transactional semantics as the message is dequeued using XA connection factory. Poller transports provide at least once semantics. However, after whole server migration of the failed node, the poller needs to have access to the resources it is polling. JTA transactions will also preserve transactional integrity when using JMS transport. JTA TLogs need to be recovered to recover in-flight JTA transactions. Oracle Service Bus guarantee Exactly Once message delivery semantics. This behavior is controlled by the QoS property in the outbound context variable configured in the proxy service. For more detail on the different QoS settings, see Oracle Fusion Middleware Administrators Guide for Oracle Service Bus. When a managed server starts up, any in-flight global transactions are recovered. For example, recovery of global transactions may be required when the proxy service is bridging between two JMS providers using XA connection factories, and QoS of exactly once. Another example is the JMS reporting provider. With the default JMS Reporting Provider, the report message is first written to the JMS queue. The MDB dequeues the report message from the JMS queue and then writes it to the database. The database data source is configured with transaction semantics of LLR Last Logging Resource optimization. In this case, the database must be running during recovery, otherwise the server will not start. The Tlog is used even when LLR is in effect. However, this happens only on a per transaction basis. Therefore the Tlog must still be made highly available even when all transactions are LLR transactions. Transaction Manager persists checkpoint TLog records that are unrelated to specific transactions, but that are still required to provide full transactional safety. There are two cases where failure of the managed server causes certain state to be lost, which prevents correct processing of messages: ■ JMS requestresponse business service: Maintains in-memory table mapping correlation information. This is lost when the managed server goes down. Thus, the response from the JMS service cannot be sent back to the original client of the proxy service, which routed to the JMS requestresponse business service. ■ WS Reliable Messaging business service: Similar to the JMS requestresponse business service, the WS business service keeps an in-memory table mapping correlation information. This is lost when the managed server goes down, and the response cannot be handled.

5.12.2.1.3 Database Failure Beyond the implications for database leasing used for

server migration database failures are relevant for Oracle Service Bus only in the context of the JMS reporting provider functionality. When there is report action in the proxy service, the report data is enqueued into the reporting JMS queue. The reporting MDB dequeues the message from the JMS queue and inserts the data into the database using the data source configured with the Logging Last Resource LLR global transaction protocol. The reporting JMS queue is configured with a redelivery limit of 2 and an error queue. When a database failure occurs, report messages are moved to the error queue after the redelivery limit is reached. When the database is running again, you can move these report messages back to the JMS reporting queue so that 5-92 Oracle Fusion Middleware High Availability Guide they are inserted into the database. If the database fails in the middle of LLR transaction processing, transaction recovery is performed as described in the chapter on Logging Last Resource Transaction Optimization in Oracle Fusion Middleware Programming JTA for Oracle WebLogic Server.

5.12.2.2 Oracle Service Bus Cluster-Wide Deployment