Diagnosing a Critical Failure — The Black Box Profiling During Performance Testing or in Production Real-time Application Diagnostics and Reporting RADAR

3-4 Configuring and Using the Diagnostics Framework for Oracle WebLogic Server

3.3 JRockit Flight Recorder Use Cases

This section summarizes three common business cases where using the JRockit Flight Recorder can help you resolve important diagnostic issues: ■ Section 3.3.1, Diagnosing a Critical Failure — The Black Box ■ Section 3.3.2, Profiling During Performance Testing or in Production ■ Section 3.3.3, Real-time Application Diagnostics and Reporting RADAR For more information about these scenarios, see Flight Recorder Use Cases in Oracle JRockit Flight Recorder Run Time Guide.

3.3.1 Diagnosing a Critical Failure — The Black Box

When a catastrophic failure occurs, the content of the Flight Recorder buffer can be made available for post-failure analysis in a manner analogous to the use of an aircraft’s black box. Examples of such failures include a JVM crash or an out-of-memory error OOME resulting in an application terminating. When these situations arise, the flight recording contains the following information, which can be helpful in determining the cause of the failure: ■ JVM core dump, including metadata about the Flight Recorder configuration at the time of the crash. Furthermore, if the Flight Recorder was running in persistent storage mode the data buffer file might contain a certain amount of data. ■ WebLogic Server events, captured by WLDF, that preceded the failure. When running in persistent mode, the JFR uses a combination of memory and disk to store its buffer. The most recent data is stored in memory and is flushed out to disk as it ages. In this mode, the on-disk data will be available even after a power failure or similar catastrophic event; only the most recent data will be unavailable for example, the data that had not yet been flushed to disk. The text dump file will contain metadata about the Flight Recorder configuration at the time of the crash, including the path to the data buffer file when applicable. For more information about persistent mode, see Operating Modes in Oracle JRockit Flight Recorder Run Time Guide.

3.3.2 Profiling During Performance Testing or in Production

Profiling involves capturing data beginning at a specific point in time so that, later, you can analyze the events that were generated after that point. In contrast to RADAR, described in the following section, profiling involves analyzing the diagnostic data generated after a particular event occurs, as opposed to the data that precedes it. Profiling with JRockit Flight Recorder optimizes the ability to perform deep analysis of lock contention and causes of latency.

3.3.3 Real-time Application Diagnostics and Reporting RADAR

RADAR is the examination of diagnostic data generated during run time when a particular event occurs for the purposes of understanding the system activity that preceded the event; for example, system activity occurring moments before a serious error message is generated. By using the diagnostic capabilities available in WLDF in conjunction with JRockit Flight Recorder, you can capture a large amount of system-wide diagnostic data the moment a problem occurs. You can then leverage the capabilities of JRockit Mission Control to quickly correlate that event with other system activity and process execution data within the snapshot in time that the JFR file provides, enabling you to quickly isolate likely causes of the problem. Using WLDF with Oracle JRockit Flight Recorder 3-5 One WLDF feature whose usage with JRockit Flight Recorder makes for a powerful RADAR capability is image notification, which allows you to create a diagnostic image capture automatically in response to a particular event or error condition. A diagnostic image capture, which created as the result of an image notification, automatically includes the JFR file. The JFR file can then be extracted from the diagnostic image capture and examined immediately in JRockit Mission Control or stored for later analysis. Image notification, used when WLDF data is captured by JRockit Flight Recorder, is particularly well suited for this sort of real-time diagnosis of intermittent problems. Image notification is part of the Watch and Notifications system in WLDF. To set up image notification, you create one or more individual watch rules. A watch rule includes a logical expression that uses the WLDF query language to specify the event for the watch to detect. For example, the following log event watch rule expression detects the server log message with severity level Critical and ID BEA-149618: SEVERITY = Critical AND MSGID = BEA-149618 Watch rules can monitor any of the following: ■ Harvestable runtime MBean instances in the local runtime MBean server A harvester watch can trigger an image notification if runtime MBean attributes detect a performance issue, such as high memory utilization rates or problems with open socket connections to the server. ■ Messages published to the server log A log watch can trigger an image notification if a specific message, severity level, or string is issued. ■ Event generated the WLDF Instrumentation component An event watch can trigger an image notification if an instrumentation service generates a particular event. For more information, see the following topics: ■ Chapter 8, Configuring Watches and Notifications ■ Section 10.6, Configuring Image Notifications ■ Appendix A, WLDF Query Language The following sections explain how to obtain the JFR file from the diagnostic image capture and provide an example of using JRockit Mission Control to examine the WebLogic Server events contained in the JFR file: ■ Section 3.4, Obtaining the JRockit Flight Recording File