Java 2 cpu=samples Profile Output

- 29 - The trace corresponding to this second entry in the summary example turns out to be another truncated trace, but the example shows the same method in 14th position, and the trace for that entry identifies the Double.equals call as coming from the Hashtable.put call. Unfortunately for tuning purposes, the Double.equals method itself is already quite fast and cannot be optimized further. When methods cannot be directly optimized, the next best choice is to reduce the number of times they are called or even avoid the methods altogether. In fact, eliminating method calls is actually the better tuning choice, but is often considerably more difficult to achieve and so is not a first- choice tactic for optimization. The object-creation profile and the method profile together point to the FloatingDecimal class as being a huge bottleneck, so avoiding this class is the obvious tuning tactic here. In Chapter 5 , I employ this technique, avoiding the default call through the FloatingDecimal class for the case of converting floating-point numbers to String s, and I obtain an order-of-magnitude improvement. Basically, the strategy is to create a more efficient routine to run the equivalent conversion functionality, and then replacing the calls to the underperforming FloatingDecimal methods with calls to the more efficient optimized methods. The best way to avoid the Double.equals method is to replace the hash table with another implementation that stores double primitive data types directly rather than requiring the double s to be wrapped in a Double object. This allows the == operator to make the comparison in the put method, thus completely avoiding the Double.equals call: this is another standard tuning tactic, where a data structure is replaced with a more appropriate and faster one for the task. The 1.1 profiling output is quite different and much less like a standard profilers output. Running the 1.1 profiler with this program details of this output are given in Section 2.3.4 gives: count callee caller time 21 javalangSystem.gc V javalangFloatingDecimal.dtoaIJIV 760 8 javalangSystem.gc V javalangDouble.equalsLjavalangObject;Z 295 2 javalangDouble.doubleToLongBitsDJ javalangDouble.equalsLjavalangObject;Z 0 I have shown only the top four lines from the output. This output actually identifies both the FloatingDecimal.dtoa and the Double.equals methods as taking the vast majority of the time, and the percentages given by the reported times are listed as around 70 and 25 of the total program time for the two methods, respectively. Since the callee for these methods is listed as System.gc , this also identifies that the methods are significantly involved in memory creation and suggests that the next tuning step might be to analyze the object-creation output for this program.

2.3.2 Java 2 cpu=samples Profile Output

The default profile output gained from executing with -Xrunhprof in Java 2 is not useful for method profiling. The default output generates object-creation statistics from the heap as the dump output occurs. By default, the dump occurs when the application terminates; you can modify the dump time by typing Ctrl-\ on Solaris and other Unix systems, or Ctrl-Break on Win32. To get a useful method profile, you need to modify the profiler options to specify method profiling. A typical call to achieve this is: java -Xrunhprof:cpu=samples,thread=y classname Note that in a Windows command-line prompt, you need to surround the option with double quotes because the equals sign is considered a meta character. - 30 - Note that -Xrunhprof has an h in it. There seems to be an undocumented feature of the VM in which the option -Xrunsomething makes the VM try to load a shared library called something , e.g., using -Xrunprof results in the VM trying to load a shared library called prof. This can be quite confusing if you are not expecting it. In fact, -Xrunhprof loads the hprof shared library. The profiling option in JDK 1.21.3 can be pretty flaky. Several of the options can cause the runtime to crash core dump. The output is a large file, since huge amounts of trace data are written rather than summarized. Since the profile option is essentially a Sun engineering tool, it has had limited resources applied to it, especially as Sun has a separate not free profile tool that Sun engineers would normally use. Another tool that Sun provides to analyze the output of the profiler is called heap-analysis tool search http:www.java.sun.com for HAT. But this tool analyzes only the object-creation statistics output gained with the default profile output, and so is not that useful for method profiling see Section 2.4 for slightly more about this tool. Nevertheless, I expect the free profiling option to stabilize and be more useful in future versions. The output when run with the options already listed cpu=samples, thread=y already results in fairly usable information. This profiling mode operates by periodically sampling the stack. Each unique stack trace provides a TRACE entry in the second section of the file; describing the method calls on the stack for that trace. Multiple identical samples are not listed; instead, the number of their hits are summarized in the third section of the file. The profile output file in this mode has three sections: Section 1 A standard header section describing possible monitored entries in the file. For example: WARNING This file format is under development, and is subject to change without notice. This file contains the following types of records: THREAD START THREAD END mark the lifetime of Java threads TRACE represents a Java stack trace. Each trace consists of a series of stack frames. Other records refer to TRACEs to identify 1 where object allocations have taken place, 2 the frames in which GC roots were found, and 3 frequently executed methods. Section 2 Individual entries describing monitored events, i.e., threads starting and terminating, but mainly sampled stack traces. For example: THREAD START obj=8c2640, id = 6, name=Thread-0, group=main THREAD END id = 6 TRACE 1: empty TRACE 964: javaioObjectInputStream.readObjectObjectInputStream.java:Compiled method javaioObjectInputStream.inputObjectObjectInputStream.java:Compiled method javaioObjectInputStream.readObjectObjectInputStream.java:Compiled method - 31 - javaioObjectInputStream.inputArrayObjectInputStream.java:Compiled method TRACE 1074: javaioBufferedInputStream.fillBufferedInputStream.java:Compiled method javaioBufferedInputStream.read1BufferedInputStream.java:Compiled method javaioBufferedInputStream.readBufferedInputStream.java:Compiled method javaioObjectInputStream.readObjectInputStream.java:Compiled method Section 3 A summary table of methods ranked by the number of times the unique stack trace for that method appears. For example: CPU SAMPLES BEGIN total = 512371 Thu Aug 26 18:37:08 1999 rank self accum count trace method 1 16.09 16.09 82426 1121 javaioFileInputStream.read 2 6.62 22.71 33926 881 javaioObjectInputStream.allocateNewObject 3 5.11 27.82 26185 918 javaioObjectInputStream.inputClassFields 4 4.42 32.24 22671 887 javaioObjectInputStream.inputObject 5 3.20 35.44 16392 922 javalangreflectField.set Section 3 is the place to start when analyzing this profile output. It consists of a table with six fields, headed rank , self , accum , count , trace , and method , as shown. These fields are used as follows: rank This column simply counts the entries in the table, starting with 1 at the top, and incrementing by 1 for each entry. self The self field is usually interpreted as a percentage of the total running time spent in this method. More accurately, this field reports the percentage of samples that have the stack given by the trace field. Heres a one-line example: rank self accum count trace method 1 11.55 11.55 18382 545 javalangFloatingDecimal.dtoa This example shows that stack trace 545 occurred in 18,382 of the sampled stack traces, and this is 11.55 of the total number of stack trace samples made. It indicates that this method was probably executing for about 11.55 of the application execution time, because the samples are at regular intervals. You can identify the precise trace from the second section of the profile output by searching for the trace with identifier 545. For the previous example, this trace was: TRACE 545: thread=1 javalangFloatingDecimal.dtoaFloatingDecimal.java:Compiled method javalangFloatingDecimal.initFloatingDecimal.java:Compiled method javalangDouble.toStringDouble.java:Compiled method javalangString.valueOfString.java:Compiled method This TRACE entry clearly identifies the exact method and its caller. Note that the stack is reported to a depth of four methods. This is the default depth: the depth can be changed - 32 - using the depth parameter to the -Xrunhprof option, e.g., - Xrunhprof:depth=6,cpu=samples,... . accum This field is a running additive total of all the self field percentages as you go down the table: for the Section 3 example shown previously, the third line lists 27.82 for the accum field, indicating that the sum total of the first three lines of the self field is 27.82. count This field indicates how many times the unique stack trace that gave rise to this entry was sampled while the program ran. trace This field shows the unique trace identifier from the second section of profile output that generated this entry. The trace is recorded only once in the second section no matter how many times it is sampled; the number of times that this trace has been sampled is listed in the count field. method This field shows the method name from the top line of the stack trace referred to from the trace field, i.e., the method that was running when the stack was sampled. This summary table lists only the method name and not its argument types. Therefore, it is frequently necessary to refer to the stack itself to determine the exact method, if the method is an overloaded method with several possible argument types. The stack is given by the trace identifier in the trace field, which in turn references the trace from the second section of the profile output. If a method is called in different ways, it may also give rise to different stack traces. Sometimes the same method call can be listed in different stack traces due to lost information. Each of these different stack traces results in a different entry in the third section of the profilers output, even though the method field is the same. For example, it is perfectly possible to see several lines with the same method field, as in the following table segment: rank self accum count trace method 95 1.1 51.55 110 699 javalangStringBuffer.append 110 1.0 67.35 100 711 javalangStringBuffer.append 128 1.0 85.35 99 332 javalangStringBuffer.append When traces 699, 711, and 332 are analyzed, one trace might be StringBuffer.appendboolean , while the other two traces could both be StringBuffer.appendint , but called from two different methods and so giving rise to two different stack traces and consequently two different lines in the summary example. Note that the trace does not identify actual method signatures, only method names. Line numbers are given if the class was compiled so that line numbers remain. This ambiguity can be a nuisance at times. The profiler in this mode cpu=samples is useful enough to suffice when you have no better alternative. It does have an effect on real measured times, slowing down operations by variable - 33 - amounts even within one application run. But it normally indicates major bottlenecks, although sometimes a little extra work is necessary to sort out multiple identical method-name references. Using the alternative cpu=times mode, the profile output gives a different view of application execution. In this mode, the method times are measured from method entry to method exit, including the time spent in all other calls the method makes. This profile of an application gives a tree-like view of where the application is spending its time. Some developers are more comfortable with this mode for profiling the application, but I find that it does not directly identify bottlenecks in the code.

Java 2 cpu=samples Profile Output

2.3.2 Java 2 cpu=samples Profile Output

2.3.3 HotSpot and 1.3 -Xprof Profile Output

Parts

Dokumen yang terkait

Perancangan self-tuning PID

Tuning Out For Success

Tuning Out The Radio Voice

SIMULASI IMPLEMENTASI SELF TUNING REGULA

Tuning the Tunneling Rate and Dielectric

PERHATIAN KHUSUS lagu Tuning regula

Threshold tuning for improved classifica (1)

Comparison of PID Controller Tuning Methods

1 A Introduction to Process Control

MAX2622 VCO TUNING CURVE

Dukungan

Links

Java 2 cpu=samples Profile Output

2.3.2 Java 2 cpu=samples Profile Output

2.3.3 HotSpot and 1.3 -Xprof Profile Output

Parts

Dokumen yang terkait

Perancangan self-tuning PID

Tuning Out For Success

Tuning Out The Radio Voice

SIMULASI IMPLEMENTASI SELF TUNING REGULA

Tuning the Tunneling Rate and Dielectric

PERHATIAN KHUSUS lagu Tuning regula

Threshold tuning for improved classifica (1)

Comparison of PID Controller Tuning Methods

1 A Introduction to Process Control

MAX2622 VCO TUNING CURVE

Dokumen yang Anda mencari sudah siap untuk unduhkan