JDK 1.1.x -prof and Java 2 cpu=old Profile Output

- 34 - 0.7 2 + 3 java.lang.StrictMath.floor 0.5 3 + 1 java.lang.Double.longBitsToDouble Section 5 A list of internal VM function calls. Listed in order of the total number of ticks counted while the method was at the top of the stack. Not tuneable. For example: Runtime stub + native Method 0.1 1 + 0 interpreter_entries 0.1 1 + 0 Total runtime stubs Section 6 Other miscellaneous entries not included in the previous sections: Thread-local ticks: 1.4 10 classloader 0.1 1 Interpreter 11.7 86 Unknown code Section 7 A global summary of ticks recorded. This includes ticks from the garbage collector, thread- locking overheads, and other miscellaneous entries: Global summary of 7.57 seconds: 100.0 754 Received ticks 1.9 14 Received GC ticks 0.3 2 Other VM operations The entries at the top of Section 3 are the methods that probably need tuning. Any method listed near the top of Section 2 should have been targeted by the HotSpot optimizer and may be listed lower down in Section 3. Such methods may still need to be optimized, but it is more likely that the methods at the top of Section 3 are what need optimizing. The ticks for the two sections are the same, so you can easily compare the time taken up by the top methods in the different sections and decide which to target.

2.3.4 JDK 1.1.x -prof and Java 2 cpu=old Profile Output

The JDK 1.1.x method-profiling output, obtained by running with the -prof option, is quite different from the normal 1.2 output. This output format is supported in Java 2, using the cpu=old variation of the -Xrunhprof option. This output file consists of four sections: Section 1 The method profile table showing cumulative times spent in each method executed. The table is sorted on the first count field; for example: callee caller time 29 javalangSystem.gc V javaioFileInputStream.read[BI 10263 1 javaioFileOutputStream.writeBytes[BIIV javaioFileOutputStream.write[BIIV 0 Section 2 One line describing high-water gross memory usage. For example: - 35 - handles_used: 1174, handles_free: 339046, heap-used: 113960, heap-free: 21794720 The line reports the number of handles and the number of bytes used by the heap memory storage over the applications lifetime. A handle is an object reference. The number of handles used is the maximum number of objects that existed at any one time in the application handles are recycled by the garbage collector, so over its lifetime the application could have used many more objects than are listed. The heap measurements are in bytes. Section 3 Reports the number of primitive data type arrays left at the end of the process, just before process termination. For example: sig count bytes indx [C 174 19060 5 [B 5 19200 8 This section has four fields. The first field is the primitive data type array dimensions and data type given by letter codes listed shortly, the second field is the number of arrays, and the third is the total number of bytes used by all the arrays. This example shows 174 char arrays taking a combined space of 19,060 bytes, and 5 byte arrays taking a combined space of 19,200 bytes. The reported data does not include any arrays that may have been garbage collected before the end of the process. For this reason, the section is of limited use. You could use the - noasyncgc option to try to eliminate garbage collection if you have enough memory; you may also need -mx with a large number to boost the maximum memory available. If you do, also use -verbosegc so that if garbage collection is forced, you at least know that garbage collection has occurred and can get the basic number of objects and bytes reclaimed. Section 4 The fourth section of the profile output is the per-object memory dump. Again, this includes only objects left at the end of the process just before termination, not objects that may have been garbage-collected before the end of the process. For example: tab[267] p=4bba378 cb=1873248 cnt=219 ac=3 al=1103 LjavautilHashtableEntry; 219 3504 [LjavautilHashtableEntry; 3 4412 This dump is a snapshot of the actual object table. The fields in the first line of an entry are: tab[ index ] The entry location as listed in the object table. The index is of no use for performance tuning. p= hex value Internal memory locations for the instance and class; of no use for performance tuning. - 36 - cb= hex value Internal memory locations for the instance and class; of no use for performance tuning. cnt= integer The number of instances of the class reported on the next line. ac= integer The number of instances of arrays of the class reported on the next line. al= integer The total number of array elements for all the arrays counted in the previous ac field. This first line of the example is followed by lines consisting of three fields: first, the class name prefixed by the array dimension if the line refers to the array data; next, the number of instances of that class or array class; and last, the total amount of space used by all the instances, in bytes. So the example reports that there are 219 HashtableEntry instances taking a total of 3504 bytes between them, [5] and three HashtableEntry arrays having 1103 array indexes between them which amounts to 4412 bytes between them, since each entry is a 4-byte object handle. [5] A HashtableEntry has one int and three object handle instance variables, each of which takes 4 bytes, so each HashtableEntry is 16 bytes. The last two sections, Sections 3 and 4, give snapshots of the object table memory and can be used in an interesting way: to run a garbage collection just before termination of your application. That leaves in the object table all the objects that are rooted [6] by the system and by your application from static variables. If this snapshot shows significantly more objects than you expect, you may be referencing more objects than you realized. [6] Objects rooted by the system are objects the JVM runtime keeps alive as part of its runtime system. Rooted objects are generally objects that cannot be garbage collected because they are referenced in some way from other objects that cannot be garbage collected. The roots of these non-garbage-collectable objects are normally objects referenced from the stack, objects referenced from static variables of classes, and special objects the runtime system ensures are kept alive. The first section of the profile output is the most useful, consisting of multiple lines, each of which specifies a method and its caller, together with the total cumulative time spent in that method and the total number of times it was called from that caller. The first line of this section specifies the four fields in the profile table in this section: count , callee , caller , and time . They are detailed here: count The total number of times the callee method was called from the caller method, accumulating multiple executions of the caller method. For example, if foo1 calls foo2 10 times every time foo1 is executed, and foo1 was itself called three times during the execution of the program, the count field should hold the value 30 for the callee-caller pair foo2 -foo1 . The line in the table should look like this: 30 xyZ.foo2 V xyZ.foo1 V 1263 - 37 - assuming the foo methods are in class x.y.Z and they both have a void return. The actual reported numbers may be less than the true number of calls: the profiler can miss calls. callee The method that was called count times in total from the caller method. The callee can be listed in other entries as the callee method for different caller methods. caller The method that called the callee method count times in total. time The cumulative time in milliseconds spent in the callee method, including time when the callee method was calling other methods i.e., when the callee method was in the stack but not at the top, and so was not the currently executing method. If each of the count calls in one line took exactly the same amount of time, then one call from caller to callee took time divided by count milliseconds. This first section is normally sorted into count order. However, for this profiler, the time spent in methods tends to be more useful. Because the times in the time field include the total time that the callee method was anywhere on the stack, interpreting the output of complex programs can be difficult without processing the table to subtract subcall times. This format is different from the 1.2 output with cpu=samples specified, and is more equivalent to a 1.2 profile with cpu=times specified. The lines in the profile output are unique for each callee-caller pair, but any one callee method and any one caller method can and normally do appear in multiple lines. This is because any particular method can call many other methods, and so the method registers as the caller for multiple callee-caller pairs. Any particular method can also be called by many other methods, and so the method registers as the callee for multiple callee-caller pairs. The methods are written out using the internal Java syntax listed in Table 2-1 . Table 2-1, Internal Java Syntax for -prof Output Format Internal Symbol Java Meaning Replaces the . character in package names e.g., javalangString stands for java.lang.String B byte C char D double I int F float J long S short V void Z boolean [ One array dimension e.g., [[B stands for a two-dimensional array of bytes, such as new - 38 - byte[3][4] Lclassname; A class e.g., LjavalangString; stands for java.lang.String There are free viewers, including source code, for viewing this format file: • Vladimir Bulatovs HyperProf search for HyperProf on the Web • Greg Whites ProfileViewer search for ProfileViewer on the Web • My own viewer see ProfileStack: A Profile Viewer for Java 1.1 ProfileStack: A Profile Viewer for Java 1.1 I have made my own viewer available, with source code. Under the tuning.profview package, the main class is tuning.profview.ProfileStack and takes one argument, the name of the prof file. All classes from this book are available by clicking the Examples link from this books catalog page, http:www.oreilly.comcatalogjavapt . My viewer analyzes the profile output file, combines identical callee methods to give a list of its callers, and maps codes into readable method names. The output to System.out looks like this: time count localtime callee 19650 2607 19354 int ObjectInputStream.read Called by time count caller 98.3 19335 46 short DataInputStream.readShort 1.1 227 1832 int DataInputStream.readUnsignedByte 0.2 58 462 int DataInputStream.readInt 0.1 23 206 int DataInputStream.readUnsignedShort 0.0 4 50 byte DataInputStream.readByte 0.0 1 9 boolean DataInputStream.readBoolean 19342 387 19342 int SocketInputStream.socketReadbyte[],int,i Called by time count caller 100.0 19342 4 int SocketInputStream.readbyte[],int,i 15116 3 15116 void ServerSocket.implAcceptSocket Called by time count caller 100.0 15116 3 Socket ServerSocket.accept Each main nonindented line of this output consists of a particular method callee showing the cumulative time in milliseconds for all the callers of that method, the cumulative count from all the callers, and the time actually spent in the method itself not in any of the methods that it called. This last noncumulative time is found by identifying the times listed for all the callers of the method and then subtracting the total time for all those calls from the cumulative time for this method. Each main line is followed by several lines breaking down all the methods that call this callee method, giving the percentage amongst them in terms of time, the cumulative time, the count of calls, and the name of the caller method. The methods are converted into normal Java source code syntax. The main lines are sorted by the time actually spent in the method the third field, localtime , of the nonindented lines. The biggest drawback to the 1.1 profile output is that threads are not indicated at all. This means that it is possible to get time values for method calls that are longer than the total time spent in running the application, since all the call times from multiple threads are added together. It also means that you cannot determine from which thread a particular method call was made. - 39 - Nevertheless, after re-sorting the section on the time field, rather than the count field, the profile data is useful enough to suffice as a method profiler when you have no better alternative. One problem Ive encountered is the limited size of the list of methods that can be held by the internal profiler. Technically, this limitation is 10,001 entries in the profile table, and there is presumably one entry per method. There are four methods that help you avoid the limitation by profiling only a small section of your code: sun.misc.VM.suspendJavaMonitor sun.misc.VM.resumeJavaMonitor sun.misc.VM.resetJavaMonitor sun.misc.VM.writeJavaMonitorReport These methods also allow you some control over which parts of your application are profiled and when to dump the results.

JDK 1.1.x -prof and Java 2 cpu=old Profile Output

2.3.4 JDK 1.1.x -prof and Java 2 cpu=old Profile Output

2.4 Object-Creation Profiling

Parts

Dokumen yang terkait

Perancangan self-tuning PID

Tuning Out For Success

Tuning Out The Radio Voice

SIMULASI IMPLEMENTASI SELF TUNING REGULA

Tuning the Tunneling Rate and Dielectric

PERHATIAN KHUSUS lagu Tuning regula

Threshold tuning for improved classifica (1)

Comparison of PID Controller Tuning Methods

1 A Introduction to Process Control

MAX2622 VCO TUNING CURVE

Dukungan

Links

JDK 1.1.x -prof and Java 2 cpu=old Profile Output

2.3.4 JDK 1.1.x -prof and Java 2 cpu=old Profile Output

2.4 Object-Creation Profiling

Parts

Dokumen yang terkait

Perancangan self-tuning PID

Tuning Out For Success

Tuning Out The Radio Voice

SIMULASI IMPLEMENTASI SELF TUNING REGULA

Tuning the Tunneling Rate and Dielectric

PERHATIAN KHUSUS lagu Tuning regula

Threshold tuning for improved classifica (1)

Comparison of PID Controller Tuning Methods

1 A Introduction to Process Control

MAX2622 VCO TUNING CURVE

Dokumen yang Anda mencari sudah siap untuk unduhkan