Disk IO Hard Disks
14.1.1 Disk IO
Do not underestimate the impact of disk writes on the system as a whole. For example, all database vendors strongly recommend that the system swap files [1] be placed on a separate disk from their databases. The impact of not doing so can decrease database throughput and system activity by an order of magnitude. This performance decrease comes from not splitting the IO of two disk- intensive applications in this case, OS paging and database IO. [1] The disk files for the virtual memory of the operating system; see Section 14.3 . Identifying that there is an IO problem is usually fairly easy. The most basic symptom is that things take longer than expected, while at the same time the CPU is not at all heavily worked. The disk- monitoring utilities will also tell you that there is a lot of work being done to the disks. At the system level, you should determine the average and peak requirements on the disks. Your disks will have some statistics that are supplied by the vendor, including: • The average and peak transfer rates , normally in megabytes MB per second, e.g., 5MBsec. From this, you can calculate how long an 8K page takes to be transferred from disk; for example, 5MBsec is about 5Kms, so an 8K page takes just under 2 ms to transfer. • Average seek time, normally in milliseconds ms, e.g., 10 ms. This is the time required for the disk head to move radially to the correct location on the disk. • Rotational speed, normally in revolutions per minute rpm, e.g., 7200 rpm. From this, you can calculate the average rotational delay in moving the disk under the disk-head reader, i.e., the time taken for half a revolution. For example, for 7200 rpm, one revolution takes 60,000 ms 60 seconds divided by 7200 rpm, which is about 8.3 ms. So half a revolution takes just over 4 ms, which is consequently the average rotational delay. This list allows you to calculate the actual time it takes to load a random 8K page from the disk, this being seek time + rotational delay + transfer time. Using the examples given in the list, you have 10 + 4 + 2 = 16ms to load a random 8K page almost an order of magnitude slower than the raw disk throughput. This calculation gives you a worst-case scenario for the disk-transfer rates for your application, allowing you to determine if the system is up to the required performance. Note that if you are reading data stored sequentially on disk as when reading a large file, the seek time and rotational delay are incurred less than once per 8K page loaded. Basically, these two times are incurred only at the beginning of opening the file and whenever the file is fragmented. But this calculation is confounded by other processes also executing IO to the disk at the same time. These overheads are some of the reasons why swap and other intensive IO files should not be put on the same disk. One mechanism for speeding up disk IO is to stripe disks. Disk striping allows data from a particular file to be spread over several disks. Striping allows reads and writes to be performed in parallel across the disks without requiring any application changes. This can speed up disk IO quite effectively. However, be aware that the seek and rotational overheads previously listed still apply, and if you are making many small random reads, there may be no performance gain from striping disks. Finally, note again that using remote disks affects IO performance very badly. You should not be using remote disks mounted from the network with any IO-intensive operations if you need good performance. - 307 -14.1.2 Clustering Files
Parts
» OReilly.Java.performance tuning
» The Tuning Game System Limitations and What to Tune
» A Tuning Strategy Introduction
» Threading to Appear Quicker Streaming to Appear Quicker
» User Agreements Starting to Tune
» Setting Benchmarks Starting to Tune
» The Benchmark Harness Starting to Tune
» Taking Measurements Starting to Tune
» What to Measure Introduction
» Dont Tune What You Dont Need to Tune
» Measurements and Timings Profiling Tools
» Garbage Collection Profiling Tools
» Profiling Methodology Method Calls
» Java 2 cpu=samples Profile Output
» HotSpot and 1.3 -Xprof Profile Output
» JDK 1.1.x -prof and Java 2 cpu=old Profile Output
» Object-Creation Profiling Profiling Tools
» Monitoring Gross Memory Usage
» Replacing Sockets ClientServer Communications
» Performance Checklist Profiling Tools
» Garbage Collection Underlying JDK Improvements
» Replacing JDK Classes Underlying JDK Improvements
» VM Speed Variations VMs with JIT Compilers
» Other VM Optimizations Faster VMs
» Inline calls Remove dynamic type checks Unroll loops Code motion
» Literal constants are folded String concatenation is sometimes folded Constant fields are inlined
» Optimizations Performed When Using the -O Option
» Performance Effects From Runtime Options
» Compile to Native Machine Code
» Native Method Calls Underlying JDK Improvements
» Uncompressed ZIPJAR Files Underlying JDK Improvements
» Performance Checklist Underlying JDK Improvements
» Object-Creation Statistics Object Creation
» Pool Management Object Reuse
» Reusable Parameters Object Reuse
» String canonicalization Changeable objects
» Weak references Canonicalizing Objects
» Avoiding Garbage Collection Object Creation
» Preallocating Objects Lazy Initialization
» Performance Checklist Object Creation
» The Performance Effects of Strings
» Compile-Time Versus Runtime Resolution of Strings
» Converting bytes, shorts, chars, and booleans to Strings Converting floats to Strings
» Converting doubles to Strings
» Converting Objects to Strings
» Word-Counting Example Strings Versus char Arrays
» Line Filter Example HotSpot 1.0
» String Comparisons and Searches
» Sorting Internationalized Strings Strings
» The Cost of try-catch Blocks Without an Exception
» The Cost of try-catch Blocks with an Exception
» Using Exceptions Without the Stack Trace Overhead Conditional Error Checking
» no JIT 1.3 Variables Strings
» Method Parameters Performance Checklist
» Exception-Terminated Loops Loops and Switches
» no JIT 1.3 Loops and Switches
» Recursion Loops and Switches
» no HotSpot 1.0 2nd Loops and Switches
» Recursion and Stacks Loops and Switches
» Performance Checklist Loops and Switches
» Replacing System.out IO, Logging, and Console Output
» Logging From Raw IO to Smokin IO
» no JIT HotSpot 1.0 no JIT HotSpot 1.0 Serialization
» no IO, Logging, and Console Output
» Clustering Objects and Counting IO Operations
» Compression IO, Logging, and Console Output
» Performance Checklist IO, Logging, and Console Output
» Avoiding Unnecessary Sorting Overhead
» An Efficient Sorting Framework
» no HotSpot Better Than Onlogn Sorting
» User-Interface Thread and Other Threads
» Desynchronization and Synchronized Wrappers
» Avoiding Serialized Execution HotSpot 1.0
» no JIT no JIT HotSpot 1.0 Timing Multithreaded Tests
» Atomic Access and Assignment
» Free Load Balancing from TCPIP
» Load-Balancing Classes Load Balancing
» A Load-Balancing Example Load Balancing
» Threaded Problem-Solving Strategies Threading
» Collections Appropriate Data Structures and Algorithms
» Java 2 Collections Appropriate Data Structures and Algorithms
» Hashtables and HashMaps Appropriate Data Structures and Algorithms
» Cached Access Appropriate Data Structures and Algorithms
» Caching Example I Appropriate Data Structures and Algorithms
» Caching Example II Appropriate Data Structures and Algorithms
» Finding the Index for Partially Matched Strings
» Search Trees Appropriate Data Structures and Algorithms
» Comparing Communication Layers Distributed Computing
» Batching I Application Partitioning
» Compression Caching Low-Level Communication Optimizations
» Transfer Batching Low-Level Communication Optimizations
» Batching II Distributed Garbage Collection
» Performance Checklist Distributed Computing
» When Not to Optimize Tuning Class Libraries and Beans
» Scaling Design and Architecture
» Distributed Applications Design and Architecture
» Object Design Design and Architecture
» Use simulations and benchmarks Consider the total work done and the design overhead
» Tuning After Deployment When to Optimize
» User Interface Usability Training Server Downtime
» Performance Checklist When to Optimize
» Clustering Files Cached Filesystems RAM Disks, tmpfs, cachefs
» Disk Fragmentation Disk Sweet Spots
» RAM Underlying Operating System and Network Improvements
» Network Bottlenecks Network IO
» Performance Checklist Underlying Operating System and Network Improvements
Show more