Understanding ‘iostat’ output for database I/O loads
From the Linux man page:
“The iostat command is used for monitoring system input/output device loading by observing the time the devices are active in relation to their average transfer rates. The iostat command generates reports that can be used to change system configuration to better balance the input/output load between physical disks.”
Reports that we get from ‘iostat’ are really useful but I myself had a little bit of trouble when trying to interpret the results while using the the first time, but since then its my preferred go-to tool when trying to debug disk overloads.
I usually use the iostat command with the following switches:
iostat –d –x <interval>
-d = gets rid of the CPU stats so that we can easily concentrate on the I/O only
-x = some additional info like ‘await’ and ‘svctm’ (will discuss them later)
<interval> = this is time in seconds, so every number of <interval> seconds you will get a new ‘iostat’ report
Let’s now see a sample output of ‘iostat’:
If we look at stats above usually we would look at %util and if we see close to 100% it can identify the problem for a single disk setup, but not in a usual multi-disks scenario.
Columns that we look at it in order to identify the problem will be:
syvctm: The average service time (in milliseconds) for I/O requests that were issued to the device
await: The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.
This basically means:
await = syvctm + wait time in queue
Now using the above we can have a basic rule to identify an overloaded setup:
…if you can see a lot of difference in values for ‘syvctm’ and ‘await’ every now and then, that can tell you about I/O requests being going into long waits and this should help you identify the problem.