Demonstrations of biolatency, the Linux eBPF/bcc version.


biolatency traces block device I/O (disk I/O), and records the distribution
of I/O latency (time), printing this as a histogram when Ctrl-C is hit.
For example:

# ./biolatency
Tracing block device I/O... Hit Ctrl-C to end.
^C
     usecs           : count     distribution
       0 -> 1        : 0        |                                      |
       2 -> 3        : 0        |                                      |
       4 -> 7        : 0        |                                      |
       8 -> 15       : 0        |                                      |
      16 -> 31       : 0        |                                      |
      32 -> 63       : 0        |                                      |
      64 -> 127      : 1        |                                      |
     128 -> 255      : 12       |********                              |
     256 -> 511      : 15       |**********                            |
     512 -> 1023     : 43       |*******************************       |
    1024 -> 2047     : 52       |**************************************|
    2048 -> 4095     : 47       |**********************************    |
    4096 -> 8191     : 52       |**************************************|
    8192 -> 16383    : 36       |**************************            |
   16384 -> 32767    : 15       |**********                            |
   32768 -> 65535    : 2        |*                                     |
   65536 -> 131071   : 2        |*                                     |

The latency of the disk I/O is measured from the issue to the device to its
completion. A -Q option can be used to include time queued in the kernel.

This example output shows a large mode of latency from about 128 microseconds
to about 32767 microseconds (33 milliseconds). The bulk of the I/O was
between 1 and 8 ms, which is the expected block device latency for
rotational storage devices.

The highest latency seen while tracing was between 65 and 131 milliseconds:
the last row printed, for which there were 2 I/O.

For efficiency, biolatency uses an in-kernel eBPF map to store timestamps
with requests, and another in-kernel map to store the histogram (the "count")
column, which is copied to user-space only when output is printed. These
methods lower the performance overhead when tracing is performed.


In the following example, the -m option is used to print a histogram using
milliseconds as the units (which eliminates the first several rows), -T to
print timestamps with the output, and to print 1 second summaries 5 times:

# ./biolatency -mT 1 5
Tracing block device I/O... Hit Ctrl-C to end.

06:20:16
     msecs           : count     distribution
       0 -> 1        : 36       |**************************************|
       2 -> 3        : 1        |*                                     |
       4 -> 7        : 3        |***                                   |
       8 -> 15       : 17       |*****************                     |
      16 -> 31       : 33       |**********************************    |
      32 -> 63       : 7        |*******                               |
      64 -> 127      : 6        |******                                |

06:20:17
     msecs           : count     distribution
       0 -> 1        : 96       |************************************  |
       2 -> 3        : 25       |*********                             |
       4 -> 7        : 29       |***********                           |
       8 -> 15       : 62       |***********************               |
      16 -> 31       : 100      |**************************************|
      32 -> 63       : 62       |***********************               |
      64 -> 127      : 18       |******                                |

06:20:18
     msecs           : count     distribution
       0 -> 1        : 68       |*************************             |
       2 -> 3        : 76       |****************************          |
       4 -> 7        : 20       |*******                               |
       8 -> 15       : 48       |*****************                     |
      16 -> 31       : 103      |**************************************|
      32 -> 63       : 49       |******************                    |
      64 -> 127      : 17       |******                                |

06:20:19
     msecs           : count     distribution
       0 -> 1        : 522      |*************************************+|
       2 -> 3        : 225      |****************                      |
       4 -> 7        : 38       |**                                    |
       8 -> 15       : 8        |                                      |
      16 -> 31       : 1        |                                      |

06:20:20
     msecs           : count     distribution
       0 -> 1        : 436      |**************************************|
       2 -> 3        : 106      |*********                             |
       4 -> 7        : 34       |**                                    |
       8 -> 15       : 19       |*                                     |
      16 -> 31       : 1        |                                      |

How the I/O latency distribution changes over time can be seen.



The -Q option begins measuring I/O latency from when the request was first
queued in the kernel, and includes queuing latency:

# ./biolatency -Q
Tracing block device I/O... Hit Ctrl-C to end.
^C
     usecs           : count     distribution
       0 -> 1        : 0        |                                      |
       2 -> 3        : 0        |                                      |
       4 -> 7        : 0        |                                      |
       8 -> 15       : 0        |                                      |
      16 -> 31       : 0        |                                      |
      32 -> 63       : 0        |                                      |
      64 -> 127      : 0        |                                      |
     128 -> 255      : 3        |*                                     |
     256 -> 511      : 37       |**************                        |
     512 -> 1023     : 30       |***********                           |
    1024 -> 2047     : 18       |*******                               |
    2048 -> 4095     : 22       |********                              |
    4096 -> 8191     : 14       |*****                                 |
    8192 -> 16383    : 48       |*******************                   |
   16384 -> 32767    : 96       |**************************************|
   32768 -> 65535    : 31       |************                          |
   65536 -> 131071   : 26       |**********                            |
  131072 -> 262143   : 12       |****                                  |

This better reflects the latency suffered by the application (if it is
synchronous I/O), whereas the default mode without kernel queueing better
reflects the performance of the device.

Note that the storage device (and storage device controller) usually have
queues of their own, which are always included in the latency, with or
without -Q.


The -D option will print a histogram per disk. Eg:

# ./biolatency -D
Tracing block device I/O... Hit Ctrl-C to end.
^C

Bucket disk = 'xvdb'
     usecs               : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 1        |                                        |
       256 -> 511        : 33       |**********************                  |
       512 -> 1023       : 36       |************************                |
      1024 -> 2047       : 58       |****************************************|
      2048 -> 4095       : 51       |***********************************     |
      4096 -> 8191       : 21       |**************                          |
      8192 -> 16383      : 2        |*                                       |

Bucket disk = 'xvdc'
     usecs               : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 1        |                                        |
       256 -> 511        : 38       |***********************                 |
       512 -> 1023       : 42       |*************************               |
      1024 -> 2047       : 66       |****************************************|
      2048 -> 4095       : 40       |************************                |
      4096 -> 8191       : 14       |********                                |

Bucket disk = 'xvda1'
     usecs               : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 18       |**********                              |
       512 -> 1023       : 67       |*************************************   |
      1024 -> 2047       : 35       |*******************                     |
      2048 -> 4095       : 71       |****************************************|
      4096 -> 8191       : 65       |************************************    |
      8192 -> 16383      : 65       |************************************    |
     16384 -> 32767      : 20       |***********                             |
     32768 -> 65535      : 7        |***                                     |

This output sows that xvda1 has much higher latency, usually between 0.5 ms
and 32 ms, whereas xvdc is usually between 0.2 ms and 4 ms.


USAGE message:

# ./biolatency -h
usage: biolatency [-h] [-T] [-Q] [-m] [-D] [interval] [count]

Summarize block device I/O latency as a histogram

positional arguments:
  interval            output interval, in seconds
  count               number of outputs

optional arguments:
  -h, --help          show this help message and exit
  -T, --timestamp     include timestamp on output
  -Q, --queued        include OS queued time in I/O time
  -m, --milliseconds  millisecond histogram
  -D, --disks         print a histogram per disk device

examples:
    ./biolatency            # summarize block I/O latency as a histogram
    ./biolatency 1 10       # print 1 second summaries, 10 times
    ./biolatency -mT 1      # 1s summaries, milliseconds, and timestamps
    ./biolatency -Q         # include OS queued time in I/O time
    ./biolatency -D         # show each disk device separately