Tuning vSCSI Disk Settings
There are several settings that can be tuned for vSCSI Disks in AIX and this is understanding of these tuning options and the various settings allowed per attribute.
What is algorithm?
The algorithm setting tells the device driver which load balancing and/or failover algorithms to use. With vSCSI disks, only fail_over is supported.
What is hcheck_cmd?
The hcheck_cmd settings tells device driver which health check command to use to verify if the disk is still active and alive. The two valid options are test_unit_rdy and inquiry. If you have reservation locks on your disks, use the inquiry option, as the test_unit_rdy option will fail and log an error on the client.
What is hcheck_interval?
The hcheak_interval is the interval in seconds between health check polls to the disk using the hcheck_cmd above. If an MPIO disks failed path is polled and found to be responding, the failed path will automatically be re-enabled. A value of 0 (zero) disable the auto polling feature. This value should always be greater than the rw_timeout value of the parent adapter. Having a small value may add aditional IO load to the adapter and disk drivers, especially if you have a large number of disk and paths per adapter.
What is hcheck_mode?
The hcheck_mode determines which types of device paths are health checked. The PCM checks the continuity of a path and the ability of the target device to process commands. If this check is successful, the path will be enabled. If the check fails, the path is left in its current state or failed.
Select "enabled" to check only paths that are in the enabled state and have no active I/O.
Select "failed" to check only paths that are in the failed state only.
Select "nonactive" to check only paths with no active I/O, these paths can be in a failed state.
What is queue_depth?
The queue_depth settings determines how many pending IOs are allowed per disk device, allowable range is 1 to 256. It is generally recommended to leave this setting at the default value as per the vendors recommendation, unless advised by IBM or the vendor support teams. The number of concurrent outstanding I⁄O requests that can be queued on the disk with additional requests being blocked and the 'sqfull' value shown with 'iostat -DR' will increase.
What is reserve_policy?
The reserve_policy is the disk reservation policy in force. This provides support for applications that are enabled to use SCSI-2 reserve functions. Typically clustering software will require the 'single_path' reserve option to be set.
What is max_transfer?
The max_transfer setting determines the amount of data that can be transfered to the disk in a single I/O operation. By increasing this value for all disks in a volume group, then the 'LTG size' value for a volume group can be increased. You may want increase this value so that the AIX I/O size is equal to (or greater than) your array stripe size. Below is the size each hex value relates to.
0x20000 = 128KB
0x40000 = 256KB <== Generally the Default value.
0x80000 = 512KB
0x100000 = 1MB
The following may not be support by AIX/LVM
0x200000 = 2MB
0x400000 = 4MB
0x800000 = 8MB
0x1000000 = 16MB
Load Balance with Multiple Paths
By default, when you run cfgmgr on an AIX LPAR with MPIO, all vSCSI disks will be set with a path priority of 1. This means that all your disk I/O (by default) will go to the first VIO server configured to service your disks. For best performance and to balance across multiple VIO servers, use the chpath command. I suggest all even number disks have priority 1 to the first VIO Server and priority 2 to the second VIOS, and all odd numbered disks have priority 1 to the second VIOS and priority 2 to the first VIOS, as per the following example.
chpath -l hdisk0 -p vscsi0 -a priority=1
chpath -l hdisk0 -p vscsi1 -a priority=2
chpath -l hdisk1 -p vscsi0 -a priority=2
chpath -l hdisk1 -p vscsi1 -a priority=1