Perfomance Checking On Servers With Linex Os

To understand the performance of EFD of HP

Uploaded by

Danielle Williamson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views

Perfomance Checking On Servers With Linex Os

To understand the performance of EFD of HP

Uploaded by

Danielle Williamson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Linux I/O performance

An end-to-end methodology for maximizing Linux I/O performance on the
IBM System x servers in a typical SAN environment.

David Quenzler
IBM Systems and Technology Group ISV Enablement
June 2012

Linux I/O Performance

Table of contents
Abstract........................................................................................................................................1
Introduction .................................................................................................................................1
External storage subsystem - XIV .............................................................................................2
External SAN switches ...............................................................................................................4
Bottleneck monitoring .............................................................................................................................. 4
Fabric parameters.................................................................................................................................... 5
Basic port configuration ........................................................................................................................... 5
Advanced port configuration.................................................................................................................... 5
Host adapter placement rules....................................................................................................6
System BIOS settings.................................................................................................................6
HBA BIOS settings......................................................................................................................6
Linux kernel parameters.............................................................................................................9
Linux memory settings...............................................................................................................9
Page size ................................................................................................................................................. 9
Transparent huge pages.......................................................................................................................... 9
Linux module settings - qla2xxx..............................................................................................10
Linux SCSI subsystem tuning - /sys .......................................................................................12
Linux XFS file system create options .....................................................................................14
Linux XFS file system mount options.....................................................................................17
Red Hat tuned............................................................................................................................18
ktune.sh ................................................................................................................................................. 18
ktune.sysconfig ...................................................................................................................................... 19
sysctl.ktune ............................................................................................................................................ 19
tuned.conf .............................................................................................................................................. 19
Linux multipath .........................................................................................................................20
Sample scripts...........................................................................................................................22
Summary....................................................................................................................................24
Resources..................................................................................................................................25
About the author .......................................................................................................................26
Trademarks and special notices..............................................................................................27

Linux I/O performance
1

Abst r ac t
This white paper discusses an end-to-end approach for Linux I/O tuning in a typical data center
environment consisting of external storage subsystems, storage area network (SAN) switches,
IBM System x Intel servers, Fibre Channel host bus adapters (HBAs) and 64-bit Red Hat
Enterprise Linux.
Anyone with an interest in I/O tuning is welcome to read this white paper.

I nt r oduc t i on
Linux I/O tuning is complex. In a typical environment, I/O makes several transitions from the client
application out to disk and vice versa. There are many pieces to the puzzle.
We will examine the following topics in detail:
External storage subsystems
External SAN switches
Host adapter placement rules
System BIOS settings
Adapter BIOS settings
Linux kernel parameters
Linux memory settings
Linux module settings
Linux SCSI subsystem settings
Linux file system create options
Linux file system mount options
Red Hat tuned
Linux multipath
You should follow an end-to-end tuning methodology in order to minimize the risk of poor tuning.
Recommendations in this white paper are based on the following environment under test:
IBM System x 3850 (64 processors and 640 GB RAM)
Red Hat Enterprise Linux 6.1 x86_64
The Linux XFS file system
IBM XIV external storage subsystem, Fibre Channel (FC) attached
An architecture comprising IBM hardware and Red Hat Linux provides a solid framework for maximizing
I/O performance.

Linux I/O performance
2
Ex t er nal st or age subsyst em - XI V
The XIV has few manual tunables. Here are a few tips:
Familiarize yourself with the XIV command-line interface (XCLI) as documented in the IBM XIV
Storage System User Manual.
Ensure that you connect the XIV system to your environment in the FC fully redundant
configuration as documented in the XIV Storage System: Host Attachment and Interoperability
guide from IBM Redbooks.

Figure 1: FC fully redundant configuration

Although you can define up to 12 paths per host, a maximum of six paths per host provides sufficient
redundancy and performance.
Useful XCLI commands:
# module_list -t all
# module_list -x
# fc_port_list
The XIV storage subsystem contains six FC data modules (4 to 9), each with 8 GB memory. The FC rate
is 4 Gbps and the data partition size is 1 MB.
Check the XIV HBA queue depth setting: The higher the host HBA queue depth, the more
parallel I/O goes to the XIV system, but each XIV port can only sustain up to 1400 concurrent
I/Os to the same type target or logical unit (LUN). Therefore, the number of connections
multiplied by the host HBA queue depth should not exceed that value. The number of
connections should take the multipath configuration into account.

Note: The XIV queue limit is 1400 per XIV FC host port and 256 per LUN per worldwide port
name (WWPN) per port.

Linux I/O performance
3

Twenty-four multipath connections to the XIV system would dictate that host queue depth be set
to 58. (24*58=1392)
Check the operating system (OS) disk queue depth (see below)
Make use of the XIV host attachment kit for RHEL

Useful commands:
# xiv_devlist

Linux I/O performance
4

Ex t er nal SAN sw i t c hes
As a best practice, set SAN Switch port speeds to Auto (auto-negotiate).
Typical bottlenecks are:
Latency bottleneck
Congestion bottleneck
Latency bottlenecks occur when frames are sent faster than they can be received. This can be due to
buffer credit starvation or slow drain devices in the fabric.
Congestion bottlenecks occur when the required throughput exceeds the physical data rate for the
connection.
Most SAN Switch web interfaces can be used to monitor the basic performance metrics, such as
throughput utilization, aggregate throughput, and percentage of utilization.
The Fabric OS command-line interface (CLI) can also be used to create frame monitors. These monitors
analyze the first 64 bytes of each frame and can detect various types of protocols that can be monitored.
Some performance features, such as frame monitor configuration (fmconfig), require a license.
Some of the useful commands:
switch:admin>perfhelp
switch:admin>perfmonitorshow
switch:admin>perfaddeemonitor
switch:admin>fmconfig

Bot t l enec k moni t or i ng
Enable bottleneck monitoring on SAN switches by using the following command:
switch:admin> bottleneckmon --enable -alert

Useful commands
switch:admin> bottleneckmon --status
switch:admin> bottleneckmon --show -interval 5 -span 300
switch:admin> switchstatusshow
switch:admin> switchshow
switch:admin> configshow
switch:admin> configshow -pattern "fabric"
switch:admin> diagshow
switch:admin> porterrshow

Linux I/O performance
5

Fabr i c par amet er s
Fabric parameters are described in the following table. Default values are in brackets []:
Fabric parameter Description
BBCredit
Increasing the buffer-to buffer (BB) credit parameter may increase
performance by buffering FC frames coming from 8Gb/s FC server
ports and going to 4Gb/s FC ports on the XIV. SAN segments can run
at different rates.
Frame pacing (BB credit starvation) occurs when no more BB credits
are available. Frame pacing delay: AVG FRAME PACING should
always be zero. If not, increase buffer credits. But, over-increasing the
number of BB credits does not increase performance
[16]
E_D_TOV Error Detect TimeOut Value [2000]
R_A_TOV Resource Allocation TimeOut Value [10000]
dataFieldSize 512, 1024, 2048, 2112 [2112]
Sequence Level Switching Under normal conditions, disable for better performance (interleave
frames, do not group frames) [0]
Disable Device Probing Set this mode only if N_Pord discovery causes attached devices to fail
[0]
Per-Frame Routing Priority [0]
Suppress Class F Traffic Used with ATM gateways only [0]
Insistent Domain ID Mode fabric.ididmode [0]
Table 1: Fabric Parameters default values are in brackets []

Basi c por t c onf i gur at i on
Target rate limiting (ratelim) is used to minimize congestion at the adapter port caused by a slow drain
device operating in the fabric at a slower speed (for example a 4 GBps XIV system)

Advanc ed por t c onf i gur at i on
Turning on Interrupt Control Coalesce and increasing the latency monitor timeout value can improve
performance by reducing interrupts and processor utilization.

Linux I/O performance
6

Host adapt er pl ac ement r ul es
It is extremely important for you to follow the adapter placement rules for your server in order to minimize
PCI bus saturation.

Syst em BI OS set t i ngs
Use recommended CMOS settings for your IBM System x server.
You can use the IBM Advanced Settings Utility (asu64) to modify the System x BIOS settings from the
Linux command line. It is normally installed in /opt/ibm/toolscenter/asu
ASU normally tries to communicate over the LAN through the USB interface. Disable the LAN over USB
interface with the following command:
# asu64 set IMM.LanOverUsb Disabled --kcs
The following settings can result in better performance
uEFI.TurboModeEnable=Enable
uEFI.PerformanceStates=Enable
uEFI.PackageCState=ACPI C3
uEFI.ProcessorC1eEnable=Disable
uEFI.DDRspeed=Max Performance
uEFI.QPISpeed=Max Performance
uEFI.EnergyManager=Disable
uEFI.OperatingMode=Performance Mode

Also: enabling or disabling Hyperthreading can improve application performance.

Useful commands:
# asu64 show
# asu64 show --help
# asu64 set IMM.LanOverUsb Disabled --kcs
# asu64 set uEFI.OperatingMode Performance

HBA BI OS set t i ngs
You can use the QLogic SANSurfer command-line utility (scli) to show or modify HBA settings.

Linux I/O performance
7

Task Command
Display current HBA Parameter
settings
# scli -c
Display WWPNs only # scli -c | grep WWPN
Display settings only # scli -c | grep \: | grep -v WWPN | sort | uniq -c
Restore default settings # scli -n all default
Table 2: Modifying HBA settings

WWPNs can also be determined from the Linux command line or using a small script
#!/bin/sh
###
hba_location=$(lspci | grep HBA | awk '{print $1}')

for adapter in $hba_location
do
cat $(find /sys/devices -name \*${adapter})/host*/fc_host/host*/port_name
done
Listing 1: Determining WWPNs

HBA parameters as reported by the scli command appear in the following table:
Parameter Default value
Connection Options 2 - Loop Preferred, Otherwise Point-to-Point
Data Rate Auto
Enable FC Tape Support Disabled
Enable Hard Loop ID Disabled
Enable Host HBA BIOS Disabled
Enable LIP Full Login Yes
Enable Target Reset Yes
Execution Throttle 16
Frame Size 2048

Linux I/O performance
8

Hard Loop ID 0
Interrupt Delay Timer (100ms) 0
Link Down Timeout (seconds) 30
Login Retry Count 8
Loop Reset Delay (seconds) 5
LUNs Per Target 128
Operation Mode 0
Out Of Order Frame Assembly Disabled
Port Down Retry Count 30 seconds
Table 3: HBA BIOS tunable parameters (sorted)

Use the lspci command to show which type(s) of Fibre Channel adapters exist in the system. For
example:
# lspci | grep HBA
Note: Adapters from different vendors have different default values.

Linux I/O performance
9

Li nux k er nel par amet er s
The available options for the Linux scheduler are noop, anticipatory, deadline, or cfq.
echo "Linux: SCHEDULER"
cat /sys/block/*/queue/scheduler | grep -v none | sort | uniq -c
echo ""
Listing 2: Determining the Linux scheduler for block devices

The Red Hat enterprise-storage tuned profile uses the deadline scheduler. The deadline scheduler can
be enabled by adding the elevator=deadline parameter to the kernel command line in grub.conf.

Useful commands:
# cat /proc/cmdline

Li nux memor y set t i ngs
This section shows you the Linux memory settings.
Page si ze
The default page size for Red Hat Linux is 4096 bytes.
# getconf PAGESIZE
Tr anspar ent huge pages
The default size for huge pages is 2048 KB for most large systems.
echo "Linux: HUGEPAGES"
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo ""
Listing 3: Determining the Linux huge page setting
The Red Hat enterprise-storage tuned profile enables huge pages.

Linux I/O performance
10

Li nux modul e set t i ngs - ql a2x x x
You can see the parameters for the qla2xxx module using the following script:
#!/bin/sh
###
for param in $(ls /sys/module/qla2xxx/parameters)
do
echo -n "${param} = "
cat /sys/module/qla2xxx/parameters/${param}
done
Listing 4: Determining qla2xxx module parameters

Disable Qlogic failover. If the output of the following command shows the -k driver (not the -fo driver) then
failover is disabled.
# modprobe qla2xxx | grep -w ^version
version: <some_version>-k

Qlogic lists the highlights of the 2400 series HBAs:
150,000 IOPS per port
Out-of-order frame reassembly
T10 CRC for end-to-end data integrity

Useful commands:
# modinfo -p qla2xxx

The qla_os.c file in the Linux kernel source contains information on many of the qla2xxx module
parameters. Some parameters as listed by modinfo -p do not exist in the Linux source code. Others
are not explicitly defined but may be initialized by the adapter firmware.

Descriptions of module parameters appear in the following table:
Parameter Description Linux kernel source Default value
ql2xallocfwdump Allocate memory for a
firmware dump during
1 1 - allocate memory

Linux I/O performance
11

HBA initialization
ql2xasynctmfenable Issue TM IOCBs
asynchronously via
IOCB mechanism
does not exist 0 - issue TM IOCBs via
mailbox mechanism
ql2xdbwr Scheme for request
queue posting
does not exist 1 - CAMRAM doorbell
(faster)
ql2xdontresethba Reset behavior does not exist 0 - reset on failure
ql2xenabledif T10-CRC-DIF does not exist 1 - DIF support
ql2xenablehba_err_chk T10-CRC-DIF Error
isolation by HBA
does not exist 0 - disabled
ql2xetsenable Firmware ETS burst does not exist 0 - skip ETS enablement
ql2xextended_error_log
ging
Extended error logging not explicitly defined 0 - no logging
ql2xfdmienable FDMI registrations 1 0 - no FDMI
ql2xfwloadbin Location from which to
load firmware
not explicitly defined 0 - use default semantics
ql2xgffidenable GFF_ID checks of port
type
does not exist 0 - do not use GFF_ID
ql2xiidmaenable IDMA setting 1 1 - perform iIDMA
ql2xloginretrycount Alternate value for
NVRAM login retry
count
0 0
ql2xlogintimeout Login timeout value in
seconds
20 20
ql2xmaxqdepth Maximum queue depth
for target devices --
used to seed queue
depth for scsi devices
32 32
ql2xmaxqueues MQ 1 1 - single queue
ql2xmultique_tag CPU affinity not defined 0 - no affinity
ql2xplogiabsentdevice PLOGI not defined 0 - no PLOGI
ql2xqfulrampup Time in seconds to wait
to begin to ramp-up the
queue depth for a
device after a queue-full
condition has been
does not exist 120 seconds

Linux I/O performance
12

detected
ql2xqfulltracking Track and dynamically
adjust queue depth for a
scsi devices
does not exist 1 - perform tracking
ql2xshiftctondsd Control shifting of
command type
processing based on
total number of SG
elements
does not exist 6
ql2xtargetreset Target reset does not exist 1 - use hw defaults
qlport_down_retry Maximum number of
command retries to a
port in PORT-DOWN
state
not defined 0
Table 4: qla2xxx module parameters

Li nux SCSI subsyst em t uni ng - /sys
See /sys/block/<device>/queue/<parameter>
Block device parameter values can be determined using a small script:
#!/bin/sh
###

param_list=$(find /sys/block/sda/queue -maxdepth 1 -type f -exec basename '{}' \;
| sort)
dev_list=$(ls -l /dev/disk/by-path | grep -w fc | awk -F \/ '{print $3}')
dm_list=$(ls -d /sys/block/dm-* | awk -F \/ '{print $NF}')

for param in ${param_list}
do
echo -n "${param} = "
for dev in ${dev_list} ${dm_list}
do
cat /sys/block/${dev}/queue/${param}

Linux I/O performance
13

done | sort | uniq -c
done
echo -n "queue_depth = "
for dev in ${dev_list}
do
cat /sys/block/${dev}/device/queue_depth
done | sort | uniq -c
Determining block device parameters
To send down large-size requests (greater than 512 KB on 4 KB page size systems):
Consider increasing max_segments to 1024 or greater
Set max_sectors_kb equal to max_hw_sectors_kb
SCSI device parameters appear in the following table. Values that can be changed are shown as (rw):
Parameter Description Value
hw_sector_size (ro) Hardware sector size in bytes 512
max_hw_sectors_kb (ro) Maximum number of kilobytes
supported in a single data
transfer
32767
max_sectors_kb (rw) Maximum number of kilobytes
that the block layer will allow for
a file system request
512
nomerges (rw) Enable or disable lookup logic 0- all merges are enabled
nr_requests (rw) Number of read or write requests
which can be allocated in the
block layer
128
read_ahead_kb (rw) 8192
rq_affinity (rw)
Always complete a request on
the same CPU that queued it.
1- CPU group affinity
2- strict CPU affinity
1 - CPU group affinity
scheduler (rw) deadline
Table 5: SCSI subsystem tunable parameters

Linux I/O performance
14

Using max_sectors_kb:
By default, Linux devices are configured for a maximum 512 KB I/O size. When using a larger file system
block size, increase the max_sectors_kb parameter. Max_sectors_kb must be less than or equal to
max_hw_sectors_kb.

The default queue_depth is 32 and represents the total number of transfers that can be queued to a
device. You can check the queue depth by examining /sys/block/<device>/device/queue_depth.

Li nux XFS f i l e syst em c r eat e opt i ons

Useful commands:
# getconf PAGESIZE
# man mkfs.xfs

Note: XFS writes are not guaranteed to be committed unless the program issues a fsync() call
afterwards.

Red Hat: Optimizing for a large number of files
If necessary, you can increase the amount of space allowed for inodes using the
mkfs.xfs -i maxpct= option. The default percentage of space allowed for inodes varies by file
system size. For example, a file system between 1 TB and 50 GB in size will allocate 5% of the
total space for inodes.
Red Hat: Optimizing for a large number of files in a single directory
Normally, the XFS file system directory block size is the same as the file system block size.
Choose a larger value for the mkfs.xfs -n size= option, if there are many millions of directory
entries.
Red Hat: Optimizing for concurrency
Increase the number of allocation groups on systems with many processors.

Red Hat: Optimizing for applications that use extended attributes
1. Increasing inode size might be necessary if applications use extended attributes.
2. Multiple attributes can be stored in an inode provided that they do not exceed the maximum size
limit (in bytes) for attribute+value.

Linux I/O performance
15

Red Hat: Optimizing for sustained metadata modifications
1. Systems with large amounts of RAM could benefit from larger XFS log sizes.
2. The log should be aligned with the device stripe size (the mkfs command may do this
automatically)

The metadata log can be placed on another device, for example, a solid-state drive (SSD) to reduce disk
seeks.
Specify the stripe unit and width for hardware RAID devices

Syntax (options not related to performance are omitted)
# mkfs.xfs [ options ] device

-b block_size_options
size=<int> -- size in bytes
default 4096
minimum 512
maximum 65536 (must be <= PAGESIZE)

-d data_section_options
More allocation groups imply that more parallelism can be achieved when
allocating blocks and inodes
agcount=<int> -- number of allocation groups
agsize
name
file
size
sunit
su
swidth
sw

Linux I/O performance
16

-i inode_options
size
log
perblock
maxpct
align
attr

-l log_section_options
internal
logdev
size
version
sunit
su
lazy-count

-n naming_options
size
log
version

-r realtime_section_options
rtdev
extsize
size

-s sector_size

Linux I/O performance
17

log
size

-N
Dry run. Print out filesystem parameters without creating the filesystem.
L

i nux XFS f i l e syst em mount opt i ons
isting 5: Create options for XFS file systems
L

Useful commands
fs_info
roc/mounts

obarrier
oatime
ode64 XFS is allowed to create inodes at any location in the file system. Starting from kernel 2.6.35,
gbsize Larger values can improve performance. Smaller values should be used with fsync-heavy
elaylog RAM is used to reduces the number of changes to the log.
he Red Hat 6.2 Release Notes mention that XFS has been improved in order to better handle metadata
# x
# xfs_quota
# grep xfs /p
# mount | grep xfs
n

n

in
XFS file systems will mount either with or without the inode64 option.

lo
workloads.

d

T
intensive workloads. The default mount options have been updated to use delayed logging.

Linux I/O performance
18

Red Hat t uned
Red Hat Enterprise Linux has a tuning package called tuned which sets certain parameters based on a
chosen profile.

Useful commands:
# tuned-adm help
# tuned-adm list
# tuned-adm active

The enterprise-storage profile contains the following files. When comparing the enterprise-storage profile
with the throughput-performance profile, some files are identical:
# cd /etc/tune-profiles
# ls enterprise-storage/
ktune.sh ktune.sysconfig sysctl.ktune tuned.conf
# sum throughput-performance/* enterprise-storage/* | sort
03295 2 throughput-performance/sysctl.s390x.ktune
08073 2 enterprise-storage/sysctl.ktune
15419 2 enterprise-storage/ktune.sysconfig
15419 2 throughput-performance/ktune.sysconfig
15570 1 enterprise-storage/ktune.sh
43756 1 enterprise-storage/tuned.conf
43756 1 throughput-performance/tuned.conf
47739 2 throughput-performance/sysctl.ktune
57787 1 throughput-performance/ktune.sh

k t une.sh
The enterprise-storage ktune.sh is the same as the throughput-performance ktune.sh but adds
functionality for disabling or enabling I/O barriers. The enterprise-storage profile is preferred when using
XIV storage. Important functions include:
set_cpu_governor performance -- uses cpuspeed to set the governor
enable_transparent_hugepages -- does what it says

Linux I/O performance
19

remount_partitions nobarrier -- disables write barriers
multiply_disk_readahead -- modifies /sys/block/sd*/queue/read_ahead_kb

k t une.sysc onf i g
ktune.sysconfig is identical for both throughput-performance and enterprise-storage profiles:
# grep -h ^[A-Za-z] enterprise-storage/ktune.sysconfig \ throughput-
performance/ktune.sysconfig | sort | uniq -c
2 ELEVATOR="deadline"
2 ELEVATOR_TUNE_DEVS="/sys/block/{sd,cciss,dm-}*/queue/scheduler"
2 SYSCTL_POST="/etc/sysctl.conf"
2 USE_KTUNE_D="yes"
Listing 6: Sorting the ktune.sysconfig file

sysc t l .k t une
sysctl.ktune is functionally identical for both throughput-performance and enterprise-storage profiles:
# grep -h ^[A-Za-z] enterprise-storage/sysctl.ktune \ throughput-
performance/sysctl.ktune | sort | uniq -c
2 kernel.sched_min_granularity_ns = 10000000
2 kernel.sched_wakeup_granularity_ns = 15000000
2 vm.dirty_ratio = 40
Listing 7: Sorting the sysctl.ktune file

t uned.c onf
tuned.conf is identical for both throughput-performance and enterprise-storage profiles:
# grep -h ^[A-Za-z] enterprise-storage/tuned.conf \throughput-
performance/tuned.conf | sort | uniq -c
12 enabled=False
Listing 8: Sorting the tuned.conf file

Linux I/O performance
20

Li nux mul t i pat h
Keep it simple: configure just enough paths for redundancy and performance.

features='1 queue_if_no_path' hwhandler='0' wp=rw
policy='round-robin 0' prio=-1

features='1 queue_if_no_path'
Set 'no_path_retry N', then remove features='1 queue_if_no_path' option or set 'features 0'

Multipath configuration defaults
Parameter Default value
polling interval 5
udev_dir/dev /dev
multipath_dir /lib/multipath
find_multipaths no
verbosity 2
path_selector round-robin 0
path_grouping_policy failover
getuid_callout /lib/udev/scsi_id -- whitelisted --device=/dev/%n
prio const
features queue_if_no_path
path_checker directio
failback manual
rr_min_io 1000
rr_weight uniform
no_path_retry 0
user_friendly_names no
queue_without_daemon yes
flush_on_last_del no
max_fds determined by the calling process

Linux I/O performance
21

checker_timer /sys/block/sdX/device/timeout
fast_io_fail_tmo determined by the OS
dev_loss_tmo determined by the OS
mode determined by the process
uid determined by the process
gid determined by the process
Table 6: Multipath configuration options
The default load balancing policy (path_selector) is round-robin 0. Other choices are queue-length 0 and
service-time 0.
Consider using the XIV Linux host attachment kit to create the multipath configuration file.
# cat /etc/multipath.conf
devices {
device {
vendor "IBM"
product "2810XIV"
path_selector "round-robin 0"
path_grouping_policy multibus
rr_min_io 15
path_checker tur
failback 15
no_path_retry 5
#polling_interval 3
}
}

defaults {
...
user_friendly_names yes
...

Linux I/O performance
22

}
Listing 9: A sample multipath.conf file

Sampl e sc r i pt s
You can use the following script to query various settings related to I/O tuning:
#!/bin/sh
#!/bin/sh
# Query scheduler, hugepages, and readahead settings for fibre channel scsi
devices
###

#hba_pci_loc=$(lspci | grep HBA | awk '{print $1}')

echo "Linux: HUGEPAGES"
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo ""

echo "Linux: SCHEDULER"
cat /sys/block/*/queue/scheduler | grep -v none | sort | uniq -c
echo ""

echo "FC: max_sectors_kb"
ls -l /dev/disk/by-path | grep -w fc | awk -F'/' '{print $3}' | xargs -n1 -i cat
/sys/block/{}/queue/max_sectors_kb | sort | uniq -c
echo ""

echo "Linux: dm-* READAHEAD"
ls /dev/dm-* | xargs -n1 -i blockdev --getra {} | sort | uniq -c
blockdev --report /dev/dm-*
echo ""

Linux I/O performance
23

echo "Linux: FC disk sd* READAHEAD"
ls -l /dev/disk/by-path | grep -w fc | awk -F'/' '{print $3}' | xargs -n1 -i
blockdev --getra /dev/{} | sort | uniq -c
ls -l /dev/disk/by-path | grep -w fc | awk -F'/' '{print $3}' | xargs -n1 -i
blockdev --report /dev/{} | grep dev
echo ""
Listing 10: Sorting the ktune.sysconfig file

Linux I/O performance
24

Summar y
This white paper presented an end-to-end approach for Linux I/O tuning in a typical data center
environment consisting of external storage subsystems, storage area network (SAN) switches, IBM
System x Intel servers, Fibre Channel HBAs and 64-bit Red Hat Enterprise Linux.
Visit the links in the Resources section for more information on topics presented in this white paper.

Linux I/O performance
25

Resour c es
The following websites provide useful references to supplement the information contained in this paper:
XIV Redbooks
ibm.com/redbooks/abstracts/sg247659.html
ibm.com/redbooks/abstracts/sg247904.html

Note: IBM Redbooks are not official IBM product documentation.

XIV Infocenter
http://publib.boulder.ibm.com/infocenter/ibmxiv/r2

XIV Host Attachment Kit for RHEL can be downloaded from Fix Central
ibm.com/support/fixcentral

Qlogic
http://driverdownloads.qlogic.com
ftp://ftp.qlogic.com/outgoing/linux/firmware/rpms

Red Hat Enterprise Linux Documentation
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux

IBM Advanced Settings Utility
ibm.com/support/entry/portal/docdisplay?Indocid=TOOL-ASU

Linux
Documentation/kernel-parameters.txt
Documentation/block/queue-sysfs.txt
Documentation/filesystems/xfs.txt
drivers/scsi/qla2xxx
http://xfs.org/index.php/XFS_FAQ

Linux I/O performance
26

About t he aut hor
David Quenzler is a consultant in IBM Systems and Technology Group ISV Enablement Organization.
He has more than 15 years experience working with the IBM System x (Linux) and IBM Power Systems
(IBM AIX) platforms. You can reach David at [email protected].

Linux I/O performance
27

Tr ademar k s and spec i al not i c es
Copyright IBM Corporation 2012.
References in this document to IBM products or services do not imply that IBM intends to make them
available in every country.
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked
terms are marked on their first occurrence in this information with a trademark symbol ( or ), these
symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information
was published. Such trademarks may also be registered or common law trademarks in other countries. A
current list of IBM trademarks is available on the Web at "Copyright and trademark information" at
www.ibm.com/legal/copytrade.shtml.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or
its affiliates.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the
United States, other countries, or both.
Intel, Intel Inside (logos), MMX, and Pentium are trademarks of Intel Corporation in the United States,
other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC.
Other company, product, or service names may be trademarks or service marks of others.
Information is provided "AS IS" without warranty of any kind.
All customer examples described are presented as illustrations of how those customers have used IBM
products and the results they may have achieved. Actual environmental costs and performance
characteristics may vary by customer.
Information concerning non-IBM products was obtained from a supplier of these products, published
announcement material, or other publicly available sources and does not constitute an endorsement of
such products by IBM. Sources for non-IBM list prices and performance numbers are taken from publicly
available information, including vendor announcements and vendor worldwide homepages. IBM has not
tested these products and cannot confirm the accuracy of performance, capability, or any other claims
related to non-IBM products. Questions on the capability of non-IBM products should be addressed to the
supplier of those products.
All statements regarding IBM future direction and intent are subject to change or withdrawal without
notice, and represent goals and objectives only. Contact your local IBM office or IBM authorized reseller
for the full text of the specific Statement of Direction.
Some information addresses anticipated future capabilities. Such information is not intended as a
definitive statement of a commitment to specific levels of performance, function or delivery schedules with
respect to any future products. Such commitments are only made in IBM product announcements. The

Linux I/O performance
28

information is presented here to communicate IBM's current investment and development activities as a
good faith effort to help with our customers' future planning.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled
environment. The actual throughput or performance that any user will experience will vary depending
upon considerations such as the amount of multiprogramming in the user's job stream, the I/O
configuration, the storage configuration, and the workload processed. Therefore, no assurance can be
given that an individual user will achieve throughput or performance improvements equivalent to the
ratios stated here.
Photographs shown are of engineering prototypes. Changes may be incorporated in production models.
Any references in this information to non-IBM websites are provided for convenience only and do not in
any manner serve as an endorsement of those websites. The materials at those websites are not part of
the materials for this IBM product and use of those websites is at your own risk.