0% found this document useful (0 votes)
44 views

CX4 Performance

The document discusses performance characteristics for various components in a storage system, including Storage Processors (SPs), disks/RAID groups, LUNs, and front-end ports. It describes basic characteristics like utilization, queue length, response time, throughput and bandwidth. It also covers some advanced SP characteristics like dirty page percentage, flush ratio, and number of write cache flushes. Disk/RAID group characteristics are also provided, noting that utilization for these components can indicate bottlenecks.

Uploaded by

johannp
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

CX4 Performance

The document discusses performance characteristics for various components in a storage system, including Storage Processors (SPs), disks/RAID groups, LUNs, and front-end ports. It describes basic characteristics like utilization, queue length, response time, throughput and bandwidth. It also covers some advanced SP characteristics like dirty page percentage, flush ratio, and number of write cache flushes. Disk/RAID group characteristics are also provided, noting that utilization for these components can indicate bottlenecks.

Uploaded by

johannp
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
You are on page 1/ 12

Performance Characteristics

The Analyzer Performance characteristics apply to disks/RAID Groups, SPs, LUNs, metaLUNs, snapshot sessions, asynchronous mirrors, and/or front-end ports as shown in the sections and tables that follow. There are several categories of performance characteristics. The categories available to customers are Basic (default) and Advanced. The Advanced category includes both Basic and Advanced characteristics. You can change between the Basic and Advanced characteristic display by selecting Tools -> Analyzer -> Customize and checking or clearing the Advanced box. The Basic and Advanced performance characteristics are shown in separate tables for each item.

SP
The Storage Processor (SP) processes all I/Os within the storage system, host requests, management and maintenance tasks, as well as operations related to replication or migration features. In Navisphere Analyzer, the statistics for an SP are based on the I/O workload from its attached hosts. Utilization and write cache metrics, however, also reflect any internal processing that is occurring. Basic Characteristics Characteristic Description Utilization Describes the fraction of a certain observation period that the system component is busy serving incoming requests. An SP or disk that shows 100% (or close to 100%) utilization is a system bottleneck since an increase in the overall workload will not affect the component throughput; the component has reached its saturation point. Since a LUN is considered busy if any of its disks is busy, LUN utilization usually presents a pessimistic view. That is, a high LUN utilization value does not necessarily indicate that the LUN is approaching its maximum capacity. The average number of requests within a certain time interval waiting to be served by the component, including the one in service.

Comment When the SP becomes the bottleneck, the utilization will be at or close to 100%. An increase in workload will have no further impact on the SP throughput, but the I/O response time will start increasing more aggressively.

Queue Length

A queue length of zero (which is average) indicates an idle system. If three requests arrive at an idle SP at the same time, only one of them can be served immediately; the other two must wait in the queue, resulting in a queue length of three. The higher the queue length for the SP, the more requests are waiting in its queue, thus increasing the average response time of a single request. For a given workload, queue length and response time are directly proportional. Larger requests usually result in a higher total bandwidth than smaller requests. Since smaller requests need a shorter time for this, they usually result in a higher total throughput than larger requests. Larger requests usually result in a higher bandwidth than smaller requests. This number indicates whether the overall read workload is oriented more toward throughput (I/Os per second) or bandwidth (Mbytes/second). For a finer distinction of I/O sizes, use an IO Size Distribution chart for the LUNs. Since smaller requests need less processing

Response Time (ms)

The average time, in milliseconds, required for one request to pass through a system component, including its waiting time.

Total Bandwidth (MB/s) Total Throughput (IO/s) Read Bandwidth (MB/s) Read Size (KB)

The average amount of data in Mbytes that is passed through a system component per second. Total bandwidth includes both read and write requests. The average number of requests that pass through a system component per second. Total throughput includes both read and write requests. The average number of Mbytes read that were passed through a component per second. The average read request size in Kbytes.

Read Throughput

The average number of read requests passed through a

(IO/s) Write Bandwidth (MB/s) Write Size (KB)

component per second.

time, they usually result in a higher read throughput than larger requests.

The average number of Mbytes written that were passed Larger requests usually result in a higher through a component per second. bandwidth than smaller requests. The average write request size in Kbytes. This number indicates whether the overall write workload is oriented more toward throughput (I/Os per second) or bandwidth (Mbytes/second). For a finer distinction of I/O sizes, use an IO Size Distribution chart for the LUNs.

Write Throughput (IO/s)

The average number of write requests passed through a Since smaller requests need less processing component per second. time, they usually result in a higher write throughput than larger requests.

Service Time (ms) Time, in milliseconds, a request spent being serviced by Larger requests usually have a longer service a component. It does not include time waiting in a time than smaller requests. queue. Service time is mainly a characteristic of the system component. However, larger I/Os take longer and therefore usually result in lower throughput (IO/s) but better bandwidth (Mbytes/s). Advanced-only Characteristics (Advanced Characteristics also include the Basic ones) Characteristic Description Comment Dirty Pages Percentage of cache pages owned by this SP (pages (%)Advanced Only committed to cache, but not yet written to disk). In an optimal environment, the dirty-pages percentage will not exceed the high watermark for a long period. Flush Ratio Advanced Only This metric shows the level of the write cache at the last poll time and is not as an average over the last polling interval.

The fraction of the number of flush operations performed A flush operation is a write of a portion compared to the number of write requests. Since the ratio is a of the cache to make room for measure for the back-end activity compared to front-end incoming write data. activity, a lower number indicates better performance. The number of megabytes per second written from the write cache to the disks. Number of times, since the last sample, that the number of modified pages in the write cache reached the high watermark. The higher the number, the greater the write workload coming from the host. Number of times, since the last sample, that the write cache started flushing dirty pages to disk due to a given idle period. Idle flushes indicate a low workload. Number of times, since the last sample, that the number of This number will only increase if the modified pages in the write cache reached the low watermark, percentage of Dirty Pages has at which point the SP stops flushing the cache. The higher the previously reached the high watermark. number, the greater the write workload coming from the host. This number should be close to the High Watermark Flush On number. Number of times per second that the write cache performed a flush operation. A flush operation is a write of a portion of a cache for any reason; it includes forced flushes, flushes resulting from high watermark, and flushes from an idle state. This value indicates back-end workload. Average number of requests waiting at a busy system component to be serviced, including the request that is A flush operation writes contiguous data out to disk. This includes forced flushes, flushes resulting from watermark processing, and flushes due to idleness. Since this queue length is counted only when the SP is not idle, the value The value is a measure of back-end activity. This number will only increase if the percentage of Dirty Pages has previously reached the low watermark.

MBs Flushed/ s (MB/s) Advanced Only High Water Flush On Advanced Only Idle Flush On Advanced Only Low Water Flush Off Advanced Only

Write Cache Flushes/s Advanced Only

Average Busy Queue Length

Advanced Only

currently in service.

indicates the frequency variation (burst frequency) of incoming requests. The higher the value, the bigger the burst and the longer the average response time at this component. In contrast to this metric, the average queue length does also include idle periods when no requests are pending. If you have 50% of the time just one outstanding request, and the other 50% the SP is idle, the average busy queue length will be 1. The average queue length however, will be .

Disk/RAID Group
As the slowest devices in a storage system, disk drives are very often responsible for performance-related issues. Therefore, we recommend that you pay close attention to disk drives when analyzing performance problems. RAID Group values are an aggregate of their disk performance values. Basic Characteristics Characteristic Description Utilization Describes the fraction of a certain observation period that the system component is busy serving incoming requests. An SP or disk that shows 100% (or close to 100%) utilization is a system bottleneck since an increase in the overall workload will not affect the component throughput; the component has reached its saturation point. Since a LUN is considered busy if any of its disks is busy, LUN utilization usually presents a pessimistic view. That is, a high LUN utilization value does not necessarily indicate that the LUN is approaching its maximum capacity.

Comment Since a RAID group can have multiple partitions, the disks utilization is a result of servicing I/Os that belong to all LUNs within this RAID group.

Queue Length

The average number of requests within a certain time A queue length of zero (which is average) interval waiting to be served by the component, indicates an idle disk. If three requests arrive at including the one in service. an idle disk at the same time, only one of them can be served immediately; the other two must wait in the queue, resulting in a queue length of three. The average time, in milliseconds, required for one request to pass through a system component, including its waiting time. The higher the queue length for the disk, the more requests are waiting in its queue, thus increasing the average response time of a single request. For a given workload, queue length and response time are directly proportional. Larger requests result usually in a higher total bandwidth than smaller requests. Since smaller requests need less processing time, they usually result in a higher total throughput than larger requests.

Response Time (ms)

Total Bandwidth (MB/s) Total Throughput (IO/s) Read Bandwidth (MB/s) Read Size (KB)

The average amount of data in Mbytes that is passed through a system component per second. Total bandwidth includes both read and write requests. The average number of requests that pass through a system component per second. Total throughput includes both read and write requests.

The average number of Mbytes read that were passed Larger requests usually result in a higher through a component per second. bandwidth than smaller ones. The average read request size in Kbytes. This number indicates whether the read workload is oriented more toward throughput (I/Os per second) or bandwidth

(Mbytes/second). For a finer distinction of I/O sizes, use an IO Size Distribution chart. Read Throughput (IO/s) Write Bandwidth (MB/s) Write Size (KB) The average number of read requests passed through a Since smaller requests need less processing component per second. time, they usually result in a higher write throughput than larger requests. The average number of Mbytes written that were passed through a component per second. Larger requests usually result in a higher bandwidth than smaller ones.

The average write request size in Kbytes. This Sequential writes can get coalesced in the write number indicates whether the read workload is cache and might result in larger disk requests oriented more toward throughput (I/Os per second) or when flushed to disks. bandwidth (Mbytes/second). For a finer distinction of I/O sizes, use an IO Size Distribution chart. The average number of write requests passed through Since smaller requests need less processing a component per second. time, they usually result in a higher write throughput than larger requests. Average seek distance in gigabytes. Longer seek distances result in longer seek times and therefore higher response times. Defragmentation might help to reduce seek distances. If there are at least three outstanding requests to the disk, it will optimize the order of execution based on their locality which can result in a shorter service time.

Write Throughput (IO/s) Average Seek Distance (GB)

Service Time (ms) Time, in milliseconds, a request spent being serviced by a component. It does not include time waiting in a queue. Service time is mainly a characteristic of the system component. However, larger I/Os take longer and therefore usually result in lower throughput (IO/s) but better bandwidth (Mbytes/s).

Advanced-only Characteristics (Advanced Characteristics also include the Basic ones) Characteristic Description Comment Average Busy Queue Length Advanced Only Average number of requests waiting at a busy system component to be serviced, including the request that is currently in service. Since this queue length is counted only when the disk is not idle, the value indicates the frequency variation (burst frequency) of incoming requests. The higher the value, the bigger the burst and the longer the average response time at this component. In contrast to this metric, the average queue length does also include idle periods when no requests are pending. If you have 50% of the time just one outstanding request, and the other 50% the disk is idle, the average busy queue length will be 1. The average queue length however, will be .

Average Seek Distance (GB) Advanced Only

Average seek distance Longer seek distances result in longer seek times and therefore higher in gigabytes. response times. Defragmentation might help to reduce seek distances.

LUN
A LUN is an abstract object whose performance depends on various factors. The main aspect is whether a host I/O can get satisfied by the cache or not. A cache hit does not require any disk access; a cache miss however requires one or more disk accesses to complete the data request. Most cache statistics are available only for the LUN object. Navisphere Analyzer reports performance statistics for three types of LUNs: regular host LUNs, metaLUNs, and component LUNs, which are the underlying LUNs of a metaLUN. Basic Characteristics Characteristic Description Utilization Describes the fraction of a certain observation period that the system component is busy serving incoming requests. An SP or disk that shows 100% (or close to 100%) utilization is a system bottleneck Comment When the LUN becomes the bottleneck, the utilization will be at or close to 100%. However, since I/Os can get serviced by multiple disks an increase in workload might still result in a higher

since an increase in the overall workload will not throughput. affect the component throughput; the component has reached its saturation point. Since a LUN is considered busy if any of its disks is busy, LUN utilization usually presents a pessimistic view. That is, a high LUN utilization value does not necessarily indicate that the LUN is approaching its maximum capacity. Queue Length The average number of requests within a certain time interval waiting to be served by the component, including the one in service. A queue length of zero (which is average) indicates an idle LUN. Since there can be idle times during the observed time period, the average queue length can also be smaller than 1. If you want to know the average number of outstanding requests when the LUN is busy, look at the average busy queue length.

Response Time (ms)

The average time, in milliseconds, required for one The higher the queue length for a LUN, the more request to pass through a system component, requests are waiting in its queue, thus increasing including its waiting time. the average response time of a single request. For a given workload, queue length and response time are directly proportional. The average amount of data in Mbytes that is passed through a system component per second. Total bandwidth includes both read and write requests. Larger requests usually result in a higher total bandwidth than smaller requests.

Total Bandwidth (MB/s)

Total Throughput (IO/s) Read Bandwidth (MB/s) Read Size (KB)

The average number of requests that pass through a Since smaller requests need a shorter time for this, system component per second. Total throughput they usually result in a higher total throughput includes both read and write requests. than larger requests. The average number of Mbytes read that were passed through a component per second. The average read request size in Kbytes. Larger requests usually result in a higher bandwidth than smaller requests. This number indicates whether the overall read workload is oriented more toward throughput (I/Os per second) or bandwidth (Mbytes/second). For a finer distinction of I/O sizes, use an IO Size Distribution chart for this LUN. Since smaller requests need less processing time, they usually result in a higher read throughput than larger requests. Larger requests usually result in a higher bandwidth than smaller requests. This number indicates whether the overall write workload is oriented more toward throughput (I/Os per second) or bandwidth (Mbytes/second). For a finer distinction of I/O sizes, use an IO Size Distribution chart for the LUNs. Since smaller requests need less processing time, they usually result in a higher write throughput than larger requests.

Read Throughput (IO/s) Write Bandwidth (MB/s) Write Size (KB)

The average number of read requests passed through a component per second. The average number of Mbytes written that were passed through a component per second. The average write request size in Kbytes.

Write Throughput (IO/s)

The average number of write requests passed through a component per second.

Service Time (ms) Time, in milliseconds, a request spent being Larger requests usually have a longer service time serviced by a component. It does not include time than smaller requests. waiting in a queue. Service time is mainly a characteristic of the system component. However, larger I/Os take longer and therefore usually result in lower throughput (IO/s) but better bandwidth (Mbytes/s).

Advanced-only Characteristics (Advanced Characteristics also include the Basic ones) Characteristic Description Comment Full Stripe Writes/s Average number of writes requests per second Advanced that spanned a whole stripe (all disks in a LUN). This metric is applicable only to LUNs that are part of a RAID 5 or RAID 3 Group. For these RAID types, full stripe writes are most efficient since data and parity can be written out to the disks without having to pre-read any old data or parity first. The amount of prefetched data depends on the prefetch settings for this LUN. It indicates the sequentiality of the workload for this LUN. To improve read bandwidth, two consecutive requests trigger prefetching, thereby filling the read cache with data before it is requested. Thus, sequential requests will receive the data from the read cache instead of from the disks, which results in a lower response time and higher throughput. As the percentage of sequential requests rises, so does the percentage of used prefetches. A read cache hit occurs when recently accessed data is re-referenced while it is still in the cache.

Prefetch Bandwidth The amount of data per second that has been (MB/s) prefetched for this particular LUN. Used Prefetches (%) The indication of prefetching efficiency. Advanced Only

Read Cache Hits/s Advanced Only Read Cache Misses/s Advanced Only Reads From Write Cache/s

The number of read requests that were satisfied by either write or read cache within a second. The rate of read requests that could not be satisfied by the SP cache and therefore required a disk access.

Average number of read requests per second that Reads from write cache occur when recently were satisfied by the write cache only. written data is read again while it is still in the write cache. This is a subset of read cache hits which includes requests satisfied by either the write or the read cache. Average number of read requests per second that Reads from read cache occur when data that has were satisfied by the read cache only. been recently read or prefetched is re-read while it is still in the read cache. This is a subset of read cache hits which includes requests satisfied by either the write or the read cache. The fraction of read requests served from both read and write caches vs. the number of read requests to this LUN. A higher ratio indicates better read performance.

Reads From Read Cache/s

Read cache Hit Ratio Advanced Only Write Cache Rehits/s Advanced Only Write Cache Hit Ratio Advanced Only Write Cache Rehit Ratio Advanced Only

The number of write requests per second that Write cache rehits occur when recently accessed were satisfied by the write cache since they have data is referenced again while it is still in the write been referenced before and not yet flushed to the cache. This is a subset of Write Cache Hits. disks. The fraction of write requests that were satisfied by the write cache without requiring any disk access, compared to the total number of write requests to this LUN. The fraction of write requests that were satisfied by the write cache since they have been referenced before and not yet flushed to the disks, compared to the total number of write requests to this LUN. A higher ratio indicates better write performance.

This is a measure of how often the write cache succeeded in eliminating a write operation to disk. While improving the rehit ratio is useful it is more beneficial to reduce the number of forced flushes. Write requests that are not write cache hits are referred to as write cache misses.

Write Cache Hits/s The number of write requests that were satisfied Advanced Only by the write cache without requiring any disk access. Write cache hits are either requests that

have been referenced before and not yet flushed to the disks (rehits) or new write requests that did not trigger any forced flushes. Write Cache Misses/s Advanced Only Forced Flushes/ s Advanced Only The number of write requests per second that Examples of write cache misses are write requests could not be satisfied by the write cache only, but that bypass the write cache due to their size and also required additional disk access. write requests that trigger forced flushes. Number of times per second the cache had to flush pages to disk to free space for incoming write requests. Forced flushes indicate that the incoming workload is higher than the back end workload. A relatively high number over a long period of time suggests that you spread the load over more disks. Forced flushes are a measure of how often write requests will have to wait for disk I/O rather than be satisfied by an empty slot in the write cache. In most well performing systems this should be zero most of the time.

Disk Crossing (%) Advanced

Only Percentage of requests that require I/O to at A single disk crossing can involve more than two least two disks compared to the total number of disk drives; that is, more than two stripe element server requests. crossings. Disk crossings relate to the LUN stripe element size. Generally, a low value is needed for good performance. Average number of requests waiting at a busy system component to be serviced, including the request that is currently in service. Since this queue length is counted only when the LUN is not idle, the value indicates the frequency variation (burst frequency) of incoming requests. The higher the value, the bigger the burst and the longer the average response time at this component. In contrast to this metric, the average queue length does also include idle periods when no requests are pending. If you have 50%of the time just one outstanding request, and the other 50% the LUN is idle, the average busy queue length will be 1. The average queue length however, will be .

Average Busy Queue Length Advanced Only

Explicit Trespass Count

The result of an external command that you or the This host-side performance characteristic is failover software issue. When an SP receives this displayed in both archive dump files and charts command (from the failover software or you (runtime and archive). issuing the LUN trespass in Navisphere), LUN ownership is transferred to that SP. The result of software controls within the storage This host-side performance characteristic is system. An example of an implicit trespass displayed in both archive dump files and charts operation is when LUN ownership is transferred (runtime and archive). to the SP (non-optimal paths) that receives the heaviest I/O activity. Once a threshold of I/Os is reached on the non-optimal paths, the CLARiiON implicitly trespasses the LUN to that SP.

Implicit Trespass Count

Optimal and Nonoptimal Characteristics (Advanced-only) You will see the Optimal and Nonoptimal performance characteristics only if you enable the advanced mode in the Customize dialog box and are running the active/active feature. Optimal refers to the path that is ready to do I/O and will yield the best performance. Nonoptimal refers to the path that is ready to do I/O, but may not yield the best performance. The following Optimal and Nonoptimal performance characteristics apply to LUN objects: Optimal Performance Characteristic Nonoptimal Performance Characteristic Utilization-Optimal [%] Queue Length-Optimal Response Time-Optimal Total Bandwidth-Optimal [MB/s] Total Throughput-Optimal [IO/s] Utilization-Nonoptimal [%] Queue Length-Nonoptimal Response Time-Nonoptimal Total Bandwidth-Nonoptimal [MB/s] Total Throughput-Nonoptimal [IO/s

Read Bandwidth-Optimal [MB/s] Read Size-Optimal [KB] Read Throughput-Optimal [IO/s] Write Bandwidth-Optimal [MB/s] Write Size-Optimal [KB] Write Throughput-Optimal [IO/s] Average Busy Queue Length-Optimal Service Time-Optimal [ms] Explicit Trespass Count-Optimal Implicit Trespass Count-Optimal

Read Bandwidth-Nonoptimal [MB/s] Read Size-Nonoptimal [KB] Read Throughput-Nonoptimal [IO/s] Write Bandwidth-Nonoptimal [MB/s] Write Size-Nonoptimal [KB] Write Throughput-Nonoptimal [IO/s] Average Busy Queue Length-Nonoptimal Service Time-Nonoptimal [ms] Explicit Trespass Count-Nonoptimal Implicit Trespass Count-Nonoptimal

MetaLUNs
Navisphere Analyzer reports performance statistics for three types of LUNs: regular host LUNs, metaLUNs, and component LUNs, which are the underlying LUNs of a metaLUN. Basic Characteristics Characteristic Description Utilization Describes the fraction of a certain observation period that the system component is busy serving incoming requests. An SP or disk that shows 100% (or close to 100%) utilization is a system bottleneck since an increase in the overall workload will not affect the component throughput; the component has reached its saturation point. Since a LUN is considered busy if any of its disks is busy, LUN utilization usually presents a pessimistic view. That is, a high LUN utilization value does not necessarily indicate that the LUN is approaching its maximum capacity. The average number of requests within a certain time interval waiting to be served by the component, including the one in service. Comment When the SP becomes the bottleneck, the utilization will be at or close to 100%. An increase in workload will have no further impact on the SP throughput, but the I/O response time will start increasing more aggressively.

Queue Length

A queue length of zero (which is average) indicates an idle system. If three requests arrive at an idle SP at the same time, only one of them can be served immediately; the other two must wait in the queue, resulting in a queue length of three. The higher the queue length for the SP, the more requests are waiting in its queue, thus increasing the average response time of a single request. For a given workload, queue length and response time are directly proportional. Larger requests usually result in a higher total bandwidth than smaller requests. Since smaller requests need a shorter time for this, they usually result in a higher total throughput than larger requests. Larger requests usually result in a higher

Response Time (ms)

The average time, in milliseconds, required for one request to pass through a system component, including its waiting time.

Total Bandwidth (MB/s) Total Throughput (IO/s) Read Bandwidth

The average amount of data in Mbytes that is passed through a system component per second. Total bandwidth includes both read and write requests. The average number of requests that pass through a system component per second. Total throughput includes both read and write requests. The average number of Mbytes read that were passed

(MB/s) Read Size (KB)

through a component per second. The average read request size in Kbytes.

bandwidth than smaller requests. This number indicates whether the overall read workload is oriented more toward throughput (I/Os per second) or bandwidth (Mbytes/second). For a finer distinction of I/O sizes, use an IO Size Distribution chart for the LUNs. Since smaller requests need less processing time, they usually result in a higher read throughput than larger requests.

Read Throughput (IO/s) Write Bandwidth (MB/s) Write Size (KB)

The average number of read requests passed through a component per second.

The average number of Mbytes written that were passed Larger requests usually result in a higher through a component per second. bandwidth than smaller requests. The average write request size in Kbytes. This number indicates whether the overall write workload is oriented more toward throughput (I/Os per second) or bandwidth (Mbytes/second). For a finer distinction of I/O sizes, use an IO Size Distribution chart for the LUNs.

Write Throughput (IO/s)

The average number of write requests passed through a Since smaller requests need less processing component per second. time, they usually result in a higher write throughput than larger requests.

Service Time (ms) Time, in milliseconds, a request spent being serviced by Larger requests usually have a longer service a component. It does not include time waiting in a time than smaller requests. queue. Service time is mainly a characteristic of the system component. However, larger I/Os take longer and therefore usually result in lower throughput (IO/s) but better bandwidth (Mbytes/s). Advanced-only Characteristics (Advanced Characteristics also include the Basic ones) Characteristic Description Comment Full Stripe Writes/s Average number of writes requests per Advanced second that spanned a whole stripe (all disks in a LUN). This metric is applicable only to LUNs that are part of a RAID 5 or RAID 3 Group. For these RAID types, full stripe writes are most efficient since data and parity can be written out to the disks without having to preread any old data or parity first. To improve read bandwidth, two consecutive requests trigger prefetching, thereby filling the read cache with data before it is requested. Thus, sequential requests will receive the data from the read cache instead of from the disks, which results in a lower response time and higher throughput. As the percentage of sequential requests rises, so does the percentage of used prefetches. A read cache hit occurs when recently accessed data is re-referenced while it is still in the cache.

Used Prefetches The indication of prefetching efficiency. (%) Advanced Only

Read Cache Hits/s The number of read requests that were Advanced Only satisfied by either write or read cache within a second. Read Cache Misses/s Advanced Only Reads From Write Cache/s The rate of read requests that could not be satisfied by the SP cache and therefore required a disk access. Average number of read requests per second that were satisfied by the write cache only.

Reads from write cache occur when recently written data is read again while it is still in the write cache. This is a subset of read cache hits which includes requests satisfied by either the write or the read cache.

Reads From Read Cache/s

Average number of read requests per second that were satisfied by the read cache only.

Reads from read cache occur when data that has been recently read or prefetched is re-read while it is still in the read cache. This is a subset of read cache hits which includes requests satisfied by either the write or the read cache.

Read cache Hit Ratio Advanced Only Write Cache Rehits/s Advanced Only Write Cache Hit Ratio Advanced Only

The fraction of read requests served from A higher ratio indicates better read performance. both read and write caches vs. the number of read requests to this LUN. The number of write requests per second Write cache rehits occur when recently accessed data is that were satisfied by the write cache referenced again while it is still in the write cache. This since they have been referenced before is a subset of Write Cache Hits. and not yet flushed to the disks. The fraction of write requests that were satisfied by the write cache without requiring any disk access, compared to the total number of write requests to this LUN. A higher ratio indicates better write performance.

Write Cache Rehit The fraction of write requests that were Ratio satisfied by the write cache since they Advanced Only have been referenced before and not yet flushed to the disks, compared to the total number of write requests to this LUN. Write Cache Hits/s The number of write requests that were Advanced Only satisfied by the write cache without requiring any disk access. Write cache hits are either requests that have been referenced before and not yet flushed to the disks (rehits) or new write requests that did not trigger any forced flushes. Write Cache Misses/s Advanced Only

This is a measure of how often the write cache succeeded in eliminating a write operation to disk. While improving the rehit ratio is useful it is more beneficial to reduce the number of forced flushes. Write requests that are not write cache hits are referred to as write cache misses.

The number of write requests per second Write requests that cause forced flushes or that bypass that could not be satisfied by the write the write cache due to their size are examples of write cache only, but also required additional cache misses. disk access. Examples of write cache misses are write requests that bypass the write cache due to their size and write requests that trigger forced flushes. Number of times per second the cache Forced flushes are a measure of how often write had to flush pages to disk to free space for requests will have to wait for disk I/O rather than be incoming write requests. Forced flushes satisfied by an empty slot in the write cache. In most indicate that the incoming workload is well performing systems this should be zero most of higher than the back end workload. A the time. relatively high number over a long period of time suggests that you spread the load over more disks. A single disk crossing can involve more than two disk drives; that is, more than two stripe element crossings. Disk crossings relate to the LUN stripe element size. Generally, a low value is needed for good performance. Since this queue length is counted only when the LUN is not idle, the value indicates the frequency variation (burst frequency) of incoming requests. The higher the value, the bigger the burst and the longer the average response time at this component. In contrast to this metric, the average queue length does also include idle periods when no requests are pending. If you have 50%of the time just one outstanding request, and the

Forced Flushes/ s Advanced Only

Disk Crossing (%) Only Percentage of requests that require Advanced I/O to at least two disks compared to the total number of server requests. Average Busy Queue Length Advanced Only Average number of requests waiting at a busy system component to be serviced, including the request that is currently in service.

other 50% the LUN is idle, the average busy queue length will be 1. The average queue length however, will be . LUN Read Crossings/s Advanced Only LUN Write Crossings/s Advanced Only Explicit Trespass Count The number of LUN crossings per second Since metaLUNs consist of multiple LUNs, a single that a read request to a MetaLUN caused. read request can access disk drives that belong to two or more of these conventional LUNs. The number of LUN crossings per second Since metaLUNs consist of multiple LUNs, a single that a write request to a MetaLUN write request can access disk drives that belong to two caused. or more of these conventional LUNs. The result of an external command that Host-side performance characteristic displayed in both you or the failover software issues. When archive dump files and charts (runtime and archive). an SP receives this command (from the failover software or you issuing the LUN trespass in Navisphere), LUN ownership is transferred to that SP. The result of software controls within the Host-side performance characteristic displayed in both storage system. An example of an implicit archive dump files and charts (runtime and archive). trespass operation is when LUN ownership is transferred to the SP (nonoptimal paths) that receives the heaviest I/O activity. Once a threshold of I/Os is reached on the non-optimal paths, the CLARiiON implicitly trespasses the LUN to that SP.

Implicit Trespass Count

Optimal and Nonoptimal Characteristics (Advanced-only) You will see the Optimal and Nonoptimal performance characteristics only if you enable the advanced mode in the Customize dialog box and are running the active/active feature. Optimal refers to the path that is ready to do I/O and will yield the best performance. Nonoptimal refers to the path that is ready to do I/O, but may not yield the best performance. The following Optimal and Nonoptimal performance characteristics apply to MetaLUN objects: Optimal Performance Characteristic Nonoptimal Performance Characteristic Utilization-Optimal [%] Queue Length-Optimal Response Time-Optimal Total Bandwidth-Optimal [MB/s] Total Throughput-Optimal [IO/s] Read Bandwidth-Optimal [MB/s] Read Size-Optimal [KB] Read Throughput-Optimal [IO/s] Write Bandwidth-Optimal [MB/s] Write Size-Optimal [KB] Write Throughput-Optimal [IO/s] Average Busy Queue Length-Optimal Service Time-Optimal [ms] Explicit Trespass Count-Optimal Implicit Trespass Count-Optimal Utilization-Nonoptimal [%] Queue Length-Nonoptimal Response Time-Nonoptimal Total Bandwidth-Nonoptimal [MB/s] Total Throughput-Nonoptimal [IO/s Read Bandwidth-Nonoptimal [MB/s] Read Size-Nonoptimal [KB] Read Throughput-Nonoptimal [IO/s] Write Bandwidth-Nonoptimal [MB/s] Write Size-Nonoptimal [KB] Write Throughput-Nonoptimal [IO/s] Average Busy Queue Length-Nonoptimal Service Time-Nonoptimal [ms] Explicit Trespass Count-Nonoptimal Implicit Trespass Count-Nonoptimal

SnapSession
The snapshot characteristics apply to a Snapshot Session node only (SnapView software required). In the Snapshot Session, these characteristics are shown as SS. SnapSessions are available with the optional software package, SnapView. During a SnapSession, data that a production host overwrites on a source LUN will be saved in the SnapCache (part of the Reserved LUN Pool), thus preserving a pointin-time view of this LUN from the time the SnapSession was created (Start Session). This view can then get exported to another host (Activate Session) that needs to access this data. The following table contains all statistics that Navisphere Analyzer provides for a SnapSession. Since a SnapSession can contain multiple SnapShot Source LUNs, the statistics comprise all LUNs that are associated with this SnapSession. Basic Characteristics Characteristic Description Reads From Snapshot The number of read requests on snapshots during LUN this snapshot session. Reads From Snapshot The number of reads during this Snapshot session Source LUN from the source LUN. It is calculated by the difference between Total Reads in Session and Reads From Cache. Writes To Snapshot Source LUN The number of writes during this Snapshot session to the source LUN (on the pertinent SP). Comment These are requests that originate for instance from a backup host and read data either from the source LUN or the SnapCache. These are read requests that originate from a backup host and access data that has not been overwritten since the SnapSession started and therefore are satisfied by the Source LUN.

Advanced-only Characteristics (Advanced Characteristics also include the Basic ones) Characteristic Description Comment Reads From Snapshot The number of reads during this session that have Cache resulted in a read from the Snapshot cache (instead Advanced Only of a read from the source LUN). These are read requests that originate from a backup host and access data that has been previously overwritten on the source LUN during this SnapSession and therefore are satisfied by one or more Snap Cache LUNs.

Writes To Snapshot Cache Advanced Only Writes Larger Than Cache Chunk Size Advanced Only Chunks Used in Snapshot Copy Session Advanced Only

The number of writes to the source LUN this session Write requests that trigger multiple copy-onthat triggered a copy-on-write operation (the first first-write operations due to misalignment or write to each snapshot cache chunk region). their size will only be counted once. The number of writes to the source LUN during this These writes result in multiple copy-on-firstsession that were larger than the chunk size (they write operations. have resulted in multiple writes to the cache). The number of snapshot cache chunks that this session has used.

Ports
Navisphere Analyzer reports performance statistics for front-end ports. Advanced-only Characteristics Characteristic Description Comment

Queue Full Count The number of Queue Full A Queue Full response is sent to the host when the port receives events that occurred for a more requests than it can accept at once. This event should never particular front-end port during occur in a properly configured SAN environment. a polling interval.

You might also like