0% found this document useful (0 votes)

336 views42 pages

Performance Counters Thresholds For Windows Server

The document discusses key performance counters and thresholds for monitoring disk I/O, memory, and processor usage on Windows Server. It provides details on performance counters to monitor average disk response times, available memory, page faults, pool memory usage, and private memory usage per process. Thresholds indicate when disk I/O is slow, memory usage is low, and page faults or pool memory usage are high.

Uploaded by

rockysheddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

336 views42 pages

Performance Counters Thresholds For Windows Server

Uploaded by

rockysheddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Performance Counters thresholds for

Windows Server

Key Performance Counters and their thresholds for Windows

Server

When you need to measure how many system resources your application consumes, you need to pay
particular attention to the following:

o Disk I/O. Amount of read and write disk activity. I/O bottlenecks occur if read and write operations
begin to queue.
o Memory. Amount of available memory, virtual memory, and cache utilization.
o Network. Percent of the available bandwidth being utilized, network bottlenecks.
o Processor. Processor utilization, context switches, interrupts and so on.

The next sections describe the performance counters that help you measure the preceding metrics. System
Overview (General operating system performance analysis. Use this for a general analysis of the operating
system performance counters) Formatting:

o Counter (Explanation)
o Thresholds

o Disk
o \LogicalDisk(*)\Avg. Disk sec/Read (Avg. Disk sec/Read is the average time, in seconds, of a
read of data to the disk. This analysis determines if any of the physical disks are responding
slowly)
o Average disk responsiveness is slow – more than 15ms
o Average disk responsiveness is very slow – more than 25ms
o Disk responsiveness is very slow (spike of more than 25ms)

o \LogicalDisk(*)\Avg. Disk sec/Write

(Avg. Disk sec/Write is the average time, in seconds, of a write of data to the disk. This
analysis determines if any of the physical disks are responding slowly)
o Average disk responsiveness is slow – more than 15ms
o Average disk responsiveness is very slow – more than 25ms
o Disk responsiveness is very slow (spike of more than 25ms)

1
o \LogicalDisk(*)\Disk Transfers/sec (Disk Transfers/sec is the rate of read and write
operations on the disk)
o Less than 80 I/O’s per second on average when disk latency is longer than 25ms.
This may indicate too many virtual LUNs using the same physical disks on a SAN.
o Less than 80 I/O’s per second on average when disk latency is longer than 25ms.
This may indicate too many virtual LUNs using the same physical disks on a SAN.
This was a spike – not an average.

o \PhysicalDisk(*)\Avg. Disk sec/Read (Avg. Disk sec/Read is the average time, in seconds, of
a read of data to the disk. This analysis determines if any of the physical disks are
responding slowly)
o Average disk responsiveness is slow – more than 15ms
o Average disk responsiveness is very slow – more than 25ms
o Disk responsiveness is very slow (spike of more than 25ms)

o \PhysicalDisk(*)\Avg. Disk sec/Write (Avg. Disk sec/Write is the average time, in seconds, of
a write of data to the disk. This analysis determines if any of the physical disks are
responding slowly)
o Average disk responsiveness is slow – more than 15ms
o Average disk responsiveness is very slow – more than 25ms
o Disk responsiveness is very slow (spike of more than 25ms

o \Process(*)\IO Data Operations/sec

(The rate at which the process is issuing read and write I/O operations. This counter counts
all I/O activity generated by the process to include file, network and device I/Os)
o This process is using more than 1000 data I/O’s per second

o \Process(*)\IO Other Operations/sec

(The rate at which the process is issuing I/O operations that are neither read nor write
operations (for example, a control function). This counter counts all I/O activity generated
by the process to include file, network and device I/Os)
o This process is using more than 1000 data I/O’s per second

2
o Memory
Kernel Mode Memory

o \Memory\Available MBytes (Available MBytes is the amount of physical memory available to

processes running on the computer, in Megabytes, rather than bytes as reported in
Memory\Available Bytes. The Virtual Memory Manager continually adjusts the space used in
physical memory and on disk to maintain a minimum number of available bytes for the
operating system and processes. When available bytes are plentiful, the Virtual Memory
Manager lets the working sets of processes grow, or keeps them stable by removing an old
page for each new page added. When available bytes are few, the Virtual Memory Manager
must trim the working sets of processes to maintain the minimum required)
o Low on available memory – less than 10% available
o Very low on available memory – less than 5% available
o Decreasing trend of 10MB’s per hour. This could indicate a memory leak.

o \Memory\Free System Page Table Entries (Free System Page Table Entries is the number of
page table entries not currently in used by the system. This analysis determines if the
system is running out of free system page table entries (PTEs) by checking if there is less
than 5,000 free PTE’s with a Warning if there is less than 10,000 free PTE’s. Lack of enough
PTEs can result in system wide hangs)
o Running low on PTE’s – less than 10,000 (If the free PTEs are under 10,000 the
system is close to a system wide hang)
o Critically low on PTE’s – less than 5000 (If the free PTEs are under 5000 the system
is close to a system wide hang)

o \Memory\Pages Input/sec (Pages Input/sec is the rate at which pages are read from disk to
resolve hard page faults. Hard page faults occur when a process refers to a page in virtual
memory that is not in its working set or elsewhere in physical memory, and must be
retrieved from disk. When a page is faulted, the system tries to read multiple contiguous
pages into memory to maximize the benefit of the read operation. Compare the value of
Memory\\Pages Input/sec to the value of Memory\\Page Reads/sec to determine the
average number of pages read into memory during each read operation)
o More then 10 page file reads per second

3
o \Memory\Pages/sec (If it is high, then the system is likely running out of memory by trying
to page the memory to the disk. Pages/sec is the rate at which pages are read from or
written to disk to resolve hard page faults. This counter is a primary indicator of the kinds
of faults that cause system-wide delays. It is the sum of Memory\Pages Input/sec and
Memory\Pages Output/sec. It is counted in numbers of pages, so it can be compared to
other counts of pages, such as Memory\Page Faults/sec, without conversion. It includes
pages retrieved to satisfy faults in the file system cache (usually requested by applications)
non-cached mapped memory files)
o High pages/sec – greater than 1000 (If it’s higher than 1000, the system is could be
beginning to run out of memory. Consider reviewing the processes to see which
processes are taking up the most memory or consider adding more memory)
o Very high average pages/sec – greater than 2500 (If this is greater than 2500, the
system could be experiencing system-wide delays due to insufficient memory.
Consider reviewing the processes to see which processes are taking up the most
memory or consider adding more memory)
o Critically high average pages/sec – greater than 5000 (If this is greater than 5000. If
so, the system is most likely experiencing delays due to insufficient memory.
Consider reviewing the processes to see which processes are taking up the most
memory or consider adding more memory)
o Spike in pages/sec – greater than 1000 (If this is greater than 5000. If so, the system
is most likely experiencing delays due to insufficient memory. Consider reviewing
the processes to see which processes are taking up the most memory or consider
adding more memory)

o \Memory\Pool Nonpaged Bytes

o Low on Pool NonPaged memory
- less than 40% available (If the systems exceeds more that 60% of the Pool Non-
paged bytes memory pool, then consider removing the /3GB switch or consider
migrating to a 64-bit system.
o Critically low on Pool NonPaged memory – less than 20% available (If the system
exceeds 80% of the Pool Non-paged bytes memory pool. If so, consider removing
the /3GB switch or consider migrating to a 64-bit system.

o \Memory\Pool Paged Bytes (if the system is becoming close to the maximum Pool paged
memory size. Pool Paged Bytes is the size, in bytes, of the paged pool, an area of system
memory (physical memory used by the operating system) for objects that can be written to
disk when they are not being used)
o Low on Pool Paged memory – less than 40% available
o Critically low on Pool Paged memory – less than 20% available

4
User Mode Memory

o \Process(*)\Private Bytes (Private Bytes is the current size, in bytes, of memory that this
process has allocated that cannot be shared with other processes)
o For Windows 32 Bit: 250MB delta between Minimum Size and Maximum Size
(Maximum – Minimum = !>(not greater than) 250MB)
o For Windows 64 Bit: 500MB delta between Minimum Size and Maximum Size
(Maximum – Minimum = !> (not greater than) 500MB)
o
o \Process(*)\Working Set (Working Set is the current size, in bytes, of the Working Set of this
process. The Working Set is the set of memory pages touched recently by the threads in the
process. If free memory in the computer is above a threshold, pages are left in the Working
Set of a process even if they are not in use. When free memory falls below a threshold,
pages are trimmed from Working Sets. If they are needed they will then be soft-faulted back
into the Working Set before leaving main memory)
o For Windows 32 Bit: 250MB delta between Minimum Size and Maximum Size
(Maximum – Minimum = !>(not greater than) 250MB)
o For Windows 64 Bit: 500MB delta between Minimum Size and Maximum Size
(Maximum – Minimum = !> (not greater than) 500MB)

o
o
o
o \Process(*)Thread Count (The number of threads currently active in this
process. An instruction is the basic unit of execution in a processor, and a
thread is the object that executes instructions. Every running process has at
least one thread.)
o For Windows 32 Bit: For 2GB maximum 2048 threads
o For Windows 64 Bit: For 2GB memory maximum 6600
threads

5
\Process(*)\Handle Count (How many handles each process has open and
determines if a handle leaks is suspected. A process with a large number of
handles and/or an aggresive upward trend could indicate a handle leak which
typically results in a memory leak. The total number of handles currently
open by this process. This number is equal to the sum of the handles
currently open by each thread in this process)

o
o For Windows 32 Bit: For most processes, if higher than
2,500 handles open, investigate.
Exceptions are:
System 10,000
lsass.exe 30,000
store.exe 30,000
sqlsrvr.exe 30,000
o For Windows 64 Bit: For most processes, if higher than
3,000 handles open, investigate.
Exceptions are:
System 20,000
lsass.exe 50,000
store.exe 50,000
sqlsrvr.exe 50,000

6
Network
\Network Interface(*)\Output Queue Length

o High Network I/O – more than 1 thread waiting on network I/O (If the output queue length
is greater than 1. If so, this system’s network is nearing capacity. Consider analyzing
network traffic to determine why network I/O is nearing capacity such as *chatty* network
services and/or large data transfers)
o Very high network I/O – more than 2 threads waiting on network I/O (if the output queue
length is greater than 2. If so, this system’s network is over capacity. Consider analyzing
network traffic to determine why network I/O is nearing capacity such as *chatty* network
services and/or large data transfers)

o Network Utilization Analysis (Bytes Total/sec is the rate at which bytes are sent and received over
each network adapter, including framing characters. Network Interface\Bytes Received/sec is a sum
of Network Interface\Bytes Received/sec and Network Interface\Bytes Sent/sec. This counter
indicates the rate at which bytes are sent and received over each network adapter. This counter
helps you know whether the traffic at your network adapter is saturated and if you need to add
another network adapter. How quickly you can identify a problem depends on the type of network
you have as well as whether you share bandwidth with other applications.
o \Network Interface(*)\Bytes Total/sec
o \Network Interface(*)\Current Bandwidth
o Thresholds:
o High average network utilization – more than 50%
o Very high average network utilization – more than 80%

o Server\Bytes Total/sec (This counter indicates the number of bytes sent and received over the
network. Higher values indicate network bandwidth as the bottleneck. If the sum of Bytes
Total/sec for all servers is roughly equal to the maximum transfer rates of your network, you may
need to segment the network)
o Not be more than 50 percent of network capacity.

7
o Processor:

o Processor\% Processor Time (This counter is the primary indicator of processor activity.
High values many not necessarily be bad. However, if the other processor-related
counters are increasing linearly such as % Privileged Time or Processor Queue Length,
high CPU utilization may be worth investigating)
o Less than 60% consumed = Healthy
o 51% – 90% consumed = Monitor or Caution
o 91% – 100% consumed = Critical or Out of Spec

o \Processor\% Privileged Time

o Consistently over 75 percent indicates a bottleneck.

o \System\Context Switches/sec (Context switching happens when a higher priority

thread preempts a lower priority thread that is currently running or when a high priority
thread blocks. High levels of context switching can occur when many threads share the
same priority level. This often indicates that there are too many threads competing for
the processors on the system. If you do not see much processor utilization and you see
very low levels of context switching, it could indicate that threads are blocked)
o High context switches/sec – more than 5000 context switches per second
o Very high context switches/sec – more than 15,000 context switches per
second

o \Processor(*)\% Interrupt Time

(This counter indicates the percentage of time the processor spends receiving and
servicing hardware interrupts. This value is an indirect indicator of the activity of devices
that generate interrupts, such as network adapters. A dramatic increase in this counter
indicates potential hardware problems)
o High CPU Interrupt Time – more than 30% interrupt time (A high amount
of % Interrupt Time in the processor could indicate a hardware or driver
problem)
o Very high CPU Interrupt Time – more than 50% interrupt time (A very high
amount of % Interrupt Time in the processor could indicate a hardware or
driver problem)

8
o System\Processor Queue Length (If there are more tasks ready to run than there are
processors, threads queue up. The processor queue is the collection of threads that are
ready but not able to be executed by the processor because another active thread is
currently executing. A sustained or recurring queue of more than two threads is a clear
indication of a processor bottleneck. You may get more throughput by reducing
parallelism in those cases. You can use this counter in conjunction with the Processor\%
Processor Time counter to determine if your application can benefit from more CPUs.
There is a single queue for processor time, even on multiprocessor computers.
Therefore, in a multiprocessor computer, divide the Processor Queue Length (PQL) value
by the number of processors servicing the workload. If the CPU is very busy (90 percent
and higher utilization) and the PQL average is consistently higher than 2 per processor,
you may have a processor bottleneck that could benefit from additional CPUs. Or, you
could reduce the number of threads and queue more at the application level. This will
cause less context switching, and less context switching is good for reducing CPU load.
The common reason for a PQL of 2 or higher with low CPU utilization is that requests for
processor time arrive randomly and threads demand irregular amounts of time from the
processor. This means that the processor is not a bottleneck but that it is your
threading logic that needs to be improved)
o Each processor has 10 or more threads waiting.(Determines if the average processor
queue length exceeds the number of processors by 10. If this threshold is broken, then
the processor(s) may be at capacity)

Each processor has 20 or more threads waiting(Determines if the average processor queue length
exceeds twenty times the number of processors. If this threshold is broken, then the processor(s)
are beyond capacity)

9
Common Performance Monitor counter thresholds

Question
What are some commonly used Performance Monitor (perfmon) counter thresholds?

Answer

This article lists some common perfmon counters with descriptions and thresholds. The threshold
values listed here are meant for use as a general 'rule-of-thumb' and each should be interpreted in
context of the specific performance issue currently at hand.

When using these values, keep the following in mind:

1. The threshold values provided below are averages, not min/max and are only useful when
looked at within a meaningful time-period as the following points help to clarify.
2. What is the time range captured in the perfmon data set?
3. Extreme highs and lows within the time-range of the capture can result in less-useful averages.
In these cases, try to narrow the time range and then look at counter values around the time(s)
when the poor performance was observed.
4. In general, to be considered a genuine bottleneck, a given counter's threshold must be
exceeded either on average, or on a frequent basis, during the time period being analyzed.
5. This is not meant to be a comprehensive nor definitive reference.

10
How to Measure Storage Performance and IOPS on Windows?

One of the main metric, which allows to estimate the performance of the existing or designed storage
system is IOPS (Input/Output Operations Per Second). In simple terms, IOPS is the number of
read/write operations with a storage, disk or a file system per a time unit. The larger is this number,
the greater the performance of your storage (frankly speaking, the IOPS value has to be considered
along with other storage performance characteristics, like latency, throughput, etc.).
In this article, we will look at several ways to measure the storage performance (IOPS, latency,
throughput) in Windows (you can use this manual for a local hard drive, SSD, SMB network folder,
CSV volume or LUN on SAN/iSCSI storage).

Contents:

 Capturing Storage I/O Using Disk Performance Counters in Windows

 DiskSpd: Testing Disk Performance and IOPS in Windows
 How to Measure Storage IOPS, Throughput and Latency Using PowerShell?

Capturing Storage I/O Using Disk Performance Counters in Windows

You can roughly estimate the current storage I/O workload in Windows using the built-in disk
performance counters from Performance Monitor. To collect these counters data:

1. Start the Perfmon;

2. Create a new Data Collector Set and select Create manually;

11
3. Select the checkbox Create data logs -> Performance counter;

4. Now in the properties of the new data collection set, add the following performance counters for the Physical
Disk object (you can select the counters for a specific disk or for all available local disks):

 Avg. Disk Sec./Transfer

 Avg. Disk Queue Length
 Avg Disk Bytes/Transfer
 Disk Bytes/sec
 Disk Transfers/sec
 Split IO/sec

5. You can change other data collection properties. By default, counter values are collected every 15 seconds.
To display real time disk performance, you need to add the specified Perfmon counters in the Monitoring
Tools -> Performance Monitor section.

12
6. It remains to start collecting performance counters data (select Start) and wait for the collection of sufficient
information for analysis. After that, right click your data collector set and select Stop;

7. To view the collected performance data go to the Perfmon -> Reports -> User Defined -> Data_Disk_IO
—> check_the_last_set. By default, disk data is displayed as graphs;

8. Use Ctrl + G to switch to the Report mode.

How to understand storage performance counters collected by Perfmon? For a quick analysis of the disk/storage
performance, you need to look at the values of at least the following 5 counters.

13
When analyzing the counter data, it is advisable for you to understand the current physical disks
(storage) configuration (whether RAID or Stripe is used, the number and types of disks, cache size,
etc.).

 Disk sec/Transfer – the time required to perform one write/read operation with the storage device
or disk (disk latency). If the delay is more than 25 ms (0.25), then the disk array cannot handle the
I/O operation on time. For high load servers, the disk latency value should not exceed 10 ms (0.1);
 Disk Transfers/sec – (IOPS). The number of read/write operations per second. This is the main
indicator of the disk access intensity (approximate IOPS values for different disk types are listed at
the end of the article);
 Disk Bytes/Sec – Total disk throughput (read+write) per second. Maximum values depend on the
disk type (150-250 Mb/s – for a regular HDD disk and 500-10000 for SSD);
 Split IO/sec – a disk fragmentation indicator when the operating system has to split one I/O
operation into multiple disk requests. It may also indicate that the application is requesting too large
blocks of data that cannot be transferred in one operation;
 Avg. Disk Queue Length – average number of read/write requests that were queued. For a single
disk, the queue length should not exceed 2. For a RAID array of 4 disks, the threshold value of disk
queue length is 8.

DiskSpd: Testing Disk Performance and IOPS in Windows

Microsoft recommends to use the DiskSpd (https://aka.ms/diskspd) utility for generating a load on a
disk (storage) system and measuring its performance. This is a command line interface tool that can
perform I/O operations with the specified drive target in several threads. I quite often use DiskSpd to
measure the storage performance and get the maximum available read/write speed and IOPS from the
specific server (of course you can measure the performance of storage as well, in this case diskspd will
be used to generate the storage load).
The DiskSpd does not require installation, just download and extract the archive to a local disk. For
x64 bit systems, use the version of diskspd.exe from the amd64fre directory.
I use the following command to test the performance of the disk:

diskspd.exe –c50G -d300 -r -w40 -t8 -o32 -b64K -Sh -L E:\diskpsdtmp.dat > DiskSpeedResults.txt

Important. When using diskspd.exe, quite a considerable load is generated on the disks and CPU of
the tested system. To eliminate the performance degradation for users, it is not recommended to start
it on productive systems in peak hours.

14
 -c50G – file size 50 GB (it is better to use a large file size so that it does not fit in the cache of the storage
controller);
 -d300 – test duration in seconds;
 -r – random read/write operations (if you need to test sequential access, use –s);
 -t8 – number of threads;
 -w40 – ratio of write to read operations 40%/60%;
 -o32 — queue length;
 -b64K — block size;
 -Sh — do not use cache;
 -L — measure latency;
 E:\diskpsdtmp.dat – test file path.

After the stress test is completed, average storage performance values can be obtained from the output tables.

In my test, the following performance data (check the Total IO table) was obtained:

 MiB/s — 241 (about 252 Mb/s, not bad);

 IOPS — 3866 (perfectly well!);
 Average latency — 66.206 ms (quite a big latency!).

You can get individual values for read (section Read IO) or write (section Write IO) operations.

Having tested several disks or storage LUNs using diskspd, you can compare them or select an array
with the desired performance for your tasks.

15
How to Measure Storage IOPS, Throughput and Latency Using
PowerShell?

I have found a PowerShell script (by Mikael Nystrom, Microsoft MVP), which is essentially an add-on
to SQLIO.exe utility (a set of file storage performance tests).

Note. In December, 2015, Microsoft announced the end of support for this tool and replacement of
SQLIO with a more universal tool Diskspd, and removed SQLIO distribution files from its website. So
you will have to search for sqlio.exe by yourself, or download it from our website (it is located in the
archive with the PowerShell script).

So, download the archive containing 2 files: SQLIO.exe and

DiskPerformance.ps1 (disk_perf_iops.ZIP — 74 KB) and extract it to any folder.

An example of running a PowerShell script to estimate disk performance and IOPS:

.\DiskPerformance.ps1 -TestFileName test.dat –TestFileSizeInGB 1 -TestFilepath C:\temp -TestMode

Get-LargeIO -FastMode True -RemoveTestFile True -OutputFormat Out-GridView

16
Let’s consider the script arguments:

 –TestFileName test.dat – the name of the file created by FSUTIL tool;

 –TestFileSizeInGB 1 – the test file size. Possible values are 1.5, 10, 50, 100, 500, 1,000 GB. The file
size has to be larger than the size of the storage system cache. Otherwise, IOPS will be measured for
the cache data, instead of the disk;
 -TestFilepath C:\Temp – here you specify the disk to measure the performance for and the folder on
the disk, in which a test file will be created. You can also specify an UNC path to the network shared
folder (SMB);
 -TestMode Get-LargeIO – there are two options of input-output measurement. Get-SmallIO allows to
measure IOPS, and Get-LargeIO measures data transfer rate. The difference between SmallIO and
LargeIO arguments is in the block size of 8 KB and 512 KB while measuring the rate, and the type of
access, Random or Sequential, correspondingly;
 -FastMode True – in the Fastmode, each test runs 10 seconds, otherwise it takes 60 seconds;
 -RemoveTestFile True – remove the test file after the test is over;
 -OutputFormat Out-GridView – the test results can be displayed either in PowerShell console (Format-
Table) or in a separate graphic chart window (Out-Gridview).

In our case (a vmdk virtual disk on the VMFS datastore on HP MSA 2040 connected over SAN is used)
the disk array showed the average IOPS value of about 15,000 and the data transmission rate
(throughput) about 5 Gbit/s.

17
In the following table, the approximate IOPS values for different disk types are shown:

Type IOPS

SSD(SLC) 6000

SSD(MLC) 1000

15K RPM 175-200

10K RPM 125-150

7.2K RPM 50-75

Raid5 of 6 drives with 10K RPM 1000

I have found some recommendations for disk performance in IOPS for some popular Microsoft
services:

1. Microsoft Exchange 2010 with 5,000 users, each of them receives 75 and sends 30 emails per day, will
require at least 3,750 IOPS;
2. Microsoft SQL 2008 Server with 3,500 SQL transactions per second (TPS) requires 28,000 IOPS;
3. A common Windows application server for 10-100 users requires 10-40 IOPS.

18
Diagnosing Disk Performance Issues

Disk performance issues can be hard to track down but can also cause a wide variety of issues. The
disk performance counter available in Windows are numerous, and being able to select the right
counters for a given situation is a great troubleshooting skill. Here, we'll review two basic scenarios -
measuring overall disk performance and determining if the disks are a bottleneck.

Measuring Disk Performance

When it comes to disk performance, there are two important considerations: IOPS and byte
throughput. IOPS is the raw number of disk operations that are performed per second. Byte
throughput is the effective bandwidth the disk is achieving, usually expressed in MB/s. These numbers
are closely related - a disk with more IOPS can provide better throughput.

These can be measured in perfmon with the following counters:

 Disk Transfers/sec
o Total number of IOPS. This should be about equal to Disk Reads/sec + Disk Writes/sec
 Disk Reads/sec
o Disk read operations per second (IOPS which are read operations)
 Disk Writes/sec
o Disk write operations per second (IOPS which are write operations)
 Disk Bytes/sec
o Total disk throughput per second. This should be about equal to Disk Read Bytes/sec +
Disk Write Bytes/sec
 Disk Read Bytes/sec
o Disk read throughput per second
 Disk Write Bytes/sec
o Disk write throughput per second
These performance counters are available in both the LogicalDisk and PhysicalDisk categories. In a
standard setup, with a 1:1 disk-partition mapping, these would provide the same results. However, if
you have a more advanced setup with storage pools, spanned disks, or multiple partitions on a single
disk, you would need to choose the correct category for the part of the stack you are measuring.

19
Here are the results on a test VM. In this test, diskspd was used to simulate an average mixed
read/write workload. The results show the following:

 3,610 IOPS
o 2,872 read IOPS
o 737 write IOPS
 17.1 MB/s total throughput
o 11.2 MB/s read throughput
o 5.9 MB/s write throughput
In this case, we're seeing a decent number of IOPS with fairly low throughput. The expected results
vary greatly depending on the underlying storage and the type of workload that is running. In any
case, you can use these counters to get an idea of how a disk is performing during real world usage.

Disk Bottlenecks

Determining if storage is a performance bottleneck relies on a different set of counters than the
above. Instead of looking at IOPS and throughput, latency and queue lengths needs to be
checked. Latency is the amount of time it takes to get a piece of requested data back from the disk
and is measured in milliseconds (ms). Queue length refers to the number of outstanding IO requests
that are in the queue to be sent to the disk. This is measured as an absolute number of requests.

20
The specific perfmon counters are:

 Avg. Disk sec/Transfer

o The average number of seconds it takes to get a response from the disk. This is the
total latency.
 Avg. Disk sec/Read
o The average number of seconds it takes to get a response from the disk for read
operations. This is read latency.
 Avg. Disk sec/Write
o The average number of seconds it takes to get a response from the disk for read
operations. This is write latency.
 Current Disk Queue Length
o The current number of IO requests in the queue waiting to be sent to the storage
system.
 Avg. Disk Read Queue Length
o The average number of read IO requests in the queue waiting to be sent to the storage
system. The average is taken over the perfmon sample interval (default of 1 second)
 Avg. Disk Write Queue Length
o The average number of read IO requests in the queue waiting to be sent to the storage
system. The average is taken over the perfmon sample interval (default of 1 second)

21
Here are the results on a test VM. In this test, diskspd was used to simulate an IO-intensive
read/write workload. Here is what the test shows:

 Total disk latency: 42 ms (0.042 seconds is equal to 42 milliseconds)

o Read latency: 5 ms
o Write latency: 80 ms
 Total disk queue: 48
o Read queue: 2.7
o Write queue: 45
These results show that the disk is clearly a bottleneck and underperforming for the workload. Both
the write latency and write queue are very high. If this were a real environment, we would be digging
deeper into the storage to see where the issue is. It could be that there's a problem on the storage
side (like a bad drive or a misconfiguration), or that the storage is simply too slow to handle the
workload.

Generally speaking, the performance tests can be interpreted with the following:

 Disk latency should be below 15 ms. Disk latency above 25 ms can cause noticeable
performance issues. Latency above 50 ms is indicative of extremely underperforming storage.
 Disk queues should be no greater twice than the number of physical disks serving the
drive. For example, if the underlying storage is a 6 disk RAID 5 array, the total disk queue
should be 12 or less. For storage that isn't mapped directly to an array (such as in a private
cloud or in Azure), queues should be below 10 or so. Queue length isn't directly indicative of
performance issues but can help lead to that conclusion.
These are general rules and may not apply in every scenario. However, if you see the counters
exceeding the thresholds above, it warrants a deeper investigation.

General Troubleshooting Process

If a disk performance issue is suspected to be causing a larger problem, we generally start off by
running the second set of counters above. This will determine if the storage is actually a bottleneck,
or if the problem is being caused by something else. If the counters indicate that the disk is
underperforming, we would then run the first set of counters to see how many IOPS and how much
throughput we are getting. From there, we would determine if the storage is under-spec'ed or if there
is a problem on the storage side. In an on-premise environment, that would be done by working with
the storage team. In Azure, we would review the disk configuration to see if we're getting the
advertised performance.

22
Configuring Windows Performance Monitor to Capture Disk I/O
Activity and Potential Disk Issues

As most SmarterMail administrators know, a server’s hard disks are some of the most heavily used
resources when it comes to managing a secure mail server and often have the biggest impact on how
that mail server performs. As such, it’s important to keep a close eye on your disk activity, ensuring
there are no bottlenecks or latencies that can cause SmarterMail’s performance to suffer.

And when it comes to monitoring server performance on Windows, there’s no better tool than the one
built-in: Windows Performance Monitor! PerfMon, as it’s commonly known, is a console snap-in that
provides tools for analyzing your system’s performance. It’s great to use for recording a performance
baseline, monitoring your daily activity, troubleshooting server data or discovering potential disk issues
before they occur.

Follow along to learn how to configure PerfMon to capture information pertaining to your email server’s
disk I/O utilization. There are three sections in this guide: (1) steps for configuring a monitor to view
data in real-time, (2) steps for configuring a data collection set in which data can be captured over a
period of time and (3) an explanation of the performance counters and their expected values.

Note: While the interface may vary slightly, the steps for configuring PerfMon remain consistent across
supported server versions. The screenshots provided here were taken from Windows Server 2008 R2.

Monitoring Disk I/O Activity in Real-Time

Configuring PerfMon to Monitor Performance Counters

To monitor your disk activity in real-time and catch disk I/O bottlenecks before they occur, you’ll need
to configure certain performance monitors within PerfMon:

1. On the Windows server where SmarterMail is installed, open Performance Monitor. This can
be opened from the Start menu by clicking on Administrative Tools and selecting Performance
Monitor OR by opening the Run command, entering “perfmon.exe” and clicking OK.
2. Once open, add a new counter. This is done by expanding the Monitoring Tools folder in the
navigation pane and clicking on Performance Monitor. In the toolbar of icons above the main
window, click the green plus sign (+) icon. The counter settings will load in a popup window.

23
3. In the popup window, find the ‘Instances of selected object’ section and select the physical
disk(s) you want to monitor. Highlight <All instances> to monitor all disks on the server in
the same report. To monitor one or multiple disks individually, select each individual volume.
By default, _Total will be selected; however, this is the sum of all your disks and won’t provide
meaningful data for this configuration. (It’s important to do this step before selecting
performance counters, as changing the selected instance could remove the highlighting from
the chosen performance counters.)
4. Next, go to the ‘Available counters’ section and find PhysicalDisk. Expand its additional
options and highlight the following counters:

o % Disk Read Time

o % Disk Time
o % Disk Write Time
o % Idle Time
o Current Disk Queue Length
o Disk Reads/sec
o Disk Writes/sec
o Split IO/sec

5. Click Add >>. The highlighted counters will be shown in the ‘Added counters’ section on the
right-hand side of the window.

6. Click OK to close the Add Counters dialog window.

24
Monitoring Real-Time Disk Activity

As soon as you close the Add Counters window, you’ll be dropped back into the PerfMon section where
you can begin monitoring your results!

There are three types of graphs that you can choose to view: Line, Histogram Bar or Report. To toggle
through the options, use the Change graph type button to the left of the plus sign (+) or press
Ctrl+G on your keyboard. We prefer reviewing the Report type as this lays out your data in a neat
table; however, when you’re monitoring quite a few disks, you may not be able to view all disk data
simultaneously.

So, if you review your results using the Line or Histogram Bar graphs instead, here are some things to
be aware of: If you chose <All instances> or the individual disk(s) when adding your performance
counters, each counter will be listed one time for every disk you’re monitoring. Use the column’s
sorting options to group disks together by Instance for easier review.

You may also find the Highlight toolbar button to be extremely useful in these views. When enabled,
the performance counter currently selected at the bottom of the window will have its corresponding
line/bar highlighted in black within the graph.

25
Monitoring Disk I/O Activity Over a Period of Time

Configuring PerfMon’s Data Collector Set

Now that your real-time monitoring is squared away, we can move on to capturing data sets over a
period of time. This configuration is extremely useful for those incidents that are tough to catch in
real-time. For example, an issue that occurs once every hour, happens sporadically or one that pops
up after-hours.

To capture disk data over a period of time, we’ll configure a Data Collector Set within PerfMon that can
be started and stopped as needed:

1. On the Windows server where SmarterMail is installed, open Performance Monitor. This can
be opened from the Start menu by clicking on Administrative Tools and selecting Performance
Monitor OR by opening the Run command, entering “perfmon.exe” and clicking OK.
2. Once open, add a new Data Collector Set. This can be done by expanding the Data Collector
Sets folder in the navigation pane. Then right-click on User Defined, hover over New and
select Data Collector Set. The collector set settings will load in a popup window.
3. In the Create new Data Collector Set dialog window, enter a friendly name for your report, such
as “IO Report.” Select the bulleted option to Create manually (Advanced). Click Next.

4. On the next screen, select the bulleted option to Create data logs and
checkmark Performance counter. Click Next.
5. Now, click on Add.... A window for the performance counter settings will appear.
6. In the popup window, find the ‘Instances of selected object’ section and select the physical
disk(s) you want to monitor. Highlight <All instances> to monitor all disks on the server in
the same report. To monitor one or multiple disks individually, select each individual volume.
By default, _Total will be selected; however, this is the sum of all your disks and won’t provide
meaningful data for this configuration. (It’s important to do this step before selecting
performance counters, as changing the selected instance could remove the highlighting from
the chosen performance counters.)

26
7. Next, go to the ‘Available counters’ section and find PhysicalDisk. Expand its additional
options and highlight the following counters:
1. % Disk Read Time
2. % Disk Time
3. % Disk Write Time
4. % Idle Time
5. Current Disk Queue Length
6. Disk Reads/sec
7. Disk Writes/sec
8. Split IO/sec
8. Click Add >>. The highlighted counters will be shown in the ‘Added counters’ section on the
right-hand side of the window.
9. Next, find SmarterMail within the list of ‘Available counters.’ Expand its additional options and
highlight each one. (These counters will be helpful by allowing you to compare the values of
normal disk activity versus high disk I/O. For example, during an instance of high disk I/O, you
could potentially see an influx of IMAP connections, SMTP connections, file handles, threads,
etc., allowing you to understand the SmarterMail sections impacted so you can troubleshoot the
root cause of the issue.)
10. In the ‘Instances of selected object’ section, select the mailservice instance. (If you click on
mailservice immediately after highlighting all SmarterMail counters, all counters should still be
highlighted.) Then click Add >>. The counters will be shown in the ‘Added counters’ section,
indicated with an asterisk (*).

27
11. Click OK to close the window and return to the Create new Data Collector Set dialog window.
The performance counters just added will be displayed.
12. Adjust the Sample Interval as desired. In most cases, 15-30 seconds is enough (and can be
adjusted in the future if needed). Click Next.
13. Set the Root directory path. This is where the actual report data will be saved. It’s
recommended to save this to a volume that is not low on disk space, as these reports can get
fairly large if left running for a long period of time (days). Notate this location so you can pull
from this path, if needed. Click Next.
14. Leave the Run as: option at <Default>, unless special permissions are necessary for your
environment. In the bulleted options below, select Save and close. Click Finish.

Collecting Data Over a Period of Time

When the wizard has finished, you’ll be dropped back into the PerfMon window where you can begin
collecting data for your report! Find the “IO Report” (or whatever friendly name you used) by
expanding the Data Collector Sets folder and clicking on User Defined. To begin capturing the Disk
I/O and SmarterMail data, right-click on the report name at any time and click Start. Once the data
has been captured for the desired period, right-click again and choose Stop.

28
Reviewing the Data Collector Set Report

To review the data you’ve collected, head over to the Reports folder and expand User Defined. Here
you’ll see the name of your report and, below it, each set of data that has been collected. Select the
latest report to view its information.

There are five types of graphs that you can choose to view: Line, Histogram Bar, Report, Area or
Stacked Area. To toggle through the options, use the ‘Change graph type’ button to the left of the plus
sign (+) or press Ctrl+G on your keyboard. Again, we prefer the Report type as this lays out your data
in a neat table; however, when you’re monitoring quite a few disks, you may not be able to view all
disk data simultaneously.

So, if you review your results using the graphs instead, here are some things to be aware of: If you
chose <All instances> or the individual disk(s) when adding your performance counters, each counter
will be listed one time for every disk you’re monitoring. Use the column’s sorting options to group
disks together by Instance for easier review.

Finally, if the actual report data needs to be pulled -- either for a support ticket with the SmarterTools
Support Department or for you to review on an external system -- this can be obtained from the path
specified in step 13.

29
Understanding PerfMon Counters and their Results

So now that we have all the steps in place for monitoring your disk I/O activity, it’s important that you
understand the information each performance counter provides, as well the results that should be
expected on a healthy installation that is capable of handling the I/O requirements:

*Though the percentages for Disk Read/sec and Disk Write/sec can influx up to 35-40%, this isn’t a
firm indicator of true bottlenecking. However, if you see these values exceed 70-80%, this indicates
the disk activity is VERY high. Chances are, during this same period, you will notice % Disk Idle sitting
around 0-10%.
**In combination with high Disk Read\Write percentages, if the Current Disk Queue Length exceeds 1-
2, noticeable slowness will occur within the SmarterMail web interface and many other aspects may be
affected, including message deliveries, IMAP\EWS\EAS synchronization and more. This is because the
OS would have to queue the Read\Write operations rather than committing said operations to the disk
in real time.

There we have it! Using the steps above, you’ve created real-time and historical monitors to keep a
close eye on your server’s disk activity, and with your hard disks performing at their best, you’re well
on your way to a healthy, reliable and high-performing mail server.

So what other tools do you use for maintaining your mail server performance? Are there any additional
performance counters you recommend monitoring? Let us know in the comments!

30
SQL Server disk performance metrics – Part 1 – the most
important disk performance metrics

So far, we have presented the most important memory and processor metrics. These metrics indicate
system and SQL Server performance, and are useful for troubleshooting performance issues and
bottlenecks. Besides memory and processor metrics, equally important are SQL Server disk metrics.
Sometimes a metric from one category can be masked by other events and be misleading – e.g. a disk
issue can cause processor bottlenecks. That’s why it’s necessary to understand the cause and effect of
each metric

Disk metrics are not related only to disk itself, but to the whole disk subsystem which includes disk,
the disk controller card and the I/O disk system bus. For SQL Server disk performance monitoring, it’s
recommended to monitor the metrics for a while, determine the trend, and set a baseline for normal
operation. Then, compare the current metric values to baselines

Most of these metrics are available in Windows Performance Monitor, where they are divided into 2
groups – Physical Disk and Logical Disk metrics. A Logical disk is a disk partition, while a physical disk
is the complete physical disk with all partitions created on it. The metrics in both groups are the same,
the only difference is whether they show the performance for a single partition, or for the entire disk

Some physical disk metrics might not be sufficient for deeper investigation and troubleshooting if you
have more than one logical partition on a disk. This is where logical disk metrics are useful, as they
show more granular results and help determining effect of SQL Server or any other application on disk
performance

SQL Server uses I/O calls to perform reads and writes on a disk, it defines and manages requests for
reading and writing the data, while the operating system actually performs the I/O operations.
Problems with disk I/O operations are manifested through slow response times, operation time outs,
and system bottlenecks

To troubleshoot SQL Server disk issues, besides total disk I/O activity, it’s recommended to monitor
and detect disk activity made by SQL Server

Excessive disk using by various applications can cause SQL Server performance degradation, as SQL
Server might not be the master of disk resources and would have to wait for disk reads and writes.
The SQL Server activities that require disk access are creating database and transaction log backups
and saving them to disk, import/export processes, jobs that read or write large amounts of data
from/to disk, etc.

31
Average Disk sec/Read

The Average Disk sec/Read metric, along with Average Disk sec/Read (presented next), is one of the
most important disk performance metrics. Both metrics can be tracked on logical and physical disk
levels and show disk latency. The shorter the time needed to read or write data, the faster the system

“The value for this counter is generally the number of seconds it takes to do each read. On less-
complex disk subsystems involving controllers that do not have intelligent management of the I/O, this
value is a multiple of the disk’s rotation per minute. This does not negate the rule that the entire
system is being observed. The rotational speed of the hard drive will be the predominant factor in the
value with the delays imposed by the controller card and support bus system.” [1]

Average Disk sec/Read is proportional to time needed for one disk rotation. For example, a disk that
makes 3,600 round per minute needs 60s/3600 = 0.016 seconds, i.e. 16 milliseconds to make one
rotation. Average Disk sec/Read for that disk should be a multiple of 16 milliseconds. The time added
to one disk rotation is the queuing time and the time needed for data transit across the I/O bus

The recommended Average Disk sec/Read value is below 8ms

Value (ms) Performance

<8 Excellent

8 – 12 OK

12 – 20 Fair

> 20 Bad

Maximum peaks during excessive I/O operations can be up to 25 milliseconds, but values constantly
higher than 20 milliseconds indicate poor performance

32
Average Disk sec/Write

Average Disk sec/Write is another useful disk performance metric that shows the average time in
seconds needed to write data to disk

Usually, the read and write speed on a disk are different. The recommended values for non-cached
writes are the same as for Average Disk sec/Read. In case of cached writes, the values are very
different – values higher than 4 milliseconds indicate poor performance, while the values less than 1
milliseconds indicate the best performance

Value (ms) Performance

<1 Excellent

1–2 OK

2–4 Fair

>4 Bad

If the Average Disk sec/Read and Average Disk sec/Write values are constantly above the
recommended values, it’s an indication of a disk bottleneck and additional analysis is required

“After you have found the disks with high levels of read/write activity, look at the read-specific and
write-specific counters (for example, Logical Disk: Disk Write Bytes/sec) for the type of disk activity
that is causing the load on each logical volume.” [2]

If the Average Disk sec/Read and Average Disk sec/Write values are high for all or almost all disks, the
problem is most probably caused by disk communication mediums. If only a specific disk shows poor
performance, the problem is most probably in disk itself

Monitoring both values can help you determine if reconfiguration of disk controller cache is needed. If
for example, the Average Disk sec/Read value is significantly higher than Average Disk sec/Write, you
can consider cache optimization for reading

33
Average Disk sec/Transfer

The Average Disk sec/Transfer metric shows disk efficiency as the average time needed for each read
and write

“Measures the average time of each data transfer, regardless of the number of bytes read or written.
Shows the total time of the read or write, from the moment it leaves the Diskperf.sys driver to the
moment it is complete
A high value for this counter might mean that the system is retrying requests due to lengthy queuing
or, less commonly, disk failures. “[3]

The recommended value is the same as for the previous two metrics

There’s no need to monitor this metric along with Average Disk sec/Read and Average Disk sec/Write,
as the latter two are combined in Average Disk sec/Transfer. But if you’re monitoring Average Disk
sec/Transfer and its values are higher than recommended, monitoring Average Disk
sec/Read and Average Disk sec/Write is the first step in further troubleshooting

34
Disk Reads/sec and Disk Writes/sec

The Disk Reads/sec and Disk Writes/sec metrics show the rate of read and write operations on disk,
respectively

The metric that shows the combined value of these two is Disk Transfers/sec, it’s the total number of
all I/O disk requests generated in a second

If the values are low, they indicate slow disk I/O operation processing and you should check processor
usage parameters and disk-expensive queries

There is no specific threshold, as it depends on disk specification and your server configuration. For an
array system, the values shown are for all disks. With that said, it’s recommended to monitor these
metrics for a while and to determine trends and set a baseline. Any unexpected peaks should be
investigated

In this part of the SQL Server performance metrics series, we presented the most important disk
performance metrics. All metrics show disk latency and if the latency is too high, the final solution is
upgrading the disk subsystem, or adding more disks

35
SQL Server disk performance metrics – Part 2 – other important
disk performance measures

In the previous part of the SQL Server performance metrics series, we presented the most important
and useful disk performance metrics. Now, we’ll show other important disk performance measures

Current Disk Queue Length

“Indicates the number of disk requests that are currently waiting as well as requests currently being
serviced. Subject to wide variations unless the workload has achieved a steady state and you have
collected a sufficient number of samples to establish a pattern.” [1]

The metric shows how many I/O operations are waiting to be written to or read from the hard drive
and how many are currently processed. If the hard drive is not available, these operations are queued
and will be processed when disk becomes available. The whole disk subsystem has a single queue

The Current Disk Queue Length metric in Windows Performance Monitor is available for both physical
and logical disk. In some earlier versions of Performance Monitor, this counter was named Disk Queue
Length

The Current Disk Queue Length value should be less than 2 per disk spindle. Note that this is not per
logical, but per physical disk. If larger, this indicates a potential disk bottleneck, so further
investigation and monitoring other disk metrics is recommended. Start with monitoring %Disk
Time (explained below). Frequent peaks should also be investigated

Disk array systems such as RAID or SAN have a large number of disks and controllers, which makes
queues on such systems shorter. Because the metric doesn’t indicate queuing per disk, but for the
whole array, some DBAs consider that monitoring Current Disk Queue Length on disk arrays is not
needed

Another scenario where Current Disk Queue Length can be misleading is when data is stored in the
disk cache. It will be reported as being queued for writing and thus the Current Disk Queue
Length value will be higher than actual

36
Average Disk Queue Length

The Average Disk Queue Length metric shows the information similar to Current Disk Queue Length,
only the value is not current but average over a specific time period. The threshold is the same as for
the previous metric – up to 2 per disk. For disk systems, the recommended value is less than 2 per
individual disk drive in an array. For example, in a 6 disk array the Current Disk Queue Length value of
12 means that the queue is 2 per disk

There are two more metrics similar to Average Disk Queue Length – Average Disk Read Queue
Length and Average Disk Write Queue Length. As their names indicate – they show the average queue
length for operations waiting for disk to be read or written

%Disk Time

“This counter indicates a disk problem, but must be observed in conjunction with the Current Disk
Queue Length counter to be truly informative. Recall also that the disk could be a bottleneck prior to
the %Disk Time reaching 100%” [2]

The %Disk Time metric indicates how busy the disk is servicing read and write requests, but as stated
above, it’s not a clear indication of a problem, as its values can be normal while there’s a serious disk
performance issue. Its value is the Average Disk Queue Length value represented in percents (i.e.
multiplied by 100). If Average Disk Queue Length is 1, %Disk Time is 100%

What can be confusing is that %Disk Time values can be over 100%, which isn’t logical. This happens
if the Average Disk Queue Length value is greater than 1. If Average Disk Queue Length is 3, %Disk
Time is 300%, which doesn’t mean that processes are using 3 times more disk time than available, nor
that there is a bottleneck

If you have a hard disk array, the total disk time for all disks is shown, without the indication of how
many disks are available and what disk is having the highest %Disk Time. For example, %Disk
Time equal to 500% might indicate good performance (in case you have 6 disks), or extremely bad (in
case you have only 1 disk). You cannot tell without knowing the machine hardware

37
As this counter can be misleading, some DBAs don’t use it as there are other more straightforward and
indicative metrics that show disk performance

If the value is higher than 90% per disk, additional investigation is needed. First, check the Current
Disk Queue Length value. If it’s higher than the threshold (2 per physical disk), monitor if the high
values occur frequently. If the machine is not used only for SQL Server, other resource-intensive
applications might cause disk bottlenecks, so SQL Server performance will be suffering. If this is the
case, consider moving these applications to another machine and using a dedicated machine for SQL
Server only

If this is not the case, or cannot be done, consider moving some of the files to another disk or machine
– archive databases, database and transaction log backups, using a faster disk, or adding additional
disks to an array

38
%Disk Read Time and the %Disk Write Time

The %Disk Read Time and %Disk Write Time metrics are similar to %Disk Time, just showing the
operations read from or written to disk, respectively. They are actually the Average Disk Read Queue
Length and Average Disk Write Queue Length values presented in percents. The values these metrics
show can be equally misleading as %Disk Time

On a three – disk array system, if one disk reads 50% of the time (%Disk Read Time =50%), the
other one reads 85% of the time, and the third one is idle, %Disk Read Time is 135% and Average
Disk Read Queue Length 1.35. At a first glance, %Disk Read Time equal to 135% looks like a problem,
but it’s not. It doesn’t mean that disks are busy 135% of the time. To get a real value, you should
divide the value with the number of disks and you’ll get 136%/3 = 45%, which indicates normal
performance

39
%Idle Time

The disk is idle when it’s not processing read and write requests

“This measures the percentage of time the disk was idle during the sample interval. If this counter falls
below 20 percent, the disk system is saturated. You may consider replacing the current disk system
with a faster disk system.” [3]

If the value is lower than 20%, disk is not able to service all read and write requests in a timely
fashion. Before opting for disk replacement, check whether it’s possible to remove some applications
to another machine

40
%Free Space

Besides Windows Performance Monitor, this metric is available in Windows Explorer in the computer
and disk Properties tabs. While Performance Monitor shows the percentage of available free disk space,
Windows Explorer shows the amount in GB

“This measures the percentage of free space on the selected logical disk drive. Take note if this falls
below 15 percent, as you risk running out of free space for the OS to store critical files. One obvious
solution here is to add more disk space.” [3]

If the value shows sudden peaks without obvious reasons, further investigation is required

Unlike most of the memory and processor SQL Server performance metrics, disk metrics can be quite
deceptive. They might not clearly indicate a performance problem; their values might be OK, when
actually there is a serious disk issue, while their strangely high values might show normal
performance, as they show values for an array of disks. Then it comes to array metrics, hardware
configuration knowledge is necessary to read them correctly. Despite these disk metrics downsides,
they are necessary for SQL Server performance troubleshooting

41
https://sites.google.com/site/saifsqlserverrecipes/memory-performance-counters/key-performance-
counters-and-their-thresholds-for-windows-server

https://knowledge.broadcom.com/external/article/181816/common-performance-monitor-counter-
thres.html

http://woshub.com/how-to-measure-disk-iops-using-powershell/

https://www.concurrency.com/blog/september-2019/diagnosing-disk-performance-issues

https://www.smartertools.com/blog/2016/07/15-configure-perfmon-to-prevent-disk-issues

https://www.sqlshack.com/sql-server-disk-performance-metrics-part-1-important-disk-performance-
metrics/

https://www.sqlshack.com/sql-server-disk-performance-metrics-part-2-important-disk-performance-
measures/

Why Containerd in Kubernetes
No ratings yet
Why Containerd in Kubernetes
3 pages
6 71 W94J0 D02B PDF
No ratings yet
6 71 W94J0 D02B PDF
38 pages
API Gateways Buyers Guide
No ratings yet
API Gateways Buyers Guide
2 pages
Forrester Wave In-Memory Data Grids
No ratings yet
Forrester Wave In-Memory Data Grids
16 pages
NetScaler 10.5 Application Firewall
No ratings yet
NetScaler 10.5 Application Firewall
248 pages
Us A Turnkey Iot Solution For Manufacturing
No ratings yet
Us A Turnkey Iot Solution For Manufacturing
17 pages
VxRail Easy Deploy
No ratings yet
VxRail Easy Deploy
38 pages
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Mesosphere Guide To Data-Rich Apps in Financial Services 1
No ratings yet
Mesosphere Guide To Data-Rich Apps in Financial Services 1
11 pages
GCP Fund Module 7 Developing, Deploying, and Monitoring in The Cloud
No ratings yet
GCP Fund Module 7 Developing, Deploying, and Monitoring in The Cloud
15 pages
Competitive Comparison-Application Servers
100% (1)
Competitive Comparison-Application Servers
64 pages
Redis Vs Ncache
No ratings yet
Redis Vs Ncache
36 pages
Zabbix
No ratings yet
Zabbix
45 pages
Cloud Computing With Amazon Web Services
No ratings yet
Cloud Computing With Amazon Web Services
46 pages
Unit 42 Cloud Threat Report 2h 2020 PDF
No ratings yet
Unit 42 Cloud Threat Report 2h 2020 PDF
28 pages
All About Performance Testing
No ratings yet
All About Performance Testing
4 pages
Azure - Storage
No ratings yet
Azure - Storage
2 pages
13software Maintenance Overview PDF
No ratings yet
13software Maintenance Overview PDF
6 pages
Benefits of Cloudendure
No ratings yet
Benefits of Cloudendure
7 pages
Swot Vmware Cloud Infra
No ratings yet
Swot Vmware Cloud Infra
19 pages
Appspider Pro: Getting Started Guide
No ratings yet
Appspider Pro: Getting Started Guide
12 pages
AWS Certified DevOps Engineer - Professional
No ratings yet
AWS Certified DevOps Engineer - Professional
2 pages
Expert Days 2018: SUSE Enterprise Storage
No ratings yet
Expert Days 2018: SUSE Enterprise Storage
15 pages
(IBM Security) IBM Security QRadar Installation Guide
No ratings yet
(IBM Security) IBM Security QRadar Installation Guide
54 pages
Experiences Running Apache Flink at Very Large Scale: @stephanewen Berlin Buzzwords, 2017
No ratings yet
Experiences Running Apache Flink at Very Large Scale: @stephanewen Berlin Buzzwords, 2017
76 pages
MongoDB Security Architecture WP
No ratings yet
MongoDB Security Architecture WP
17 pages
Pega Process
No ratings yet
Pega Process
180 pages
Dynamic DNS PDF
No ratings yet
Dynamic DNS PDF
7 pages
Transactional Behavior Verification in Business Process As A Service Configuration
No ratings yet
Transactional Behavior Verification in Business Process As A Service Configuration
7 pages
Lab 6 - Performing Real-Time Analytics With Stream Analytics
No ratings yet
Lab 6 - Performing Real-Time Analytics With Stream Analytics
17 pages
Aws Vs Azure PDF
No ratings yet
Aws Vs Azure PDF
9 pages
Defend Against Malicious Insiders Using Splunk Enterprise Security, Splunk's Machine Learning Toolkit, and Statistics
No ratings yet
Defend Against Malicious Insiders Using Splunk Enterprise Security, Splunk's Machine Learning Toolkit, and Statistics
36 pages
Oracle On Azure Whitepaper
No ratings yet
Oracle On Azure Whitepaper
34 pages
DynaTrace5.6 ReleaseNotes X
100% (1)
DynaTrace5.6 ReleaseNotes X
20 pages
AWS Re:invent 2022 - AWS Well-Architected Framework Security Pillar: Cloud Security
No ratings yet
AWS Re:invent 2022 - AWS Well-Architected Framework Security Pillar: Cloud Security
39 pages
Cloud Scalability Considerations
No ratings yet
Cloud Scalability Considerations
11 pages
Chapter 16 - Cloud - Security - Risks
No ratings yet
Chapter 16 - Cloud - Security - Risks
12 pages
Baseline+Checklist+-+RHEL+v1 0
No ratings yet
Baseline+Checklist+-+RHEL+v1 0
30 pages
Use Case - Application Migration To AWS
No ratings yet
Use Case - Application Migration To AWS
3 pages
10 Things To Get Right For Successful DevSecOps
100% (1)
10 Things To Get Right For Successful DevSecOps
15 pages
A Secure Portal Extended With Single Sign-On
No ratings yet
A Secure Portal Extended With Single Sign-On
112 pages
DevOps General Deck
No ratings yet
DevOps General Deck
19 pages
Build and Run Applications in A Dockerless Kubernetes World
No ratings yet
Build and Run Applications in A Dockerless Kubernetes World
49 pages
Servers
No ratings yet
Servers
8 pages
Zesty Overview
No ratings yet
Zesty Overview
26 pages
For Speed and Agility
No ratings yet
For Speed and Agility
14 pages
Delphix Admin For Oracle Lab Guide 5.3.5
No ratings yet
Delphix Admin For Oracle Lab Guide 5.3.5
61 pages
Sample_DevOps_Resume
No ratings yet
Sample_DevOps_Resume
2 pages
Apache Tomcat
No ratings yet
Apache Tomcat
3 pages
Cloud Tutorial: Aws Iot: Cse 521S Fall Sep. 17, 2020 Ruixuan (Corey) Dai
No ratings yet
Cloud Tutorial: Aws Iot: Cse 521S Fall Sep. 17, 2020 Ruixuan (Corey) Dai
47 pages
InterWorks Tableau Performance Checklist
No ratings yet
InterWorks Tableau Performance Checklist
3 pages
AWS Marketplace Cloud-Native Ebook 9 EaaS v2
No ratings yet
AWS Marketplace Cloud-Native Ebook 9 EaaS v2
36 pages
Exam C5050-300 Dumps - C5050-300 IBM DevOps Portfolio Exam Questions
No ratings yet
Exam C5050-300 Dumps - C5050-300 IBM DevOps Portfolio Exam Questions
6 pages
Hitachi Nas Platform Best Practices Guide For Nfs With Vmware Vsphere
No ratings yet
Hitachi Nas Platform Best Practices Guide For Nfs With Vmware Vsphere
29 pages
Snapmanager 2.0 For Virtual Infrastructure Best Practices: Front Cover
No ratings yet
Snapmanager 2.0 For Virtual Infrastructure Best Practices: Front Cover
118 pages
GC Tuning
No ratings yet
GC Tuning
136 pages
Load Testing
No ratings yet
Load Testing
7 pages
SQL PT Poster
No ratings yet
SQL PT Poster
1 page
Drupal
No ratings yet
Drupal
9 pages
AppDynamics Third Edition
From Everand
AppDynamics Third Edition
Gerardus Blokdyk
No ratings yet
Ultimate Microsoft Intune for Administrators: Master Enterprise Endpoint Security and Manage Devices, Apps, and Cloud Security with Expert Microsoft Intune Strategies (English Edition)
From Everand
Ultimate Microsoft Intune for Administrators: Master Enterprise Endpoint Security and Manage Devices, Apps, and Cloud Security with Expert Microsoft Intune Strategies (English Edition)
Paul Winstanley
No ratings yet
Cloud computing Complete Self-Assessment Guide
From Everand
Cloud computing Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
Primate Apes
No ratings yet
Primate Apes
8 pages
RDS RemoteApp On Windows Server 2012
No ratings yet
RDS RemoteApp On Windows Server 2012
26 pages
Rocks Classification
No ratings yet
Rocks Classification
4 pages
RDS Gateway and Certificates On Windows Server 2012
No ratings yet
RDS Gateway and Certificates On Windows Server 2012
16 pages
RDS Licensing
No ratings yet
RDS Licensing
28 pages
RDS How To Create A Self Signed Certificate in IIS 7
No ratings yet
RDS How To Create A Self Signed Certificate in IIS 7
12 pages
Vsphere Distributed Switch Best Practices
No ratings yet
Vsphere Distributed Switch Best Practices
30 pages
Linux Boot Process
100% (2)
Linux Boot Process
9 pages
Rebuild BCD
No ratings yet
Rebuild BCD
6 pages
VCP-NV Study Guide
No ratings yet
VCP-NV Study Guide
132 pages
Vsphere 6.0 - What'S New in Vcenter Server 6.0
No ratings yet
Vsphere 6.0 - What'S New in Vcenter Server 6.0
2 pages
What'S New in Vmware Vsphere 6.0: Vmotion Enhancements: Vmotion Across Vcenter Servers
No ratings yet
What'S New in Vmware Vsphere 6.0: Vmotion Enhancements: Vmotion Across Vcenter Servers
2 pages
Vsphere 6.0 - Difference Between Vsphere 5.0, 5.1, 5.5 and Vsphere 6.0
No ratings yet
Vsphere 6.0 - Difference Between Vsphere 5.0, 5.1, 5.5 and Vsphere 6.0
1 page
Grade 7 Excercise
No ratings yet
Grade 7 Excercise
6 pages
169-Hart 375 Data Sheet
No ratings yet
169-Hart 375 Data Sheet
8 pages
1 Basic Component
No ratings yet
1 Basic Component
10 pages
ARM Architecture
No ratings yet
ARM Architecture
16 pages
Factual Report About Things Borobudur Temple
100% (1)
Factual Report About Things Borobudur Temple
4 pages
Guide to Operating Systems Greg Tomsho - Own the ebook now and start reading instantly
100% (3)
Guide to Operating Systems Greg Tomsho - Own the ebook now and start reading instantly
57 pages
WD SMR drives_ How to Access Data If a Second-Level Translator is Damaged. _ PC-3000 Support Blog
No ratings yet
WD SMR drives_ How to Access Data If a Second-Level Translator is Damaged. _ PC-3000 Support Blog
8 pages
JD374A Datasheet: Check Its Price
No ratings yet
JD374A Datasheet: Check Its Price
4 pages
Ch-01 (Comp) - Introduction To Computers
No ratings yet
Ch-01 (Comp) - Introduction To Computers
52 pages
Cyint Technologies - Product Portfolio
No ratings yet
Cyint Technologies - Product Portfolio
4 pages
Biografi Steve Jobs PDF
No ratings yet
Biografi Steve Jobs PDF
74 pages
INET Jul 2024 CDAC CO Mock Checklist
No ratings yet
INET Jul 2024 CDAC CO Mock Checklist
3 pages
Introduction To MASM
100% (1)
Introduction To MASM
24 pages
ICT QUIZ QUESTION Prof
No ratings yet
ICT QUIZ QUESTION Prof
6 pages
Configuration Software AmeTrim User Manual US/ESP
100% (1)
Configuration Software AmeTrim User Manual US/ESP
68 pages
Pentium Architecture
No ratings yet
Pentium Architecture
3 pages
Manual ASRock P4i65GV
No ratings yet
Manual ASRock P4i65GV
36 pages
All Questions 83 Pages FDTC - Without Answers
No ratings yet
All Questions 83 Pages FDTC - Without Answers
83 pages
Laiza-Powerpoint 20240410 215454 0000
No ratings yet
Laiza-Powerpoint 20240410 215454 0000
9 pages
IT Basic Troubleshooting
No ratings yet
IT Basic Troubleshooting
2 pages
VL393T2863M E6s
No ratings yet
VL393T2863M E6s
10 pages
Quick Specs Nvidia GT 230 1.5 GB
No ratings yet
Quick Specs Nvidia GT 230 1.5 GB
4 pages
Computer Science - UNIT 1 and 2
No ratings yet
Computer Science - UNIT 1 and 2
22 pages
Isa Connector 10Mbit/S Ethernet Card Rtl8019As
No ratings yet
Isa Connector 10Mbit/S Ethernet Card Rtl8019As
4 pages
Using Yocto Project With BeagleBone Black - Sample Chapter
100% (1)
Using Yocto Project With BeagleBone Black - Sample Chapter
25 pages
Lista de Achizitii Startup
No ratings yet
Lista de Achizitii Startup
5 pages
Week 13 14 - Performance Evaluation
No ratings yet
Week 13 14 - Performance Evaluation
19 pages
MPMC Syllabus
100% (1)
MPMC Syllabus
2 pages
Screenshot 2023-02-20 at 9.50.40 AM
No ratings yet
Screenshot 2023-02-20 at 9.50.40 AM
1 page

Performance Counters Thresholds For Windows Server

Uploaded by

Performance Counters Thresholds For Windows Server

Uploaded by

Performance Counters thresholds for

Key Performance Counters and their thresholds for Windows

o \LogicalDisk(*)\Avg. Disk sec/Write

o \Process(*)\IO Data Operations/sec

o \Process(*)\IO Other Operations/sec

o \Memory\Available MBytes (Available MBytes is the amount of physical memory available to

o \Memory\Pool Nonpaged Bytes

o \Processor\% Privileged Time

o \System\Context Switches/sec (Context switching happens when a higher priority

o \Processor(*)\% Interrupt Time

When using these values, keep the following in mind:

 Capturing Storage I/O Using Disk Performance Counters in Windows

Capturing Storage I/O Using Disk Performance Counters in Windows

1. Start the Perfmon;

 Avg. Disk Sec./Transfer

8. Use Ctrl + G to switch to the Report mode.

DiskSpd: Testing Disk Performance and IOPS in Windows

 MiB/s — 241 (about 252 Mb/s, not bad);

So, download the archive containing 2 files: SQLIO.exe and

An example of running a PowerShell script to estimate disk performance and IOPS:

.\DiskPerformance.ps1 -TestFileName test.dat –TestFileSizeInGB 1 -TestFilepath C:\temp -TestMode

 –TestFileName test.dat – the name of the file created by FSUTIL tool;

15K RPM 175-200

10K RPM 125-150

7.2K RPM 50-75

Raid5 of 6 drives with 10K RPM 1000

Measuring Disk Performance

These can be measured in perfmon with the following counters:

 Avg. Disk sec/Transfer

 Total disk latency: 42 ms (0.042 seconds is equal to 42 milliseconds)

General Troubleshooting Process

Monitoring Disk I/O Activity in Real-Time

Configuring PerfMon to Monitor Performance Counters

o % Disk Read Time

6. Click OK to close the Add Counters dialog window.

Configuring PerfMon’s Data Collector Set

Collecting Data Over a Period of Time

The recommended Average Disk sec/Read value is below 8ms

Value (ms) Performance

Value (ms) Performance

Current Disk Queue Length

You might also like