May 052015
 

drive-harddisk-3My recent post on using the new DiskSpd utility to help you benchmark your storage is a great primer on how to use DiskSpd. But… what if you want to run multiple tests? For example, what if you want to run both read and write tests with a varying degree of threads or operations per threads to see the ramp up curve? What if you wanted these automated? What about putting the results from all of these tests into something that you can quickly review?

Now you can – with a PowerShell script that I’m releasing for free. I call it DiskSpd Batch.

Not only does the script help automate your test cycles, it leverages the great DiskSpd feature of saving the results to an XML file. After the testing cycles complete, it then extracts the relevant information from each test cycle and places it into a CSV output file. You can use this file to perform your own analysis on the results.

This PowerShell script is available for free over at my business web site at Heraflux.com.

Usage

First, download DiskSpd from TechNet, and extract it to your hard drive on the server that you wish to test. Read the documentation that comes with it.

Next, find the subdirectory that matches your system architecture (32 or 64-bit). This path becomes your location to the DiskSpd executable.

Download and copy the DiskSpd Batch script into a folder on your file system.

From an elevated PowerShell prompt, execute the script with the following parameters that you specify.

Syntax

Parameter Description
-Time Duration for each test cycle, measured in seconds
-DataFile Path and filename for the workload file
-DataFileSize Workload file size, in the format “500M” for 500MB, or “10G” for 10GB
-OutPath Results output file location (output file is automatically named)
-SplitIO “True” tests permutations of read and write tests in the same test cycle, in increments of 10%. “False” only tests 100% read or write test cycles.
-AllowIdle So as not to overwhelm a storage device’s ability to flush inbound I/O to disk, pause for 20 seconds between test cycles

Example

A normal test cycle might resemble the following screenshot.

The script performs numerous tests in the testing cycle, and then extracts the relevant data from the resultant XML file and creates a CSV file with the information that matters.

The output file can be opened with your favorite spreadsheet program. The columns that you will find the most interesting are:

  • WriteRatio
  • IsRandom
  • MB/s
  • IOps
  • Read MB/s
  • Read IOps
  • Write MB/s
  • Write IOps
  • Read and write latencies, broken out by percentile


Download and experiment with it! Remember, storage testing can be dangerous to an IT infrastructure. Not only can you overwhelm one server that’s doing the testing, you can also negatively impact (or even bring offline) the entire storage device and all of the other dependent systems located on it. Do not execute any storage tests in your environment outside of your own workstation until you have the express permission to execute the tests during a pre-specified window of opportunity. Heraflux is not liable for any damage or disruption to your business from you executing these tests in your environment.


Special thanks goes to my friend Mike Fal ( b | t | l ) for helping me with the PoSH in this script. He’s an incredible PoSH resource!

If you have any feedback on these scripts, or have bugs or ideas for improvements, please don’t hesitate to contact me.

Apr 012015
 

drive-harddisk-3As I mentioned in my storage benchmarking post, storage performance is one of the critical infrastructure components underneath a mission-critical SQL Server.

My defacto storage benchmarking utility has been recently updated. Last October, Microsoft released a great free utility called diskspd, and it is freely available at http://aka.ms/DiskSpd. I consider it a very solid modern replacement to the much loved SQLIO. It is a synthetic I/O subsystem workload generator that runs via a command line. It produces similar tests to SQLIO, such as read or write, random or sequential, number of threads and thread intensity, and setting the block sizes, but also gives us some significant improvements.

The benefits of diskspd include:

  • Sub-millisecond granularity on all tests, extremely useful for local SSDs and flash/hybrid storage arrays
  • Ability to perform read AND write tests in the same test, similar to IOmeter
  • Latency output metrics per read and write portions of the test, with standard deviation values
  • CPU consumption analysis by thread during the tests
  • Latency percentile analysis with percentiles 0-99 and then 99.9 up to 99.99999999 and then 100%, which is very useful for finding inflection points at the extremes which can skew test averages
  • Can define the workload placement and size in the command line parameters, which is useful to keep the test cycles compact
  • Ability to set the entropy values used in the workload file generation
  • Output is in plain text with an option to output to XML, which is extremely useful for a result we can convert and use elsewhere

So, how do we use this utility? Simple. Download the executable file from TechNet (source code is available at GitHub for those who are interested) and extract the archive to your file system.

For this example, we’ll use c:\diskspd. Copy the diskspd.exe file from the platform folder of your choice into the c:\diskspd folder to keep the pathing simple.

Some of the command line parameters are as follows.

Parameter Description
-b Block size of the I/O. Specify your unit of size (K/M/G). For example, 8KB block tests should use the flag -b8K.
-c Creates workload file(s) of the specified size. Specify your unit of size (K/M/G). For example, a 50GB workload file can be created with the parameter -c50G.
-d Test duration, in seconds. Tests longer than 30 seconds each are usually sufficient, but longer tests are suitable for production workload testing.
-h Disable hardware and software caching, from within the Windows OS layer. This mimics SQL Server behavior.
-L Capture latency information. Vitally important to SQL Server performance testing.
-o Outstanding I/Os (queue depth) per worker thread. Setting this higher increases the intensity of the test, which pushes your storage harder.
-r Random or sequential. If -r is specified. Random tests are performed. If this parameter is omitted, sequential tests are performed.
-t Worker threads. I usually set this to the number of non-hyperthreaded cores on the server.
-w Read and/or write percentage, based on the percentage of writes. If the test is a pure read test, set this to zero. If the test is a pure write test, set to 100. You can mix reads and writes. For example, if you want to perform a 20% write / 80% read test, set the parameter as -w20.
-Z Workload test write source buffers sized to a specific number of bytes.  Specify your unit of size (K/M/G). The larger the value, the more write entropy (randomness) your workload data file contains. Experiment with this value based on your system and database workload profiles. For example, 1GB source buffer sizes could use the flag -Z1G.

At the end of the line, specify the workload placement location and file name.

Other parameters exist for more advanced workload simulations, so read the great documentation that accompanies the executable.

What if you want to simulate SQL Server? If we are going to do OLTP-type workloads, use the following sample command as a place to get started.

diskspd -b8K -d2 -h -L -o4 -t4 -r -w20 -Z1G -c50G e:\diskspd\io.dat > resultssql.txt

This test executes an 80%/20% read/write test with an 8KB block size test on a 50GB workload file located on the E: drive with four worker threads, each with four outstanding I/Os, an intensity of four outstanding I/O’s per thread, and with a write entropy value seed of 1GB. It saves the output text into a results.txt output file for reference.

You can also save this into a batch or PowerShell script to make this test easily repeatable.

Execute this with Administrator privileges or else you might see an error code about needing permissions to write the file or else it might take longer.

diskspd_01

The output presents some fantastic granular data. Let’s investigate the sections.

diskspd_02

The header simply shows the parameters that you used to run the individual test, including the command line itself.

diskspd_03

The next section shows the CPU consumption, broken out by user and kernel time by worker thread, for the test cycle.

diskspd_04

The next section gets more interesting. We now have the IOPs and throughput metrics, broken out by worker thread and by read and write, for the test. IOPs by read and write matters the most here, with throughput a close second. The operations by thread should be very similar. Higher IOPs and throughput are good values.

diskspd_05

The last section is the most interesting. It presents a percentile analysis of the storage performance from the minimum value up to the maximum value. Look for points of inflection in the data. In this test, you can see that we had a significant inflection point between the 99th and 99.9th percentile. Statistically speaking, less than one percent of our I/Os in this test had a latency greater than 4.123 milliseconds, and less than one tenth of one percent had this latency greater than 11.928 milliseconds.

In conclusion, storage testing with diskspd has never been easier! This utility is now my defacto storage benchmarking tool. Give it a try!

Stay tuned – I’ve got a surprise coming!

Dec 152014
 

drive-harddisk-3

You should benchmark your storage immediately. If you are a database administrator, you should benchmark it yesterday. And today. And next week.

Well, that’s exaggerating things a bit… but storage performance should be at the core of your ongoing performance metric collection and test process. Storage performance matters that much to the business your data powers. If it’s not performing well, it’s your duty to document and then work with the folks responsible to do something about it.

Why does it matter so much?

Consider a modern application server stack.

application stack

Profile the items in this stack for their performance characteristics. Generally speaking, storage performance is slower than memory or CPU. Application code can be all over the place so I’ll disregard that layer for this discussion.

db_statusYour storage slows down your SQL Server. It’s the inherent nature of the technology. So, storage is slow? Why? It’s the most complex component in the stack, has the most moving parts, and historically has the slowest components. Modern flash arrays are working in our favor nowadays, but it’s still slower than memory or CPUs.

Why is it so complex? It’s not just a black box, as usually represented on a lot of system diagrams. Expand the storage component in this stack. It’s not that simple.

storage_is_complexA modern storage platform has many layers and parts, all of which are endlessly customizable by the organization and the storage administrators who architect and maintain such environments. As DBAs, we might not be able to see deep into this stack (or need/care to), but we get the end result of the stack in the form of performance underneath our data.

Three metrics matter to database servers (in this order):

  • Latency (measured in milliseconds)
  • I/Os per second (measured in IOPs)
  • Throughput (Megabytes per second)

The round trip path time from your server to disk and back is the latency. High latencies mean that you highly concurrent database server just cannot get data from the storage back fast enough. It could mean a lot of things – the path between the storage and server is clogged, the storage just can’t keep up with the request (or the requests running in the background), or something else is problematic. Usually these round trips are measured in milliseconds, unless you’re on a flash or flash-hybrid array, and then the performance might be sub-millisecond.

I/Os per second is how many of these transactions you can process concurrently in a second. Faster is always better. The more you can support and sustain, the higher the overall performance and concurrency and scale improve on your servers. For perspective, a single desktop-grade SATA disk can sustain about 100 IOPs. A 15K RPM SAS disk can sustain 175 to 210 IOPs. A lower-end IP-based SAN can usually sustain in the 2500 IOPs range. Flash SANs can handle to 1M IOPs and beyond!

Throughput is a combination of block size and IOPs. If a storage unit can handle a 1000 IOPs read stream with 4KB blocks, you should be reading at 4096 KB/s, or 4MB/s throughput.

IOPs and throughput can also vary by the block size of the transaction. Most Windows NTFS-formatted volumes are formatted with the default 4KB allocation unit size. Microsoft recommends a 64KB NTFS allocation unit size for SQL Server volumes. Generally speaking, if a block size stream on a SAN/NAS doubles, the IOPs are cut in half or the throughput doubles if the storage can handle the performance. For example, on the local SSD in my workstation, we can see this very quickly with one of the utilities I’ll be exploring on this blog soon – SQLIO.

sqlio_blocksize

The parameter -bX is the block size, in KB. The two tests show a sequential read test on my local SSD with both 4KB and 8KB block sizes. The first test did not stress the storage to the maximum performance, so we had headroom to go up, and it did. The throughput doubled with the block size doubled.

From within SQL Server, you can see this fairly easily. SQL Server keeps track of disk latencies by database file, and you can query to see the average values per drive and by file. This query came from Glenn Berry’s diagnostic queries.

— SQL Server 2012 Diagnostic Information Queries
— Glenn Berry
— November 2014
— Last Modified: November 3, 2014
— http://sqlserverperformance.wordpress.com/
— http://sqlskills.com/blogs/glenn/
— Twitter: GlennAlanBerry
— Drive level latency information (Query 24) (Drive Level Latency)
— Based on code from Jimmy May
SELECT [Drive],
CASE
WHEN num_of_reads = 0 THEN 0
ELSE (io_stall_read_ms/num_of_reads)
END AS [Read Latency],
CASE
WHEN io_stall_write_ms = 0 THEN 0
ELSE (io_stall_write_ms/num_of_writes)
END AS [Write Latency],
CASE
WHEN (num_of_reads = 0 AND num_of_writes = 0) THEN 0
ELSE (io_stall/(num_of_reads + num_of_writes))
END AS [Overall Latency],
CASE
WHEN num_of_reads = 0 THEN 0
ELSE (num_of_bytes_read/num_of_reads)
END AS [Avg Bytes/Read],
CASE
WHEN io_stall_write_ms = 0 THEN 0
ELSE (num_of_bytes_written/num_of_writes)
END AS [Avg Bytes/Write],
CASE
WHEN (num_of_reads = 0 AND num_of_writes = 0) THEN 0
ELSE ((num_of_bytes_read + num_of_bytes_written)/(num_of_reads + num_of_writes))
END AS [Avg Bytes/Transfer]
FROM (SELECT LEFT(UPPER(mf.physical_name), 2) AS Drive, SUM(num_of_reads) AS num_of_reads,
SUM(io_stall_read_ms) AS io_stall_read_ms, SUM(num_of_writes) AS num_of_writes,
SUM(io_stall_write_ms) AS io_stall_write_ms, SUM(num_of_bytes_read) AS num_of_bytes_read,
SUM(num_of_bytes_written) AS num_of_bytes_written, SUM(io_stall) AS io_stall
FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs
INNER JOIN sys.master_files AS mf WITH (NOLOCK)
ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
GROUP BY LEFT(UPPER(mf.physical_name), 2)) AS tab
ORDER BY [Overall Latency] OPTION (RECOMPILE);

You can see each drive represented here on my storage (and it’s busy in the background on one of the volumes so I can drive up some latency counters here). The latency counters are in milliseconds. Remember, these are averages, so if you have a minor blip while it runs, it can throw off the averages significantly.

sql server latency

help-about-3That leads me to a good transition – what do you use to actually test the storage? A lot of options are available – and most of them are free.  Over the next month or so, we’ll be exploring here how to use some of my favorite free disk benchmarking utilities, including the following:

Microsoft SQLIO (with SQLIO Batch)

FIO

IOMeter

DiskSpd (new from Microsoft)


pictograms-hazard_signsWarning! Disk benchmarking utilities can put a significant strain on your storage subsystem. Unstable storage devices can actually crash under the load. Stable storage can suffer performance degradation of your workload, as well as everything else running on it. DO NOT run any storage test on a production environment without the express permission of those responsible for the environment. I will not be responsible for any disruption in service to any system that has a stress test run on it inappropriately.


applications-engineering-3Now, with that out of the way, stay tuned for more posts soon on how to benchmark the storage underneath your SQL Servers, and how to simulate OLTP traffic on various disks with some advanced parameters of these tools!

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close