Capturing SQL Server IO Latencies for a Period of Time with PowerShell

Posted in: Microsoft SQL Server, Technical Track

Today’s blog post is a demonstration of the PowerShell approach to Paul Randal’s recent blog post, Capturing IO Latencies for a Period of Time. Since wait statistics were properly implemented in SQL Server, it turned into a powerful resource to diagnose and troubleshoot SQL Server issues.

Paul Randal is, IMHO, one of  the best names in this technology.  I requested his approval to use and modify his code to demonstrate another approach that can be used not only in this situation, but in others where you need to capture data in an interval and  don’t want to use SQL Server resources. Paul, as always, was very kind to allow me to use his implementation to demonstrate mine—thanks a lot, Paul.

In his excellent blog post Capturing IO Latencies for a Period of Time, Paul demonstrates how to capture the wait stats in an interval of 30  minutes using only T-SQL.

If you have a very busy system perhaps leaving this job to trigger the gathering to another resource than SQL Server may be a good idea. Those who know me, know that I am PowerShell lover, so let’s take a look how to do that using PowerShell.

Before we proceed, I strongly encourage readers to check out my good friend Ed Wilson’s fantastic blog post Use Asynchronous Event Handling in PowerShell to understand timer event handlers, asynchronous .NET events, or the cmdlets to work with events.

A quick note about .NET System.Timers.Timer class. 

It is simple—the namespace System.Timers fires an event in a specific interval and the class Timer generates recurring events. In other words, I can set up an event to be fired on each x unit of time (milliseconds.)

The first thing to do it it is to create the object System.Timers.Timer and define the interval that we want. For now, let’s use 500 milliseconds  by setting the property interval.

$Timer= New-Object System.Timers.Timer -Property @{ Interval = 100;autoreset=$true }

We instanced the timer and set up the interval. Now we need to register the object (in the case object $timer) that will fire the event and execute the action. For that we will use the Register-Objectevent cmdlet The parameter Eventname we will use Elapsed that is detailed in Use Asynchronous Event Handling in PowerShell  and also in Manage Event Subscriptions with PowerShell.

First, let’s just print something on the screen:

Register-ObjectEvent -InputObject$timer  -Action { Write-Host‘Holy PowerShell. It Worked!!!! ‘} -EventName Elapsed -SourceIdentifier stateful

Id              Name            State      HasMoreData     Location             Command

—              —-            —–      ———–     ——–             ——-

1               93ce50c3-6f5… NotStarted False                                 Write-Host ‘Holy Powe…

If you simply run this code, you will realize that after 30 seconds nothing happens.

Wait, PowerShell is broken? No my dear friend, the .net class is so beautiful that it allows you start and stop the suppression of the event. Yes, you can start the event and will be recurring every 30 seconds, and also stop it.

So let’s start:

$Timer.Start()

Holy PowerShell. It Worked!!!!

Holy PowerShell. It Worked!!!!

Holy PowerShell. It Worked!!!!

Holy PowerShell. It Worked!!!!

Holy PowerShell. It Worked!!!!

Holy PowerShell. It Worked!!!!

Holy PowerShell. It Worked!!!!

And to stop it :

$Timer.Stop()

 

And to Unregister the event use :

Unregister-Event -SourceIdentifier stateful

How PowerCool is that?

Let’s get back to our example. Using Paul’s code, let’s first create the table to store the data. I did some changes to store the date and time so you also can have the historical data if you want and the table is physical.

SELECT [database_id], [file_id], [num_of_reads], [io_stall_read_ms],

[num_of_writes], [io_stall_write_ms], [io_stall],

[num_of_bytes_read], [num_of_bytes_written], [file_handle]

INTO IOLatency

FROM sys.dm_io_virtual_file_stats(NULL, NULL);

GO

alter table IOLatency add [TimeCapture] datetime default getdate()

alter table IOLatency add [Identifier] bigint

 

I added this column identifier because my TSQL it is not so great and I want (like Paul’s example) to get the difference between the last top samples between 30 minutes, so this column will have the same value for the last two gatherings and I can easily create the TSQL condition for that.

Now let’s play with PowerShell: We need to import the module SQLPS to use the invoke-sqlcmd cmdlet . Also in the Action parameter of the Register-Objectevent cmdlet I am using a scriptblock variable to be more friendly readable. Also I am using the interval as Paul, 30 minutes, so the difference always will be in the 2 last same date times with 30 minutes of difference.

 

$Timer=New-Object System.Timers.Timer -Property @{ Interval = 1800000 ;autoreset=$true } #1800000 30 minutes in milliseconds

#Create the ScriptBlock variable to run in the action parameter in the register-objectevent cmdlet

$Action= {

#Import the SQLPS Module

Import-Module SQLPS -ErrorAction SilentlyContinue -DisableNameChecking;

#variable to counter the numb er of the events

$Script:counter+= 1;

#if the events is more than 2 , increase the the identity

if ($counter-eq 3) {

$script:Identity+= 1;

$counter= 1

}

#Define the TSQL

$Tsql =@”

insert into IOLatency ([database_id], [file_id], [num_of_reads], [io_stall_read_ms],

[num_of_writes], [io_stall_write_ms], [io_stall],

[num_of_bytes_read], [num_of_bytes_written], [file_handle],[Identifier])

SELECT [database_id],[file_id], [num_of_reads], [io_stall_read_ms],

[num_of_writes], [io_stall_write_ms], [io_stall],

[num_of_bytes_read], [num_of_bytes_written], [file_handle],$($Identity)

FROM sys.dm_io_virtual_file_stats (NULL, NULL);

“@;

#Invoke the TSQL

invoke-sqlcmd -ServerInstance vader -Database Test -Query$Tsql

}

Register-ObjectEvent-InputObject$timer  -Action  $Action  -EventNameelapsed  -SourceIdentifierstateful

Not it is juts start ….

$Timer.Start()

And every 30 minutes the table IOLatency will be populated. If you want to stop the gathering but not unregistered the event :

$Timer.Stop()

This way if you want to start again, just timer.start().

Now it is time to get the data. Using The Paul´s query  with a change to get the same identifier :

 

WITH [DiffLatencies] AS

(SELECT

— Files that weren’t in the first snapshot

[ts2].[database_id],

[ts2].[file_id],

[ts2].[Identifier],

[ts2].[num_of_reads],

[ts2].[io_stall_read_ms],

[ts2].[num_of_writes],

[ts2].[io_stall_write_ms],

[ts2].[io_stall],

[ts2].[num_of_bytes_read],

[ts2].[num_of_bytes_written]

FROM IOLatency AS [ts2]

LEFT OUTER JOIN IOLatency AS [ts1]

ON [ts2].[file_handle] = [ts1].[file_handle]

and ts2.Identifier = ts1.Identifier

WHERE [ts1].[file_handle] IS NULL

UNION

SELECT

— Diff of latencies in both snapshots

[ts2].[database_id],

[ts2].[file_id],

[ts2].[Identifier],

[ts2].[num_of_reads] – [ts1].[num_of_reads] AS [num_of_reads],

[ts2].[io_stall_read_ms] – [ts1].[io_stall_read_ms] AS [io_stall_read_ms],

[ts2].[num_of_writes] – [ts1].[num_of_writes] AS [num_of_writes],

[ts2].[io_stall_write_ms] – [ts1].[io_stall_write_ms] AS [io_stall_write_ms],

[ts2].[io_stall] – [ts1].[io_stall] AS [io_stall],

[ts2].[num_of_bytes_read] – [ts1].[num_of_bytes_read] AS [num_of_bytes_read],

[ts2].[num_of_bytes_written] – [ts1].[num_of_bytes_written] AS [num_of_bytes_written]

FROM IOLatency AS [ts2]

LEFT OUTER JOIN IOLatency AS [ts1]

ON [ts2].[file_handle] = [ts1].[file_handle]

and ts2.Identifier = ts1.Identifier

WHERE [ts1].[file_handle] IS NOT NULL)

SELECT

DB_NAME([vfs].[database_id]) AS [DB],

LEFT([mf].[physical_name], 2) AS [Drive],

[mf].[type_desc],

[num_of_reads] AS [Reads],

[num_of_writes] AS [Writes],

[ReadLatency(ms)] =

CASE WHEN [num_of_reads] = 0

THEN 0 ELSE ([io_stall_read_ms] / [num_of_reads]) END,

[WriteLatency(ms)] =

CASE WHEN [num_of_writes] = 0

THEN 0 ELSE ([io_stall_write_ms] / [num_of_writes]) END,

/*[Latency] =

CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)

THEN 0 ELSE ([io_stall] / ([num_of_reads] + [num_of_writes])) END,*/

[AvgBPerRead] =

CASE WHEN [num_of_reads] = 0

THEN 0 ELSE ([num_of_bytes_read] / [num_of_reads]) END,

[AvgBPerWrite] =

CASE WHEN [num_of_writes] = 0

THEN 0 ELSE ([num_of_bytes_written] / [num_of_writes]) END,

/*[AvgBPerTransfer] =

CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)

THEN 0 ELSE

(([num_of_bytes_read] + [num_of_bytes_written]) /

([num_of_reads] + [num_of_writes])) END,*/

[mf].[physical_name],

[vfs].identifier

FROM [DiffLatencies] AS [vfs]

JOIN sys.master_files AS [mf]

ON [vfs].[database_id] = [mf].[database_id]

AND [vfs].[file_id] = [mf].[file_id]

— ORDER BY [ReadLatency(ms)] DESC

ORDER BY [Identifier],[WriteLatency(ms)] DESC;

GO

 

If you want to to get only one sample , just add a

ON [vfs].[database_id] = [mf].[database_id]

AND [vfs].[file_id] = [mf].[file_id]

AND [vfs].identifier = <IdentifierNumber>

 

This way you also can know what time was made this gathering, just checking the timecapture column in the identifier number. In other words, you also have a historical data that you can query and get the differences or make any calculations that you need.

The data is stored—you can play with whatever query you want and need.

If you are thinking of gathering perfmon counters, it is possible too. Take a look at my blog post on Simple-Talk, The PoSh DBA – Specifying and Gathering Performance Counters and just play with the timer class.

I guess that you realize you can use this approach in any scenario that you need to capture data in a busy system without using any kind (or a minimum)  of SQL Server resources. IMHO, that can be a good approach in those kind environments

Again, I would like to thank Paul Randal for kindly allowing me to use and modify to use his code and blog post to demonstrate my approach, and of course the Hey, Scripting Guy! blog.

 

Remember kids… If it is PowerCool, it is PowerShell!

email

Author

Interested in working with Laerte? Schedule a tech call.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *