The idea of this blog post is to describe what the delayed durability feature is in SQL Server 2014 and to describe a use case from an application development perspective.
With every new SQL Server release we get a bunch of new features and delayed durability of transactions really caught my attention. Most of the relational database engines are used to handle transactions with the write ahead log method(https://en.wikipedia.org/wiki/Write-ahead_logging), basically a transaction comes into the database, and in order to successfully commit a piece of information it will flush the pages from the memory, then write to the transaction log and finally to the datafile, always following a synchronous order, since the transaction log is pretty much a log of each transactions, recovery methods can even try to get the data from logs in case the data pages were never committed to the datafile, so as a summary this is a data protection method used to handle transactions, MSDN calls this a transaction with FULL DURABILITY.
So what is Delayed Transaction Durability?
To accomplish delayed durability in a transaction, asynchronous log writes happens from the buffers to the disk. Information is kept in memory until either the buffer is full or a flush takes place. This means instead of flushing from memory, then to log and then to datafile, the data will just wait in memory and the control of the transaction will be restored to the requestor app faster. If a transaction initially only hits memory and avoid going through the disk heads, it will for sure complete faster as well.
But when is the data really stored in disk?
SQL Server will handle this depending on how busy/full the memory is and will then execute asynchronous transactions to finally store the information in disk. You can always force this to happen with this stored procedure “sp_flush_log”.
Ok But there is a risk, right?
Correct, since the original data protection method is basically skipped, in the event of a system disruption such as SQL Server doing a failover or simply “unexpectedly shutting down”, some data can be lost in the so called LIMBO that is somewhere between the application pool and the network cable.
Why would I want to use this?
Microsoft recommends to use this feature only if you can tolerate a data loss, if you are experiencing a bottleneck or performance issue related to log writes or if your workload have a high contention rate(processes waiting for locks to be released.)
How do I Implement it?
To use delayed transactions you should enable this as a database property. You can used FORCED option which will try to handle all transactions as delayed durable, you can use ALLOWED which will let you use delayed durable transactions, which you then need to specify in your TSQL(this is called atomic block level control), see a sample taken from MSDN below:
CREATE PROCEDURE …
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS BEGIN ATOMIC WITH
(
DELAYED_DURABILITY = ON,
TRANSACTION ISOLATION LEVEL = SNAPSHOT,
LANGUAGE = N'English'
…
)
END
For more syntax and details I invite you to check the so full of wisdom MSDN Library.
Enough of the background information, and let’s take this puppy for a ride, shall we?
Consider the following scenario: You manage a huge application, probably some application between an ERP and a Finance module. The company has developed this application from scratch, each year more and more features are added in this app. The company decides that they want to standardize procedures and want to have more control over the events of the application. They realize they do not have enough audit traces, if someone deletes data, if a new deal or customer information is inserted, management needs to have a track record of almost anything that happens. They have some level of logging, but is implemented differently depending on the developer taste and mood.
So, Mr MS Architect decides they will implement enterprise library logging block, and will handle both exceptions and custom logging with this tool. After adding all this logging to the events, the system begins to misbehave and the usual slow is now officially really slow. Mr Consultant then comes in and suggest that the logging data is moved to a separate database, also this database should use Delayed durability, by doing so, transactions related to logging events will have less contention and will return the control faster to the application, some level of data loss can be tolerated which also makes the decision even better.
Let’s build a proof of concept and test it..
You can find a sample project attached: WebFinalNew
You need to have enterprise library installed in your visual studio. For this sample I am using Visual Studio 2010.
You need to create 2 databases, DelayedDB and NormalDB (Of Course we need to use SQL Server 2014)
Use the attached script LoggingDatabase (which is part of the scripts of Enterprise library), it will create all the objects needed for the application log block.
In the DelayedDB, edit the properties and set the Delayed Durability to FORCED, this will make all transactions to have delayed durability(please note some transactions will never be delayed durable such as system transactions, cross-database transactions, and operations involving FileTable, Change Tracking and Change Data Capture)
You need to create a windows web project, it should have a web.config , if not you can manually add a configuration file:
Make sure you add all the application block references(Logging Block)
Now right click over the web.config or app.config file and edit your enterprise library configuration
In the database Settings block, add 2 new connections to your database(one for NormalDB and the other for DelayedDB), make sure to specify the connection in the form of a connection string like the picture below:
In the Logging block, create a new category called DelayedLogging, this will point to the database with delayed durability enabled.
Then add 2 database Trace listeners, configure General Category to point to “Database Trace Listener” and then configure DelayedLogging Category to point to “Database Trace Listener 2”. Configure each listener to point to the corresponding database(one to each database previously configured in the Database block)
Save all changes and go back to the application, configure the design layout with something like below
Add a codebehind to the button in the upper screen and build a code that will iterate and send X amount of commands to each database, track the time it takes to send the transaction and regain control of the application into a variable, check the attached project for more details, but use logwriter.write and pass as parameter the category you configured to connect to DelayedDB(DelayedLogging
) and the general category(default, no parameter) to connect to NormalDB. See a sample of how a logging transaction is fired below:
logWriter.Write("This is a delayed transaction","DelayedLogging");
logWriter.Write("This is a transaction");
This code will call the logging block and execute a transaction on each database, the “normal” database and the durable one, it will also track milliseconds it takes to return the control to the application, additionally I will have performance monitor and query statistics from the database engine to see the difference in behavior.
Quick Test
Batch insert of 1000 rows, a normal database took 1 millisecond more in average per transaction to return the control to the application:
What information we have from sys.dm_io_virtual_file_stats?
Database | io_stall_read_ms | num_of_writes | num_of_bytes_written | io_stall_write_ms | io_stall | size_on_disk_bytes |
DelayedDB | 47 | 5126 | 13843456 | 4960 | 5007 | 1048576 |
Normal | 87 | 5394 | 14492160 | 2661 | 2748 | 1048576 |
We can see that the same amount of data was sent to both databases(last column size_on_disk_bytes), interesting observation are the stalls, in a delayed durable database the stall will be higher for writing, this means despite the fact that the transaction is executed “faster”, what really means is that it returns the control to the application faster, but the time it takes to actually store the data to disk can be higher since is done in async mode.
Let’s see a quick graphic of the performance impact
Delayed Durability
With a Delayed Durability the disk queue length average is higher, since it will wait to fill the buffer and then execute the write. You can appreciate the yellow peak(within the red circle) after the transaction completes, it will execute pending writes( moment where I issue a “sp_flush_log”.).
Full Durability
With a Full Durability the disk queue length average is lower, since it will sequentially execute the writes there will be less pending transactions in memory.
Conclusion
Delayed Durability feature is definitely a great addition to your DBA toolbelt, it needs to be used taking in consideration all the risks involved, but if properly tested and implemented it can definitely improve the performance and architecture of certain applications. Is important to understand this is not a turbo button(like some people does with the nolock hint) and it should be used for certain types of transactions and tables. Will this change your design methods and make you plan for a separate delayed durable database? or plan to implement certain modules with delayed durable transactions? This for sure will have an interesting impact on software design and architecture.
No comments