Blogs

Buyer Beware: Azure SQL Managed Instance Storage is Regularly as Slow as 60 Seconds

What are your stories of unbelievably bad performance from cloud vendors? I’ll go first. For years, Azure SQL Managed Instance’s General Purpose Tier has documented “approximate” storage latency as being “5-10 ms." This week they added a footnote: “This is an average range. Although the vast majority of IO request durations will fall under the top of the range, outliers which exceed the range are possible.” How approximate is that 5-10 milliseconds, you might wonder?

Continue reading

Query Hash Values are Meaningless in SQL Server: They May be Reset to be the Same Value as the Query Plan Hash

This is the worst bug I’ve found in SQL Server to date. Previously, my top find was SQL Server Online Index Rebuild sometimes happens offline without warning. This one has taken top slot because it makes my life more difficult on a daily basis. Background: SQL Server generates a query_hash for each query. This is stored in sys.query_store_query and it’s one of the primary ways you can identify what a query is across different Query Stores, or even the same Query Store over time, as surrogate query_id values get reset if Query Store is cleared or data ages on.

Continue reading

What Is the CPU Wait in Datadog SQL Server Monitoring? How does it Compare to Waiting on CPU?

I use Datadog on a regular basis, and I’m a pretty huge fan. The monitoring pack for SQL Server (and its PAAS variants) is still pretty rudimentary, but it evolves regularly. That’s NOT what I’m a fan of, though. What makes me a raving fan is the flexibility of Datadog’s notebooks and dashboards, combined with the ability to create all sorts of custom metrics and monitors. There are always things in SQL Server monitoring packs that I have strong opinions about.

Continue reading

How to Survive Opening A Microsoft Support Ticket for SQL Server or Azure SQL

Asking Microsoft for support for SQL Server or Azure SQL is a lousy experience these days. This is true whether you are using a cheaper service tier or the more expensive support tier formerly known as “Premiere Support.” Either way, I’ve found the same issues: as the person requesting support, I must know a whole lot about the root cause of my problem and how to solve it, or my request will be dismissed with misinformation. I need to have data and metrics that back up my claims in order to get the ticket escalated to someone who can help, and I will need to provide those receipts three or four times. Once something is escalated to the Product Group, I may get a helpful response, but it will generally take a while. If I’m not engaged directly with the Product Group and the answer is being relayed through a lower support tier, it often won’t make much sense.

These issues don’t happen due to bad work ethics or personal failings of support workers. These are good humans, who are trying their best! The problem is worse, because it’s systemic.

Continue reading

SQL Server page compression: should you worry about CPU usage increasing on on inserts, updates, and deletes?

Every time I share a recommendation to use data compression in SQL Server to reduce physical IO and keep frequently accessed data pages in memory, I hear the same concern from multiple people: won’t this increase CPU usage for inserts, updates, and deletes? DBAs have been trained to ask this question by many trainings and a lot of online content – I used to mention this as a tradeoff to think about, myself– but I’ve found this is simply the wrong question to ask.

Continue reading

Free Script to Automate Unforcing Failed Forced Plans in Query Store (SQL Server)

tldr; I’ve published a script to loop through all databases on an instance, identify if there are any query plans in a problematic “failed” forced state (which can hurt query performance), and un-force them if found. Get the dbo.dba_QueryStoreUnforceFailed stored procedure on GitHub. This script is designed to work on SQL Server on-prem, in a VM, or in Azure SQL Managed Instance or SQL Server RDS. Since the script is instance-level and loops through all databases, this isn’t really designed for Azure SQL Database – and you don’t get a SQL Agent there anyway, so you probably want to change this around for that use case.

Continue reading

Please Compress Your Indexes and Shrink Your Databases if you use Azure SQL Managed Instance

Shrinking databases in SQL Server isn’t fun – it’s slow, it causes blocking if you forget to use the WAIT_AT_LOW_PRIORITY option, and sometimes it persistently fails and refuses to budge until you restart the instance. You only want to shrink a SQL Server database when you’ve got a good reason and a lot of patience.

If you’re using Azure SQL Managed Instance and you haven’t already used data compression on your indexes and shrunk your databases, you probably have two good reasons to do both of those things: performance and cost reduction.

Continue reading

Error 1119 when Shrinking Database: Removing IAM Page Failed

At times when shrinking a data file in a SQL Server or Azure SQL Managed Instance/Database, shrink operations may persistently fail with the error:

Msg 1119, Level 16, State 1, Line 11 Removing IAM page ([filenumber]:[pagenumber]]) failed because someone else is using the object that this IAM page belongs to. DBCC execution completed. If DBCC printed error messages, contact your system administrator.

There’s not much documented on this error anywhere that I can find, so I’m sharing my experience with this error. tldr: I was not able to get past this without restarting the SQL Server service.

Continue reading

General Failure Failed Forced Plans in Query Store Cause Even Slower Compile Times

This post demonstrates two related bugs with plan forcing in Query Store which increase the likelihood of slower query execution and application timeouts in SQL Server environments.

These bugs are most likely to impact you if:

  • You use the Automatic Plan Correction feature in SQL Server, which automatically forces query plans.
  • Anyone manually forces query plans with Query Store.
  • You have slow storage, which can increase your likelihood of having longer compilation times.

The General Purpose tier of Azure SQL Managed Instance and Azure SQL Database feature both slow storage and Automatic Plan Correction enabled by default. So, weirdly enough, your risks of suffering from this problem are high if you are an Azure SQL customer.

Thanks to Erik Darling for his help in diagnosing and reproducing these issues– and his “slow compiler” query used in this post was incredibly helpful to isolate and narrow down these problems.

Continue reading

Slow Storage Can Cause Slow Compilation Time in SQL Server

Up till now, I’ve thought of compilation time in SQL Server as being dependent only on CPU resources– not something that requires fast storage to be speedy. But that’s not quite right. Slow storage can result in periodic long compile time in SQL Server. And long compile time not only extends the runtime for the query, it can also result in blocking with waits for compile locks. Thanks to Erik Darling for helping me figure this out, and explaining this all in his video, What Else Happens When Queries Try To Compile In SQL Server: COMPILE LOCKS!

Continue reading