Please Compress Your Indexes and Shrink Your Databases if you use Azure SQL Managed Instance

By Kendra Little on August 26, 2024

Shrinking databases in SQL Server isn’t fun – it’s slow, it causes blocking if you forget to use the WAIT_AT_LOW_PRIORITY option, and sometimes it persistently fails and refuses to budge until you restart the instance. You only want to shrink a SQL Server database when you’ve got a good reason and a lot of patience.

If you’re using Azure SQL Managed Instance and you haven’t already used data compression on your indexes and shrunk your databases, you probably have two good reasons to do both of those things: performance and cost reduction.

Use page compression on your data to make the most of memory

Compressing indexes reduces the amount of data pages that indexes take up in storage. That reduces the amount of physical reads your queries do to pull that data into memory. That’s terrific, but it’s only half the benefit: compressing those data pages helps you fit more data into memory at a given time, which further reduces physical IO. Physical IO is your enemy in the video game that is administering Managed Instance.

You don’t get a ton of memory per vCore with Managed Instance:

  • Standard series hardware: 5.1 GB RAM per vCore. 16 vCore= 81.6GB RAM.
  • Premium series hardware: 7 GB RAM per vCore. 16 vCore = 112 GB RAM.
  • Premium series memory-optimized hardware: 13.6 GB RAM per vCore. 16 vCore=217GB RAM.

All of these amounts of RAM make me sad. That memory-optimized option doesn’t seem very memory-optimized to me. But the lower options are basically anti-patterns. To make a SQL Server workload perform well, you want to make sure you can keep frequently accessed data in cache. This is especially true when you have slow storage, which is the case with Managed Instance.

Yes, the storage is faster on the Business Critical tier, but at a huge cost— and even then, it ain’t all that fast and your writes will get gated by waits due to synchronous replicas.

If you can use a combination of compression, data pruning, and index tuning to make the General Purpose tier work for you, you will get a lot more value out of the service, without the synchronous commit waits. But you may have to shrink your databases, too, because…

Storage sizes are tied to vCore allocation on Managed Instance

My least favorite thing about Managed Instance is that the amount of storage you need on an instance can dictate the number of vCores you must allocate. This can really drive up your monthly costs if you have a significant amount of data that is not read often– and that’s the case for most databases that have been around for longer than a year or two.

Here are some storage limits based on vCores. These limits are for total space used across all system and user databases – including tempdb. The prices listed are at pay-as-you-go rates as of today in the East US region if you are using “memory-optimized” hardware:

General Purpose:

  • Need more than 2TB? 8 vCore minimum. $2,375.04/month for 8TB.
  • Need more than 8TB? 16 vCore minimum. $4,753.76 /month for 16TB.
  • Need more than 16TB? Sorry, you can’t have that. (You can have more cores, but this list is based on storage sizes.)

Business Critical:

  • Need more than 1TB? 8 vCore minimum. $3,383.12/month for 2TB.
  • Need more than 2TB? 16 vCore minimum. $6,774.24/month for 4TB.
  • Need more than 4TB? 24 vCore minimum. $10,037.36/month for 5.5TB.
  • Need more than 5.5TB? 32 vCore minimum. $25,364.16/month for 8TB.
  • Need more than 8TB? 48 vCore minimum. $38,051.04/month for 12TB.
  • Need more than 12TB? 64 vCore minimum. $50,737.92/month for 16TB.
  • Need more than 16TB? You can’t have that. But you could still pay $96,570.24/month for 16TB at 128vCore. Yikes.

Note: Business Critical SKUs with more than 24 vCores were not available for pricing at the time of this writing in the East US Region. I had to look in the Central region to get pricing on them. I have found the availability of larger vCore Business Critical instances to vary a lot in East US.

Don’t shrink TOO far if you’re on General Purpose V1

General Purpose V1 is very odd: throughput for your files is determined by their size.

You never want to shrink your files too far, just to end up in a cycle of constant shrinking and growing. But you also don’t accidentally dip under some of these IOPS thresholds.

There is an additional limitation of “Max 120 MiB/s per instance” for transaction log files under GPV1, and I can confirm that I haven’t seen much of a difference in performance between a 130GB log file and a 514GB log file. (I haven’t been all that scientific about it, though.)

Thankfully, this whole “IOPS based on file size” thing is going away in General Purpose V2 (in Public Preview at this time). But it will always be useful to stay fresh on what limitations apply to your configuration, and what configurations are possible.

The cloud is different

With on-prem databases, it’s often not such a big deal to let “maybe someday we will need it” types of data collect in your database, sprawling in non-compressed fields of bytes. The cloud is different. With PAAS databases, it can take a bit of time to understand how the various SKU options impact your costs, and how to get the best deal of them.

To recap, when it comes to Azure SQL Managed Instance, you can get good performance even on General Purpose, but to get there you really need to make your active working set of data fit into memory as much as it possibly can. This list will help you get there:

  • Maximize your memory per core with the Memory-Optimized sku (even if it’s not all THAT optimized)
  • Tune your indexes to minimize reads for your frequent queries
  • Delete old data where you can, truncate tables you don’t need
  • Use page compression on All The Things!
  • Shrink your databases if the steps above leave you with significant empty space

These steps should at least help you lower that storage slider for some cost savings, but it can often help you fit into a smaller vCore count and save loads of money over time.

I do like Managed Instance – and I’m excited about the future of the General Purpose tier

While there are many things I want to change about Managed Instance, you can get good value from the service in the right configuration, especially if you work to minimize your costs.

Here are three things I love about Managed Instance:

  • It’s good at Availability Groups. Whether you are using Business Critical with synchronous AGs or asynchronous AGs in a Failover Group (or both), it really takes the headache out of managing those replicas. This is not an easy thing to automate. I’ve found a gotcha or two, but I’m super impressed with how well it usually works.
  • It’s not ALL Availability Groups. I love that the General Purpose tier in Managed Instance uses a Failover Cluster like architecture. I don’t want the overhead of synchronous Availability Group replicas unless there is a specific need for them. For many purposes, having a failover cluster architecture in the primary region and an asynchronous secondary using a Failover Group in another region is a terrific fit. This is not available in Amazon RDS for SQL Server.
  • It’s much faster than it used to be to create and scale instances. A lot of the pains there have been fixed –as long as you’ve got the 2022 feature wave applied to your subnets. Ask your account rep about upgrading if you’ve been using Managed Instance for a while and don’t have that, it makes a lot of things better.

I’ll expand on these in a future post after I look a bit more at General Purpose V2, which is in public preview.