This isn’t as simple to fix as you might think.
From subqueries to CTEs, the TSQL rewrites shown here don’t fix the problem.
Transcript
Let’s look at some query rewrites that don’t actually help, but seem like they might.
We’re looking at a version of our query that isn’t the stored procedure. I’ve rewritten this with the literal value just for convenience, to make it easy and lightweight.
This is our original query, but I have made one change here.
I moved that join criterion that used to be in the where clause, I moved it into the join.
I moved what used to be a predicate logically written into the where clause into the join between the two tables. I’m going to look at my estimated plan for this— I just did control + l, the same as this button up here.
Hmm, Matthew, he got that nested loop plan again. Hovering over our Select, I can see that my estimated subtree cost for the whole query is that familiar 19.75 that comes along with our familiar estimate of 6,016.2.
This moving of the predicate from the where clause into the join, it’s an inner join, and this is logically the same operation.
That didn’t do us any good.
A second attempt: let’s try a subquery
Well, what if we rewrote our query a little bit and this time, in our where clause we’re going to put a subquery in there and say, “I want you to select the FirstNameId from ref.FirstName where the first name is Matthew, and then, put that in the where clause.”
Well, let’s go ahead and this time let’s make sure that you’ve got the cold cache, and let’s do this one with actual execution plans for fun. This one is worth waiting for, so we’ll get that started, get it going and hmm, we’re already past four seconds there.
This is getting us the slow query for this guy. It’s still going, we’re at 13 seconds, man, this subquery plan didn’t work out. We ended up at 14 seconds. Looking at our plan, I actually do have a slightly different estimated subtree cost, but it went the wrong way.
This is now a subtree cost that’s lower instead of higher, and when I go back over and I look at things like my estimate, I’ve still got the same problematic estimate, it just played out into a slightly different, very slow execution plan.
This is still being executed as a single query.
It’s not able to just run the subquery first and then use the data and optimize more later.
This is all optimized at once before the start of execution.
Let’s look at a CTE
This is a different rewrite. This time I’ve taken my query for FirstNameId from the little table and it’s not a join, it’s not in the where clause, but this time it’s a common table expression at the top.
I’m just going to do our estimated plan for this one. We’ve got that familiar estimated subtree cost of 19.75. This is the same as our original query, and sure enough, there is our very familiar estimated number of rows, that didn’t change anything either.
What if we tell SQL Server to recompile?
I’m going to go ahead and make sure actual plans are on, and they are, and get this one running, and it’s going, it’s going and going. We’re going to talk more about recompile hints. Recompile means I want you to not reuse a plan you have before, and I want you to compile a special plan just for this query.
Sometimes, you find things on the internet that make it seem like maybe recompile’s going to be magical and will solve our problem, but in this case the recompile hint we get a special unique execution plan that’s only for us, and it still did that {same thing}. It’s the cost of 19.75.
It didn’t change our estimates at all. It didn’t make that FirstNameId any more visible or accessible than before. So these rewrites, although they might be like, “Well maybe if I do this…,” they don’t fix that essential problem of the fact that when we are expressing our predicate on the little table on the FirstName, we aren’t providing the FirstNameId at all.
We will take a look at different solutions that do have more of an effect though.