Opinions expressed on this site are solely those of Kendra Little of Catalyze SQL, LLC. Content policy: Short excerpts of blog posts (3 sentences) may be republished, but longer excerpts and artwork cannot be shared without explicit permission.
on December 1, 2017
I had a great time this week talking about isolation levels with Hekaton – aka In-Memory OLTP – aka Memory Optimized tables. Here’s a video of the recorded webcast.
Thanks to everyone who attended, I sincerely enjoyed your questions and comments – you make giving these webcasts fun. Links to scripts, blog posts, and a transcript are below the video.
Note: this webcast ran while I was still experimenting with different ways to capture audio and video for live events
Scripts
Grab the scripts from this gist.
Links for In-Memory OLTP
“SQL Server In-Memory OLTP Internals for SQL Server 2016” Whitepaper by Kalen Delaney
“Considerations around validation errors 41305 and 41325 on memory optimized tables with foreign keys” by Denzil Ribeiro
“What’s New” books online for In-Memory OLTP (link on page for 2017 what’s new)
Two blog posts on Memory Optimized Table variables from Michael J Swart - why he likes them, and why he needs to postpone using them in his environment.
Ned Otter blogs about In-Memory OLTP, SQL Server, and more at http://nedotter.com
Transcript
This is a transcript of a live webcast, so it’s chatty and not written as a blog post would be. This transcript was prepared by robots and by one woman chasing a corgi, so please forgive word errors, atrocious spelling, inconsistencies, and grammatical goofs.
It is time to talk about isolation levels and memory optimized tables in SQL Server. Welcome to the webcast!
I am recording this, and I will attempt to publish this after the webcast. I’ll put a link in the quizletter as well as the follow-up email– so fingers crossed that it goes well, and that the recording is successful.
What is In-Memory OLTP?
First off, I’m not going to spend a ton of time on this, but I’m talking about isolation levels and In- Memory OLTP. Big picture what is In Memory OLTP in SQL Server? This is an entirely new way that data can be stored, accessed, and manipulated in SQL Server that was written in — well it was released first in SQL Server 2014.
The name implies that what’s special about it is that the data is in memory. But, you know, we use memory for other forms of data storage. We use memory for columnstore, we use memory for disk based indexes.
What is special about In Memory OLTP is that not just that the data resides in memory, but that the data is really optimized so that primarily the point of this was that we could do lots of little bitty transactions, lots of little bitty modifications fast without blocks. Without latches. So that blocking isn’t slowing down tons of tiny inserts.
Another really cool thing – something that I think I take for granted, but that is really cool, is that although this technology was redesigned from the ground up, you don’t have to create a database where only In Memory OLTP objects reside. You can modify normal databases in SQL Server and have disk based indexes, columnstore indexes – which are also disk based but can be on In Memory as well, as well as In Memory OLTP all in the same database. And you can manage it with SQL Server Management Studio. It doesn’t take special tools to manage this. It’s all integrated with the same SQL Server surface area.
When can In-Memory OLTP be used?
This can be used starting in SQL Server 2014. It’s much less limited in SQL Server 2016– they added a lot more to it– and an important changes in SQL Server 2016 Service Pack 1, we got the ability to use In Memory OLTP in quote “lower editions” in SQL Server. The less expensive editions in SQL Server.
So if you want to start experimenting with this, it becomes easier.
One of the reasons I was really glad about this– it’s not that I really want to use In Memory OLTP on Web Edition. But with a new feature like this, where everything is completely new and completely different, we’re not always that excited to go rush into changing our most critical tables in our most critical applications. When we start integrating this into an environment, we want to find places where it may make sense to use, where we can test out and see: do we get benefits? But they’re– maybe it’s a logging usage, where we have a way that if it goes down our primary infrastructure isn’t impacted. It’s really healthy to start experimenting with new features on less critical parts of our environment, and if we aren’t lucky enough to have our whole environment on Enterprise Edition, that was really really limiting before now.
There are some limits. You have quotas on these “cheaper”– and I mean cheaper it shouldn’t be in scare quotes, they are literally cheaper, Editions of SQL Server. We do have limits on what we can do with In Memory OLTP on those. We don’t get to use resource governor, either.
Historically I’ve not been the world’s biggest fan of Resource Governor but for In Memory OLTP as well as for columnstore, for different reasons Resource Governor is really really useful to use for In- Memory OLTP. It can help both monitor and configure how much memory is available to and reserved for In Memory OLTP. Really useful feature that is still Enterprise Edition only, also accessible in Dev Edition.
So it’s still definitely worth it, once you get to critical applications, to have Enterprise Edition. But I’m really happy for that “dipping your toe in” reason, and starting to get used to this.
What scenarios may work for In-Memory OLTP?
So, when would we use this? In Memory OLTP is not designed as– is not – I mean I’m not really thinking about the intentions of how it was designed– but this is NOT a replacement for all our existing stuff in SQL Server.
If you want a scenario– where does give you an improvement in performance– it’s not gonna give you an improvement in performance all the time, right? Sometimes, with any change, we can make things slower. The scenarios that currently are the best know possible fit for In Memory OLTP are– if I have a ton of a little-bitty transactions and I want to insert rapidly, and that’s why I mentioned the logging scenario might be a non-critical dip-your-toe-in scenario.
Something like caching. Now caching I am like, well, that’s an expensive cache! We’re paying a lot for Enterprise Edition, and we have limits on how much we can cache in lower editions. But depending on your environment, sometimes these things make sense. Fast ingestion of lots of little bits of data.
We have memory optimized table variables, which if you have a really high access of temporary objects, then memory optimized table variables can take that out of tempdb. It’s cool because Michael J Swart has blogged about this: he had high contention in tempdb, and one way he could resolve that was by using memory optimized table variables. Now Michael wrote a recent blog post that’s really interesting about why it didn’t end up making sense yet in their environment, because they do need to support lots of little bitty databases, and there is a required footprint for In Memory OLTP on their version of SQL Server. That created an issue for them. So you’ve got to be careful and test this out, and sometimes you find that maybe it’s not a fit for you yet.
Also we get with In Memory OLTP the ability to have– this is an option, they are not ALL non-durable– but you can create with In Memory OLTP an object that if the power gets cut on your SQL Server, the data is all lost. There’s some cases where we actually want this! A lot of times we’re loading data, and we’re manipulating it, and if something goes wrong we’re just gonna restart the whole process. We don’t care if the data is lost. We don’t want the overhead of logging. Non logged, non-durable objects can be great for this. Erin Stellato has written about some testing processes to see if that could be a win for some scenarios, and I would also just leave the door open for different types of things that you can imagine.
Ned Otter comments: your table variables in In-Memory OLTP will still have the same bad cardinality assumptions. Yeah you can use– there’s a trace flag you can use to help it getan estimate of the number of rows in the table variable, but it’s not necessarily going to be faster than a temp table, right? This is true for almost any feature in SQL Server, and when we talk about temp tables vs table variables, this comes up a lot. One is not ALWAYS faster. And that remains true for all of this stuff! You may be able to engineer something really cool with this that you couldn’t do with something else, but it isn’t just like you swap it in and everything is faster.
John has a question: does the 32 GB quota affect the buffer pool limit in SQL Server Standard Edition? The quota for In Memory OLTP is a per database quota. My memory – I’m going off my memory here– is that it is NOT. It is outside that buffer pool limit. The blogpost I have here, the link on the screen, digs into that more. The scenarios post talks about those limits, and Ned please correct me if I’m remembering wrong, but I believe it is per database limit and it is not within the buffer pool limit, I believe it is outside of it. I don’t have In Memory OLTP in my brain, I have to refresh everything from the cold storage, and I was mainly focusing on the isolation levels for today. Because today’s fun is all about isolation levels!
Why Isolation Levels are worth studying for In-Memory OLTP
Whatever you’re testing, and I really think this is an exciting area of SQL Server to say, “hey could I engineer something really fast with this?” Whatever you’re testing, the isolation levels here can be confusing, because this is really different than disk based rowstore tables in SQL Server and disk based columnstore in SQL Server, as well.
It’s not bad, and I think once you start playing with it it isn’t even hard with the In-Memory OLTP isolation levels. In part, because there’s only three isolation levels! There’s actually fewer isolation levels, and once you get used to how they work, I think it makes sense.
But it is really different, it’s definitely worth playing around with.
Here are the demos we’re going to explore in this webcast
Most of what we’re doing today is playing around with different demos. You have the demo scripts in your– oh and Ned just confirmed that my non-In- Memory OLTP brain was correct. The quota is outside of the buffer pool, per DB. If you get into the columnstore index on In Memory OLTP, it does count towards the per DB cap. That’s an extra layer of rocket science that you can play around with!
Today we’re gonna dig through these demos:
- We’ll be talking about different kinds of transactions, and why you should care.
- What isolation level escalation is and why you should care. And I this is one of those things that, if you take care of it, you don’t necessarily have to worry about it later.
- What is an uncommittable transaction and why is that a good thing?
- Snapshot isolation and multi statement transactions.
- There are three isolation levels. We get snapshot, repeatable read, and serializable. We’ll see demos of why you might want repeatable read and serializable.
- Then finally we’ll close with a stupid pet trick, and we’ll do a recap.
In the demos today I am only doing these in interpreted T-SQL. In other words, I am not creating a bunch of natively compiled procedures. I do not mean that natively compiled procedures aren’t cool though! In fact, natively compiled procedures, when you can use them, can be incredibly fast. But we can only access in memories tables from natively compiled procedures. There’s limitations on what we can do, and I’m talking about being creative and exploring. So we are exploring in interpreted T SQL, doing cross container stuff. I have a link if you want to learn more about what this “cross container” means– just please know that even though I’m not showing you natively compiled procedures, they are really interesting, they are really cool, you should explore them. We just want to play around with isolation levels today.
So let’s dig in now to our SQL Server instance here and bring up our virtual machine. I’m doing this on SQL Server 2017 today. Here we are in Management Studio. I am going to need a second Management Studio later, so I will just go ahead and get it open now. Let’s make this 100% font and connect up our second instance, and then we will be all sorts are ready to go.
We are on line 208 in the script. Before this I have done all sorts of setup. I have restored a tiny database called BabbyNames. I have configured it for In Memory OLTP. I have created some tables that we’re gonna use, and I have loaded them. I have setup resource governor. You can use the script for download to explore all of that stuff, we just don’t have time to show it all today, so I did all that– I baked that cake before we started.
Autocommit, explicit, and implicit transactions
The first weird thing that can be confusing with In Memory OLTP is that what looks like very similar transactions can behave very differently by default. This query: select count star from ref.FirstNameXTP – I put xtp on the end of my in memory OLTP tables just to make it totally obvious that it’s an In Memory table for the purpose of demo, you don’t have to do that– but this is the same query, the only difference is I have it in an explicit transaction in #2.
And the first example, this is going to be an auto committed transaction – I haven’t explicitly said BEGIN TRAN. But it’s gonna run that way. When I run the first statement it works fine, right? No errors. I have ninety five thousand and twenty five rows in the rest of ref.FirstNameXTP table.
But if I put an explicit transaction around that same query I get an error and the error is actually very clear. Which, you know at least yay for clear errors! It says: accessing memory optimized tables using the read committed isolation level – my default isolation level in SQL Server is read committed, I don’t have read committed snapshot enabled for this database so default, I’m using read committed – it says it’s only supported for those auto commit transactions. It is not supported for explicit or implicit transactions.
I need to provide a supported isolation level using a table hint, and it’s even goes so far as to suggest– it’s like a waiter perhaps you would like to use a “WITH snapshot” hint?
Well, that’s interesting, I need to tell it to use snapshot isolation. And you might wonder from that error message – it says explicit or implicit transactions. You might be like, okay. what is the difference between auto commit and implicit transactions? I have hardly ever seen people using this in SQL Server, but what it means is you can change this setting and say that implicit transactions are on for my session.
Then you can run the query, what this essentially does is it secretly puts in a BEGIN TRAN / COMMIT around whatever I run. So it implicitly does that BEGIN TRAN, and I get the same error message as I do for the explicit transaction. I’m not saying you should use this, very few people do, I’m just showing you so the error message makes sense.
So, I had to turn my implicit transactions off to return to normal. I can I can turn it off as often as I want :D So what about these auto- commit transactions. Let’s look at them a little bit, I do want to just show you and prove the settings for my database because, this this is one of the things that really confused me at first. I can look at sys.databases and see the settings for snapshot isolation, read committed snapshot, and this one we’re going to talk about whether or not “elevation to snapshot” is on.
Currently in my database that we’re using, I have NOT enabled snapshot isolation. This “have you enabled snapshot?” this is the setting we got in SQL Server 2005 when we got snapshot isolation for disk based tables. They were all we had! That snapshot isolation uses versioning in tempdb, right? I have not turned on tempdb versioning, and I want you to read this as “snapshot isolation using tempdb versionin"g when you use this setting, even though that’s not what it says. Similarly, I’m not doing read committed snapshot, aka “RCSI”, in tempdb. And I’m currently not elevating explicit and implicit transactions to snapshot.
One of the confusing things to me was: can I tell? What it’s telling me is auto commit transactions started in read committed are being elevated to snapshot. Now, part of my confusion was” do I have to enable snapshot isolation for this database? That old 2005 version? And then, can I even SEE that they’re using snapshot? Well, let’s go ahead and– I’m just running this query over and over and over again here. While one equals one, just keep running this query, and while it’s running I’m going to go into another window and use Adam Machanics' sp_whoisactive, which is the great procedure you can get it at WhoIsActive.com for free, it’s a great way to see what’s running in your SQL Server. And I’ve said @get_additional_info = 1, because I want to see information like the transaction isolation level.
This is completely accurate, this session that I’m running, that interpreted TSQL, is in read committed. This statement is sort of secretly getting a WITH SNAPSHOT hint tacked on to it, but that doesn’t change the transaction isolation level for the session at all. Even if I do things like @get_plan = 1 to get an execution plan… and I need plural.. an execution plan isn’t super concerned about transaction isolation. This is just about optimization. I don’t see any hints tacked on here about “WITH SNAPSHOT.”
I can’t tell from just looking at the query that this is anything different than plain old vanilla read committed, right? So that’s not really– if you were to just come across this, you wouldn’t be like oh that’s snapshot isolation, you have to actually play around with this.
We CAN prove, or at least demonstrate the effects of snapshot isolation. One of the things that you we can do with disk based tables– and I’m going to go ahead and pop this query in here and uncomment it – with disk based tables if I do this demo, and I go to a disk based index with first names in it– this table has a lot of names in it and on a disk based index on FirstName, I do have an index on FirstName you’re gonna see I forced it to use it. If I take the row Aabin which is the first row that’s in here and I update it to zzzAabin, it bounces to the end of that index. And if I’m reading it in read committed, if I read it over and over while it’s bouncing to the one side of the index and then the other, and then the other, and then the other, on a disk based b-tree index, sometimes I’ll count it twice. Sometimes I won’t see it all. And that’s easy to demo. So what I’m going to show show is using supposedly read committed, right? What we know and what we’re gonna prove is that with snapshot secretly being added on here, this query we’re selecting the name count from, I’m forcing it to use an index on the table that I created on FirstName, and if I look at my execution plan I can even see– yeah I’m doing an index scan on FirstName– what we’re gonna do is while we’re updating the name and it’s NOT bouncing back and forth! That’s because this is an In-Memory index.
This isn’t the same old b-tree rowstore updates! It doesn’t modify our row and move it back and forth.
In In-Memory OLTP we’re just adding new information with timestamps, so SQL Server can figure out what’s current. This isn’t happening in tempdb, this is a whole new way of the table working, right? So when I’m running these updates I’m actually ADDING new stuff and with timestamps about what the current information is. So here we go we are “updating”, which is really just adding new stuff, and then while that happens, I’m going to count how many first names there are 2000 times. I want to do this without execution plans on because that’ll tank my Management Studio.
So we have a table that we’ve created called NameCount, and we’re counting how many rows are in there. If we were really in read committed (against a disk based index) we would see different numbers result from this because sometimes we would miss a name as it was bouncing back and forth in that that disk based b-tree index, and sometimes we would count twice. But now we just have inserts, we don’t have updates happening. So let’s, actually while it’s the working – it’s at about seconds– I have sp_who_is_active hot keyed, so we can actually watch it as it runs. We can see but see there’s no blocking. We are we are still sometimes having to wait on those updates, and memory OLTP does still have to talk to my transaction log, and I am still having some waits on that, but I’m not having lock waits going on here, right? What I’m running here is sp_whoisactive, I just have it hot keyed. And we finished up there. Alright, so how many names did we count? If we always see 95,025 names– yeah we are not in plain old read committed. And it’s just not using tempdb to do versioning.
The lowest isolation level that we can have with memory optimized tables is snapshot. That with snapshot hint is getting added on. What about explicit transactions? I’m going to review again– I haven’t changed any of my settings, right? I have not enabled elevation. The tempdb version of snapshot isolation is still off. Here’s that error I get from the explicit transaction, and it suggests that I add this with snapshot hint. It’s like, “you might like to have with snapshot” Well, okay, let’s go ahead and do that. And sure enough that works. The table hint WITH SNAPSHOT works. Now, if you’re like me, you might have thought, well another way I could do this is— if a table hinted with snapshot works, maybe I can just set the isolation level to snapshot on my session? And then I don’t have to do all those table hints. Well, in fact, NO. This is part of why it’s confusing! Remember, it gives me very clear error – it says you can’t use the session based snapshot on memory optimized tables, and natively compiled modules cannot be accessed or created when you’re using session based snapshot.
The ‘snapshot’ has to happen as a table hint, not a session hint. And there are reasons for that. I’m going to give you a link to Kalen Delaney’s whitepaper, and she explains– there’s a very complicated chart, it has to do with if you’re accessing disk based stuff and you’re accessing In-Memory stuff, you need to have a single timestamp to refer to and because of the different ways they work session level snapshot just doesn’t work. So we have to go back to read committed for our session, but it wants the hint not at the session level. If we don’t want to type all those hints, here’s how we avoid typing all the hints. We say alter database– I am using alter database current, which we got in SQL Server 2012, but you could put the database name there– set memory optimized elevate to snapshot on. So now we can see that property in our database, that I do have elevation enabled, and now that I’ve set this on my database I can now run this. It ran right without that hint, and it automatically says, " oh that’s an In-Memory table, I will add that hint for you."
But it is adding that with snapshot hint invisibly in there too make it all work.
An uncommittable transaction (the seat-stealing demo)
So what about— what does snapshot isolation give us for this? Well this example is crappy code, but just for illustration purposes, we’re gonna talk about getting a seat on an airplane. And I’m creating a table named SeatAssignmentsXTP. It has seats, who they’re assigned to, and just for fun we’re doing some system versioning on the table. You can do versioning on In Memory OLTP tables. The history table is a disk based based table. There’s lots of info on that online. I’m just doing that for fun, so we can see it. We’re gonna put a thousand seats into the table, and then we’re gonna update. Let’s say all the seats are assigned except for one.
Let’s take a look at our seats: there’s one seat left, it’s in the front of the airplane, and we want to get this seat! The code we have to get the seats is very simple, and this is Just as crappy as the table design. We pass in the seat and the assignee and we are doing an explicit transaction. This is not a memory optimized stored procedure, this is interpreted TSQL in a procedure. It will be automatically escalated. I’m doing an update, and then I do a commit. So I want to reserve my seat, and our question is what what happens if we have a race condition, where I’m reserving my seat and somebody else comes in and tries to update while I’m committing my transaction? To simulate this, what we’re gonna do is use the TSQL debugger. I am trying to get my seat. I’m gonna say step into, and I’m going to step into this procedure. There’s the BEGIN TRAN. Now from the update, and while I’m on the commit… at this point where we are trying to commit our data, and while this is happening— I can close out this window, we don’t need it anymore– while this is happening, someone else tries to reserve a seat in our other Management Studio session.
I’m going to connect to the database, there we go, and uncomment the code– so while I’m committing my transaction, Nanners, my old cat, tries to get the seat as well! And she’s not debugging, she’s just going straight through. So she’s gonna go through the update and hit the commit, and when she tries to commit her seat, she gets an error. And this is a feature, I firmly believe this is a feature! Here is the error, here’s the number. It’s 41302: I tried to update a record that has been updated since this transaction started. I’m trying to modify a record that someone else has has modified.
They’re in the process of commit, but I’m trying to change data that’s been modified since my transaction started. It is not safe to change that data. I had to bail out. This is an uncommitted transaction, it has been rolled back. This is actually a good thing! You want to catch this error, and be like, oh I need to– actually probably not retry in this case – maybe retry if the seats available. In theory, the other transaction could roll back too (for some other reason), but this seat is not safe to reserve for Nanners, because kandar is in the process of committing the seat.
In-Memory OLTP at commit time, it does validation as well as write all the data (to the log). And I am sitting on my commit, it’s not safe for anybody– she can’t commit this row, and that’s a good thing, right? Because what we don’t want to happen is, we don’t want to have me commit to the seat and then have her grab the seat from me when it wasn’t available for her! So going back to my session, looking at who’s assigned to – our query says that is null right, and I am committing, I am getting that seat! So I finish up, I get my seat, my query is executed successfully. I have one row affected. When we look at the table, yep I have the seat! The seat is assigned to me. We can even use the history, we can use the versioning of this table. I’m gonna now say look at the history of this, was there ever a time– I mean, nanners transaction aborted, but was there ever a time when seat was assigned to anyone else but me? And we can see looking at the history of that seat – for system time ALL for seat 1a, no this was only ever assigned to me.
That commit violation is is a useful thing. Don’t look at these errors as like, “oh it’s a problem I have to handle these errors.” The errors are there for a reason, and they’re trying to be useful for you because I don’t want her stealing my seat! Now there’s lots of things my code doesn’t do, right? This is obviously not modeling code for anything you’d actually want to do, but I just like that example for showing with race conditions who gets the seat. What if I’m using snapshot and I have more than one statement in the transaction? How does snapshot work? With disk based tables with snapshot isolation, the way I usually do it with disk based tables is, I usually do that transaction isolation level snapshot for my whole session, and then I start a transaction and with that. Then you get a snapshot of the data– not from the time you run BEGIN TRAN, but from the time you do the first data access.
Multi-statement transactions in Snapshot Isolation (the lemonade stand demo)
You get the same view for the whole transaction, which can be useful for reporting– because let’s say I have started a lemonade stand for all of the babbies. This database contains all the baby names in the United States. Let’s say a lot of those babies are gonna be running lemonade stands for me. I have this great franchising business, and I need memory optimized tables because there’s there’s just so many babies selling lemonade. So we’re gonna create our lemonade stand revenue XTP table. And we’re not going to include the comment ending in the definition. We’re actually gonna create the table, now. I have to, on a memory optimized table I always have to have a primary key, it has to be non clustered– there’s no such thing as a clustered index. Oh, brave new world.
So, I said with memory optimized on, and right now I only have a limited number of revenue. Perhaps we’re starting small, but we expect this to be extremely fast in the future! So, I’ve got 2015, 2016, 2017, and my total revenue is thirty bucks. We’re doing really well here. And we’ve got this report on our lemonade stand. The top is a summary query, where we’re selecting our total revenue, and then at the bottom we have detail. What we’re going to do is, we’re going to test out what happens if in the middle of our report the data changes. So first run, we’re just going to run our report with no data changing at all. What does our report look like, just by default? As you would expect, our total revenue is thirty, and then here’s our detail. Let’s say we’re running our report– oops, I keep collapsing it– there we go– we start our report and it runs the summary bit, but before it can run the detail bit some data comes in an update. An insert comes in! So in another window – there we go – fix our indenting– in another window, we’ve run the first part of our report, which is just automatically escalated to snapshot based on my database setting.
Meanwhile an insert and an update happen, and for 2017 my revenue is adjusted to $100, and then revenue of $1 comes in for 2014. So our total is no longer $30. We have an extra row, and one of our rows has changed for 2017. When we run now the select detail portion of our report, we don’t see the $100. We don’t see the row for 2014. This is snapshot isolation working the same way it sees a snapshot from the first time we accessed data in the transaction. We see data consistent with that. Now I commit and it completes successfully. In many situations this is gonna be exactly what we want, because we want consistency with the first time data access occurs in the transaction. But there might be some cases where we don’t want this, because what if we want to make sure – what if we want to make sure that at the time the report commits, at the time of the report is actually given back and returned to the user, it has current data?
Because in fact the insert and update had committed.
Maybe we want Repeatable Read for our report
There’s going to be some cases where we want the most recent stuff. We don’t want the snapshot from when data access first occurred, we want to say, “oh if anything has changed let me know, so I can actually redo it.” Because it’s not gonna go back and run this first query for me, but I might want it to get to give me an error and let me know, hey something changed!
Isolation levels can help with this. Our lowest isolation level we can use with In- Memory is snapshot. We can raise one level to repeatable read.
Repeatable read says, if any of the rows that you have read, if I read them again, and I wouldn’t read the exact same thing at the time of commit, I’ll throw an error. So I’m going to use a read set and look to see if any of those rows would change. Let’s reset the table – oh we can’t! Yeah it’s different. When you look at this, you’re like, oh the truncate table isn’t allowed?!?!? But this is a whole new different world, it’s not like – if we delete all the rows or recreate the table, it’s not like running through a disk based index in the same way. It stores it differently! There’s an interesting discussion as to whether it would be better to delete all rows or recreate the table - I’m not covering that today, but test whenever you need to do something like this, and see what’s better for you at your volume.
Now I’m going to go ahead and just reinsert my row, so we’re back to the original. The original revenue for our massive lemonade stand. Now I have just put a repeatableread hint on the first query. I’m gonna start up my report, and I get the summary amount at the top. Now I’m just gonna run the update statement, I’m not doing the insert. I’m just gonna do the update for 2017. A repeatable read hint doesn’t catch inserts.
It doesn’t catch what’s called phantom rows. This is a change to an existing row that I read. I didn’t read this row. I didn’t run the insert, but if I had run it it wouldn’t have been there at the time I ran my summary query. So I modified a row that this read previously, and now I can run this select query at the end that will return the detailed data, and again it uses the snapshot. I get everything from snapshot but as it hits commit, here it says, “Oh, something changed in the readset. You used repeatable read on that first query, and I went and validated what you read, and something’s different at the time of commit.” So you may want to go check what’s going on and rerun that report, because you don’t have the most current data. That’s a feature, that’s the cool feature. If I redo this and I only run the insert statement– let’s reset our table. Repeatable read doesn’t catch the insert statement.
Maybe we want SERIALIZABLE for our report
I’m going to BEGIN TRAN, do this summary with repeatable read, and then I’m going to do the insert. Now I haven’t changed any of the rows that I read. Now it works just fine on the commit because it says, oh I checked the readset, those are still the same. If I want to catch that insert I need to use serializable isolation level. Serializable includes repeatable read protection, it gives us everything from snapshot, it validates that if we read the rows we read they would be right again. And it says ok, no phantom rows have been inserted that if you were to run the query again you would now read them. So I’m gonna reset my table. I’m gonna start up my transaction with serializable, and now I’m going to just run the update. I’m using serializable, not repeatable read.
While I was doing this a row was updated, and when I go to commit it says it I’m using serializable but serializable includes repeatable reads – so the error message does tell me which type of failure it was. Which type it found first. If I now rerun this and just run the insert– we’re gonna reset the table, and now do the summary transaction. Meanwhile an insert comes in for 2014, which is before the period I read, it’s not even in the middle of the period – now when I go commit, it says, “oh your query would have picked up another row” and this is a serializable validation feature. So if I do need to make sure that nothing has changed, that if I were to run my select query at the top again that it would return the same stuff, I really want to be in serializable isolation level.
NOLOCK and In-Memory OLTP (the stupid pet trick demo)
We do have time for a stupid pet trick. I’m really pleased! Bonus!
Oddly enough, the NOLOCK hint is allowed with In-Memory OLTP. No other locking hints are allowed, you can’t use XLOCK, it doesn’t use locking. You can’t tell it rowlock or anything like that. But you can do NOLOCK, which, I wonder if it’s a joke? I don’t really know. Seems like a joke. What we’re gonna do, we’re going to start our report right at the beginning of a report I don’t want to put any hints on there. We’re just returning the summary data. What we’re going to do is we’re going to now change some of the data here. I’m going to update that row, I’m gonna insert a row, and then our second statement in the report uses a NOLOCK hint.
The question is: this seems like what it would do is override the snapshot. You know we’re automatically getting elevated, the snapshot this seems like it might – I don’t know why I have two semicolons there, just for emphasis maybe???
This seems like it would be like– oh maybe we could see those rows that were modified, because maybe the NOLOCK cancels out the snapshot. The NOLOCK can’t– let’s go ahead and run this, it does not see the updated row for 2017.
It does not see the row we inserted for 2014. The NOLOCK hint is “quietly ignored” is the wave Kalen Delaney puts it in her excellent white paper on In Memory OLTP.
So maybe this is a feature meant to be that– if you have existing code, you don’t have to clean out the NOLOCK hints? I’m kinda like, you don’t want to port existing code like that. If you’re lazy enough to not remove your NOLOCKs, you really shouldn’t be using In-Memory. Not the feature for you, if you need to save time to that level. So it is more of a stupid pet trick than anything else, I think.
But, in fact, the no lock hint is allowed… trivia for you if you go to any sort of conference that has DBA trivia.
Recap time!
Let’s do a quick recap of what we covered in our isolation level adventure.
There there are three isolation levels allowed with in memory OLTP. The lowest level is snapshot.
If you use auto commit transactions, when you start playing around with this you may not realize you need to do anything special, because little autocommit transactions are automatically elevated to snapshot. But as soon as you do a BEGIN TRAN, or if you’re one of the rare people who use implicit transactions, you may start getting errors that in fact you need to add a WITH SNAPSHOT HINT.
That is a table hint - snapshot. It’s not the session level hint.
The easiest way to handle this is to alter the database and set elevation on for those explicit transactions (not a lot of people use implicit transactions).
That table level snapshot hint is good for a lot of scenarios.
If you want to make sure that at the time of commit no rows have changed, rows that you read, in that case you want to use repeatable read isolation. It gives you that additional protection.
If further, you want to make sure that not only have no rows that I’ve read– not only have they not changed, but also, no phantom rows popped in: if you were to run your queries again they would have been touched by your query– you want to make sure those phantoms show up, then you need serializable isolation. This is the most isolated your transaction can be from the impacts of other transactions. If it detects any phantoms you will get these serializable validation errors.
References
(Readers, links to this are maintained at the top of this page. I’m not duplicating them just so I don’t have to try to keep two copies of a link in sync.)
I really love Kalen Delaney’s whitepaper, here is a short link to that whitepaper. I show just one little bit of potential failure here. I showed you the uncommitable transaction violations. There are other examples of different errors that can happen with the error numbers, things you need to think about, things you need to catch, as well. Just a huge amount of information on how this all works, which is really helpful to know.
I believe Kalen’s book updated book on In Memory OLTP for 2016 is out as well, so you can check that out, also. Start with a whitepaper, it’s free, it’s shorter. This is something you can actually read and consume, I’m not saying it isn’t work, but I think it’s actually really readable. Kalen writes in a really approachable style, so I highly recommend that white paper. I used it for validating things like that NOLOCK hint. Is it actually just ignored or is something else going on? The white paper confirms, yes, it is quietly ignored. Lots of info. I really love to learn by testing, but you also want to find out, “okay is this by design?” and you know it’s not always obvious from what I’m learning from my tests, so it’s always great to have these sources. That is an official Microsoft whitepaper.
There’s another interesting post that I thought about demoing that didn’t have time to show. Uou can do foreign keys with memory optimized tables - in fact the demo script you have today has some foreign keys in it. This is a really interesting post by Denzil Riberio about validation errors with tables with foreign keys and how proper indexing can help minimize those validation errors by limiting the read set in some cases. Really fun, if you like isolation levels. I really like isolation levels – a little bit ;) If you’re using this it’s very important, if you’re using the foreign keys!
Wrapping up and chatting
I have two more webcasts this year – oh okay so Ned says, “I just wanted to say that some of what’s in the white paper was dated, even maximum and and memory data etc.” Do you mean because it’s been changed in 2017? The white paper is – I should have mentioned this the white paper is for SQL Server 2016, we have SQL Server 2017 and more improvements and enhancements have been added in 2017. What I should do – let me make a quick note is in the follow-up email for thi,s I want to add a link for the what’s new in 2017 page in books online. becaise that’ll help cover the what’s changed and what’s new. I believe Michael Swart talks about this as well in his recent article about the table variables– the footprint required for the databases I believe has been reduced in 2017. I think that was an example.
So yeah, Ned said the white paper was written before RTM. Okay, so some of the 2016 things, such as you put even for 2016 no max amount of memory - it’s only limited to OS memory. You’d think that the Microsoft folks would just quickly update the white paper because Kalen is super diligent about that stuff, but such is the world of publishing. It’s difficult with these things!
Okay so I will now include the link to what’s new in 2017 as well as 2016. It’s always fun using a rapidly evolving technology because it really is always a research project. And this isn’t just In Memory OLTP, this also has to do with columnstore. With columnstore when i was like doing testing on the rebuild or reorganize webcast, there is a whole page in books online on like okay how do I deal with maintaining columnstore indexes, and it didn’t even mention one of the big features in 2017. So, yeah. Even Microsoft’s documentation - it’s hard for it to keep up with the rate of release. This is a good problem to have because the rate of release is so fast. I don’t want to say we should slow down the rate of release and wait on the documentation, but it does mean that you’ve now got to look at all the blogs, as well as what are all the whitepapers.
So it is now 9:51, I’m looking to see if folks have questions or comments, but if if you don’t you get 9 minutes back potentially before your next meeting! So feel free if you don’t have a question or comment to bail out, refresh that coffee, take a walk before your next meeting, or grab these scripts. The link is in the chat window as well as in the answered questions. Grab the scripts and just start playing around with this. If you’re like – oh I don’t have an instance to play around with this on, I would really encourage you to just for a scratch sandbox sandbox instance (this actually wasn’t what I was demoing on today but one of the things I like to use) is just a real quick docker install. I have the video on SQLWorkbooks.com showing just how fast it is to deploy those, and so if you don’t want to have like a permanent 2017 instance around… and developer edition’s free so you could just do a permanent 2017 instance, I’m not just dissuading you from that– but if you also just want like a sandbox, going through the effort to figure out how to set up a quick doctor downloaded latest and install and setup means that anytime you can really quickly do that and then play around with something and then get rid of it and it’s gone. I like that! You don’t have to do it that way, of course.
John says don’t tell my boss I have nine extra minutes. Yeah, no one say anything. A couple folks saying thanks – I’m so glad that you enjoyed this! Thanks for coming!
Alice says, will you be presenting at SQLBits in London next year? I will not be at SQLBits next year. I do love London, I do love SQL Bits, it’s not that I don’t love it, it’s just next year didn’t work out for me. What I’ve decided I want to do actually, I don’t think I’ve told anyone this but it’s not like it’s big secret, I finally made the decision that I want to do a day-long session on isolation levels. If you’ve been watching the webcasts, I’ve been preparing this material on why read committed is not everything we want it to be, what can go wrong with read committed; snapshot isolation and read committed snapshot; now I’m starting to build this in memory OLTP piece, and I do touch on columnstore in other modules as well. Then coming up on December 14 I’ll be talking about serializable and repeatable read in the context of disk based tables. So adding to that piece– a lot of people inherit databases, maybe it’s their vendor database, maybe it’s a legacy database that’s using serializable, and they don’t even realize it and then they start getting locking problems and, “Whoa what’s going on with this?” I do want to add some more modules like maybe some examples with replication subscribers and different isolation levels.
Ned says “thanks for presenting on my favorite topic.” Isolation levels and indexes – which is my favorite? I don’t know, they’re maybe both be my favorite. Ned, is In Memory OLTP your favorite topic rather than isolation levels?
It’s not yet MY favorite topic, I find really intriguing. To me it’s really interesting, I am super glad that it’s now available in standard edition and web edition just for that, because if I was still a production DBA I would totally be like– now this available in cheaper editions, okay where can I play around with this? But if it explodes, I want it to be no big thing. I want to get ahead, and I may not be able to use it in critical stuff soon, just depending on what things my environment is on, but how can I learn as much as I can about this so I can identify places where this may work, and I can learn how to manage it. I can figure out the nature of this beast. I’m really glad that they did that. Ned says it’s a study unto itself, yes. Everything is different. I was lucky enough to get to go to Kalen Delaney’s pre con day on in memory OLTP. It was called “what the Hekaton?!?”" and I got to attend that in Portugal, and it was like entering– I mean that’s kind of the cool thing about it is spending time on In memory OTP is like entering a totally different world of SQL Server. But in our familiar SQL Server databases, and you get to just totally– it’s like everything’s a different color, you know what I mean? You get to explore it and you find everything’s different. You have to learn to relax a lot of your assumptions (like clustered indexes exist) and then you’re like, well how does it work? It’s crazy so it’s really fun. It’s really interesting.
Simon says whole thing is a great topic, it’s great to chat up girls see how they respond. So I don’t know if you saw the Rimma Nehme meme that she used. Doctor Nehme– hopefully I’m saying her name right– she did a, what do you call it, a morning talk at a conference, a keynote – at the SQL PASS conference this year and one of the slides she had was a meme about one of the impressive things at parties is to talk about how you know how many different isolation levels there are in the cosmos database.
At the right party, at the right job interview more likely… the original reason that I started learning about isolation levels in SQL Server was that I had a particularly disastrous job interview. I was interviewing for a job that I really wanted, I thought it was my dream job and I at one point – it went so badly, I was totally wrong for the position– at one point I was asked just to name the isolation levels in SQL Server. I was just like, “uh… NOLOCK?”
Okay I did a LITTLE better than that, but after that, after that interview which really did humble me on many topics– and humility is good! I was like, you know maybe I should learn about those, because I really feel like I should be able to answer that question better. And thus, I started learning about isolation levels.
Simon is working on his isolation level pick up lines. Do not use those in job interviews, that’s my biggest piece of advice.
So alright, I thank you all for joining me, this has been so much fun, and I’ll be back on December 7 talking about indexing for Windowing Functions. We will be back in the disk-based b-tree tables (or rather, rowstore tables). This makes rowstore tables– the terminology is so funny, I’m still like oh wait no, “I meant ROWSTORE” still. I may have kind of slipped up on this. The proper terminology is our old tables are disk based rowstore. Now we have In-Memory OLTP, and then we have columnstore. Columnstore is usually disk based unless we are combining these. So hopefully I got the terminology right :D
Thank you all for joining, have a great week maybe I’ll figure out how to turn off….. (yes I figured out how to turn off the webcast)