The query was perfectly broken…

I have been teaching a T-SQL 101 class and for the homework, we asked the students to get all the records where our heroes had a birthdate between 1995 through 1999. I expected something like this:

SELECT FirstName, LastName, Birthdate
FROM Heroes
WHERE Birthdate BETWEEN '1/1/1995' AND '12/31/1999'

OR

SELECT FirstName, LastName, Birthdate
FROM Heroes
WHERE Birthdate >= '1/1/1995' AND Birthdate <= '12/31/1999'


Imagine my surprise when one of the students turned in this:

SELECT FirstName, LastName, Birthdate
FROM Heroes
WHERE Birthdate BETWEEN '1995' AND '1999'

When I first saw the query I thought, “There is no way they ran that and it worked.” So I wrote it up and ran it on my data. Guess what? IT RUNS AND RETURNS DATA! I was shocked.

I started looking at the plan and what it did to the data and found that it had done an implicit conversion on the dates and assumed 1/1/1995 to 1/1/1999 based on the year. So we were missing data from the results, but I was still in shock that it had run in the first place and shared this information with my co-worker who reminded me that dates get crazy and if I only put in ’12/31/1999′ and there is a time in the field, it will cut off most of the times within that day because it will assume I want ’12/31/1999 00:00:00′. If I want the full day, I need to get in the habit of specifying ’12/31/1999 23:59:59′ or ‘1/1/2000 00:00:00’ and then test my results to make sure I am getting what I truly want back from the database.

The song for this post is BANNERS – Perfectly Broken.

We are never coming undone with NOT IN…

Last summer I wrote about NOT EXISTS and how I could speed up the query by adding a left join and adding an IS NULL to the WHERE clause. Last week, I had a similar situation, but with a NOT IN. It was killing my performance. I wanted to remove all my test emails as I prepared my production email list and was calling them out specifically.

SELECT *
FROM dbo.SuperHeroes s 
INNER JOIN dbo.Contact c ON s.ContactId = c.ContactId
  AND c.EmailAddress NOT IN ('Batman@TestEmail.com', 'Superman@TestEmail.com', 'WonderWoman@TestEmail.com', 'Aquaman@TestEmail.com')

So I tried my super cool trick:

CREATE TABLE #SuperHeroTestEmail (EmailAddress varchar(50));
  INSERT INTO #SuperHeroTestEmail  (EmailAddress) VALUES ('Batman@TestEmail.com');
  INSERT INTO #SuperHeroTestEmail  (EmailAddress) VALUES ('Superman@TestEmail.com');
  INSERT INTO #SuperHeroTestEmail  (EmailAddress) VALUES ('WonderWoman@TestEmail.com');
  INSERT INTO #SuperHeroTestEmail  (EmailAddress) VALUES ('Aquaman@TestEmail.com');

SELECT *
FROM dbo.SuperHeroes s 
INNER JOIN dbo.Contact c ON s.ContactId = c.ContactId
LEFT OUTER JOIN #SuperHeroTestEmail em ON c.EmailAddress = em.EmailAddress
WHERE em.EmailAddress IS NULL;

Here I create my temp table so I can load in the email addresses that I want to remove. Then I do a LEFT JOIN on those email addresses and in my where clause I force that join to only bring me back the records that show NULL for the temp email table. This way, I am telling SQL exactly what I want it to bring back instead of telling it what I don’t want. This makes my query run faster.

I was so excited, but I started to test and noticed my counts were way off. I couldn’t figure out why until I realized my NOT IN was removing all my NULL email records. I don’t have email addresses for all of my super heroes! As soon as I figured out that, I knew what I had to do.

CREATE TABLE #SuperHeroTestEmail (EmailAddress varchar(50));
  INSERT INTO #SuperHeroTestEmail  (EmailAddress) VALUES ('Batman@TestEmail.com');
  INSERT INTO #SuperHeroTestEmail  (EmailAddress) VALUES ('Superman@TestEmail.com');
  INSERT INTO #SuperHeroTestEmail  (EmailAddress) VALUES ('WonderWoman@TestEmail.com');
  INSERT INTO #SuperHeroTestEmail  (EmailAddress) VALUES ('Aquaman@TestEmail.com');

SELECT *
FROM dbo.SuperHeroes s 
INNER JOIN dbo.Contact c ON s.ContactId = c.ContactId
LEFT OUTER JOIN #SuperHeroTestEmail em ON c.EmailAddress = em.EmailAddress
LEFT OUTER JOIN dbo.Contact c2 ON c2.ContactId = c.ContactID AND c2.EmailAddress IS NULL
WHERE em.EmailAddress IS NULL
AND c2.ContactId IS NULL;

Here I am using the primary key in contact to join back to it but also am telling it in the join to only bring me records where the email is NULL. Then, I tell it in my where clause to only look at the records that where C2.ContactID IS NULL, this gets me around using an “IS NOT NULL” which is much slower in this case.

At the end, it cut my query run time in half, which was a win for me. The main idea is that you want to tell SQL Server what you want, not what you don’t want. It helps the engine be more precise and know exactly what you bring you. I compare this to going to a restaurant and being able to tell the waiter what dish I want instead of telling the waiter that I, “Don’t want anything with fish”. Then the waiter has to check every dish and ask me if that is what I want. It takes longer. Instead, I just tell the waiter and SQL Server what I want to consume.

The song today is a mashup of Taylor Swift (We are Never Getting Back Together) and Korn (Coming Undone) and has been living rent free in my head for weeks. Huge thanks to my awesome co-worker, Bree, that showed it to me.

Transactions follow me left and right but who did that over here?

I have been working my way through a fantastic training on SQL Internals and when I saw this trick, I had to write it down so I wouldn’t forget it.

Say you have a user come to you and they dropped a table sometime yesterday, but they don’t remember when and now they need it back. You could start the restore process and roll through logs until you see the drop and then restore to the hour before or you could run this super cool query to get the time the table was dropped.

(Before I ran this, I set up a test database, created a table, filled it with stuff, took a full backup and a transaction log backup, dropped the table and then took another transaction log backup)

SELECT [Current LSN]
		,[Operation]
		,[Context]
		,[Transaction ID]
		,[Description]
		,[Begin Time]
		,[Transaction SID]
FROM fn_dblog (NULL,NULL)
INNER JOIN(SELECT [Transaction ID] AS tid
FROM fn_dblog(NULL,NULL)
WHERE [Transaction Name] LIKE 'DROPOBJ%')fd ON [Transaction ID] = fd.tid

See that Begin Time? We want to roll our logs forward to right before that started. How cool is that?!!! Nearest point in time recovery that is possible all because of reading through the log to see when the drop occurred.

But this next part was the piece that blew my mind. What if I didn’t know who dropped the table, but wanted to talk to them so they didn’t do it again? I can add one more column to my query.

SELECT [Current LSN]
		,[Operation]
		,[Context]
		,[Transaction ID]
		,[Description]
		,[Begin Time]
		,[Transaction SID]
		,SUSER_SNAME ([Transaction SID]) AS WhoDidIt
FROM fn_dblog (NULL,NULL)
INNER JOIN(SELECT [Transaction ID] AS tid
FROM fn_dblog(NULL,NULL)
WHERE [Transaction Name] LIKE 'DROPOBJ%')fd ON [Transaction ID] = fd.tid

I am passing that Transaction SID into the SUSER_SNAME built in function.

Probably shouldn’t be surprised by that answer.

The song for this post is Left and Right by Charlie Puth.

I’m Gonna Spend My Time Speeding that Query Up, Like It’s Never Enough, Like it’s Born to Run…

Have I mentioned that I like query tuning? One of my favorite tuning tricks is removing Sub-queries from WHERE clauses. Let me give an example:

SELECT HeroName
	,HasCape
	,FavoriteColor
	,LairId
FROM [dbo].[SuperHero] s 			
WHERE HeroType = 2
AND NOT EXISTS(SELECT 1 
		FROM [dbo].[SuperHero] x 								
		WHERE x.HeroID = s.HeroID 
			 AND x.IsHuman = 1 AND x.Weakness = 'Lack of Control')

Notice the “NOT EXISTS *Sub-Query* section. Any time I see this pattern or even a “NOT IN *Sub-Query*” pattern, I know I can fix it like this:

SELECT s.HeroName
		, s.HasCape
		, s.FavoriteColor
		, s.LairId
FROM [dbo].[SuperHero] s 
	LEFT JOIN [dbo].[SuperHero] x ON x.HeroID = s.HeroID 
		 AND x.IsHuman = 1
		 AND x.Weakness = 'Lack of Control'	
WHERE HeroType = 2
	AND x.HeroId IS NULL

In this second example, I have moved the sub-query to be in a LEFT JOIN with the same criteria and then in the WHERE I use one of the columns that should be populated (I favor ID columns here) and look to see if it “IS NULL”. That “IS NULL” works the same way as the “NOT EXISTS” and the “NOT IN”.

This allows me to remove the non-sargable arguments from the where clause and takes my query from non-sargable to sargable. (from Wikipedia- The term is derived from a contraction of Search ARGument ABLE. A query failing to be sargable is known as a non-sargable query and typically has a negative effect on query time, so one of the steps in query optimization is to convert them to be sargable).

With simple queries that have a low number of records, I hardly notice a difference in performance. As the queries become more complex or the row numbers increase, the difference begins to show in the query run time and the IO statistics.

The song for this post is I’m Born to Run by American Authors.

I’m Beggin, Beggin you, to stop using VarChars as IDs

As I was troubleshooting a performance issue, I noticed that there was an implicit conversion (SQL Server automatically converts the data from one data type to another) happening in my join. The join was on a column that was named the same in both tables, but one was datatype INT (integer) and the other was a datatype of VARCHAR(50) (variable character up to 50 places).

While the implicit conversion was happening transparently to our coders and users, it was causing performance impacts to the query. I wanted to change the datatype from VARCHAR(50) to an INT, not only to match the other table, but also because INTs are faster to join on than VARCHARs in the SQL engine.

My first step was to make sure there weren’t any values in the column that would have an issue changing to an int. For this task, I am using TRY_CAST to make my life easier.

SELECT TRY_CAST(SuperHeroId as INT) as Result, SuperHeroId
FROM dbo.Lair
WHERE TRY_CAST(SuperHeroId as INT) IS NULL
AND SuperHeroId IS NOT NULL

The TRY_CAST above is checking to see if I can CAST the value as an INT. If it can’t, it will return a NULL value. My WHERE clause will help me quickly identify the values that are failing which will allow me to fix the data before I change the data type on the column.

Once my query doesn’t return any rows, I am ready to change my datatype, which will remove that implicit conversion and increase the performance of any queries using that join.

The song for this post is Beggin’ by Maneskin.

I can query multiple instances, I am King!

In the past, I have talked about CMS (Central Management Servers), but now I don’t have CMS configured and still want to query multiple instances at once. Local Server Groups are my friend.

In SSMS, I start by selecting View>>Registered Servers.

I then right click on “Local Server Groups” and select “New Server Group”.

Next I right click on the group I just created, in this case “Production” and select “New Server Registration”. I then fill in my servername, the type of Authentication, in this case I am using SQL Server Authentication and my login/password. I also am saving my password. This will help in the future. The Registered Server Name can be different. In the real world, my servernames are weird and so the Registered Server Name is the easy to remember name or the nickname I use for the server (all of my servers have nicknames). The description will come up when I hover over the server name once I have it registered.

Then I repeat this process until I have registered all my servers for Production under the Production group.

Now comes the cool part. I right click on my Production Server Group and select “New Query”. Because I saved my password, it connects to all my Production instances in one window. By default, it creates a pink bar at the bottom showing how many instances connected and the name of the Server Group.

Now I can run all my queries at once and the results will have the instance name prepended to each row. Word of warning, I never leave this connection open. I open it when I need it and then close it again so I don’t accidentally run something against all my servers.

The song for this post, King, is by Florence + The Machines.

I’m going on down to New Orleans and renaming some database things…

This week, I had a co-worker that was stuck. They no longer use SSMS and needed to rename a database. They asked if I had a script and so I wrote one. Here it is:

USE master /*Use the master database when renaming a database*/
GO

DECLARE @SQL VARCHAR(8000)

SELECT @SQL=COALESCE(@SQL,'')+'Kill '+CAST(spid AS VARCHAR(10))+ '; '
FROM sys.sysprocesses
WHERE DBID=DB_ID('DatabaseName')

EXEC(@SQL) /*This will kill all the connections to the database, which will allow it to be renamed*/

ALTER DATABASE DatabaseName /*This is the start of the rename*/
SET SINGLE_USER /*keep everyone out while we rename*/
WITH ROLLBACK IMMEDIATE;
GO
ALTER DATABASE DatabaseName MODIFY NAME = NewDatabaseName /*All the magic has been preparing for this moment, the rename*/
GO
ALTER DATABASE NewDatabaseName /*Make sure to use the new name*/
SET MULTI_USER; /*Back to letting others into the newly named database*/
GO

So useful, I had to save it.

The song for this post is Goin’ Down by the Monkees.

Now a story about the song. The Monkees’ TV Show came back on the air when I was little. I was immediately hooked and LOVED their music. I could relate a lot to Micky Dolenz because he was a prankster like me and my family. Goin’ Down was a song I didn’t pay much attention to because I could hardly understand what Micky was singing.

Recently, I read a new story about a time that Micky was doing a concert and there were deaf people in the audience. There was an ASL interpreter that had done a wonderful job with all the songs. Just as Micky was getting ready to sing this one, he looked over at the interpreter and said, “Good Luck”. After the first few lines, she gave up and just clapped along. He ended up standing next to her while singing the rest of the song. How awesome is that?

The story made me want to listen to the song more and it has been one of my favorite fast moving songs the last few weeks.

Oh my my, yeah I’m loving extra tuning time. ‘Cause I’m a sucker for Auto Tuning life.

A few weeks ago, we were talking to a new employee about how much time we spend with Query Store and they asked, “Why aren’t you using the Auto Tuning?”

THE WHAT NOW?!!!!

This awesome, Enterprise-only feature has been a bit of a trial and error for me.

Let’s start with turning it on, the only place I have found to turn it on is by using T-SQL:

ALTER DATABASE <DatabaseThatNeedsTuning> SET AUTOMATIC_TUNING ( FORCE_LAST_GOOD_PLAN = ON ); 

Now, there are a couple of cool things that we can check behind the scenes to see what is driving the auto-tuning. Once Auto-Tuning is enabled, it will collect information that can be viewed by running this query:

SELECT *
FROM sys.dm_db_tuning_recommendations

The first column (name) is the QueryID with “PR_” added to it. I like to read through the columns, the “reason” the plan was chosen, the current “state” of the plan. When it was initiated and when it was reverted, all of this is fun for me to dig through and see what plans my system is finding that are better.

I also really like the Microsoft example with the JSON:

SELECT reason, score,
      script = JSON_VALUE(details, '$.implementationDetails.script'),
      planForceDetails.*,
      estimated_gain = (regressedPlanExecutionCount + recommendedPlanExecutionCount)
                  * (regressedPlanCpuTimeAverage - recommendedPlanCpuTimeAverage)/1000000,
      error_prone = IIF(regressedPlanErrorCount > recommendedPlanErrorCount, 'YES','NO')
FROM sys.dm_db_tuning_recommendations
CROSS APPLY OPENJSON (Details, '$.planForceDetails')
    WITH (  [query_id] int '$.queryId',
            regressedPlanId int '$.regressedPlanId',
            recommendedPlanId int '$.recommendedPlanId',
            regressedPlanErrorCount int,
            recommendedPlanErrorCount int,
            regressedPlanExecutionCount int,
            regressedPlanCpuTimeAverage float,
            recommendedPlanExecutionCount int,
            recommendedPlanCpuTimeAverage float
          ) AS planForceDetails;

Now to the other stuff. It isn’t perfect. Sometimes I have to manually go in and pin plans that are better than what the system is finding. If I manually pin a plan, it will honor it and not unpin or try to find a better plan for that query. It has helped me spend a bunch less time on tuning, but since many of my servers are on Standard Edition I am still using Query Store a lot.

Happy Tuning!

The song for this post is Oh My My by Blue October

And it was never a question, Query Store was crowing for repair. You gave it space and direction but you couldn’t keep it there…

Yes! Back to Query Store! I have had this problem for months where one of my Query store databases grows by a gig each week! It completely fills up, goes into a Read only state (which sets off the an alarm that I built to tell me when it switches to read only) and the only way I could get it to work again was to add space. I would add a gig and think, “Surely that will be enough to feed the hunger”. The next week, the alarm would go off again and I would feed it again!

I adjusted how often stats were collected, how frequently data was flushed, the max plans per query and anything else I could think to do, and still, it was hungry.

I had searched, read, googled, and kept coming up with nothing. I finally found something on corruption in the query store. CORRUPTION? Could it be possible? It was worth a try, my query store was in need of a serious diet and I still needed it to function.

The next time it went in to read only mode, I turned it off (it has to be off to fix the corruption) and ran this:

sp_query_store_consistency_check

Guess what happened next?!!!! My query store had a full gig free! I have left it alone for a few weeks and today I was able to shrink it by 5 gb! It has been glorious to have it working and not being worried as to why it was growing out of control.

The song for this post is Toad the Wet Sprocket’s Crowing

Now a personal note about Toad the Wet Sprocket. They are one of my favorite bands. Last night as I was listening to “Crowing”. I looked up the lyrics to figure out one of the words and realized I had be singing along to the wrong words. I thought it was “crowing for her” when it is actually “crowing for repair”. That completely changed the meaning of the song for me and made me love that song even more. It also made me realize I need to read Toad lyrics more often.

This also took my mind to the time that Ryan surprised me with tickets to go see them at a Reunion show in Vegas. After the show, fortune shown on me and I got to meet Glen Phillips the lead singer. He was super kind and gracious and let us take a picture and right after I fan-girled out a lot and started crying while trying to tell him how much I appreciated his and the band’s music. Huge apology to all the people I have completely scared with a fan-girl episode. I promise I try not to, just sometimes I can’t word how important that moment is to me.

Starting now, is the wrong date for insert…

This is part 2 of my log-shipping journey, if you missed part 1, you can find it here.

I collected all of the file names I need, but you will notice I left the dates off. When my files are moved from one domain to another, their created dates are being changed. I needed the real dates for the correct restore order and to match the backups to the logs. If I were a Dark Knight Powershell master, I am sure I would have figured out how to do it. Every time I started to get close, I would would have a production issue or another distraction that needed my time. In the end I landed in my happy place, so we are fixing the dates in the database!

How do I get the right date for a file when the created date is being changed after it has moved? I was super lucky that the date is being stored in the filename too! (Huge thank you to Ola for his awesome database maintenance solution.)

An example of my filename is this:

Batcentral$Alfred_Batman_LOG_20210610_224501.trn

This is how I dissect the filename to get the date and time from it:

  UPDATE [DBAStuff].[dbo].[LogshippingFile]
  SET CreatedDate = CAST(Substring(FileName, (LEN(FileName)-18),8)  +' '+ (Substring(FileName, (LEN(FileName)-9),2)+ ':' + Substring(FileName, (LEN(FileName)-7),2) + ':' + Substring(FileName, (LEN(FileName)-5),2)) AS DATETIME)
  WHERE CreatedDate IS NULL

My filenames are different lengths which means the the dates won’t always be in the same place, instead I go to the end of the string and count backwards because my dates are always consistent. Then I add all the parts back together to get my datetime and update it into my table.

Are we done yet? Nope, there is more.

The song for this post is Toad the Wet Sprockets’ Starting Now