And it was never a question, Query Store was crowing for repair. You gave it space and direction but you couldn’t keep it there…

Yes! Back to Query Store! I have had this problem for months where one of my Query store databases grows by a gig each week! It completely fills up, goes into a Read only state (which sets off the an alarm that I built to tell me when it switches to read only) and the only way I could get it to work again was to add space. I would add a gig and think, “Surely that will be enough to feed the hunger”. The next week, the alarm would go off again and I would feed it again!

I adjusted how often stats were collected, how frequently data was flushed, the max plans per query and anything else I could think to do, and still, it was hungry.

I had searched, read, googled, and kept coming up with nothing. I finally found something on corruption in the query store. CORRUPTION? Could it be possible? It was worth a try, my query store was in need of a serious diet and I still needed it to function.

The next time it went in to read only mode, I turned it off (it has to be off to fix the corruption) and ran this:

sp_query_store_consistency_check

Guess what happened next?!!!! My query store had a full gig free! I have left it alone for a few weeks and today I was able to shrink it by 5 gb! It has been glorious to have it working and not being worried as to why it was growing out of control.

The song for this post is Toad the Wet Sprocket’s Crowing

Now a personal note about Toad the Wet Sprocket. They are one of my favorite bands. Last night as I was listening to “Crowing”. I looked up the lyrics to figure out one of the words and realized I had be singing along to the wrong words. I thought it was “crowing for her” when it is actually “crowing for repair”. That completely changed the meaning of the song for me and made me love that song even more. It also made me realize I need to read Toad lyrics more often.

This also took my mind to the time that Ryan surprised me with tickets to go see them at a Reunion show in Vegas. After the show, fortune shown on me and I got to meet Glen Phillips the lead singer. He was super kind and gracious and let us take a picture and right after I fan-girled out a lot and started crying while trying to tell him how much I appreciated his and the band’s music. Huge apology to all the people I have completely scared with a fan-girl episode. I promise I try not to, just sometimes I can’t word how important that moment is to me.

When I add a table new, that missing piece is found…

This is part 3 of my log-shipping journey, if you missed part 1, you can find it here and if you missed part 2, you can find it here.

As I starting working with my log-shipping job, I realized that I needed a second table. One to stage all my powershell loaded filenames and one that I could use to build the restores. I also wanted to be able to truncate the Logshipping Table to empty it before loading the filenames so I won’t miss any, but not have to redo work in my main table that had already been done. At the beginning of my SQL Agent job, I have the step that Truncates the Logshipping Table. Then I load the filenames with the powershell script, I update the created dates and then I load it all into my new RestoreFile Table.

CREATE TABLE [dbo].[RestoreFile](
	[Id] [int] IDENTITY(1,1) NOT NULL,
	[FileName] [varchar](500) NULL,
	[FileLocation] [varchar](1000) NULL,
	[FileType] [char](1) NULL,
	[Origin] [varchar](50) NULL,
	[ForDatabase] [varchar](50) NULL,
	[CreatedDate] [datetime2](7) NULL,
	[IsApplied] [bit] NOT NULL,
	[IsFinalLogFile] [bit] NOT NULL,
 CONSTRAINT [PK_RestoreFile] PRIMARY KEY CLUSTERED 
(
	[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[RestoreFile] ADD  CONSTRAINT [DF_RestoreFile_IsApplied]  DEFAULT ((0)) FOR [IsApplied]
GO

ALTER TABLE [dbo].[RestoreFile] ADD  CONSTRAINT [DF_RestoreFile_IsFinalLogFile]  DEFAULT ((0)) FOR [IsFinalLogFile]
GO

I removed the IsApplied column from the Logshipping table because I didn’t need it. The IsFinalLogFile becomes super important when we go to bring the database into Standby mode.

Next, I need to move the data from Logshipping into RestoreFile:

INSERT INTO RestoreFile(FileName, FileLocation, FileType, Origin, ForDatabase, CreatedDate)
SELECT A.FileName, A.FileLocation, A.FileType, A.Origin, A.ForDatabase, A.CreatedDate	
FROM [dbo].[LogshippingFile] A
LEFT JOIN [dbo].[RestoreFile] B ON A.FileName = B.FileName 
WHERE B.FileName IS NULL

I am comparing the filenames so that I don’t get duplicates. Thanks again to the date stamping in the names, I won’t get the same name twice, but I run this load frequently during the day and don’t want to bloat my table.

Now, we have all the pieces in place to do our restores! Get ready for some fun!

The song for this post is Vance Joy’s Missing Piece

Starting now, is the wrong date for insert…

This is part 2 of my log-shipping journey, if you missed part 1, you can find it here.

I collected all of the file names I need, but you will notice I left the dates off. When my files are moved from one domain to another, their created dates are being changed. I needed the real dates for the correct restore order and to match the backups to the logs. If I were a Dark Knight Powershell master, I am sure I would have figured out how to do it. Every time I started to get close, I would would have a production issue or another distraction that needed my time. In the end I landed in my happy place, so we are fixing the dates in the database!

How do I get the right date for a file when the created date is being changed after it has moved? I was super lucky that the date is being stored in the filename too! (Huge thank you to Ola for his awesome database maintenance solution.)

An example of my filename is this:

Batcentral$Alfred_Batman_LOG_20210610_224501.trn

This is how I dissect the filename to get the date and time from it:

  UPDATE [DBAStuff].[dbo].[LogshippingFile]
  SET CreatedDate = CAST(Substring(FileName, (LEN(FileName)-18),8)  +' '+ (Substring(FileName, (LEN(FileName)-9),2)+ ':' + Substring(FileName, (LEN(FileName)-7),2) + ':' + Substring(FileName, (LEN(FileName)-5),2)) AS DATETIME)
  WHERE CreatedDate IS NULL

My filenames are different lengths which means the the dates won’t always be in the same place, instead I go to the end of the string and count backwards because my dates are always consistent. Then I add all the parts back together to get my datetime and update it into my table.

Are we done yet? Nope, there is more.

The song for this post is Toad the Wet Sprockets’ Starting Now

I’m Going Under and This Time DBCC Can’t Save Me…

Over the weekend I was troubleshooting an issue and found that part of it was being caused by corruption on one of my databases.  I have seen DBCC CheckDB fix corruption a few times and thought, “No problem, I’ve got this.”

I ran:


ALTER DATABASE CorruptDB SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
BEGIN TRANSACTION;
DBCC CHECKDB ('CorruptDB', REPAIR_REBUILD);
ALTER DATABASE CorruptDB SET MULTI_USER;
COMMIT

I waited for it to finish because the rule is, “Let it finish”.

There were still errors. So I ran it again.

The same errors appeared saying that some errors were waiting on others to be fixed first.  But nothing had fixed and my heart sank.  I had a moment of panic as I furiously googled.  I saw post after post telling me to restore the backup…Did I mention the small problem that the corruption had prevented the backups from completing for this database but didn’t throw an error? My backups were over two weeks old and a roll back was not an option.

I found a post by Paul Randal from 2006 talking about when DBCC CheckDB won’t solve the issue.  He mentioned that sometimes it is just an index or statistic that is corrupted.  This gave me hope and I started tracing my error.  I found it was on a Change Data Capture table that is being used in our replication.  I began pondering an idea.  I removed replication and ran the DBCC CheckDB.  It was clean! No corruption! I breathed and took a backup. Then I added replication back on and ran the DBCC CheckDB again.  It was clean again! Everything was healthy and fine!

Thank you to everyone for your past posts that save my sanity and thank you for DBCC CheckDB!!!

 

The song for this post is Someone You Loved by Lewis Capaldi

If I need to rearrange my fragmentation, I will for you….

Oh my goodness, I have been buried and learning tons about Indexing.  Something super cool that I learned was how to know if you need to adjust your fillfactors on your indexes.  My whole DBA career I have usually set it 85 and forgotten it. I learned that I can check my CommandLog and see how often that index is being rebuilt.   I am currently using Ola and when my indexes are rebuilt for maintenance (rebuilt nightly in this case), it is all logged to a table named CommandLog in the master database.  If an index is rebuilt everyday, adjust the fillfactor  down (75) so that it will rebuild less often.  If it nearly never rebuilds, adjust the fillfactor up (90) so it gets rebuilt once in a while and to avoid wasting space in an index. Here is the query I am using to see how often an index has rebuilt in the last 10 days.


SELECT Command, COUNT(StartTime) AS Rebuilt
FROM master.[dbo].[CommandLog]
WHERE CommandType LIKE '%Index%' AND StartTime > GETDATE()-10
GROUP BY Command
ORDER BY Rebuilt DESC

When I see ones that have a rebuilt number of 10, 9 or 8, I know those are rebuilding nearly everyday.  I will adjust their fillfactors down so that I don’t rebuild as often.  My sweet spot right now is once a week, but that doesn’t work for every index (or environment).  This is where the art part comes into the DBA world and I have to think about what is best for the system.  If I have already adjusted something down to 75 and it is still rebuilding frequently, I will adjust that number down more (65, then 55), and back up if it is too low.  I also evaluate how much that index is used and how important it is to keep healthy. If it is hardly ever used, do I want to waste fillfactor space?

Indexing is absolutely an art, but now I have a new brush for my kit by being able to query the CommandLog.

This post’s song is Particles by Nothing But Thieves

It’s different for Availability Groups when service packs get broke…

Last week I did a few things wrong.  The good news is I learned from it and now can prevent myself from repeating it.

So I have this AG, it is kind of important, hence the AG part but after 5pm I have two hours that it can be down, or so I thought.  We recently added new functionality that requires it to be up all the time.  I have been applying SQL 2014 SP2 to all of my 2014 servers all of which have been in AGs and it has been super easy, less than 10 minutes of work and only a reboot of downtime. (I am a little old fashioned and always reboot after a service pack.) My mind said, “Sweet, you can get this done and no one will notice and if you do both at the same time and delay the reboot on one of them, there will be no down time.”  That was my first mistake.  My second one was starting a few minutes early. I was excited to have it done because it was my last round of service packs. My final mistake was not realizing that SSISDDB is considered a system database and should NEVER be in an availability group. It may be super awesome because you can keep your SSIS catalog completely in sync. You could maybe do it if you plan to remove it before doing service packs or any kind of upgrades to the server, bur as far as setting it and forgetting it, you are in trouble if you do it.

Here are a few of the fun errors that we saw.

“Script level upgrade for database ‘master’ failed because upgrade step ‘SSIS_hotfix_install.sql’ encountered error 942, state 4, severity 25. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the ‘master’ database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.”

“Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.”

Huge thank you to all the people that have blogged about Trace Flag 902.  It allowed us to start up SQL Server and find the errors and pull SSISDB out of the AG and get the service packs to finish running and everything was happy and great.  Here is the list of steps from the Microsoft KB article:

Enable trace flag 902 on the instance of SQL Server. To do this, follow these steps:

  1. Open SQL Server Configuration Manager.
  2. In SQL Server Configuration Manager, click SQL Server Services.
  3. Double-click the SQL Serverservice.
  4. In the SQL Server Properties dialog box, click the Advanced tab.
  5. On click the Advanced tab, locate the Startup Parameters item.
  6. Add ;-T902 to the end of the existing string value, and then click OK.

Lessons Learned:

  1. Don’t start earlier than you say you are going to start.
  2. Don’t do both sides of an AG at the same time.  Do the Passive one first and make sure everything is up and working and the AG is healthy before you do the primary node. Verify the fail-over works to the passive node so that you have no down time.
  3. Don’t get too comfortable with Service Packs and Cumulative Updates. They are still a big change even though they are pretty stable.
  4. Don’t have any system databases in the AG while applying patches and know that SSISDB is a part of the system databases.
  5. An awesome team that can back you up and help you trouble shoot can make all the difference.  It is amazing to have a boss that believes in you and is encouraging to help you keep going even when you want to give up and go cry in the closet.

Also for those of you following along at home and what to know what song goes with this post: It’s Different for Girls

Happy Service Packing!

Takes as Long as it Takes SQL, Break on Me, I have a DAC…

The DAC, what is it? It is the Dedicated Administrator Console. Basically it is a way to connect to your SQL Server when all the other connections are tied up. But it takes a little bit of pre-planning so that you can use it when things go bad with your SQL Server.  Let’s enable it so you can test using it and know that it is there in the future.


EXEC sp_configure 'remote admin connections', 1;
GO
RECONFIGURE
GO

I was always afraid to use the DAC because I thought I had to use the command line and I am terrible with cmd. But did you know, you can connect to it through Management Studio? Now I can feel right at home when I trouble shoot a sick server.  To connect to a server using the DAC put “admin:” in front of the name of the server like this:

DAC

I use my elevated windows login for this point, but you can use what ever sys admin account that you have.

There are a few words of caution. You can only have one connection to your DAC on each server at a time. So don’t try to connect Object explorer to it. I have SQLPrompt and noticed that my server would send an alert on Severity Error 20 because SQL Prompt was trying to connect at the same time and was failing. Just be aware that you might get an error, but if everything is correct you can look down at the connection at the bottom of the query window and see you are connected.  If someone else is using the DAC, you won’t be able to use the connection, so it is a good idea to always close it when you are done.

Kendra Little b|t has a fantastic post for Brent Ozar Unlimited about the DAC and most importantly to me, how to tell who is connected.  This is the code that she wrote to help identify who is connected:


SELECT
CASE
WHEN ses.session_id= @@SPID THEN 'It''s me! '
ELSE '' END
+ coalesce(ses.login_name,'???') as WhosGotTheDAC,
ses.session_id,
ses.login_time,
ses.status,
ses.original_login_name
from sys.endpoints as en
join sys.dm_exec_sessions ses on
en.endpoint_id=ses.endpoint_id
where en.name='Dedicated Admin Connection'

Now when SQL Server Breaks on you, there is a tool that prevents us from shattering like glass.

If Crazy = Genius, I’m rebuilding one partition at a time!

We have had an index job that has been failing for a while.  This is one of those things you really don’t want to clean up because no one is complaining, but you know you should.  I had heard that I could rebuild one partition at a time, but where to start?  Today, I worked my way through it, so here it is so that you can do it too.

First you need to find the biggest indexes, there is a good chance those are the ones that live on partitions. I am removing Primary Keys.


SELECT i.[name] AS IndexName
,SUM(ps.[used_page_count]) * 8 AS IndexSizeKB
FROM sys.dm_db_partition_stats AS ps
INNER JOIN sys.indexes AS i ON ps.[object_id] = i.[object_id]
AND ps.[index_id] = i.[index_id]
WHERE i.name NOT LIKE 'PK%'
GROUP BY i.[name]
ORDER BY IndexSizeKB DESC

The top ones are most likely the ones you want to focus on.  Next, we need to track down the partition function.


SELECT * FROM sys.partition_functions

Next we are going to figure out which partition we want to rebuild.


SELECT sprv.value AS [Value], sprv.boundary_id AS [ID]
FROM sys.partition_functions AS spf
INNER JOIN sys.partition_range_values sprv ON sprv.function_id = spf.function_id
WHERE (spf.name = N'NameOfFunctionFromPartitionFunctions')
ORDER BY [ID] ASC

Pick the number of the partition you want to rebuild.

Next use your index name from earlier.  You will also need the table name and the partition number.


ALTER INDEX [IndexName]
ON [dbo].[TableName]
REBUILD PARTITION = 3

This will rebuild just the partition that you requested.

She keeps her management studio error to minimal, Southern Style…

It is painful when you have been working on a tough query and you start to get an out of memory error.
“An error occurred while executing batch. Error message is: Exception of type ‘System.OutOfMemoryException’ was thrown”

What in the world? But it was working before! Why is this happening? Is it happening for everyone? Do I need to reboot the SQL Server?

This is a SQL Server Management Studio Error. Usually it means that you have filled up the local memory cache on your machine. Save all your queries and close Management Studio. Once you open it again you “should” be fine, but famous last words right? If that doesn’t work, just reboot. Nice clean memory and you can get back to playing with your super tough query.