Listen to the AG that’s tricking you…

 

Another AG (Availability Group) Post? Yes, I learned something new and it must be cataloged. When you are failing AG’s back and forth really fast and a major indexing job kicks off in the middle, it can cause a transaction to have to rollback.  This rollback may take a REALLY long time, even if you were only on the node for 10 minutes and a large transaction had only been running for about 5 minutes. When I failed back to my preferred primary node and the AG Dashboard didn’t go completely green, I got worried.  Why in the world would it not go green? I just failed to the preferred secondary and applied a patch (see? I learned.) and then was failing back. It had been green when I started, green when I failed over to the secondary and now one of my biggest databases was not synchronizing on the primary….*sigh*

I panicked. In this situation I would normally pull the database out of the AG and then re-add it.  I didn’t have that option because it is a HUGE database and didn’t have that much time and space to move it around. I knew a large transaction had kicked off (thank you alert email that I created to warn me about such things) but thought that surely the rollback would have cleared quickly.  That lead me to looking for rolling back transactions.

I ran this on the alarming secondary node:

SELECT R.session_id, R.command ,R.status, R.percent_complete
FROM sys.dm_exec_requests R
WHERE R.command IN ('killed/rollback','rollback')

To my surprise, there were no results.  Nothing was killed or rolling back; or was it? I ran the query again, but this time without the where clause.

SELECT R.session_id, R.command ,R.status, R.percent_complete
FROM sys.dm_exec_requests R

I saw one command listed as “UNKNOWN TOKEN” that had a percent complete at about 5%. That percent was rising. I theorized that this was my rolling back process and when it finished, my AG would be healthy again.  The system isn’t used overnight. We had started the maintenance in the late afternoon and it was the secondary node in trouble, so I had time to test my theory.  It was an agonizing 8 hours as I kept checking on the percent_complete all evening.  It finally completed and the AG went green.

My lesson learned: When my AG isn’t healthy and I have already resumed data movement, before I pull the database out of the AG,I need to check for processes that have a percent complete on the secondary node. Being patient is really hard but necessary with AG’s.

The song that goes with this post Listen to the Man.

Nothing can stop me, nothing holds me back from changing recovery mode and getting development on track…

Greetings and other salutations,

Today I found out that part of the development environment was in “Full Recovery Mode”.  This means that if someone isn’t taking log backups, their databases get huge, and it also means that the backups were much bigger than they should have been.  They don’t need point in time recovery in our development environment so we decided to move them to “Simple recovery”. This could have been a big all day job if I went through the GUI, but you know me, I found a way to script it out and thought I would share it. I am showing you how to do it on one server at a time:

Connect to your development server in the master database and run this query to see how many are in “Full Recovery”:

 SELECT name, recovery_model_desc
 FROM sys.databases
 WHERE recovery_model_desc = 'FULL'

When I ran it on one of my servers, there were 24 databases that needed to be adjusted. So I built this:

 SELECT 'ALTER DATABASE [' + name + '] SET RECOVERY SIMPLE ;'
 FROM sys.databases
 WHERE recovery_model_desc = 'Full'

Then I took the results from that query and copied it into a new window and ran it and just like that, all my databases are now in “Simple Mode” in Development. I ran the first query one more time to make sure everything updated as expected.

It is a beautiful thing. I hope this helps you clean up development too!

The song from this post is from the Kongo’s Take it from Me

I can make your logs clap…

The SQL error log has this nasty habit of getting big when I am not looking.  There are only two ways to keep is at a normal size. One is to stop and start your SQL instance (Reboot, Restart, Stop and Start) and the other is to run this handy little script:


EXEC sp_cycle_errorlog;

This will end the current log and start a new one.  Why does this matter?  The SQL Error log holds information about your backups, failed logins, SQL errors, edition information and other fun stuff.  The bigger it is, the longer it will take SQL to load it into memory so that you can read it.  Usually when you need to read it, you are in trouble so the slower it is, the more stressed you will be.

What is a good size?  I usually try to get it to roll over around 10 MB.  I use a monitoring tool and when the large error log alert is triggered, I have it run sp_cycle_errorlog for me so mine always stay a healthy size.  You don’t need fancy tools to do this though.  If you know about how fast your logs grow, you can set up a SQL Agent job to run it on a schedule to keep your logs healthy.

How many logs should I keep? This is completely up to you, but since I keep my logs so small, I try to keep 15 of them.  Why so many? I do it so I can go back and see issues further back if needed. You can adjust the amount you keep by right clicking on SQL Server Logs in SSMS and selecting “Configure”

Configure SQL Error Logs

Super cool, but what about the Agent error logs? There is a script for them as well!


USE msdb;
GO
EXEC sp_cycle_agent_errorlog;

See? Healthy and Happy Logs! Your Logs will be clapping with joy.

If Crazy = Genius, I’m rebuilding one partition at a time!

We have had an index job that has been failing for a while.  This is one of those things you really don’t want to clean up because no one is complaining, but you know you should.  I had heard that I could rebuild one partition at a time, but where to start?  Today, I worked my way through it, so here it is so that you can do it too.

First you need to find the biggest indexes, there is a good chance those are the ones that live on partitions. I am removing Primary Keys.


SELECT i.[name] AS IndexName
,SUM(ps.[used_page_count]) * 8 AS IndexSizeKB
FROM sys.dm_db_partition_stats AS ps
INNER JOIN sys.indexes AS i ON ps.[object_id] = i.[object_id]
AND ps.[index_id] = i.[index_id]
WHERE i.name NOT LIKE 'PK%'
GROUP BY i.[name]
ORDER BY IndexSizeKB DESC

The top ones are most likely the ones you want to focus on.  Next, we need to track down the partition function.


SELECT * FROM sys.partition_functions

Next we are going to figure out which partition we want to rebuild.


SELECT sprv.value AS [Value], sprv.boundary_id AS [ID]
FROM sys.partition_functions AS spf
INNER JOIN sys.partition_range_values sprv ON sprv.function_id = spf.function_id
WHERE (spf.name = N'NameOfFunctionFromPartitionFunctions')
ORDER BY [ID] ASC

Pick the number of the partition you want to rebuild.

Next use your index name from earlier.  You will also need the table name and the partition number.


ALTER INDEX [IndexName]
ON [dbo].[TableName]
REBUILD PARTITION = 3

This will rebuild just the partition that you requested.

Strip it down and remove the bad query plan

Today I got to play with some really bad queries. But the queries weren’t necessarily bad, it was more they had bad plans. I thought I had already blogged about it and tried to find my code. Alas, it wasn’t there so let’s strip it down on how you would remove a bad query plan. I am leaving out the trouble shooting part of how to determine if it is a bad plan because so much of it “depends”.
First you have to find the bad query plan.  Get a unique line from your query and paste it in the query below.

USE master;
GO

SELECT usecounts, cacheobjtype, objtype, text, plan_handle
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
WHERE usecounts > 1
AND TEXT LIKE '%Unique part of query%' --put the unique part of the query here.
ORDER BY usecounts DESC;

Now, copy the plan handle and paste it over the plan handle that I have listed here:

SELECT * FROM sys.dm_exec_query_plan (0x060001004DE4D526F0BEA28F05000);

If you click on the query_plan link, you can see what the plan looks like.  After you have reviewed it and determined the plan is bad then you can paste your plan handle over the one below to remove it from the proc cache.

DBCC FREEPROCCACHE (0x060001004DE4D526F0BEA28F05000)

There you have it. Best of luck with your bad plans.

For one good, naughty little girl found a diamond…Object Explorer Details

It’s Christmas time again and time to listen to my FAVORITE Christmas song called Joel The Lump of Coal.
Just before Thanksgiving we had our SQLSaturday\Big Mountain Data event and I spoke! This is my third time speaking at this event and every year I regret speaking and feel like everyone would have been better in another session, every year that is until this one! I loved my session and I will actually be submitting it to PASS Summit this year. Keep your fingers crossed with me. It was on SQL Server Management Studio Tips and Tricks.

One of the tips that I was super surprised that many people didn’t know is the Object Explorer Details. It allows you to delete multiple objects at once, script out multiple objects at once and just do some really cool stuff. How do I access this magic you are asking? When in management studio, click on View>>Object Explorer Details.

 

ObjectExplorerDetails

Now you can have a diamond that will help you too!

And you said you are unconsolable…clean up before you leave

Greetings,

I am approaching my last day at my current job. I love it here and will be really sad to leave, but have an awesome opportunity to grow my knowledge and career with a company on my “want to work for” list.

There are a lot of things to take care of before I leave.  I have been updating documentation (with meme’s) so that it is useful and fun.  I am trying to wrap up all my tickets and outstanding items and last night I woke up and realized, I was the owner of some databases.  This is how I fixed it:

I launched a query window on my Central Management Server to save time, but you can run this on one server at a time if you want.  I used the syntax from sp_helpdb to find out what I wanted to query:

select name, isnull(suser_sname(sid),'~~UNKNOWN~~') AS Owner, convert(nvarchar(11), crdate),dbid, cmptlevel
from master.dbo.sysdatabases
WHERE suser_sname(sid) = 'domain\MyUserName'

Some of the applications in my environment run under a special user and I didn’t want to interfere with those, I just wanted to fix the ones that use me.  Then I borrowed some code from Brent Ozar:

SELECT 'ALTER AUTHORIZATION ON DATABASE:: ['+ name +'] to sa;'
FROM master.dbo.sysdatabases
WHERE suser_sname(sid) = 'domain\MyUserName'

Here’s one I run on the CMS to find any SQL Agent Jobs that I own across my enterprise and then I can run the update scripts that are generated on the individual servers.


SELECT 'EXEC MSDB.dbo.sp_update_job ' + char(13) +
'@job_name = ' + char(39) + [Name] + char(39) + ',' + char(13) +
'@owner_login_name = ' + char(39) + 'sa' + char(39) + char(13) + char(13)+';'
FROM msdb.dbo.sysjobs
WHERE SUSER_SNAME(owner_sid) = 'domain\MyUserName'

What you gonna do? Memory, I’m coming for you!

Today I needed to quickly check 68 SQL Server Instances Min and Max memory settings. I didn’t have time to go through each one and I know I will need to do this again in the future. Thank goodness I have my Central Management Server configured with all those servers. I was able to connect to my main CMS server, and run this simple query that will tell me all my servers min and max memory setting:

SELECT ServerName, [Max], [Min]
FROM
(SELECT @@ServerName AS ServerName, LEFT(name,3) AS Memory, value
FROM sys.configurations
WHERE name like '%server memory%') AS SourceTable
PIVOT
(
MAX(Value)
FOR Memory IN ([Max], [Min])
) AS PivotTable;

Then, I also used this code from my last post but added a server name to it so I could see what memory was available on each server.

SELECT @@Servername,physical_memory_kb * 9.5367431640625E-7
FROM sys.dm_os_sys_info

Ima make a deal with the SA so the SA don’t bite no more

T-SQL Tuesday

TSQL Tuesday #63 – How do you manage security?

Yay, so this is my first official blog party post.  I was late with my post last month because I didn’t have it figured out until I saw everyone posting and wondered what was going on.

One of my goals for this year is to increase security across my 50 SQL servers.  The SA log-in is a dangerous one and I don’t want it out there on my servers, the trouble is that it is hard to exterminate. Here is my new plan.

I am going to script out my SA log-in and password using this script.

Next I am going to rename the SA log-in to something else and give it a much stronger password. This information will go into my password safe so that I have it just in case I need it.

I am going to create a dummy SA. I have many vendor databases that claim they need SA, but they really don’t and I don’t want them to have high privileges.  I am going to work on lowering the privileges to only give what they need (this will be a slow server by server process to get the permissions right).  The reason I am keeping a log in with the name “SA” is because of vendors who hard code that user name.

As I was talking to people in the #SLCSQL user group last night, someone suggested that we can also monitor that dummy SA login to watch for attacks. It is a great idea and plan to include monitoring on the new log in.

Lars Rasmussen suggested I have a user on each server that will always be there to handle running jobs and other database needs.  I plan to include this too so that when I don’t have a proper SA log in, I will still have a log in that can handle all my fun stuff.

By doing all of this, my SA won’t bite me anymore.

T-SQL Tuesday is a blog party started by Adam Machanic (b|t) just over five years ago.  This month it is hosted by Kenneth Fisher (b|t).

We could be immortals, just not for long when using a duplicate delete!

Sometimes awesome things just happen.  Today Rob Farley (@Rob_Farley) was helping me with a previous post about my dates table and just as a side note he said, “Oh, let me show you something else cool.” It was really cool and so I asked if I could add it to my blog since I never know when this problem will strike.

I have a table about Super Heroes and their cape colors.  I made a mistake and put Batman in there twice.  But since there is no primary key, how do I tell it which one to delete?  Hero Table

Since they are exactly the same, I can let SQL sort it out. This uses both a CTE (I hadn’t ever used it without some kind of join before today) and OVER which I am learning about.  So cool!

WITH SuperHeroDuplicates 
AS 
(SELECT *, ROW_NUMBER() over (partition by HeroName,CapeColor order by HeroName,CapeColor) as rownum 
FROM dbo.HeroCapeColor) 
DELETE 
FROM SuperHeroDuplicates 
where rownum > 1;

Thanks Rob!

—————UPDATE—————

Kenneth Fisher (@sqlstudent144) also wrote a blog post about another way you can accomplish this task if you only have a few of them to delete.

He has been a super big help to me.  He taught me how to display my code better in my blog so it is easier to read and copy. He also encourages me, builds my confidence and even included a link to one of my posts in an article. It made me feel special and like what I have to say matters to other people.  I love how he comments on my posts and gives me ideas on how to make them better. I feel so lucky to be a part of such a great community of people that are so thoughtful, selfless and giving.  Thanks Kenneth for being such a great example to me!