August 22, 2023 by andreaallred

I tried to Move SSISDB, It Hit Me So Hard, How Can It Be This Heavy?

Recently I needed to move a dev instance from a physical server to a virtual one and needed to do SQL upgrades. We decided it was time to spin up a new dev instance and that required moving SSISDB. I will be honest, I was not prepared for all the things. I also did not have a password for the encryption so I had to force things to work. Because this was a dev server, I wasn’t as worried about what I would lose, I just wanted it to work. Here is what I did.

I spun up my new instance and restored SSISDB, which was wrong. So I deleted it. I found that I had missed the SSIS install piece during the install process, so I ran that again and installed SSIS.

Next I configured CLR on the new server:

EXEC sp_configure 'clr enabled', 1;  
RECONFIGURE;  
GO

Since I didn’t have the password from the old server, I ran this on the old server:

Backup master key to file = 'D:\Backup\SSISDB_key' --Replace with the location you want to save it.
Encryption by password = 'StrongPassword' --Replace with your password

Then I created a new backup of the SSISDB on the old server.

Because I had restored my SSISDB when I shouldn’t have, I ran this powershell code on the new server to adjust all the little things I was going to miss:

# Load the IntegrationServices Assembly
[Reflection.Assembly]::LoadWithPartialName(“Microsoft.SqlServer.Management.IntegrationServices”)

# Store the IntegrationServices Assembly namespace to avoid typing it every time
$ISNamespace = “Microsoft.SqlServer.Management.IntegrationServices”

Write-Host “Connecting to server …”

# Create a connection to the server
$sqlConnectionString = “Data Source=localhost;Initial Catalog=master;Integrated Security=SSPI;”
$sqlConnection = New-Object System.Data.SqlClient.SqlConnection $sqlConnectionString

# Create the Integration Services object
$integrationServices = New-Object $ISNamespace”.IntegrationServices” $sqlConnection

# Provision a new SSIS Catalog
$catalog = New-Object $ISNamespace”.Catalog” ($integrationServices, “SSISDB”, “StrongPassword”)  #REPLACE THE PASSWORD
$catalog.Create()

I restored the SSISDB to to the new server from the backup I took above. Then I restored the SSISDB_key that I created:

RESTORE MASTER KEY FROM FILE = 'D:\Install\SSISDB_key' --Change to the Location of your MasterKey Backup
DECRYPTION BY PASSWORD = 'StrongPassword' --Password used to encrypt. 
ENCRYPTION BY PASSWORD = 'StrongPassword' --New Password
FORCE

Then I opened and altered the master key:

OPEN MASTER KEY DECRYPTION BY PASSWORD = 'StrongPassword'--–‘Password used when creating SSISDB in Powershell Script’
ALTER MASTER KEY ADD ENCRYPTION BY SERVICE MASTER KEY

Next, I verified my keys were working:

OPEN MASTER KEY DECRYPTION BY PASSWORD = 'StrongPassword';

Finally, I synchronized logins and made sure my new version was compatible by watching the compatibility status (0 means all is good)

EXEC catalog.check_schema_version 0 --0 for 64, 1 for 32

It was a lot, but I was so thrilled when it all started working.

The song for this post is Dancing with My Eyes Closed – Performed by Wind Walkers

August 17, 2021 by andreaallred

Give me a minute to restore my logs…

This is part 4 of my log-shipping journey, if you missed part 1, you can find it here, part 2 is here, and part 3 is here.

We have one more set up piece to finish up before we do our restores. We need to tell my scripts which logfile is going to be the final one for now, so it can restore with Standby and I can read my databases.

UPDATE RestoreFile
SET IsFinalLogFile = 1
FROM RestoreFile RF1
INNER JOIN (SELECT RF.Origin, MAX(RF.CreatedDate) AS CreatedDate
FROM RestoreFile RF
INNER JOIN	(SELECT ForDatabase, MAX(CreatedDate) AS CreatedDate
					FROM [dbo].[RestoreFile] RF
					WHERE RF.FileType = 'B' 
					GROUP BY ForDatabase) BF ON RF.ForDatabase = BF.ForDatabase AND RF.CreatedDate > BF.CreatedDate
WHERE FileType = 'L' 
GROUP BY RF.Origin)RF2 ON RF1.Origin = RF2.Origin AND RF1.CreatedDate = RF2.CreatedDate

There are a few things going on here. First, I need to make sure that my logfile is more recent than my most recently restored full backup. Because I am restoring more than one database with this script, I have to join on both the ForDatabase and the CreatedDate (making sure it is greater than or, in the logfile case, equal to Date and Time I need). When I find the last logfile that is greater than the full backup’s created date, I am marking it with a IsFinalLogFile = 1 (True).

Now we are ready to build our full database restores.

DECLARE @SQLToExecute nvarchar(max), @RestoreFileId int
DECLARE RestoreBackups CURSOR FAST_FORWARD
FOR
SELECT 
'RESTORE DATABASE ' + RF.ForDatabase + ' FROM DISK = N''' + RF.FileLocation + ''' WITH FILE =  1, MOVE N''' + MF.name + ''' TO N''D:\SQLData\' + RF.ForDatabase + '.mdf'', MOVE N''' + MF.name + '_log'' TO N''L:\SQLLogs\'+ RF.ForDatabase + '_log.ldf'', NORECOVERY,  NOUNLOAD,  REPLACE,  STATS = 5'
, Id

  FROM [dbo].[RestoreFile] RF
	INNER JOIN sys.databases d ON RF.ForDatabase = d.name
	INNER JOIN sys.master_files mf on d.database_id = mf.database_id and mf.type = 0
  WHERE RF.FileType = 'B' AND RF.IsApplied = 0
OPEN RestoreBackups
FETCH NEXT FROM RestoreBackups INTO @SQLToExecute, @RestoreFileId
WHILE (@@FETCH_STATUS = 0)
BEGIN
    EXEC sp_executeSQL @SQLToExecute
	
	UPDATE [dbo].[RestoreFile]  SET IsApplied = 1 WHERE Id = @RestoreFileId

    FETCH NEXT FROM RestoreBackups INTO @SQLToExecute, @RestoreFileId
END
CLOSE RestoreBackups
DEALLOCATE RestoreBackups
GO

I am using a cursor to run each restore one at a time. I make sure that I am only doing FileType = ‘B’ so I know they are full backups and ones that haven’t been applied yet. I am building my restore string to include a move since the location on the new server may be different from what was used in the past. I update the applied status as I do the restore so I won’t apply this one twice, for this I am using the RestoreFileId which allows me to get the specific record that I want. This is the main reason I wanted a new table. If I would have stayed with one table, there is a chance I would have had duplicates on the filenames. That would have meant duplicate chances to restore records. It just made everything so much cleaner to make one a stage table and one a production table.

For the logfile restore, I followed a similar pattern:

DECLARE @SQLToExecute nvarchar(max), @RestoreFileId int
DECLARE RestoreBackups CURSOR FAST_FORWARD
FOR
SELECT 
'RESTORE LOG ' + RF.ForDatabase + ' FROM DISK = N''' + RF.FileLocation + ''' WITH NORECOVERY;'
, Id

  FROM [dbo].[RestoreFile] RF
	INNER JOIN sys.databases d ON RF.ForDatabase = d.name
	INNER JOIN sys.master_files mf on d.database_id = mf.database_id and mf.type = 0
	INNER JOIN	(SELECT ForDatabase, MAX(CreatedDate) AS CreatedDate
					FROM [dbo].[RestoreFile] RF
					WHERE RF.FileType = 'B' 
					GROUP BY ForDatabase) BF ON RF.ForDatabase = BF.ForDatabase AND RF.CreatedDate > BF.CreatedDate
  WHERE RF.FileType = 'L' AND RF.IsApplied = 0 AND RF.IsFinalLogFile = 0
  ORDER BY RF.ForDatabase, RF.CreatedDate
OPEN RestoreBackups
FETCH NEXT FROM RestoreBackups INTO @SQLToExecute, @RestoreFileId
WHILE (@@FETCH_STATUS = 0)
BEGIN

BEGIN TRY
    EXEC sp_executeSQL @SQLToExecute

		UPDATE [dbo].[RestoreFile]  SET IsApplied = 1 WHERE Id = @RestoreFileId
END TRY
BEGIN CATCH
    --SELECT   
        --ERROR_NUMBER() AS ErrorNumber,ERROR_MESSAGE() AS ErrorMessage
		UPDATE [dbo].[RestoreFile]  SET IsApplied = 0 WHERE Id = @RestoreFileId
END CATCH 

    FETCH NEXT FROM RestoreBackups INTO @SQLToExecute, @RestoreFileId
END
CLOSE RestoreBackups
DEALLOCATE RestoreBackups
GO

The pattern here is almost the same. I build my simple restore script but only for filetype = ‘L’ (Logfile) and only if it hasn’t been applied and if it isn’t the final logfile. I also check to make sure the full backup has restored and is earlier than the logfile. Then I execute the string of sql that I built and update the record to applied. I ended up having issues with the IsApplied being marked to true, even when it didn’t apply correctly so I added in error handling to change it back if that is the case. I may go back and add this to the Full Backup Restore script.

Next, I run the script that will do my final logfile restore for each database:

DECLARE @SQLToExecute nvarchar(max), @RestoreFileId int
DECLARE RestoreBackups CURSOR FAST_FORWARD
FOR
SELECT 
'RESTORE LOG ' + RF.ForDatabase + ' FROM DISK = N''' + RF.FileLocation + ''' WITH STANDBY=''Z:\Standby\' + RF.ForDatabase + '_Standby.bak'';'
, Id

  FROM [dbo].[RestoreFile] RF
	INNER JOIN sys.databases d ON RF.ForDatabase = d.name
	INNER JOIN sys.master_files mf on d.database_id = mf.database_id and mf.type = 0
	INNER JOIN	(SELECT ForDatabase, MAX(CreatedDate) AS CreatedDate
					FROM [dbo].[RestoreFile] RF
					WHERE RF.FileType = 'B' 
					GROUP BY ForDatabase) BF ON RF.ForDatabase = BF.ForDatabase AND RF.CreatedDate > BF.CreatedDate
  WHERE RF.FileType = 'L' AND RF.IsApplied = 0 AND RF.IsFinalLogFile = 1
  ORDER BY RF.ForDatabase, RF.CreatedDate
OPEN RestoreBackups
FETCH NEXT FROM RestoreBackups INTO @SQLToExecute, @RestoreFileId
WHILE (@@FETCH_STATUS = 0)
BEGIN

BEGIN TRY
    EXEC sp_executeSQL @SQLToExecute

		UPDATE [dbo].[RestoreFile]  SET IsApplied = 1 WHERE Id = @RestoreFileId
END TRY
BEGIN CATCH
    --SELECT   
        --ERROR_NUMBER() AS ErrorNumber,ERROR_MESSAGE() AS ErrorMessage
		UPDATE [dbo].[RestoreFile]  SET IsApplied = 0 WHERE Id = @RestoreFileId
END CATCH 

    FETCH NEXT FROM RestoreBackups INTO @SQLToExecute, @RestoreFileId
END
CLOSE RestoreBackups
DEALLOCATE RestoreBackups
GO

The only difference between these last two are the “WITH STANDBY” instead of “WITH NORECOVERY” and the IsFinalLogFile being set to true.

Last, but not least, I am cleaning old records out of my RestoreFile table so it doesn’t get large. I debated how much data to keep and for me, it makes sense to delete anything older than two days. I am doing a full restore daily and if I change that to weekly, I will keep this data longer.

  DELETE
  FROM [Maintenance].[dbo].[RestoreFile]
  WHERE CreatedDate < GETDATE()-2

This process allows me to stay flexible. If I don’t get a new full backup, this process will keep restoring logfiles until a new full backup file appears. If the backup chain is broken, this process will also break, but it will work as long as that chain is healthy and it gets all the files it needs.

Whew! We did it! That is my full poor girl log shipping process.

The song for this post is George Ezra’s Hold My Girl.

July 28, 2021 by andreaallred

When I add a table new, that missing piece is found…

This is part 3 of my log-shipping journey, if you missed part 1, you can find it here and if you missed part 2, you can find it here.

As I starting working with my log-shipping job, I realized that I needed a second table. One to stage all my powershell loaded filenames and one that I could use to build the restores. I also wanted to be able to truncate the Logshipping Table to empty it before loading the filenames so I won’t miss any, but not have to redo work in my main table that had already been done. At the beginning of my SQL Agent job, I have the step that Truncates the Logshipping Table. Then I load the filenames with the powershell script, I update the created dates and then I load it all into my new RestoreFile Table.

CREATE TABLE [dbo].[RestoreFile](
	[Id] [int] IDENTITY(1,1) NOT NULL,
	[FileName] [varchar](500) NULL,
	[FileLocation] [varchar](1000) NULL,
	[FileType] [char](1) NULL,
	[Origin] [varchar](50) NULL,
	[ForDatabase] [varchar](50) NULL,
	[CreatedDate] [datetime2](7) NULL,
	[IsApplied] [bit] NOT NULL,
	[IsFinalLogFile] [bit] NOT NULL,
 CONSTRAINT [PK_RestoreFile] PRIMARY KEY CLUSTERED 
(
	[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[RestoreFile] ADD  CONSTRAINT [DF_RestoreFile_IsApplied]  DEFAULT ((0)) FOR [IsApplied]
GO

ALTER TABLE [dbo].[RestoreFile] ADD  CONSTRAINT [DF_RestoreFile_IsFinalLogFile]  DEFAULT ((0)) FOR [IsFinalLogFile]
GO

I removed the IsApplied column from the Logshipping table because I didn’t need it. The IsFinalLogFile becomes super important when we go to bring the database into Standby mode.

Next, I need to move the data from Logshipping into RestoreFile:

INSERT INTO RestoreFile(FileName, FileLocation, FileType, Origin, ForDatabase, CreatedDate)
SELECT A.FileName, A.FileLocation, A.FileType, A.Origin, A.ForDatabase, A.CreatedDate	
FROM [dbo].[LogshippingFile] A
LEFT JOIN [dbo].[RestoreFile] B ON A.FileName = B.FileName 
WHERE B.FileName IS NULL

I am comparing the filenames so that I don’t get duplicates. Thanks again to the date stamping in the names, I won’t get the same name twice, but I run this load frequently during the day and don’t want to bloat my table.

Now, we have all the pieces in place to do our restores! Get ready for some fun!

The song for this post is Vance Joy’s Missing Piece

June 25, 2021 by andreaallred

Starting now, is the wrong date for insert…

This is part 2 of my log-shipping journey, if you missed part 1, you can find it here.

I collected all of the file names I need, but you will notice I left the dates off. When my files are moved from one domain to another, their created dates are being changed. I needed the real dates for the correct restore order and to match the backups to the logs. If I were a Dark Knight Powershell master, I am sure I would have figured out how to do it. Every time I started to get close, I would would have a production issue or another distraction that needed my time. In the end I landed in my happy place, so we are fixing the dates in the database!

How do I get the right date for a file when the created date is being changed after it has moved? I was super lucky that the date is being stored in the filename too! (Huge thank you to Ola for his awesome database maintenance solution.)

An example of my filename is this:

Batcentral$Alfred_Batman_LOG_20210610_224501.trn

This is how I dissect the filename to get the date and time from it:

  UPDATE [DBAStuff].[dbo].[LogshippingFile]
  SET CreatedDate = CAST(Substring(FileName, (LEN(FileName)-18),8)  +' '+ (Substring(FileName, (LEN(FileName)-9),2)+ ':' + Substring(FileName, (LEN(FileName)-7),2) + ':' + Substring(FileName, (LEN(FileName)-5),2)) AS DATETIME)
  WHERE CreatedDate IS NULL

My filenames are different lengths which means the the dates won’t always be in the same place, instead I go to the end of the string and count backwards because my dates are always consistent. Then I add all the parts back together to get my datetime and update it into my table.

Are we done yet? Nope, there is more.

The song for this post is Toad the Wet Sprockets’ Starting Now

May 25, 2021 by andreaallred

Load quick, got a list of files that need to be in a table…

Yes, I am trying to do a version of log-shipping that doesn’t require linking any servers and allows them to be in different domains. I also know I am crazy.

Now that we have that out of the way, let me show you some of the stuff I am doing! Loading a table from multiple file locations using Powershell. It will go and read what is in the directories and load them into a table for me (Oh my goodness, I knew Powershell was cool, but it took me so long to figure this out, that I was super proud with how fast it loaded).

First, the housekeeping. Create two tables in a DBA Database:

CREATE TABLE [dbo].[LogshippingFile](
	[FileName] [varchar](500) NULL,
	[FileLocation] [varchar](1000) NULL,
	[FileType] [char](1) NULL,
	[Origin] [varchar](50) NULL,
	[ForDatabase] [varchar](50) NULL,
	[CreatedDate] [datetime2](7) NULL,
	[Applied] [bit] NOT NULL
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[LogshippingFile] ADD  CONSTRAINT [DF_LogshippingFiles_Applied]  DEFAULT ((0)) FOR [Applied]

CREATE TABLE [dbo].[Config](
	[Origin] [varchar](50) NULL,
	[FileLocation] [varchar](500) NULL,
	[Type] [char](1) NULL
) ON [PRIMARY]

In the Config table, I am going to put the common name for the database I am restoring so the rows will look like this:

Origin	FileLocation	Type
Batman	J:\FullBackups\Batman	B
Batman	J:\LogBackups\Batman	L
Superman	J:\FullBackups\Superman	B
Superman	J:\LogBackups\Superman	L

Yay! Now we can get to the fun stuff. This is how I set up the Powershell:



# Credentials for connection to SQL Server. 
$username = 'Batgirl'
$password = '!Pennyworth54'
#Connecting to the SQL Server and Query to get FileLocation and Origin from the Config table. 
$DS = Invoke-Sqlcmd -ServerInstance "BatCave01" -Database "DBAStuff" -Username $username -Password $password -Query "SELECT FileLocation, Origin FROM Config" -As DataSet

#Magic that puts my two columns into variables
$data = $DS.Tables[0].Rows
$data
ForEach ($Row in $data) {
$FileLocation = $Row.FileLocation 
$Origin = $Row.Origin



#Build my insert statement so Powershell knows what to expect. 
#gets all the filenames and loads them into the table. 
$sqlstatement=@'
INSERT INTO LogshippingFile
(
FileName,	
FileLocation, 
FileType,
Origin,
ForDatabase 
) 
VALUES (
	'{0}',
	'{1}',
	'{2}',
	'{3}',
	'{4}'

   
)
'@
Get-ChildItem -Recurse $FileLocation  | 
select Name,FullName, 
    @{
    Label='FileType'
    Expression={($FileLocation.Substring(10,1))}}, @{
    Label='Origin'
    Expression={($Origin)}}, @{
    Label='ForDatabase'
    Expression={'Gryphon_'+($Origin)}} |
   # @{Label="Length";Expression={$_.Length / 1024KB -as [int] }}  |
ForEach-Object {
          $SQL = $sqlstatement -f  $_.name,$_.FullName, $_.FileType, $_.Origin, $_.ForDatabase #,  $_.Length		
    Invoke-sqlcmd -Username $username -Password $password -Query $SQL -ServerInstance “BatCave01” -database “DBAStuff” -Querytimeout 0
    }
    }

Now we have all this data in the table, what do we do with it? Guess you will have to tune in again to this same Bat channel.

The song for this post is Matt Nathanson’s Kiss Quick

April 30, 2021 by andreaallred

Only AG’s can hurt like this

While doing a planned Availability Group failover, the application stopped talking to the database. After checking the SQL Server log, we found that all the SQL Logins were failing with an “incorrect password” error. The logins were on the server, the users were in the databases, and the passwords were even right, so what was wrong? It all comes down to SID’s (Security Identifiers).

I am going to back up a little bit and point out the few main items that don’t automatically synchronize between two AG Servers. I like to think in terms of Database Level vs Server Level. If it is stored in a database and that database is in the AG, then it will synchronize. The msdb database and the master database can’t be in an AG, those two hold the agent jobs, linked servers, backup history and logins. Because a sql login can be created on two different servers, even if they have the same username and password, they can have a different SID which will make them mismatch. That is what had happened in this case.

I like to use a script when creating these logins that has the SID specified so they will match between all the AG nodes. It can be done like this:

CREATE Login MyLogin WITH password = 'SecurePassword', SID = 0x06046EG0F6C0C6488GCGG3G815EFF5GA

But how can I check if the SIDs match once they are created? I like to create a temporary linked server between the nodes to make it easier for me to verify them (it can introduce a security risk if left there). Then I run something like this:

SELECT A.name, A.sid, B.name, B.sid
FROM NodeA.master.dbo.sysusers A
LEFT JOIN NodeB.master.dbo.sysusers B ON A.name = B.name
WHERE A.sid <> B.sid

I absolutely could do this without the temporary linked server by querying each node and comparing the SID’s but this is a little faster.

This posts song is Only Love Can Hurt Like This by Paloma Faith

December 27, 2016 by andreaallred

What can I say except “You’re Welcome for the AG voting script”

We recently had an issue where the network between our GEO-Cluster would go down and both Availability Group Instances thought they were supposed to take charge. When the network came back up, both of them still though they were in charge. You can imagine with an AG, you can’t have two instances that think they are in charge without problems. This brought up the question of how voting was configured between the two of them. This script helped a bunch:

SELECT member_name, member_state_desc, number_of_quorum_votes
 FROM sys.dm_hadr_cluster_members;

We found that the File Share Witness wasn’t working properly by checking the member state. In a simple AG, a good practice is to have each instance and then a File Share Witness,that keeps each side from accidentally taking over. You’re Welcome.

The song for this post: You’re Welcome From Moana

November 1, 2016 by andreaallred

Lay your weary head to rest and run your query more…

Just coming from a wonderful week at PASS Summit. I sat in two separate sessions where this trick came up and watched as people furiously wrote it down. Did you know that you can set a query to run multiple times in one command? All you have to do is add “Go” and the number of times you want it to run. For example:

SELECT GETDATE()
GO 5

Will return something like this:

go5results

Super cool, right? It was mentioned in both Brent Ozar’s and Argenis Fernandez’s presentations and I didn’t want to forget. Now you can easily run your queries more and create load on your test system without having to click run over and over again.

Today’s post features Lyrics from Panic! At the Disco’s Cover of Carry On My Wayward Son

September 29, 2016 by andreaallred

Mayday this is an emergency, my linked server is using ODBC to connect to an AG…

Availability Groups (AG) and Linked servers can get really tricky. Especially if you are dealing with multi-subnet failover. The only way we have figured out how to do this is with ODBC. Here is how.

First, we have create an ODBC connection on our SQL Server. The single server in the picture below is the server we are going create the ODBC connection and the linked server on. It will go over to the AG Listener. ag-pic

First we are creating an ODBC Connection on our server that is going to link to the AG.

odbc1

Make sure to be in the System DSN section. Click Add

odbc2

We can pick either ODBC Driver 11 or 13. This is a separate driver install that we can get here. We want to install the driver and then we will see it as an option in the screen above. Click Finish.

odbc3

Give it whatever name and description wanted, but save the name for later. The server should be the name of the AG Listener. Click Next.

odbc4

Technically we don’t have to put in a login and password, but I like to test the login and password that I am going to use for the linked server. It won’t be saved here. Click Next.

odbc5

Here is the magic part, make sure to check multi-subnet failover. That is what is going to make the connection automatically fail between the two nodes. Click next, test the connection and then Finish. The ODBC connection is ready to be used by the linked server connection. Let’s build that part now.

Go into SQL Server Management Studio and under Server Objects, right click on Linked Servers and select “New Linked Server”.

odbc6

The provider needs to be OLE DB Provider for ODBC Drivers. Remember the name we gave the ODBC connection? We are going to use that here. Then go to the Security Tab.

odbc7

This is where we put in the login and it will be stored here. I also make sure this user name and password is on both Nodes of the AG with the permissions that I need. Click Server Options

odbc8

The above is what I need, but I check that I am only giving access to what is needed and not more. When we click ok, it will test our connection. If everything works with no errors, we are ready to go.

Some of the problems that we have noticed are querying tables that have big datatypes like time(3-7), timestamp, and a few others. Casting or converting the datatypes doesn’t help. If we pull the table into a view without the big datatype columns, we are able to query the view from another server, but never the base table. It has been a bit frustrating, but we are still hopeful that we can find a solution or that Microsoft with fix ODBC connections. If there is a better way to do this, please reach out to me. We have things we need to solve and could use some help.

The song for this post is Mayday by Cam.

Coming soon: Count down to PASS Summit 2016 with more pictures from PASS Summit 2015. Watch twitter and the Magic Mirrors page for more.

August 31, 2016 by andreaallred

Listen to the AG that’s tricking you…

Another AG (Availability Group) Post? Yes, I learned something new and it must be cataloged. When you are failing AG’s back and forth really fast and a major indexing job kicks off in the middle, it can cause a transaction to have to rollback. This rollback may take a REALLY long time, even if you were only on the node for 10 minutes and a large transaction had only been running for about 5 minutes. When I failed back to my preferred primary node and the AG Dashboard didn’t go completely green, I got worried. Why in the world would it not go green? I just failed to the preferred secondary and applied a patch (see? I learned.) and then was failing back. It had been green when I started, green when I failed over to the secondary and now one of my biggest databases was not synchronizing on the primary….*sigh*

I panicked. In this situation I would normally pull the database out of the AG and then re-add it. I didn’t have that option because it is a HUGE database and didn’t have that much time and space to move it around. I knew a large transaction had kicked off (thank you alert email that I created to warn me about such things) but thought that surely the rollback would have cleared quickly. That lead me to looking for rolling back transactions.

I ran this on the alarming secondary node:

SELECT R.session_id, R.command ,R.status, R.percent_complete
FROM sys.dm_exec_requests R
WHERE R.command IN ('killed/rollback','rollback')

To my surprise, there were no results. Nothing was killed or rolling back; or was it? I ran the query again, but this time without the where clause.

SELECT R.session_id, R.command ,R.status, R.percent_complete
FROM sys.dm_exec_requests R

I saw one command listed as “UNKNOWN TOKEN” that had a percent complete at about 5%. That percent was rising. I theorized that this was my rolling back process and when it finished, my AG would be healthy again. The system isn’t used overnight. We had started the maintenance in the late afternoon and it was the secondary node in trouble, so I had time to test my theory. It was an agonizing 8 hours as I kept checking on the percent_complete all evening. It finally completed and the AG went green.

My lesson learned: When my AG isn’t healthy and I have already resumed data movement, before I pull the database out of the AG,I need to check for processes that have a percent complete on the secondary node. Being patient is really hard but necessary with AG’s.

The song that goes with this post Listen to the Man.

Andrea Allred presents RoyalSQL

Bringing happy endings to all your data stories.

Category Archives: Moving Data