Now all that’s left of our Community is Playlists and a Pile of Cheese….

I have spent the last month thinking about why PASS Summit 2016 felt so empty to me.  I loved seeing my friends from all over the world.  The sessions were hit and miss, just like always, but I left feeling empty and disappointed. Summit is normally my time to recharge and get excited about SQL Server again. This year I had to find other things to get me recharged and it left me thinking, what was different?

I talked to a few other people that had the same experience and after talking it out, I think I found the problem. There wasn’t a community party. There were tons of parties.  Every vendor seemed to have a party and I spent a lot of time trekking around Seattle to attend them.    I saw a few people at each one but I never really got the opportunity to converse with the people that I only see at the community party.  I heard (mostly through the grapevine so it could all be misinformation) that stopping the community party was a way for PASS to save money and to allow people to attend more vendor parties. I think this may have been a misstep and my parents taught me to never bring up a problem without proposing a solution.

I would suggest that the vendors that do their own party change their course and instead sponsor a section of the EMP and bring back the Community Party. But Andrea, isn’t that the point of the vendor reception? If it is, why are they having separate parties? The Vendor reception appears to me to be a time for vendors to sell things and the community party is a time to come together as a community and celebrate. I feel like this year our community was fractured all over Seattle and never really had a chance to connect.

Our Community is what sets SQL Server apart from other database platforms. I want to encourage the growth of this community and offer this humble post as a way to make PASS Summit more united in the future.

Also, in no way is this meant as an attack on any one that serves PASS. I know everyone works really hard to make it a success and I would offer a huge THANK YOU!!! to all those that make PASS possible.

*Hugs* to my awesome #SQLFamily, #DataFamily

 

Today’s lyrics are from Matt Nathanson’s Playlist and Apologies

Lay your weary head to rest and run your query more…

Just coming from a wonderful week at PASS Summit. I sat in two separate sessions where this trick came up and watched as people furiously wrote it down.  Did you know that you can set a query to run multiple times in one command?  All you have to do is add “Go” and the number of times you want it to run.  For example:

SELECT GETDATE()
GO 5

Will return something like this:

go5results

Super cool, right?  It was mentioned in both Brent Ozar’s and Argenis Fernandez’s presentations and I didn’t want to forget. Now you can easily run your queries more and create load on your test system without having to click run over and over again.

Today’s post features Lyrics from Panic! At the Disco’s Cover of Carry On My Wayward Son

Mayday this is an emergency, my linked server is using ODBC to connect to an AG…

Availability Groups (AG) and Linked servers can get really tricky. Especially if you are dealing with multi-subnet failover.  The only way we have figured out how to do this is with ODBC. Here is how.

First, we have create an ODBC connection on our SQL Server.  The single server in the picture below is the server we are going create the ODBC connection and the linked server on.  It will go over to the AG Listener. ag-pic

First we are creating an ODBC Connection on our server that is going to link to the AG.

odbc1

Make sure to be in the System DSN section. Click Add

odbc2

We can pick either ODBC Driver 11 or 13.  This is a separate driver install that we can get here.  We want to install the driver and then we will see it as an option in the screen above.  Click Finish.

odbc3

Give it whatever name and description wanted, but save the name for later. The server should be the name of the AG Listener. Click Next.

odbc4

Technically we don’t have to put in a login and password, but I like to test the login and password that I am going to use for the linked server. It won’t be saved here. Click Next.

odbc5

Here is the magic part, make sure to check multi-subnet failover.  That is what is going to make the connection automatically fail between the two nodes. Click next, test the connection and then Finish. The ODBC connection is ready to be used by the linked server connection.  Let’s build that part now.

Go into SQL Server Management Studio and under Server Objects, right click on Linked Servers and select “New Linked Server”.

odbc6

The provider needs to be OLE DB Provider for ODBC Drivers. Remember the name we gave the ODBC connection?  We are going to use that here. Then go to the Security Tab.

odbc7

This is where we put in the login and it will be stored here.  I also make sure this user name and password is on both Nodes of the AG with the permissions that I need. Click Server Options

odbc8

The above is what I need, but I check that I am only giving access to what is needed and not more.  When we click ok, it will test our connection.  If everything works with no errors, we are ready to go.

Some of the problems that we have noticed are querying tables that have big datatypes like time(3-7), timestamp, and a few others.  Casting or converting the datatypes doesn’t help. If we pull the table into a view without the big datatype columns, we are able to query the view from another server, but never the base table. It has been a bit frustrating, but we are still hopeful that we can find a solution or that Microsoft with fix ODBC connections. If there is a better way to do this, please reach out to me.  We have things we need to solve and could use some help.

The song for this post is Mayday by Cam.

Coming soon: Count down to PASS Summit 2016 with more pictures from PASS Summit 2015. Watch twitter and the Magic Mirrors page for more.

Listen to the AG that’s tricking you…

 

Another AG (Availability Group) Post? Yes, I learned something new and it must be cataloged. When you are failing AG’s back and forth really fast and a major indexing job kicks off in the middle, it can cause a transaction to have to rollback.  This rollback may take a REALLY long time, even if you were only on the node for 10 minutes and a large transaction had only been running for about 5 minutes. When I failed back to my preferred primary node and the AG Dashboard didn’t go completely green, I got worried.  Why in the world would it not go green? I just failed to the preferred secondary and applied a patch (see? I learned.) and then was failing back. It had been green when I started, green when I failed over to the secondary and now one of my biggest databases was not synchronizing on the primary….*sigh*

I panicked. In this situation I would normally pull the database out of the AG and then re-add it.  I didn’t have that option because it is a HUGE database and didn’t have that much time and space to move it around. I knew a large transaction had kicked off (thank you alert email that I created to warn me about such things) but thought that surely the rollback would have cleared quickly.  That lead me to looking for rolling back transactions.

I ran this on the alarming secondary node:

SELECT R.session_id, R.command ,R.status, R.percent_complete
FROM sys.dm_exec_requests R
WHERE R.command IN ('killed/rollback','rollback')

To my surprise, there were no results.  Nothing was killed or rolling back; or was it? I ran the query again, but this time without the where clause.

SELECT R.session_id, R.command ,R.status, R.percent_complete
FROM sys.dm_exec_requests R

I saw one command listed as “UNKNOWN TOKEN” that had a percent complete at about 5%. That percent was rising. I theorized that this was my rolling back process and when it finished, my AG would be healthy again.  The system isn’t used overnight. We had started the maintenance in the late afternoon and it was the secondary node in trouble, so I had time to test my theory.  It was an agonizing 8 hours as I kept checking on the percent_complete all evening.  It finally completed and the AG went green.

My lesson learned: When my AG isn’t healthy and I have already resumed data movement, before I pull the database out of the AG,I need to check for processes that have a percent complete on the secondary node. Being patient is really hard but necessary with AG’s.

The song that goes with this post Listen to the Man.

It’s different for Availability Groups when service packs get broke…

Last week I did a few things wrong.  The good news is I learned from it and now can prevent myself from repeating it.

So I have this AG, it is kind of important, hence the AG part but after 5pm I have two hours that it can be down, or so I thought.  We recently added new functionality that requires it to be up all the time.  I have been applying SQL 2014 SP2 to all of my 2014 servers all of which have been in AGs and it has been super easy, less than 10 minutes of work and only a reboot of downtime. (I am a little old fashioned and always reboot after a service pack.) My mind said, “Sweet, you can get this done and no one will notice and if you do both at the same time and delay the reboot on one of them, there will be no down time.”  That was my first mistake.  My second one was starting a few minutes early. I was excited to have it done because it was my last round of service packs. My final mistake was not realizing that SSISDDB is considered a system database and should NEVER be in an availability group. It may be super awesome because you can keep your SSIS catalog completely in sync. You could maybe do it if you plan to remove it before doing service packs or any kind of upgrades to the server, bur as far as setting it and forgetting it, you are in trouble if you do it.

Here are a few of the fun errors that we saw.

“Script level upgrade for database ‘master’ failed because upgrade step ‘SSIS_hotfix_install.sql’ encountered error 942, state 4, severity 25. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the ‘master’ database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.”

“Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.”

Huge thank you to all the people that have blogged about Trace Flag 902.  It allowed us to start up SQL Server and find the errors and pull SSISDB out of the AG and get the service packs to finish running and everything was happy and great.  Here is the list of steps from the Microsoft KB article:

Enable trace flag 902 on the instance of SQL Server. To do this, follow these steps:

  1. Open SQL Server Configuration Manager.
  2. In SQL Server Configuration Manager, click SQL Server Services.
  3. Double-click the SQL Serverservice.
  4. In the SQL Server Properties dialog box, click the Advanced tab.
  5. On click the Advanced tab, locate the Startup Parameters item.
  6. Add ;-T902 to the end of the existing string value, and then click OK.

Lessons Learned:

  1. Don’t start earlier than you say you are going to start.
  2. Don’t do both sides of an AG at the same time.  Do the Passive one first and make sure everything is up and working and the AG is healthy before you do the primary node. Verify the fail-over works to the passive node so that you have no down time.
  3. Don’t get too comfortable with Service Packs and Cumulative Updates. They are still a big change even though they are pretty stable.
  4. Don’t have any system databases in the AG while applying patches and know that SSISDB is a part of the system databases.
  5. An awesome team that can back you up and help you trouble shoot can make all the difference.  It is amazing to have a boss that believes in you and is encouraging to help you keep going even when you want to give up and go cry in the closet.

Also for those of you following along at home and what to know what song goes with this post: It’s Different for Girls

Happy Service Packing!

Nothing can stop me, nothing holds me back from changing recovery mode and getting development on track…

Greetings and other salutations,

Today I found out that part of the development environment was in “Full Recovery Mode”.  This means that if someone isn’t taking log backups, their databases get huge, and it also means that the backups were much bigger than they should have been.  They don’t need point in time recovery in our development environment so we decided to move them to “Simple recovery”. This could have been a big all day job if I went through the GUI, but you know me, I found a way to script it out and thought I would share it. I am showing you how to do it on one server at a time:

Connect to your development server in the master database and run this query to see how many are in “Full Recovery”:

 SELECT name, recovery_model_desc
 FROM sys.databases
 WHERE recovery_model_desc = 'FULL'

When I ran it on one of my servers, there were 24 databases that needed to be adjusted. So I built this:

 SELECT 'ALTER DATABASE [' + name + '] SET RECOVERY SIMPLE ;'
 FROM sys.databases
 WHERE recovery_model_desc = 'Full'

Then I took the results from that query and copied it into a new window and ran it and just like that, all my databases are now in “Simple Mode” in Development. I ran the first query one more time to make sure everything updated as expected.

It is a beautiful thing. I hope this helps you clean up development too!

The song from this post is from the Kongo’s Take it from Me

I can make your logs clap…

The SQL error log has this nasty habit of getting big when I am not looking.  There are only two ways to keep is at a normal size. One is to stop and start your SQL instance (Reboot, Restart, Stop and Start) and the other is to run this handy little script:


EXEC sp_cycle_errorlog;

This will end the current log and start a new one.  Why does this matter?  The SQL Error log holds information about your backups, failed logins, SQL errors, edition information and other fun stuff.  The bigger it is, the longer it will take SQL to load it into memory so that you can read it.  Usually when you need to read it, you are in trouble so the slower it is, the more stressed you will be.

What is a good size?  I usually try to get it to roll over around 10 MB.  I use a monitoring tool and when the large error log alert is triggered, I have it run sp_cycle_errorlog for me so mine always stay a healthy size.  You don’t need fancy tools to do this though.  If you know about how fast your logs grow, you can set up a SQL Agent job to run it on a schedule to keep your logs healthy.

How many logs should I keep? This is completely up to you, but since I keep my logs so small, I try to keep 15 of them.  Why so many? I do it so I can go back and see issues further back if needed. You can adjust the amount you keep by right clicking on SQL Server Logs in SSMS and selecting “Configure”

Configure SQL Error Logs

Super cool, but what about the Agent error logs? There is a script for them as well!


USE msdb;
GO
EXEC sp_cycle_agent_errorlog;

See? Healthy and Happy Logs! Your Logs will be clapping with joy.