We could be immortals, just not for long when using a duplicate delete!

Sometimes awesome things just happen.  Today Rob Farley (@Rob_Farley) was helping me with a previous post about my dates table and just as a side note he said, “Oh, let me show you something else cool.” It was really cool and so I asked if I could add it to my blog since I never know when this problem will strike.

I have a table about Super Heroes and their cape colors.  I made a mistake and put Batman in there twice.  But since there is no primary key, how do I tell it which one to delete?  Hero Table

Since they are exactly the same, I can let SQL sort it out. This uses both a CTE (I hadn’t ever used it without some kind of join before today) and OVER which I am learning about.  So cool!

WITH SuperHeroDuplicates
AS
(SELECT *, ROW_NUMBER() over (partition by HeroName,CapeColor order by HeroName,CapeColor) as rownum
FROM dbo.HeroCapeColor)
DELETE
FROM SuperHeroDuplicates
where rownum > 1;

Thanks Rob!

—————UPDATE—————

Kenneth Fisher (@sqlstudent144) also wrote a blog post about another way you can accomplish this task if you only have a few of them to delete.

He has been a super big help to me.  He taught me how to display my code better in my blog so it is easier to read and copy. He also encourages me, builds my confidence and even included a link to one of my posts in an article. It made me feel special and like what I have to say matters to other people.  I love how he comments on my posts and gives me ideas on how to make them better. I feel so lucky to be a part of such a great community of people that are so thoughtful, selfless and giving.  Thanks Kenneth for being such a great example to me!

Yeah, that’s my kind of T-SQL!

I have a Dates table and I really love it. This week I needed to do a calculation where I take my monthly budget, divide it by the number of working days in a month and then times it by the working days for the week. That will give me a weekly budget number without having to store it, plus my company only gives me monthly numbers. I had already figured out the daily number and had the calculation working for a 5 day work week when New Years day made everyone notice that it was 5 and I was cheating. So I added a Column to my Dates table to tell me on any given day the number of working days in that week. My weeks run from Sunday to Saturday. I have a Date in my dates table for both the start of the week and the start of the next week. I have a flag (1,0) that says whether or not a day is a weekday and another for whether a day is a holiday. Using this flags, I can pass in a range of dates and get the number of working weekdays.

UPDATE Dates 
SET WorkDaysInWeek = WeekDays 
FROM Dates D 
INNER JOIN  
(SELECT BegW, SUM(IsWeekDay) - SUM(IsHoliday) AS WeekDays 
FROM Dates 
WHERE FullDate Between BegW AND NextWeekStart 
GROUP BY BegW) W ON D.BegW = W.BegW

—————UPDATE—————

But wait, there’s more. Rob Farley who is on twitter @rob_farley sent me an even better way to do it. It gets rid of the need for a WHERE or GROUP BY because we are using OVER. We hoped it would eliminate the sub-query and it does if our query is only a SELECT, but when I go to do an UPDATE, it says that “Windows functions can only appear in the SELECT and ORDER BY clauses”. Rob suggested I use a CTE instead.  I hadn’t ever used a CTE without joining back to it so he taught me how.

This is the super awesome SELECT:

SELECT BegW,SUM(IsWeekday - IsHoliday) OVER (PARTITION BY BegW) AS WeekDays
FROM Dates 

Here is what the UPDATE ended up looking like.

WITH d AS (
SELECT *,SUM(IsWeekday - IsHoliday) OVER (PARTITION BY BegW) AS WeekDays
FROM Dates)
UPDATE d SET WorkDaysInWeek = WeekDays;

The other question that Rob had was what if we have holidays on weekends. This is a great question. At my company, the holiday would be shifted to one of the days of the week, so the counts would still be correct. But if you are in a situation where that is not the case you can change the where clause like this:

WHERE [DayOfWeek] NOT IN (1,7)

As I was writing my where clause, I noticed I did something bad and used a reserved word as a column name. If you happen to make this mistake as well, just make sure to put square brackets [] around the word when you use it.

Huge thank you to Rob for being so kind to help me be better! One of my new goals is to play with OVER and understand how to use it and when. I am also going to be learning more about CTE’s and not having to use joins. Yay! New toys!!!!!