Have I mentioned that I like query tuning? One of my favorite tuning tricks is removing Sub-queries from WHERE clauses. Let me give an example:
SELECT HeroName
,HasCape
,FavoriteColor
,LairId
FROM [dbo].[SuperHero] s
WHERE HeroType = 2
AND NOT EXISTS(SELECT 1
FROM [dbo].[SuperHero] x
WHERE x.HeroID = s.HeroID
AND x.IsHuman = 1 AND x.Weakness = 'Lack of Control')
Notice the “NOT EXISTS *Sub-Query* section. Any time I see this pattern or even a “NOT IN *Sub-Query*” pattern, I know I can fix it like this:
SELECT s.HeroName
, s.HasCape
, s.FavoriteColor
, s.LairId
FROM [dbo].[SuperHero] s
LEFT JOIN [dbo].[SuperHero] x ON x.HeroID = s.HeroID
AND x.IsHuman = 1
AND x.Weakness = 'Lack of Control'
WHERE HeroType = 2
AND x.HeroId IS NULL
In this second example, I have moved the sub-query to be in a LEFT JOIN with the same criteria and then in the WHERE I use one of the columns that should be populated (I favor ID columns here) and look to see if it “IS NULL”. That “IS NULL” works the same way as the “NOT EXISTS” and the “NOT IN”.
This allows me to remove the non-sargable arguments from the where clause and takes my query from non-sargable to sargable. (from Wikipedia- The term is derived from a contraction of Search ARGument ABLE. A query failing to be sargable is known as a non-sargable query and typically has a negative effect on query time, so one of the steps in query optimization is to convert them to be sargable).
With simple queries that have a low number of records, I hardly notice a difference in performance. As the queries become more complex or the row numbers increase, the difference begins to show in the query run time and the IO statistics.
The song for this post is I’m Born to Run by American Authors.
[…] summer I wrote about NOT EXISTS and how I could speed up the query by adding a left join and adding an IS NULL to the WHERE clause. […]