In preparation for my Groupby.org presentation on Friday, I wanted to do a post about why SELECT * is bittersweet.
Let’s start with the bitter first:
- If I have a wide table (lots of columns or columns that have large data types) and I select them all, I am reading A LOT more data than I probably need to read which takes up processor. This will also take up space in my buffer cache and can knock other things out. I am making this sound extra scary to discourge anyone from doing it for a whole table or even a lot of rows.
- Indexing goes out the window with SELECT * because SQL will do a full table scan. If I create an index using every column to make it work (and please don’t do this, it stores the table twice) , someone will come along one day and add a column to my table and it will do a full table scan again. There will be a domino effect in my system leading to the end of the world or at least poor performance.
- Using SELECT * in a Stored Procedure or View can seem simple and easy. I automatically have all the columns I need. What happens when I add a column to my table that lives underneathe? It can change the behavior or even break the Stored Procedure or View and anything that relys on them. Some of the issues may not show up at first for example, it may sort differently, or I may get duplicate data. Then, I have to go back through everything trying to figure out what changed and where to fix it. Huge headache ahead on this one.
We have seen the bitter, now let me show you the sweet:
- When I have a small table and can’t remember the names of all the columns, SELECT * is a quick way for me to see what the columns are named.
- When I need to see what the data looks like in the table, I can use SELECT Top 5 * and get an idea of the data. By limiting the data I am bringing back, I don’t cause as many of the scary reads which uses less processor and I don’t fill up my buffer cache with useless stuff.
SELECT * Can be a friend if it is used carefully and knowingly.
*Update* Here is the GroupBy Presentation I did on T-SQL for Beginners if you would like to watch it.
The song for this post is Bittersweet by Panic! At the Disco
Hi, I was listening to your GroupBy presentation today via podcast, and I wanted to offer a correction. The BETWEEN operator *does* include the end-points. I.e. “myInteger between 1 and 5” includes all integers in (1,2,3,4,5). I like your presentation style, very newbie-friendly!
It looks like the host stepped in and clarified, nice. Kudos on a great session!
Thank you for the kind words! Have an awesome day!