Generally, when we write simple and small queries, we omit small details that make a huge difference. A few weeks ago I found an interesting query that was returning million of rows, the problem with this query is that it runs very often and we were not expecting that amount of data... In fact, we were getting the whole table instead of couple of rows.
The Lab
Let's create two tables, one table is for users and the other one for some kind of log per user.
Here is the query that is running very often.
Query is requesting all the logs for the users that were created through an external application, we know that only four records qualify for, but query is returning 1,000,000 rows. Let's take a look at the execution plan.
There is a warning for missing join predicate.
FK_UserID does not exists in table T2, but query runs without error.
Row Count Spool is used to satisfy the query (Left Semi Join), therefore all rows are returned and no error.
If you run the sub-query then you are going to get an error, this because the field does not exists.
If you add the missing qualifiers for the select statement, then you are going to get and error too, now SQL server knows what you are trying to do.
Here is the right query and execution plan.
Bonus
SQL server ignores completely the qualifications for the columns in insert statement, this makes sense, since the table has been specified already.
If you have any questions or suggestion, please let me know in the comments.
Enjoy!!
Comments