top of page
Search
Writer's pictureVictor Escott

Every time when I tell somebody that percentage cost in execution plan lie and you should not trust them when tuning a query, I get different reactions/question such as:

You are the one lying!

Can you prove it?

Why they put it there if they are not right?

Every time I got this question, I share a nice blog post by my friend Forrest [t,b] . You should read it before you start reading my craziness, he is way smarter than me :).

Yesterday, someone came back asking a ton of questions, assuring that percentages were wrong because the statistics were forced, etc. I tried to explain that the main point was proved, and statistics are not even close to perfect in a production environment, but it was not enough to convince them, so I decided to share my screen and demonstrate it over a live session.


Here is a script to create our lab, one table and one UDF.

--\\Create table
CREATE TABLE [dbo].[UserProfile] (UserID INT  PRIMARY KEY CLUSTERED IDENTITY(1,1)
,AboutMe NVARCHAR(1500) NOT NULL
, MyValue1 INT NOT NULL);
GO
--\\ Insert some data :)
INSERT INTO [dbo].[UserProfile](AboutMe,MyValue1)
SELECT TOP (2000000)    AboutMe= REPLICATE( LEFT(CAST(NEWID() AS VARCHAR(520)), 5), 3)
				,   ROW_NUMBER() OVER (ORDER BY a.name)%3
FROM sys.all_objects a
CROSS JOIN sys.messages c
UNION ALL
SELECT TOP (2000000)    AboutMe= REPLICATE( LEFT(CAST(NEWID() AS VARCHAR(520)), 5), 3)
				,   3
FROM sys.all_objects a
CROSS JOIN sys.messages c;
GO
--\\Create and index to help our query.
CREATE INDEX IX_Value1 ON [dbo].[UserProfile] (MyValue1) WITH FILLFACTOR=100;
GO
--\\ Create scalar function.
CREATE OR ALTER FUNCTION dbo.Total(@MyValue1 INT)  
RETURNS BIGINT  
WITH SCHEMABINDING
AS   
BEGIN    
    DECLARE @Total BIGINT;        
	SELECT @Total=COUNT_BIG(*) 
	FROM [dbo].[UserProfile]
	WHERE MyValue1=@MyValue1;

    RETURN @Total;    
END;
GO

Once the table is ready, we are going to run the following queries, please make sure you include the Actual Execution Plan.

Note: Compatibility level is set to 140 (SQL 2017)
--\\ Compare the function and raw query
--\\ Include Actual Execution Plan (Ctrl + M)

SELECT Total=COUNT_BIG(*) 
FROM [dbo].[UserProfile]
WHERE MyValue1=3;

SELECT dbo.Total(3) as Total;

Both queries are identical, same logic, same results and same number of reads, the only difference is the UDF. What about percentages? Are those identical? Not sure about those, please see below...


SQL Server cannot expand the logic behind the UDF, therefore, the percentages are lying.


I am aware of SQL Server 2019 and FROID, but let's be honest, we are not upgrading soon and FROID it's not going to fix every UDF for you.

The above queries were not manipulated ,so what happens if we confuse SQL Server a little bit?


Let's run these three queries:

SELECT  A.MyValue1
	   ,A.AboutMe
FROM [dbo].[UserProfile] A  
WHERE A.MyValue1=3
ORDER BY A.MyValue1
OPTION(FAST 10); --\\ Give me the first 10 rows as fast as you can.

SELECT TOP (10) A.MyValue1
              , A.AboutMe
FROM [dbo].[UserProfile] A
WHERE A.MyValue1=3
ORDER BY A.MyValue1; --\\ Let SQL server handle.


SELECT TOP (10) A.MyValue1
              , B.AboutMe 
FROM [dbo].[UserProfile] A 
		INNER LOOP JOIN [dbo].[UserProfile] B --\\Forcing Loop Join and selference, expecting SQL Server seeks using cluster and noncluster indexes.
				ON A.UserID= B.UserID
WHERE A.MyValue1=3
ORDER BY A.MyValue1;
  1. Query #1, returns 2,000,000 rows, but I asked SQL server to return the first 10 rows as fast as possible (Query Hint).

  2. Query #2, returns 10 rows, very simple query without tricks.

  3. Query #3, returns 10 rows, forced a Loop Join and self-reference.

Indeed, the percentages are lying again... Please see below.

If you have any questions or suggestions, please let me know in the comments.

Enjoy!!

25 views0 comments

Generally, when we write simple and small queries, we omit small details that make a huge difference. A few weeks ago I found an interesting query that was returning million of rows, the problem with this query is that it runs very often and we were not expecting that amount of data... In fact, we were getting the whole table instead of couple of rows.


The Lab

Let's create two tables, one table is for users and the other one for some kind of log per user.


Here is the query that is running very often.

Query is requesting all the logs for the users that were created through an external application, we know that only four records qualify for, but query is returning 1,000,000 rows. Let's take a look at the execution plan.


  1. There is a warning for missing join predicate.

  2. FK_UserID does not exists in table T2, but query runs without error.

  3. Row Count Spool is used to satisfy the query (Left Semi Join), therefore all rows are returned and no error.

If you run the sub-query then you are going to get an error, this because the field does not exists.

If you add the missing qualifiers for the select statement, then you are going to get and error too, now SQL server knows what you are trying to do.


Here is the right query and execution plan.


Bonus

SQL server ignores completely the qualifications for the columns in insert statement, this makes sense, since the table has been specified already.


If you have any questions or suggestion, please let me know in the comments.

Enjoy!!

102 views0 comments

A couple of weeks ago I attended a training session that was taught by Tim Chapman(t). Parameter-Sniffing was part of the agenda and when we were asked to share our thoughts and ideas of how to tackle it , I suggested that if you know very well the distribution of the data and you cannot afford to recompile the plan every time, then you can add a single comment to generate a new plan and such plan will be reused.


I thought most of the people there would know about this and I was going to get a lot feedback and a couple of good reasons of why they do not use it, but it was the other way around, most of them did not know about it.


The lab

We are going to be working with a single table, one stored procedure and a simple query to retrieve the plans from cache, scripts are as follows:

Here is our stored procedure, please noted that I am using Dynamic SQL:

This is the query that we are going to use to retrieve the plans from cache.


Le'ts review our data distribution.

Results are shown below:


Executing this stored procedure with Score=1 produces the following results, please note that we have to clear the cache for this execution.

200 rows were returned and new plan was created, this plan is complied for Score=1, please see below.

Messages tab will show the query that was executed.

Execution plan: Index Seek therefore a Nested Loops is required due to the Key Lookup to get the "AboutMe" field.

Executing this stored procedure with any other value will produces the same results, since the plan will be pulled from cache. Let's try with a different value.


799K rows were returned and plan was pulled from cache, note that Query Hash and Query Plan Hash are the same as the previous execution.

Messages tab will show the query that was executed. (Same query)

Execution plan: Index Seek therefore a Nested Loops is required due to the Key Lookup to get the "AboutMe" field... Not good, we should be scanning , but parameter-sniffing is doing it's job.


The trick

Let's modify the stored procedure, we are going to add a comment and ,if we are lucky enough, SQL Server will hear our prayers.

Executing this stored procedure with Score=1 produces the following results, please note that we have to clear the cache for this execution.

200 rows were returned and new plan was created, this plan is complied for Score=1, please see below.

Messages tab will show the query that was executed. (Now with our comment)

Execution Plan: It seems that our prayers are working out :)

If we run the stored procedure now using Score =6, we get the same execution plan since it was reused from cache, so our comment it's working wonders, pleas see below.

Executing this stored procedure with Score=4 produces the following results.

799K rows were returned and new plan was created note that Query Hash is the same as the previous execution, but now we have a new Query Plan Hash.

Messages tab will show the query that was executed. (Now with our comment)

Execution Plan: It seems that our prayers are working out :)

The magic is here:


Basically, a comment is persuading SQL Server to create a new plan and this is because the query changed and SQL Server has to recompile in order to assure a better plan :).


If you have any questions or suggestion, please let me know in the comments.

Enjoy!!

40 views0 comments
bottom of page