top of page
Search
Writer's pictureVictor Escott

Percentage Cost lie and are good at it.

Every time when I tell somebody that percentage cost in execution plan lie and you should not trust them when tuning a query, I get different reactions/question such as:

You are the one lying!

Can you prove it?

Why they put it there if they are not right?

Every time I got this question, I share a nice blog post by my friend Forrest [t,b] . You should read it before you start reading my craziness, he is way smarter than me :).

Yesterday, someone came back asking a ton of questions, assuring that percentages were wrong because the statistics were forced, etc. I tried to explain that the main point was proved, and statistics are not even close to perfect in a production environment, but it was not enough to convince them, so I decided to share my screen and demonstrate it over a live session.


Here is a script to create our lab, one table and one UDF.

--\\Create table
CREATE TABLE [dbo].[UserProfile] (UserID INT  PRIMARY KEY CLUSTERED IDENTITY(1,1)
,AboutMe NVARCHAR(1500) NOT NULL
, MyValue1 INT NOT NULL);
GO
--\\ Insert some data :)
INSERT INTO [dbo].[UserProfile](AboutMe,MyValue1)
SELECT TOP (2000000)    AboutMe= REPLICATE( LEFT(CAST(NEWID() AS VARCHAR(520)), 5), 3)
				,   ROW_NUMBER() OVER (ORDER BY a.name)%3
FROM sys.all_objects a
CROSS JOIN sys.messages c
UNION ALL
SELECT TOP (2000000)    AboutMe= REPLICATE( LEFT(CAST(NEWID() AS VARCHAR(520)), 5), 3)
				,   3
FROM sys.all_objects a
CROSS JOIN sys.messages c;
GO
--\\Create and index to help our query.
CREATE INDEX IX_Value1 ON [dbo].[UserProfile] (MyValue1) WITH FILLFACTOR=100;
GO
--\\ Create scalar function.
CREATE OR ALTER FUNCTION dbo.Total(@MyValue1 INT)  
RETURNS BIGINT  
WITH SCHEMABINDING
AS   
BEGIN    
    DECLARE @Total BIGINT;        
	SELECT @Total=COUNT_BIG(*) 
	FROM [dbo].[UserProfile]
	WHERE MyValue1=@MyValue1;

    RETURN @Total;    
END;
GO

Once the table is ready, we are going to run the following queries, please make sure you include the Actual Execution Plan.

Note: Compatibility level is set to 140 (SQL 2017)
--\\ Compare the function and raw query
--\\ Include Actual Execution Plan (Ctrl + M)

SELECT Total=COUNT_BIG(*) 
FROM [dbo].[UserProfile]
WHERE MyValue1=3;

SELECT dbo.Total(3) as Total;

Both queries are identical, same logic, same results and same number of reads, the only difference is the UDF. What about percentages? Are those identical? Not sure about those, please see below...


SQL Server cannot expand the logic behind the UDF, therefore, the percentages are lying.


I am aware of SQL Server 2019 and FROID, but let's be honest, we are not upgrading soon and FROID it's not going to fix every UDF for you.

The above queries were not manipulated ,so what happens if we confuse SQL Server a little bit?


Let's run these three queries:

SELECT  A.MyValue1
	   ,A.AboutMe
FROM [dbo].[UserProfile] A  
WHERE A.MyValue1=3
ORDER BY A.MyValue1
OPTION(FAST 10); --\\ Give me the first 10 rows as fast as you can.

SELECT TOP (10) A.MyValue1
              , A.AboutMe
FROM [dbo].[UserProfile] A
WHERE A.MyValue1=3
ORDER BY A.MyValue1; --\\ Let SQL server handle.


SELECT TOP (10) A.MyValue1
              , B.AboutMe 
FROM [dbo].[UserProfile] A 
		INNER LOOP JOIN [dbo].[UserProfile] B --\\Forcing Loop Join and selference, expecting SQL Server seeks using cluster and noncluster indexes.
				ON A.UserID= B.UserID
WHERE A.MyValue1=3
ORDER BY A.MyValue1;
  1. Query #1, returns 2,000,000 rows, but I asked SQL server to return the first 10 rows as fast as possible (Query Hint).

  2. Query #2, returns 10 rows, very simple query without tricks.

  3. Query #3, returns 10 rows, forced a Loop Join and self-reference.

Indeed, the percentages are lying again... Please see below.

If you have any questions or suggestions, please let me know in the comments.

Enjoy!!

25 views0 comments

Comments


bottom of page