Every time when I tell somebody that percentage cost in execution plan lie and you should not trust them when tuning a query, I get different reactions/question such as:
You are the one lying!
Can you prove it?
Why they put it there if they are not right?
Every time I got this question, I share a nice blog post by my friend Forrest [t,b] . You should read it before you start reading my craziness, he is way smarter than me :).
Here you go: Percentage Non Grata
Yesterday, someone came back asking a ton of questions, assuring that percentages were wrong because the statistics were forced, etc. I tried to explain that the main point was proved, and statistics are not even close to perfect in a production environment, but it was not enough to convince them, so I decided to share my screen and demonstrate it over a live session.
Here is a script to create our lab, one table and one UDF.
--\\Create table
CREATE TABLE [dbo].[UserProfile] (UserID INT PRIMARY KEY CLUSTERED IDENTITY(1,1)
,AboutMe NVARCHAR(1500) NOT NULL
, MyValue1 INT NOT NULL);
GO
--\\ Insert some data :)
INSERT INTO [dbo].[UserProfile](AboutMe,MyValue1)
SELECT TOP (2000000) AboutMe= REPLICATE( LEFT(CAST(NEWID() AS VARCHAR(520)), 5), 3)
, ROW_NUMBER() OVER (ORDER BY a.name)%3
FROM sys.all_objects a
CROSS JOIN sys.messages c
UNION ALL
SELECT TOP (2000000) AboutMe= REPLICATE( LEFT(CAST(NEWID() AS VARCHAR(520)), 5), 3)
, 3
FROM sys.all_objects a
CROSS JOIN sys.messages c;
GO
--\\Create and index to help our query.
CREATE INDEX IX_Value1 ON [dbo].[UserProfile] (MyValue1) WITH FILLFACTOR=100;
GO
--\\ Create scalar function.
CREATE OR ALTER FUNCTION dbo.Total(@MyValue1 INT)
RETURNS BIGINT
WITH SCHEMABINDING
AS
BEGIN
DECLARE @Total BIGINT;
SELECT @Total=COUNT_BIG(*)
FROM [dbo].[UserProfile]
WHERE MyValue1=@MyValue1;
RETURN @Total;
END;
GO
Once the table is ready, we are going to run the following queries, please make sure you include the Actual Execution Plan.
Note: Compatibility level is set to 140 (SQL 2017)
--\\ Compare the function and raw query
--\\ Include Actual Execution Plan (Ctrl + M)
SELECT Total=COUNT_BIG(*)
FROM [dbo].[UserProfile]
WHERE MyValue1=3;
SELECT dbo.Total(3) as Total;
Both queries are identical, same logic, same results and same number of reads, the only difference is the UDF. What about percentages? Are those identical? Not sure about those, please see below...
SQL Server cannot expand the logic behind the UDF, therefore, the percentages are lying.
I am aware of SQL Server 2019 and FROID, but let's be honest, we are not upgrading soon and FROID it's not going to fix every UDF for you.
The above queries were not manipulated ,so what happens if we confuse SQL Server a little bit?
Let's run these three queries:
SELECT A.MyValue1
,A.AboutMe
FROM [dbo].[UserProfile] A
WHERE A.MyValue1=3
ORDER BY A.MyValue1
OPTION(FAST 10); --\\ Give me the first 10 rows as fast as you can.
SELECT TOP (10) A.MyValue1
, A.AboutMe
FROM [dbo].[UserProfile] A
WHERE A.MyValue1=3
ORDER BY A.MyValue1; --\\ Let SQL server handle.
SELECT TOP (10) A.MyValue1
, B.AboutMe
FROM [dbo].[UserProfile] A
INNER LOOP JOIN [dbo].[UserProfile] B --\\Forcing Loop Join and selference, expecting SQL Server seeks using cluster and noncluster indexes.
ON A.UserID= B.UserID
WHERE A.MyValue1=3
ORDER BY A.MyValue1;
Indeed, the percentages are lying again... Please see below.
If you have any questions or suggestions, please let me know in the comments.
Enjoy!!
Comments