top of page
Search

A couple of weeks ago I attended a training session that was taught by Tim Chapman(t). Parameter-Sniffing was part of the agenda and when we were asked to share our thoughts and ideas of how to tackle it , I suggested that if you know very well the distribution of the data and you cannot afford to recompile the plan every time, then you can add a single comment to generate a new plan and such plan will be reused.


I thought most of the people there would know about this and I was going to get a lot feedback and a couple of good reasons of why they do not use it, but it was the other way around, most of them did not know about it.


The lab

We are going to be working with a single table, one stored procedure and a simple query to retrieve the plans from cache, scripts are as follows:

Here is our stored procedure, please noted that I am using Dynamic SQL:

This is the query that we are going to use to retrieve the plans from cache.


Le'ts review our data distribution.

Results are shown below:


Executing this stored procedure with Score=1 produces the following results, please note that we have to clear the cache for this execution.

200 rows were returned and new plan was created, this plan is complied for Score=1, please see below.

Messages tab will show the query that was executed.

Execution plan: Index Seek therefore a Nested Loops is required due to the Key Lookup to get the "AboutMe" field.

Executing this stored procedure with any other value will produces the same results, since the plan will be pulled from cache. Let's try with a different value.


799K rows were returned and plan was pulled from cache, note that Query Hash and Query Plan Hash are the same as the previous execution.

Messages tab will show the query that was executed. (Same query)

Execution plan: Index Seek therefore a Nested Loops is required due to the Key Lookup to get the "AboutMe" field... Not good, we should be scanning , but parameter-sniffing is doing it's job.


The trick

Let's modify the stored procedure, we are going to add a comment and ,if we are lucky enough, SQL Server will hear our prayers.

Executing this stored procedure with Score=1 produces the following results, please note that we have to clear the cache for this execution.

200 rows were returned and new plan was created, this plan is complied for Score=1, please see below.

Messages tab will show the query that was executed. (Now with our comment)

Execution Plan: It seems that our prayers are working out :)

If we run the stored procedure now using Score =6, we get the same execution plan since it was reused from cache, so our comment it's working wonders, pleas see below.

Executing this stored procedure with Score=4 produces the following results.

799K rows were returned and new plan was created note that Query Hash is the same as the previous execution, but now we have a new Query Plan Hash.

Messages tab will show the query that was executed. (Now with our comment)

Execution Plan: It seems that our prayers are working out :)

The magic is here:


Basically, a comment is persuading SQL Server to create a new plan and this is because the query changed and SQL Server has to recompile in order to assure a better plan :).


If you have any questions or suggestion, please let me know in the comments.

Enjoy!!

A few days ago, I was asked to look at stored procedure that started to have problems, at first glance it seemed to be the typical problem with parameter-sniffing, so I was not excited at all.


Man, how wrong I was; indeed, parameter-sniffing was playing a role, but then I noticed something that blow my mind, some statistics were way off! Yeah, I know what you are thinking, but what if I tell you that the statistics that I am talking about it’s for temporary table (#NowWeAreTalking).


Here is a replication of what I saw.

The statistics are way off even for the insert statement, SQL estimated 8 rows when 250 were inserted, the same statistics were used to choose between Nested Loops and Hash Match, you know who won and why the SORT spilled to tempdb.


Temporary tables support statistics, this is the main advantage over variable tables.

Awesome, but what happened here?


The statistics of the temporary table were cached, and are being reused every time that the stored procedure is execute. Say what? Yes, they are reused every single time, even if you explicit Drop the temporary table.


The Lab.


Let’s replicate the issue, in order to do so, we must create two tables:

Let’s review the data, so we know what we are dealing with.

Results are shown below:

The table “MainUsers” has 508 rows and a skewed distribution based on “uLevel” (parameter-sniffing).
The table “UserBadges” has 110,100 rows, this table contains badges per user.

Now that we have the data in place, let’s create a stored procedure.

This procedure returns a list of badges filtered by uLevel, with the uLevel passed in as a parameter, also returns the object_id for the temporary table that was created. For example:

Results are shown below:


Now, let's review what happened with the temporary table that we created, in order to do so, we are going to use the Obj_ID to retrieve the table info, statistics and histogram. For example:

Results are shown below:

Notice the following interesting details:

  • The temporary table was renamed to "#A6928DC1", it was no dropped, therefore now it's in cache.

  • There is a relationship between the name and obj_id, the name = Hex signed 2's complement.

  • Stats were created and the modification counter is set to 25, there is a lot going on with this counter, so let's keep an eye on this.

  • We are going to use this script several times, so keep it in a separate window.

Now, let's start the party!


Executing this stored procedure with uLevel=6 produces the following results, please noted that we have to clear the cache for this execution.

No data was returned, but temporary table was created. Obj_id=-1452344724



Let's review the temporary table, notice that stats were created based on the parameter (6).If we go back to our initial analysis, we can corroborate that number of users with uLevel=6 is 8.


The execution plan is good and stats are amazing.


Executing this stored procedure with uLevel=5 produces the following results, please noted that we did not clear the cache for this execution.

Now data was returned and temporary table is the same. Obj_id=-1452344724


Let's review the temporary table, notice that stats were created based on the parameter (6) and have not changed after this execution, but there is something new, the modification counter has increased by 500.

The execution plan is the same (parameter-sniffing), but the stats are a mess and the sort spilled to tempdb. Now we can confirm that the stats are cached and reused.

Interesting, let's run the same stored procedure multiple times and see if stats are updated once the RT(Recompilation Threshold ) is reached, but first we have to calculate the RT.


In order to do so, let's follow this formula that I found here ( Plan Caching and Recompilation in SQL Server 2012 ).

For our case, the RT is 508 --> N=8 (first execution), this means that RT was set to 500, so stats and plan will recompile when our modification counter reaches that number.


Let's run the SP five times.

We reached the RT, but the stats and histogram did not change.

Execution plan still the same, nothing changed.

I know what you are thinking right now, we are dealing with parameter-sniffing, so we should recompile the plan, alright let's update the stored procedure with a hint (RECOMPILE) at query level.

Now let's run our first test one more time, now using OPTION(RECOMPILE).

We see the same stats and same plan as before, everything here is good since we cleared the cache.

Now let's see what happens when we run this stored procedure with uLevel=5, this was the problematic parameter.


  1. Stats are way off, but we did not recompile this query.

  2. Cardinality is fine now , this is why the plan changed and now it's using a Hash Match.

  3. There is a warning for missing stats, how come? Yeah, we are still using the old stats.

  4. The stats are way off for the Hash Match, so everything after that it's going to be just trash, this is because there is not stats for uLevel=5.

  5. Stats are the same even when we reached the 508 for RT.

  6. There is even a missing index warning/request.

So we have reached the RT, but this is not enough to cause recompilation, in order to test it, let's run the same procedure one more time.

Now the stats were updated and the plan with the query hint is working fine.

But unfortunately, this is not the solution, since if you run the stored procedure with a different parameter , you will see the same behavior until the stats are recompiled again. Please see below.


So what's the solution?

There are several workarounds to avoid this behavior, but I do not want keep all the fun for myself, so I am going to list a couple of things that I tried and worked like a charm, but you have to decide which one to use in your environment.


This list requires the query hint (RECOMPILE).

  • Create a non cluster index on temporary table, this must be done after creation.

  • Update stats for temporary table (inside the stored procedure).

Other stuff that I tested:

  • Dynamic SQL.

  • SP executed WITH RECOMPILE

  • Trace Flag 2371, but since our table it's too small you are not going to notice difference.

Bonus.

We have been using the uLevel =5 to trigger this behavior, if you use uLevel=4 instead, then you are going to see a new friend who joined the party.



If you have any questions or suggestion, please let me know in the comments.

Enjoy!!

  • Writer: Victor Escott
    Victor Escott
  • Apr 13, 2019
  • 2 min read

This post is part 2 in a series about physical join operators, please be sure to check out:


The Nested Loops operator is one of four operators in charge of combine data and generate an output stream.


The Nested Loops operator is by far the simplest algorithm, let’s take a look at its visual appearance.

How to read plan above?

Everyone has their own way to read execution plans, but generally we have to read execution plans from right to left, following the data flow arrows. Therefore, in the image above we see two data flows arrows passing data to the Nested Loops, the one in the top is called Outer Input the one in the bottom is called Inner Input, then to the left of the Nested Loops we see the output stream (the combined data).


How it works?

The logic is very simple to understand, since it’s like two nested cursors, the main cursor is the data from the Outer Input and the nested cursor is the data from the Inner Input. So take the first value from the Outer Input, and compare it to each value in the Inner Input (nested cursor), once the first loop is completed then take next value from Outer Input and compare to each value in the Inner Input, and so on. Please see below.


Can Nested Loops seek?

In fact, there are several optimizations that allow a Nested Loops to seek directly to the rows, for example (but not limited to):

  • Cluster Index

  • Nonclustered Index

  • Spool

Do you wonder how this works? Please see below.



SQL Server still loops through the Outer input one row at a time, but it does not loop through the Inner input anymore, however, the seek was executed four times (for each row in the Outer Input)


Considerations.

  • If Outer Input is small (low cardinality) and Inner Input is cheap to loop or seek (low cost), then Nested Loops will be very efficient.

  • Right indexes allow to seek through the Inner Input.

  • Under normal circumstances the Nested Loop is a not blocking operator.

Last, but not least, I want to clarify that the main purpose of this post is to explain in the simplest way possible the way in which the Nested Loops operator does it job, this is the reason why I did not cover other cool stuff( Dynamic vs Static Inner inputs, Prefetching, Rebind and Rewinds, etc).


Enjoy!!

© 2018 by Victor Escott.

bottom of page