top of page
Search

A few days ago, I was asked to look at stored procedure that started to have problems, at first glance it seemed to be the typical problem with parameter-sniffing, so I was not excited at all.


Man, how wrong I was; indeed, parameter-sniffing was playing a role, but then I noticed something that blow my mind, some statistics were way off! Yeah, I know what you are thinking, but what if I tell you that the statistics that I am talking about it’s for temporary table (#NowWeAreTalking).


Here is a replication of what I saw.

The statistics are way off even for the insert statement, SQL estimated 8 rows when 250 were inserted, the same statistics were used to choose between Nested Loops and Hash Match, you know who won and why the SORT spilled to tempdb.


Temporary tables support statistics, this is the main advantage over variable tables.

Awesome, but what happened here?


The statistics of the temporary table were cached, and are being reused every time that the stored procedure is execute. Say what? Yes, they are reused every single time, even if you explicit Drop the temporary table.


The Lab.


Let’s replicate the issue, in order to do so, we must create two tables:

Let’s review the data, so we know what we are dealing with.

Results are shown below:

The table “MainUsers” has 508 rows and a skewed distribution based on “uLevel” (parameter-sniffing).
The table “UserBadges” has 110,100 rows, this table contains badges per user.

Now that we have the data in place, let’s create a stored procedure.

This procedure returns a list of badges filtered by uLevel, with the uLevel passed in as a parameter, also returns the object_id for the temporary table that was created. For example:

Results are shown below:


Now, let's review what happened with the temporary table that we created, in order to do so, we are going to use the Obj_ID to retrieve the table info, statistics and histogram. For example:

Results are shown below:

Notice the following interesting details:

  • The temporary table was renamed to "#A6928DC1", it was no dropped, therefore now it's in cache.

  • There is a relationship between the name and obj_id, the name = Hex signed 2's complement.

  • Stats were created and the modification counter is set to 25, there is a lot going on with this counter, so let's keep an eye on this.

  • We are going to use this script several times, so keep it in a separate window.

Now, let's start the party!


Executing this stored procedure with uLevel=6 produces the following results, please noted that we have to clear the cache for this execution.

No data was returned, but temporary table was created. Obj_id=-1452344724



Let's review the temporary table, notice that stats were created based on the parameter (6).If we go back to our initial analysis, we can corroborate that number of users with uLevel=6 is 8.


The execution plan is good and stats are amazing.


Executing this stored procedure with uLevel=5 produces the following results, please noted that we did not clear the cache for this execution.

Now data was returned and temporary table is the same. Obj_id=-1452344724


Let's review the temporary table, notice that stats were created based on the parameter (6) and have not changed after this execution, but there is something new, the modification counter has increased by 500.

The execution plan is the same (parameter-sniffing), but the stats are a mess and the sort spilled to tempdb. Now we can confirm that the stats are cached and reused.

Interesting, let's run the same stored procedure multiple times and see if stats are updated once the RT(Recompilation Threshold ) is reached, but first we have to calculate the RT.


In order to do so, let's follow this formula that I found here ( Plan Caching and Recompilation in SQL Server 2012 ).

For our case, the RT is 508 --> N=8 (first execution), this means that RT was set to 500, so stats and plan will recompile when our modification counter reaches that number.


Let's run the SP five times.

We reached the RT, but the stats and histogram did not change.

Execution plan still the same, nothing changed.

I know what you are thinking right now, we are dealing with parameter-sniffing, so we should recompile the plan, alright let's update the stored procedure with a hint (RECOMPILE) at query level.

Now let's run our first test one more time, now using OPTION(RECOMPILE).

We see the same stats and same plan as before, everything here is good since we cleared the cache.

Now let's see what happens when we run this stored procedure with uLevel=5, this was the problematic parameter.


  1. Stats are way off, but we did not recompile this query.

  2. Cardinality is fine now , this is why the plan changed and now it's using a Hash Match.

  3. There is a warning for missing stats, how come? Yeah, we are still using the old stats.

  4. The stats are way off for the Hash Match, so everything after that it's going to be just trash, this is because there is not stats for uLevel=5.

  5. Stats are the same even when we reached the 508 for RT.

  6. There is even a missing index warning/request.

So we have reached the RT, but this is not enough to cause recompilation, in order to test it, let's run the same procedure one more time.

Now the stats were updated and the plan with the query hint is working fine.

But unfortunately, this is not the solution, since if you run the stored procedure with a different parameter , you will see the same behavior until the stats are recompiled again. Please see below.


So what's the solution?

There are several workarounds to avoid this behavior, but I do not want keep all the fun for myself, so I am going to list a couple of things that I tried and worked like a charm, but you have to decide which one to use in your environment.


This list requires the query hint (RECOMPILE).

  • Create a non cluster index on temporary table, this must be done after creation.

  • Update stats for temporary table (inside the stored procedure).

Other stuff that I tested:

  • Dynamic SQL.

  • SP executed WITH RECOMPILE

  • Trace Flag 2371, but since our table it's too small you are not going to notice difference.

Bonus.

We have been using the uLevel =5 to trigger this behavior, if you use uLevel=4 instead, then you are going to see a new friend who joined the party.



If you have any questions or suggestion, please let me know in the comments.

Enjoy!!

52 views0 comments
Writer's pictureVictor Escott

This post is part 2 in a series about physical join operators, please be sure to check out:


The Nested Loops operator is one of four operators in charge of combine data and generate an output stream.


The Nested Loops operator is by far the simplest algorithm, let’s take a look at its visual appearance.

How to read plan above?

Everyone has their own way to read execution plans, but generally we have to read execution plans from right to left, following the data flow arrows. Therefore, in the image above we see two data flows arrows passing data to the Nested Loops, the one in the top is called Outer Input the one in the bottom is called Inner Input, then to the left of the Nested Loops we see the output stream (the combined data).


How it works?

The logic is very simple to understand, since it’s like two nested cursors, the main cursor is the data from the Outer Input and the nested cursor is the data from the Inner Input. So take the first value from the Outer Input, and compare it to each value in the Inner Input (nested cursor), once the first loop is completed then take next value from Outer Input and compare to each value in the Inner Input, and so on. Please see below.


Can Nested Loops seek?

In fact, there are several optimizations that allow a Nested Loops to seek directly to the rows, for example (but not limited to):

  • Cluster Index

  • Nonclustered Index

  • Spool

Do you wonder how this works? Please see below.



SQL Server still loops through the Outer input one row at a time, but it does not loop through the Inner input anymore, however, the seek was executed four times (for each row in the Outer Input)


Considerations.

  • If Outer Input is small (low cardinality) and Inner Input is cheap to loop or seek (low cost), then Nested Loops will be very efficient.

  • Right indexes allow to seek through the Inner Input.

  • Under normal circumstances the Nested Loop is a not blocking operator.

Last, but not least, I want to clarify that the main purpose of this post is to explain in the simplest way possible the way in which the Nested Loops operator does it job, this is the reason why I did not cover other cool stuff( Dynamic vs Static Inner inputs, Prefetching, Rebind and Rewinds, etc).


Enjoy!!

51 views0 comments
Writer's pictureVictor Escott

Esta publicación es la segunda parte de una serie sobre operadores de unión física, asegúrese de revisar:


El Nested Loops es uno de los cuatro operadores a cargo de combinar datos y generar un flujo de salida.


El operador Nested Loops es, con mucho, el algoritmo más simple, echemos un vistazo a su apariencia visual.

¿Cómo leer el plan de arriba?

Cada persona tiene su propia forma de leer los planes de ejecución, pero generalmente tenemos que leer los planes de ejecución de derecha a izquierda, siguiendo las flechas de flujo de datos. Por lo tanto, en la imagen de arriba vemos dos flechas de flujos de datos que pasan datos al Nested Loops, la flecha que está en la parte superior se llama Outer Input, la que está en la parte inferior se llama Inner Input, luego a la izquierda de el operador Nested Loops vemos la salida final (los datos combinados).


¿Cómo funciona?

La lógica es muy simple de entender, ya que es como tener dos cursores anidados, el cursor principal son los datos de la entrada externa (Outer Input) y el cursor anidado son los datos de la entrada interna (Inner Input). Entonces, tome el primer valor de la entrada externa (Outer Input) y compárelo con cada valor de la entrada interna (cursor anidado), una vez que se complete el primer bucle, tome el siguiente valor de la entrada externa (Outer Input) y compárelo con cada valor de la entrada interna (Inner Input). en. Por favor ver más abajo.


¿Los bucles anidados pueden hacer Seek?

De hecho, hay varias optimizaciones que permiten que el operador Nested Loops busque directamente las filas, por ejemplo (pero sin limitarse a):

  • Cluster Index

  • Nonclustered Index

  • Spool

¿Te preguntas cómo funciona esto? Por favor ver más abajo.



SQL Server aún tiene que recorrer la entrada externa (Outer Input) una fila a la vez, pero ya no lo hace en la entrada interna (Inner Input), sin embargo, la búsqueda se ejecutó cuatro veces (para cada fila en la entrada externa "Inner Input")


Consideraciones.

  • Si la entrada externa es pequeña (cardinalidad baja) y la entrada interna tiene un bajo costo, entonces el operador Nested Loops será muy eficiente.

  • Los índices correctos permiten buscar a través de la entrada interna.

  • En circunstancias normales, el operador Nested Loops no es un operador que cause bloqueo.

Por último, pero no menos importante, quiero aclarar que el propósito principal de este post es explicar de la manera más simple posible la forma en que el operador Nested Loops funciona, esta es la razón por la que no cubrí otras cosas interesante( Dynamic vs Static Inner inputs, Prefetching, Rebind and Rewinds, etc).


¡Diviértanse aprendiendo!

368 views0 comments
bottom of page