BorderLayoutBoxedLayoutOpenLayoutMaximum textMedium textSmall text


Register
Thursday, March 11, 2010
MyStreamMinimize
Print  

Entries for the 'query optimizer' Category

Key Lookup Threshold Part 2

Posted by Jason Massie Click to IM Jason Massie on Monday, February 09, 2009 at 7:57 AM to query optimizer, SQL performance tuning
1077 Views | 1 Comments | Article Rating

Grant Fritchey(@gfritchey) posed this question to me on twitter after my posted on the key lookup threshold post.

@statisticsio Interesting post. Do you think that threshold is dependent on the data involved? Or maybe on the size of the key?8:28 AM Feb 5th from TwitterGadget in reply to statisticsio

I kind of assumed so which is why I did char(1000). However, I have just tested with char(1). The numbers are interesting. Please refer to the original post for repro code.

 

 IOIO
RowsCHAR(1000)CHAR(1)
Scan4229
1 row44
5 rows1212
10 rows2222
25 rows5252
50 rows114102
100 rows217202
250 Rows526502
500 Rows10431003
1000 Rows41414007

 

The threshold is crossed much earlier because the scan is smaller. This is a small table especially with CHAR(1) so take this test with a grain of salt. Run your own tests when you are working with large production data.

I would also venture to guess that as the complexity of the query increases the threshold gets lower especially on more complex queries like when grouping especially when grouping and outer joining.

A covering index is probably the best solution in most cases unless you need to touch all rows even then, it might be better,.

email it! |   |   | 

Key Lookup Threshold

Posted by Jason Massie Click to IM Jason Massie on Thursday, February 05, 2009 at 8:05 AM to SQL Server 2008, SQL Server 2005, query optimizer, SQL performance tuning, Indexes
2292 Views | 1 Comments | Article Rating

Gail talks about bookmark lookups…. err.. key lookups in this post. So are they good or bad? Well, like many things in SQL, it depends. The main factor is the number of rows returned. A few rows are fine but the cost rises sharply with larger result sets. There comes a point where the threshold is crossed and a scan is more efficient. This is because a scan leverages sequential IO while a lookup does random IO.

Here are the results of the code at the end of the post.

RowsIO 
Scan422 
1 row4 
5 rows12 
10 rows22 
25 rows52 
50 rows114 
100 rows217 
250 Rows526<--Threshold
500 Rows1043 
1000 Rows4141 

 

As you can see, at 250 rows, we have crossed the threshold and it is cheaper do a scan. If you are passing in a literal, the optimizer can detect this and switch to a scan. If it is a stored proc or parameterized SQL, a plan is cached the first time it is run. Problems happen when the result size greatly varies depending on the parameter. There are ways around this all with their pro’s and con’s. Here are some:

  • A covering index.
  • A statement level recompile hint
  • Plan guides and hints

Here is the sample code that can repro these numbers on SQL Server 2008.

CREATE TABLE #temp
(
id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
c1 CHAR(1000) DEFAULT( 'blah'),
c2 INT
)
INSERT INTO #temp(c2)
VALUES
(1)
GO 
INSERT INTO #temp(c2)
VALUES
(5)
GO 5
INSERT INTO #temp(c2)
VALUES
(10)
GO 10
INSERT INTO #temp(c2)
VALUES
(25)
GO 25
INSERT INTO #temp(c2)
VALUES
(50)
GO 50
INSERT INTO #temp(c2)
VALUES
(100)
GO 100
INSERT INTO #temp(c2)
VALUES
(250)
GO 250
INSERT INTO #temp(c2)
VALUES
(500)
GO 500
INSERT INTO #temp(c2)
VALUES
(1000)
GO 1000
INSERT INTO #temp(c2)
VALUES
(1000)
GO 1000

CREATE INDEX ix ON #temp(c2)

--The baseline
SET STATISTICS io ON
SELECT
*
FROM #temp WITH (INDEX=1)
WHERE c2 = 1

--1 row returned
SELECT *
FROM #temp
WHERE c2 = 1

--5 rows returned
SELECT *
FROM #temp
WHERE c2 = 5

--10 rows returned
SELECT *
FROM #temp
WHERE c2 = 10

--25 rows returned
SELECT *
FROM #temp
WHERE c2 = 25

--50 rows returned
SELECT *
FROM #temp
WHERE c2 = 50

--100 rows returned
SELECT *
FROM #temp
WHERE c2 = 100

--250 rows returned
--Must begin using hints because the optimizer can tell that a scan is better
SELECT *
FROM #temp WITH (INDEX=ix, forceseek)
WHERE c2 = 250

--1000 rows returned
--Must begin using hints because the optimizer can tell that a scan is better
SELECT *
FROM #temp WITH (INDEX=ix, forceseek)
WHERE c2 = 1000

email it! |   |   | 

Finding Index Scans due to Implicit Conversions

Posted by Jason Massie Click to IM Jason Massie on Monday, January 12, 2009 at 7:00 PM to SQL Server 2008, SQL Server 2005, query optimizer, SQL performance tuning, tsql, Indexes, Procedure Cache, DMVs
2461 Views | 8 Comments | Article Rating

It is fairly widely known that implicit conversions of data types can cause scans. However, you may not be able to control what goes in your database, you may have inherited an app and we just are not perfect.

On top of that some can be tricky. For example, did you know that scope_identity and @@identity both return numeric data types. An AVG of a SMALLINT column returns an INT.

Here is a simple repro which the same query does an index scan instead of an index seek

CREATE TABLE t1
(
id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
, c1 VARCHAR(100)
)

GO

INSERT INTO t1(c1)
SELECT 1
GO 10000

INSERT INTO t1(c1)
SELECT 0
GO 10

CREATE INDEX ix ON t1 (c1)

CREATE PROC p1
@blah NVARCHAR(100)
AS
SELECT
id FROM t1
WHERE c1 = @blah

EXEC p1 N'0'

CREATE PROC p2
@blah VARCHAR(100)
AS
SELECT
id FROM t1
WHERE c1 = @blah

EXEC p2 '0'

DROP TABLE t1
DROP PROC p1
DROP PROC p2

 

I found a sample query from Umachandar Jayachandran and made some mod’s to it. It basically scans your procedure cache for index scans containing implicit conversions.

I have posted it at the SQLServerPedia Wiki here. It is the 2nd query listed.

Note: Implicit conversions may not always result in a scan.

email it! |   |   | 

Why do I have to use a MERGE\HASH JOIN Hint?

Posted by Jason Massie Click to IM Jason Massie on Saturday, November 15, 2008 at 8:04 PM to SQL Server 2008, SQL Server 2005, query optimizer, SQL performance tuning
2444 Views | 4 Comments | Article Rating

Hash match operator iconI have run into this several times lately so I thought I would whip out a quick post.

The scenario

A query runs much faster with a hash join hint than a nested loop but that is what the optimizer is choosing.

High level background

Merge and hash are good when you are working with lots of rows because it can leverage sequential IO. If you do a nested loop join with lookups on 100-1000+ rows, the random IOs can be 10-1000% more expensive than a sequential scan.

Why?

So why does the optimizer choose a nested loop? I have seen it is because the optimizer had to make a best guess at how make rows were being returned. It most cases, the optimizer guesses 1 row will be returned so a nested loop is the best choice as far as the optimizer can tell.

Identifying the problem 

Identifying the problem is pretty easy. All you need to do is look at the query plan. It can be text, xml or graphical. Look actual and estimated rows returned. If the actual rows are 11teen,000,000 but the estimated rows is 1 then this may be the problem

Causes

I most often see this when large table variables, local variables or modified stored procedure parameters are involved. There can be other causes. See my favorite white paper.

Fix?

Easy. Don’t use table variables and local variables. If you have to modify sp parameters, pass them to a nested stored proc. Make sure stats are up to date. If the WHERE cause is complex, try putting the logic in a persisted computed column or indexed view. Another option is to use a covering index. Of course, the last resort is the JOIN hint.

Need more?

Want to get real deep on stuff like this? Check out Craig Freedman’s blog or his Inside SQL Server 2005 chapters.

email it! |   |   | 

5 Quick Tips for the Query using the “Wrong” Index

Posted by Jason Massie Click to IM Jason Massie on Thursday, October 30, 2008 at 10:59 PM to SQL Server 2008, SQL Server 2005, query optimizer, Indexes
1546 Views | 1 Comments | Article Rating

I quote “wrong” because with the info the optimizer has, it thinks it has the right index. :)

1. Look for cardinality problems. You can do this by looking at the actual rows versus estimated rows in the query plan. If they are off, chances are you are not following the best practices listed here.

2. Conversions in the WHERE clause can cause issues. Be sure to look at query plan for implicit conversions

3. Complicated logic in the WHERE clause can also cause problems. Consider indexed persisted computed columns or indexed views.

4. Constraints give the optimizer more info when choosing a plan with the “right” index. It can take a PK or unique constraint into consideration but it does not know about similar logic in a trigger.

5. Large datasets are being returned but the index is not covering. An index with just the columns in the WHERE clause is ok if just a few rows are returned but it does not take much for an index scan to be more efficient.

#1 is the most important one because the link gives 5 or 7 more reason why the “wrong” index is chosen.

P.S. If all else fails, update stats and if that does not work, make the index covering.

email it! |   |   | 

NTSSUG June Presentation

Posted by Jason Massie Click to IM Jason Massie on Thursday, July 17, 2008 at 2:37 PM to query optimizer, SQL performance tuning, Indexes, PASS, Data collection
1201 Views | 0 Comments | Article Rating

Here are the slides and scripts from my June presentation at the North Texas SQL Server Users Group. The presentation was on the Data Collector in SQL Server 2008 but it was short so I also touched on filtered indexes, query hashes and troubleshooting. Enjoy and let me know if you have any questions.

Also Peter Debetta is presenting TONIGHT at 6:30 on the following topics:

"We'll be discussing the various ways you can secure your SQL Server data via encryption, including the new Transparent Data Encryption feature of SQL Server 2008."

 Stop by.

email it! |   |   | 

CXPACKET, MAXDOP and your OLTP system

Posted by Jason Massie Click to IM Jason Massie on Thursday, June 26, 2008 at 5:15 PM to SQL Server 2005, query optimizer, SQL performance tuning, tsql, DMVs
7094 Views | 6 Comments | Article Rating

So you are experiencing CXPACKET wait types? If you run a google search, you will quickly find out you are experiencing the dreaded SQL Server parallelism problem and you must reduce MAXDOP to 1. While that is possible, I say most likely not. I say that, most of the time, SQL is doing the best it can with what it has to work with. That is the query, the data and the schema.

Microsoft has come a long way since version <= 7.0 when it comes to parallelism. If your OLTP queries follow best practices and are well indexed, they probably will never generate a parallel plan. This is because they are fast and access a small amount of rows. If they are missing indexes or SQL overestimates cardinality, SQL might decide to do scans, sorts, hashes, spools etc. These iterators, among others, can go parallel to reduce execution time at the cost of system resources. These iterators are not bad and they do have their place. It just isn’t on OLTP type of queries most of the time.

So dropping the MAXDOP on an OLTP system to 1 probably won’t hurt much because most of the time there is an IO bottleneck. If the CXPACKET wait types are a symptoms of poor indexing and row count estimation, it won’t help either. Whole books have been written on indexing, query tuning and there is a nice whitepaper on stats best practices to avoid estimation problems so I am not going to go into that. I will give you some pointers on quickly identifying whether the CXPACKET is a symptom or the cause.

So your boss and boss's boss run over to your desk yelling about the customer complaints of slowness with SQL. You run this query or something similar:

select r.cpu_time 
, r.logical_reads
, r.session_id 
into #temp
from sys.dm_exec_sessions as s 
inner join sys.dm_exec_requests as r 
on s.session_id =r.session_id --and s.last_request_start_time=r.start_time
where is_user_process = 1 
 
waitfor delay '00:00:01'
 
select substring(h.text, (r.statement_start_offset/2)+1 , ((case r.statement_end_offset when -1 
then datalength(h.text)  else r.statement_end_offset end - r.statement_start_offset)/2) + 1) as text
, r.cpu_time-t.cpu_time as CPUDiff 
, r.logical_reads-t.logical_reads as ReadDiff
, p.query_plan
, r.wait_type
, r.wait_time
, r.last_wait_type
, r.wait_resource
, r.command
, r.database_id
, r.blocking_session_id
, r.granted_query_memory
, r.session_id
, r.reads
, r.writes
, r.row_count
, s.[host_name]
, s.program_name
, s.login_name
from sys.dm_exec_sessions as s 
inner join sys.dm_exec_requests as r 
on s.session_id =r.session_id and s.last_request_start_time=r.start_time
full outer join #temp as t on t.session_id=s.session_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) h
cross apply sys.dm_exec_query_plan(r.plan_handle) p
order by 3 desc
 
drop table #temp
 

You notice that you have several rows with CXPACKET wait types. With this query, you click the xml show plan link and:
1.    search the xml for missing indexes.
2.    Save as a .sqlplan and reopen in SSMS
3.    Compare estimated and actual rows in the iterators on the right side of the plan. Poor estimates may bubble to the left as well.
4.    If there are no missing indexes, estimates and actual are fairly close, reducing MAXDOP may help if it is not a huge report or query.

5. If there are missing index or bad estimations, fix it! :)


Of course, these are not rules set in stone. Just a style in the art of database administration. Just don't blame it on a "parallelism bug" because it is a poor musician that blames his instrument.

One thing to note, if you do turn down MAXDOP server-wide, turn in on at the query level(enterprise edition) on your index operation because they are optimized for it.
Alter index all on tblBlah rebuild with (maxdop=32);

email it! |   |   | 

RE: Filtered indexes not compatible plan reuse

Posted by Jason Massie Click to IM Jason Massie on Friday, March 21, 2008 at 11:40 PM to SQL Server 2008, query optimizer, Indexes, CTP6, Procedure Cache
1742 Views | 1 Comments | Article Rating

Before the original post, I had submitted a connect item for something similar.

It turns out that filtered indexes may not be used when auto-parameterization(or for stored proc) occurs without a recompile hint. This is sort of understandable but I can think of workarounds and I suspect additional logic will be added in the future as mentioned below. Here is the feedback from a connect item I submitted.

"

Thanks for your feedback. The query in question, select lastname from Person.Contact where ContactID <=(100-20) is auto-parameterized by SQL Server into the following form, using the standard auto-parameterization rules: select lastname from Person.Contact where ContactID <=(@1-@2) These rules are designed to minimize compile time cost for simple queries like this, possibly at the expense of additional optimizations such as using a filtered index. Admittedly, it is a difficult tradeoff. In a future release, we'll consider extending the design to make a better decision in a cost-based way. For now, this behavior is by design.

"

alt head: You can't have your cake and eat it too.

email it! |   |   | 

Filtered indexes not compatible plan reuse??

Posted by Click to IM Jason Massie on Sunday, March 16, 2008 at 5:31 AM to SQL Server 2008, query optimizer, Indexes, CTP6, Procedure Cache
2449 Views | 0 Comments | Article Rating

So I was messing around with filtered indexes tonight. Yes, beta software is my idea of a wild Saturday night. Hey, I played some Smash Bros first. ;) I was trying to figure out what happens if a plan is created using a covering but filtered index and try to reuse it with a parameter that is covered vertically but not horizontally. Let's look at this example: 

create database dbtest01;

--Force parameterization

alter database dbtest01 set parameterization forced;

 

use dbtest01

--create test table 1

create table t1

(id int identity(1,1),

c1 int,

c2 int)

 

--insert dummy data

declare @ctr int = 2000000

while @ctr > 0

begin

insert into t1(c1, c2) values (@ctr, @ctr)

select @ctr=@ctr-1

end

 

 

--create clustered indexes

create clustered index CI on t1(id)

 

 

--Let create a covering indexes that is pretty extreme.

create index ix01 on t1(c1) where c1 = 1990001

 

 

--Test queries

--ctrl + m

dbcc freeproccache

set statistics io on

 

--Does a CI scan.

select c1 from t1 

where c1 = 1990001

 

--Nope, index hints returns an error

--Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.

select c1 from t1  with (index=ix01)

where c1 = 1990001

 

--Ahh, recompile works. Now it uses the filtered index

select c1 from t1 

where c1 = 1990001

option(recompile)

 

 

--What about simple parameterization and a stored proc?

alter database dbtest01 set parameterization simple;

 

create proc proc1

@p1 int

as

select c1 from t1 

where c1 = @p1

 

--It does the CI scan

exec proc1 1990001

 

 

alter proc proc1

@p1 int

as

select c1 from t1 

where c1 = @p1

option(recompile)

 

--It uses the filtered index :)

exec proc1 1990001

email it! |   |   | 

TechEd 2008

Posted by Jason Massie Click to IM Jason Massie on Tuesday, February 19, 2008 at 11:10 PM to SQL Server 2008, SQL Server 2005, query optimizer, WMI, TechEd
1030 Views | 0 Comments | Article Rating

Two of my session submissions got accepted for the TechEd Developer conference database track.

  • Troubleshooting Query Plans Gone Wrong
  • Windows Management Instrumentation(WMI) for the Command Line DBA

I am really pumped up about this. I have seen the other sessions that they have selected so far and they look great. I highly suggest you check this out! Orlando over the Memorial Day Weekend followed by some hardcore tech goodness. That sounds like a great vacation.

email it! |   |   | 

Never use table variables?

Posted by Jason Massie Click to IM Jason Massie on Wednesday, January 30, 2008 at 4:20 AM to SQL Server 2005, query optimizer, SQL performance tuning
1494 Views | 0 Comments | Article Rating

Itzik Ben-Gan describes a method of analyzing trace files in his book "Inside Microsoft SQL Server 2005: T-SQL Querying" He provides a couple functions for stripping the literals from queries in the TextData column. If you do not use one of these functions or one of your own similar tools, you are not being very effective in your trace analyisis. The first one is from Microsoft PSS and can be downloaded here. The second one is a CLR function that uses regex. It performs much better. His book is worth it just for this function.

Here is the concept: Tuning single queries can be insane ineffective for multiple reasons. When facing a performance problem, you should be looking at query patterns that way to can get the biggest bang in the shortest amount of time.

Here is a simple trace analysis query using the PSS function.

-- Generate pattern and order by highest CPU

-- This is long running so you may want to dump it into a table.

--http://statisticsio.com/files/patterns/fn_tsqlsig.sql

SELECT dbo.fn_SQLSigTSQL(textdata, 4000)

, SUM(CPU)

, COUNT(*)

FROM dbo.trace_table

GROUP BY dbo.fn_SQLSigTSQL(textdata, 4000)

order by CPU DESC

 

Check out Itzak's book for more info on trace analysis. In this series on blog's, we will look at applying this concept to data in the DMV.

email it! |   |   | 

The problem with local variables

Posted by Jason Massie Click to IM Jason Massie on Friday, January 25, 2008 at 9:32 AM to SQL Server 2005, query optimizer, SQL performance tuning, Indexes
1238 Views | 0 Comments | Article Rating

Have you ever been writing a query and just cannot get it to use the right index. This could be one of the reasons why. Let's use this query with local variables as our example.

declare @Start datetime
declare @End datetime
select @Start = '2004-08-01 00:00:00.000'
select @End = '2004-07-28 00:00:00.000'
select ProductID from sales.SalesOrderDetail where ModifiedDate >= @End and ModifiedDate <= @Start
It generates this plan:

SQL Server Clustered index scan

but we have an index on ModifiedDate. There are many reason why SQL would not use this index but, for this post, we will assume we have eliminated them. Finally, we hard code the dates and we get this plan.

SQL Server index seek with bookmark lookup

So why is it doing this? The reason is because the query optimizer cannot accurately use the statistics to estimate how many rows are returned with local variables. Let's look at how we can tell there is a problem with the cardinality estimates. In the query with the local variables, the optimizer thinks we are getting 10918.5 rows so we do the index scan. In the query with hard coded literals, the estimated rows and actual rows are the same and accurate.

SQL Server Cardinality underestimation

How can we fix this? There are several way. This is another situation that makes a case for stored procedures or parameterized queries.

create proc pDemo01 @Start datetime, @End datetime
as
select ProductID from sales.SalesOrderDetail
where ModifiedDate >= @End and ModifiedDate <= @Start

The stored proc generates the proper plan. However, you will run into the same problem if you modify the parameter within the stored proc like select @start = @start-90. In this case, to should use sp_executeSQL. What if you cannot use a stored proc because it is a 3rd party app or some other reason? A covering index is probably the solution. Once we create this index, it will always be used:

create index ix01 on sales.SalesOrderDetail(ModifiedDate) include (ProductID)

We could use a plan guide or an index hint with a forceseek(SQL 2008) but performance will be really bad when we really do need to get 10k rows. The same problem can happen with stored proc's but that is another post.

To get deeper into this subject, check out this.

email it! |   |   | 

SQL Server 2008 Plan Guides from Cache

Posted by Jason Massie Click to IM Jason Massie on Friday, January 18, 2008 at 5:47 AM to SQL Server 2008, query optimizer, SQL performance tuning
1313 Views | 0 Comments | Article Rating

Uhoh... I can see some junior developers going crazy with this. One of the things that kept plan guides from being over used was the fact that they are kind of hard :) Well, Microsoft built their empire making hard stuff easy. They do it again with sp_create_plan_guide_from_cache.

Let's look at this BOL sample.

 

USE AdventureWorks;

GO

SELECT WorkOrderID, p.Name, OrderQty, DueDate

FROM Production.WorkOrder AS w

JOIN Production.Product AS p ON w.ProductID = p.ProductID

WHERE p.ProductSubcategoryID > 4

ORDER BY p.Name, DueDate;

GO

-- Inspect the query plan by using dynamic management views.

SELECT * FROM sys.dm_exec_query_stats AS qs

CROSS APPLY sys.dm_exec_sql_text(sql_handle)

CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) AS qp

WHERE text LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

GO

-- Create a plan guide for the query by specifying the query plan in the plan cache.

DECLARE @plan_handle varbinary(64);

DECLARE @offset int;

SELECT @plan_handle = plan_handle, @offset = qs.statement_start_offset

FROM sys.dm_exec_query_stats AS qs

CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS st

CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) AS qp

WHERE text LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

 

EXECUTE sp_create_plan_guide_from_cache

    @name =  N'Guide1',

    @plan_handle = @plan_handle,

    @statement_start_offset = @offset;

GO

-- Verify that the plan guide is created.

SELECT * FROM sys.plan_guides

WHERE scope_batch LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

GO

 

--Let's verify it actually worked.

--Click the xml link

--Save as a .sqlplan, reopen in SSMS and then hit f4

set statistics xml on

go

SELECT WorkOrderID, p.Name, OrderQty, DueDate

FROM Production.WorkOrder AS w

JOIN Production.Product AS p ON w.ProductID = p.ProductID

WHERE p.ProductSubcategoryID > 4

ORDER BY p.Name, DueDate;

GO

So when would you use this? I would say hardly ever hopefully. However, it could solve the really tough problems.

Let's say you have a 3rd party application that generates adhoc dynamic SQL. You cannot modify the code or schema. Index changes are not supported. Sometimes parameter sniffing causes unpredictable performance. Sound like a nightmare? Welcome to most CRM apps.

Other scenarios that come to mind are when best practices are not or cannot be followed. Let's say you just cannot update stats often enough with a large enough sample on a very very VERY large table to get a consistently optimal plan. Use a plan guide!

Here are some other times that the optimizer might have trouble and a plan guide may be a good option.

  • Use of local variables
  • Modifying stored proc parameters.
  • Ascending keys
  • Complex queries with table variables

There are usually better solutions than plan guides so save them for times that best practices are not an option. sp_create_plan_guide_from_cache makes using plan guides so much easier. Put it in your toolbox!

 

 

email it! |   |   | 

Queries using a lot of memory using dm_exec_query_memory_grants and dm_exec_sql_text

Posted by Jason Massie Click to IM Jason Massie on Thursday, December 06, 2007 at 11:56 PM to SQL Server 2005, query optimizer, SQL performance tuning, memory bottleneck
1579 Views | 0 Comments | Article Rating

Here is query I relied on heavily while troubleshooting a customer with  resource_semaphore_query_compile wait types. It is useful in other low memory conditions.

select text, query_plan, requested_memory_kb, granted_memory_kb, used_memory_kb from sys.dm_exec_query_memory_grants MG
CROSS APPLY sys.dm_exec_sql_text(sql_handle)  t
CROSS APPLY sys.dm_exec_query_plan(MG.plan_handle)

High granted memory and used memory is what I looked at. Once you find the culprits, you can look through the plan for the usual suspects like hash joins. Just click on the xml link and save and a .sqlplan. Reopen it in SSMS.

email it! |   |   | 

The forceseek hint in SQL Server 2008

Posted by Jason Massie Click to IM Jason Massie on Thursday, December 06, 2007 at 10:38 PM to Windows Server 2008, query optimizer, SQL performance tuning, tsql, Indexes
1291 Views | 0 Comments | Article Rating

Have you every had a query that should totally be using an existing index and doing a seek but it keeps doing a clustered index scan? In SQL Server 2008, you can force it to do a seek. That may or may not be a good thing. Let's take a look at an AdventureWorks example.

--logical reads 1246

select ProductID, OrderQty from sales.SalesOrderDetail

where ProductID > 776

This query generates this plan.


Now let's add the hint.

--logical reads 271264

select ProductID, OrderQty from sales.SalesOrderDetail with (forceseek)

where ProductID > 776


This query generates this plan.

The second query plan contains the index seek but logical reads are 20 times more. This is because it has to do lookup for OrderQty. Obviously using the hint would be a bad thing.

So when would be a good time to use this hint? Let's look at this query
create proc p1
@p int
as
select c1, c2 from t1 where c3 = @p

t1 has 1 million rows
99% of c3 =1
.5% of c3 = 2
.5% of c3 = 3
99% of the queries run with @p = 2 or 3

The problem is if the query compiles with c3 = 1. The optimizer will choose a CI scan. In SQL 2005, we could recompile each time at the statement level or we could use an optimize for hint. Now we have another tool in the bag of tricks.

Use with caution.

email it! |   |   | 

Take it a step further with the unused and missing index DMV's

Posted by Jason Massie Click to IM Jason Massie on Sunday, November 25, 2007 at 10:33 AM to SQL Server 2005, query optimizer, SQL performance tuning, tsql, memory bottleneck, Indexes
3601 Views | 0 Comments | Article Rating

In SQL 2000 I wanted to find unused indexes but found it to be difficult. I had to capture a huge trace that fully represents the workload and “trust” that the ITW knows what it’s talking about. You could also use the scan started trace event filtered by dbid, objectid, indexid. This procedure was very tedious.

In SS2005, a quick query of the dynamic management views lets you know which indexes are not being used. You can do this in a few minutes what had previously taken days or weeks. This alone is a very powerful feature.

I am proposing that you take it a step further. We are going to use a little logic and the missing index DMV’s to combine indexes and remove indexes that are still used but redundant. The steps in this process would look like this:

1.       Remove unused indexes with the unused index script

2.       Get your list of tables to analyze.

3.       Remove redundant but used indexes.

4.       Revue missing index DMV’s for mistaken index drops.

5.       Combine indexes that where it is logical to do so.

6.       Revue missing index DMV’s for mistaken index drops.

This methodology is most effective and viable when these conditions are met:

·         The server has been online and thus collecting stats for a long time.

·         The server is not pushing a hardware bottleneck. If so, this should be done during maint window.

·         The server is enterprise edition and the tables allow online operations (i.e. no LOB data or partitions).

What are the benefits of removing unused indexes?

·         Reduced writes during updates

·         Reduced space usage

·         Reduced backup\restore space and time

·         Reduced index maintenance time

What are the benefits of removing unused indexes, used but redundant indexes and combining indexes?

·         Less memory footprint for the same amount of data

·         A different angle to index tuning

·         Indexes are more likely to be covering

·         Reduced writes during updates

·         Reduced space usage

·         Reduced backup\restore space and time

·         Reduced index maintenance time

Now, let’s get down to business…

 

Remove Unused Indexes

This is pretty straightforward and relatively safe. You want to make sure that your server has been up long enough to get good index usage. This includes infrequent operations like month end reporting, etc.

--Unused indexes

declare @dbid int 

select @dbid = db_id() 

 

select object_name(s.object_id) as ObjName

, i.name as IndName

, i.index_id 

, user_seeks + user_scans + user_lookups  as reads

, user_updates as writes 

, sum(p.rows) as rows

from sys.dm_db_index_usage_stats s join sys.indexes i on s.object_id = i.object_id  and i.index_id = s.index_id 

join sys.partitions p on s.object_id = p.object_id and p.index_id= s.index_id

where objectproperty(s.object_id,'IsUserTable') = 1 and s.index_id> 0 and s.database_id = @dbid 

group by object_name(s.object_id), i.name, i.index_id, user_seeks+ user_scans + user_lookups, user_updates

order by reads, writes desc 

 

You can most likely drop any indexes with zero or close to zero reads. The more rows, the more space you will reclaim. The more writes, the better write performance you get by removing them. If the index is not being read but writes are minimal and rows are low, there is little benefit of dropping it. Keep in mind that not having an index that you need is a lot worse than having an index you don’t need in most situations. I suggest starting off with a conservative approach.

Now that we have gotten rid of the low hanging fruit, we can get deeper. I get a list of tables by size and work my way down.

select object_name(object_id), max(rows)

from sys.partitions

group by object_name(object_id)

order by 2 desc

After running that query we see that tEvent is the largest table and should provide nice gains. This is where we start. Let’s use this schema for our example.

create table tEvent

(EventID int primary key clustered,

EventType int,

EventName varchar(100),

EventDetailID int,

CustomerID int,

CompanyID int,

DateOpen datetime,

DateClose datetime)

 

create index ind1 on tEvents(EventDetailID);

 

create index ind2 on tEvents(CustomerID);

 

create index ind3 on tEvents(EventDetailID, CustomerID, CompanyID);

 

create index ind4 on tEvents(CompanyID, DateClose, EventType);

 

create index ind5 on tEvents(CustomerID, DateClose, EventType);

Removing used but redundant indexes

If we do a sp_helpindex, we can see that ind3 should satisfy queries currently using ind1 and ind2. The keyword is “should”. It’s a fairly safe bet so we drop ind1 and ind2. Now we should monitor the missing index DMV to see if there is any negative impact. We should also see the read count substantially increase by using the unused index query filtered by tEvent. Here is the missing index monitoring query:

--Missing indexes

SELECT sys.objects.name

, (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) as Impact

, mid.equality_columns

, mid.inequality_columns

, mid.included_columns

FROM sys.dm_db_missing_index_group_stats AS migs

INNER JOIN sys.dm_db_missing_index_groups AS mig ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details AS mid ON mig.index_handle = mid.index_handle

INNER JOIN sys.objects WITH (nolock) ON mid.object_id = sys.objects.object_id

WHERE     (migs.group_handleIN

(

SELECT     TOP (5000) group_handle

FROM sys.dm_db_missing_index_group_stats WITH (nolock)

ORDER BY (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) DESC)

)

and objectproperty(sys.objects.object_id, 'isusertable')=1and name = 'tEvent'

ORDER BY 2 DESC , 3 desc

As long as the indexes do not come back up in this report, we should be ok. Now we are servicing the same queries with 2 less indexes. Those are index pages that are no longer taking up buffer pool space!

Combining indexes

The same concept can be applied to the following scenario but a little more “feel” and understanding of how the app accesses the data is needed.

create index ind4 on tEvents(CompanyID, DateClose, EventType);

 

create index ind5 on tEvents(CustomerID, DateClose, EventType);

We know a former DBA added these. It is logical for us, based on our knowledge of the app and the cardinality of the data, to try to replace these indexes with this one.

create index ind6 on tEvents(CustomerID, CompanyID,  DateClose, EventType) with (online=on, maxdop=8);

This index should satisfy all queries using both indexes. Again, we want go back to the missing index report to see if SQL thinks it needs one of those indexes.

Now work your way down your list of tables by row count. I was able to reduce the size of a 200GB database by 15% in addition to removing the unused indexes. That’s a big gain! The only problem I ran into was changing the name of an index that had a hint. There were a few indexes that I had to add back but since my approach was conservative it was nothing drastic and completely online.

Let me know if you have any other tips or questions regarding this topic.

email it! |   |   | 

SQL Server 2008 NOV CTP changes to sys.dm_exec_query_memory_grants

Posted by Jason Massie Click to IM Jason Massie on Wednesday, November 21, 2007 at 11:25 AM to SQL Server 2008, query optimizer, SQL performance tuning, memory bottleneck
1169 Views | 1 Comments | Article Rating

I got well aquainted with this DMV when I was wrestling with resource_semaphore and resource_semaphore_query_compile wait types pre-sp2.

It looks like there are a few changes to it. Most notably, hooks for the resource govenor and ideal_memory_kb. BOL defines ideal_memory_kb as "Size, in kilobytes (KB), of the memory grant to fit everything into physical memory. This is based on the cardinality estimate." I think this will be very useful when troubleshooting plan/stats problems.

email it! |   |   | 

New DMF in SP2 - sys.dm_exec_text_query_plan

Posted by Jason Massie Click to IM Jason Massie on Saturday, January 13, 2007 at 9:25 AM to SQL Server 2005, query optimizer
978 Views | 0 Comments | Article Rating

Here is a DMF of interest in sp2. This could be applied some of my previous blogs.


sys.dm_exec_text_query_plan

Returns the Showplan in text format for a Transact-SQL batch or for a specific statement within the batch. The query plan specified by the plan handle can either be cached or currently executing. This table-valued function is similar to sys.dm_exec_query_plan, but has the following differences:

select * from
sys.dm_exec_requests cross apply sys.dm_exec_text_query_plan(plan_handle)

email it! |   |   | 

A tool - pExecStats

Posted by Jason Massie Click to IM Jason Massie on Sunday, December 31, 2006 at 9:03 AM to SQL Server 2005, query optimizer, SQL performance tuning, tsql
1061 Views | 0 Comments | Article Rating

Itzik Ben-Gan describes a method of analyzing trace files in his book "Inside Microsoft SQL Server 2005: T-SQL Querying" He provides a couple functions for stripping the literals from queries in the TextData column. If you do not use one of these functions or one of your own similar tools, you are not being very effective in your trace analyisis. The first one is from Microsoft PSS and can be downloaded here. The second one is a CLR function that uses regex. It performs much better. His book is worth it just for this function.

Here is the concept: Tuning single queries can be insane ineffective for multiple reasons. When facing a performance problem, you should be looking at query patterns that way to can get the biggest bang in the shortest amount of time.

Here is a simple trace analysis query using the PSS function.

-- Generate pattern and order by highest CPU

-- This is long running so you may want to dump it into a table.

--http://statisticsio.com/files/patterns/fn_tsqlsig.sql

SELECT dbo.fn_SQLSigTSQL(textdata, 4000)

, SUM(CPU)

, COUNT(*)

FROM dbo.trace_table

GROUP BY dbo.fn_SQLSigTSQL(textdata, 4000)

order by CPU DESC

 

Check out Itzak's book for more info on trace analysis. In this series on blog's, we will look at applying this concept to data in the DMV.

email it! |   |   | 

Page 1 of 1First   Previous   Next   Last   


Copyright 2006 by Statistics IO, My SQL Server Blog