BorderLayoutBoxedLayoutOpenLayoutMaximum textMedium textSmall text


Register
Thursday, September 02, 2010
MyStreamMinimize
Print  

Entries for the 'SQL performance tuning' Category

Greater Fort Worth SQL Server Users Group

Posted by Jason Massie Click to IM Jason Massie on Saturday, April 11, 2009 at 2:07 PM to SQL Server 2008, SQL performance tuning, Indexes, PASS, DMVs
2157 Views | 0 Comments | Article Rating

I will be speaking at the Greater Fort Worth SQL Server Users group this Wednesday. The talk will be on Troubleshooting with the DMV’s. If you are a DFW SQL’er come on out. Directions and more info can be found here.

If you don’t come, I am sending my boy after you with blow torch and a pair of pliers.

email it! |   |   | 

Mixed Workloads Part 3

Posted by Jason Massie Click to IM Jason Massie on Tuesday, March 31, 2009 at 8:17 AM to SQL Server 2008, SQL Server 2005, SQL performance tuning, tsql, Indexes, consolidation, editorial, syndicate
2353 Views | 0 Comments | Article Rating

In part 1, I talk about what I consider feeble attempts at implementing a reporting server through log shipping, mirroring\snapshots and, to a lesser extent, replication. Unless you invest in a real ETL solution, I argued that it is better to run a mixed workload. I talked about the architectural advantages of running mixed workloads in part 2. In a nutshell, doubling the hardware and cutting the data in half.

In this final post, we will talk about new features in SQL Server 2008 and some features that have been around a while that can help with mixed workloads.There are also some bad practices that could be the right answer that we won’t  talk about but let us mention triggers, table valued functions, 20 table outer joins, some correlated subqueries and table variables. These are options but usually not good ones. In the right circumstances, they could be right like an end of year report.

Here are the main tools in your arsenal:

  • Resource governor
  • Covering filtered Indexes
  • Indexed views
  • Partitioning + data compression
  • Persisted computed columns
  • Archiving on the same server
  • After hours denormalization(indexes views with deferred updates)

Resource Governor

Chances are you do not want to limit throughput of your OLTP queries. The resource governor does not do a good job with these queries anyway because their duration is usually so short. However, let’s say you have reports that run by executives. You can put them in a workload group that gives them as much resources as possible without affecting OLTP traffic. You may also have a less important group of reports from the marketing or sales teams that you can limit further. One caveat to the resource governor is it cannot limit, disk IO so if that is your bottleneck, this will not help much.

Covering Filtered indexes

Filtered indexes are a great new feature in SQL Server 2008. When optimizing for reporting queries on your OLTP system, you are probably going to be touching a lot of rows so covering the query is important. For example, the order fulfillment team works off a report of unfulfilled orders that pulls in order data, customer data, shipping data etc. In this case, you would add covering filtering indexes on each of those tables. The filtered indexes reduce write overhead on your OLTP writes and reduces read overhead of your reporting.

Indexed Views

Indexed views take filtered indexed view a step further. You can create indexes on multiple tables. Think of it as denormalization alongside your OLTP optimized schema. In the previous order fulfillment example, we can basically persist that report and have it updated in real time. There is more overhead to your OLTP transactions so weigh the pro’s and con’s. Test if possible.Unfortunately, you cannot defer changes to your indexed views but I believe there is a feature request for this on Connect and I will tell you about a workaround shortly.

Partitioning and Compression

This is the dynamic duo when mixing workloads. Unfortunately, the nitty gritty details would require their own post. For example, one mixed workload may benefit from compression on the hottest partition while the older data should be uncompressed. However, another workload may benefit from the opposite. The key here is really understanding your workload, data and hardware limitations. Most importantly, plan then TEST, TEST, TEST! Once you partition, you loose online operations so if you do it wrong, you are stuck.

 Persisted Computed Columns

This is an easy one. It is a simple trade off. Writes take a little more CPU and space in exchange for reduced CPU time when you report. Take your orders table, for example. You could calculate and save shipping costs when you insert the rows. If it adds a few milliseconds to the insert but shaves seconds off the hourly open orders report that the execs are looking at, it may be an easy decision.

Archival

This might not always be possible depending on your data. It may not be necessary if you have finely tuned indexes. However, it could make a night and day difference. If you need the data, UNION ALL’ing the production table with the archive table has little overhead. I do suggest you keep the archive database on the same server unless it will rarely be accessed. Trying to do this with linked servers is bad.

After hours denormalization

This is basically precreating reports during off hours. Think of it as indexed view with deferred updates. You can UNION with the OLTP tables if you need realtime data in your report. In an ideal world, touching less rows in the OLTP table and then UNIONing with the denormalized data will result in the best of both worlds if you need real time data.

The final word

As the concurrency and size of data scales, both a pseudo reporting database and a mixed work load scenario will not meet business requirements. A business requirement of real time data may dictate a mixed workload. There may be plenty of workloads where scaling out and scaling up both meet performance demands. I just wanted to play devil’s advocate and  let you know there is another option when planning reporting.

email it! |   |   | 

Mixed Workloads Part 2

Posted by Jason Massie Click to IM Jason Massie on Tuesday, March 24, 2009 at 7:56 PM to SQL Server 2008, SQL performance tuning, hardware, editorial
1774 Views | 2 Comments | Article Rating

I am presenting at the Ft. Worth SQL Server Users Group in April. Details forthcoming but it is basically going to be on running mixed workloads(OLTP and DSS) on the same server. This is part two of what I will be pulling into the presentation. Click here for part one.

In this part we will look at the typical hardware configurations used in part one(logshipping, mirroring, replication etc.). and what a better configuration would be especially if you leverage some new features in SQL Server 2008 to run mixed workloads.

Unless you invest the time to create a real reporting solution with an ETL, you end up with the following solutions based off of the methods described in part one.I see it all the time. On top of that, I see reporting queries still running on the production server because there is a need for real time data.

TypicalReporting

So what do we get with this solution? Most reporting queries are offloaded from production. However, there are a lot of con’s. The schema usually is not optimized for reporting. There is overhead in getting the data to the reporting server. The data is stored twice on disk and more importantly, memory. Finally, resource utilization is usually lopsided. For example, first thing in the morning the reporting server may be hammered while production traffic is just ramping up. During peak production traffic, the reporting server can be underutilized.

mixed

I submit to you that combing reporting and production is a better configuration if you do not invest in an ETL solution that creates a real reporting database. In the next post, we will talk about features to optimize this configuration but lets talk about what we gain just by using this architecture.

  • The hardware is doubled and the amount of data is cut in half.
  • In the configurations that this would replace, the memory gain is huge. Unless the reporting database is optimized for reporting, the data pages in memory are going to be the same. This allows the same page in memory to satisfy both OLTP and reporting traffic.
  • This configuration gives more cores for parallel plans that are common in reporting.
  • Most of all, it make available hardware that could otherwise be idle.

 

Note: I use direct attached storage(DAS) in these examples because that is where the biggest gains are to be had. However, the same benefits apply if you are on an enterprise level SAN with some caveats.

Note 2: This series is generalized and your mileage may vary based on your particular environment, business requirements and workload.

In part three, we will talk about features that will help optimize a mixed workload on a single instance with feature in SQL Server 2008.

email it! |   |   | 

Another Overlooked Windows Setting for the DBA

Posted by Jason Massie Click to IM Jason Massie on Friday, March 20, 2009 at 5:33 PM to SQL Server 2008, SQL Server 2005, SQL performance tuning, hardware, syndicate
2096 Views | 1 Comments | Article Rating

*Warning* Only use if your array controller has a battery backed cache. *Warning*

The settings are “Enable write caching on the disk” and “Enable advanced performance”. You can access these through device manager on the properties of the disk. These settings mostly apply to direct attached storage and are unavailable for most enterprise SAN lun’s that I have seen.

While we are at it, if your RAID controller cache has a read\write ratio, it is a good idea to set it to 0% read\100% write as long as you do not have a memory bottleneck. SQL uses RAM as its read buffer.

So is this a silver bullet for performance? Definitely not especially if you are not hitting a disk write bottleneck. However, every little bit helps and if it knocks 5-10% off of your 3 hour long full backup to disk, that is a win!

Happy Friday!

email it! |   |   | 

An Often Overlooked Windows Setting for the DBA

Posted by Jason Massie Click to IM Jason Massie on Thursday, March 19, 2009 at 9:07 AM to SQL Server 2008, SQL Server 2005, Windows Server 2008, SQL performance tuning, memory bottleneck, syndicate
7956 Views | 8 Comments | Article Rating

The default setting is the wrong setting for SQL Server. However, unless this has caused you a problem or you are thorough to point of OCD, this may not be set on your server.

Unless you are fighting a memory bottleneck, it probably won’t affect you too much but it is hard to give SQL too much memory.

The setting is “Maximize Data Throughput for Network Applications” and on by default. It sounds like a good thing. To the contrary, here is documentation from MSDN.

http://msdn.microsoft.com/en-us/library/ms178067.aspx

Maximize Data Throughput for Network Applications

To optimize system memory use for SQL Server, you should limit the amount of memory that is used by the system for file caching. To limit the file system cache, make sure that Maximize data throughput for file sharing is not selected. You can specify the smallest file system cache by selecting Minimize memory used or Balance.

To check the current setting on your operating system

1. Click Start, then click Control Panel, double-click Network Connections, and then double-click Local Area Connection.

2. On the General tab, click Properties, select File and Printer Sharing Microsoft Networks, and then click Properties.

3. If Maximize data throughput for network applications is selected, choose any other option, click OK, and then close the rest of the dialog boxes.

Happy Tweaking.

email it! |   |   | 

Key Lookup Threshold Part 2

Posted by Jason Massie Click to IM Jason Massie on Monday, February 09, 2009 at 7:57 AM to query optimizer, SQL performance tuning
1407 Views | 1 Comments | Article Rating

Grant Fritchey(@gfritchey) posed this question to me on twitter after my posted on the key lookup threshold post.

@statisticsio Interesting post. Do you think that threshold is dependent on the data involved? Or maybe on the size of the key?8:28 AM Feb 5th from TwitterGadget in reply to statisticsio

I kind of assumed so which is why I did char(1000). However, I have just tested with char(1). The numbers are interesting. Please refer to the original post for repro code.

 

 IOIO
RowsCHAR(1000)CHAR(1)
Scan4229
1 row44
5 rows1212
10 rows2222
25 rows5252
50 rows114102
100 rows217202
250 Rows526502
500 Rows10431003
1000 Rows41414007

 

The threshold is crossed much earlier because the scan is smaller. This is a small table especially with CHAR(1) so take this test with a grain of salt. Run your own tests when you are working with large production data.

I would also venture to guess that as the complexity of the query increases the threshold gets lower especially on more complex queries like when grouping especially when grouping and outer joining.

A covering index is probably the best solution in most cases unless you need to touch all rows even then, it might be better,.

email it! |   |   | 

Key Lookup Threshold

Posted by Jason Massie Click to IM Jason Massie on Thursday, February 05, 2009 at 8:05 AM to SQL Server 2008, SQL Server 2005, query optimizer, SQL performance tuning, Indexes
3088 Views | 1 Comments | Article Rating

Gail talks about bookmark lookups…. err.. key lookups in this post. So are they good or bad? Well, like many things in SQL, it depends. The main factor is the number of rows returned. A few rows are fine but the cost rises sharply with larger result sets. There comes a point where the threshold is crossed and a scan is more efficient. This is because a scan leverages sequential IO while a lookup does random IO.

Here are the results of the code at the end of the post.

RowsIO 
Scan422 
1 row4 
5 rows12 
10 rows22 
25 rows52 
50 rows114 
100 rows217 
250 Rows526<--Threshold
500 Rows1043 
1000 Rows4141 

 

As you can see, at 250 rows, we have crossed the threshold and it is cheaper do a scan. If you are passing in a literal, the optimizer can detect this and switch to a scan. If it is a stored proc or parameterized SQL, a plan is cached the first time it is run. Problems happen when the result size greatly varies depending on the parameter. There are ways around this all with their pro’s and con’s. Here are some:

  • A covering index.
  • A statement level recompile hint
  • Plan guides and hints

Here is the sample code that can repro these numbers on SQL Server 2008.

CREATE TABLE #temp
(
id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
c1 CHAR(1000) DEFAULT( 'blah'),
c2 INT
)
INSERT INTO #temp(c2)
VALUES
(1)
GO 
INSERT INTO #temp(c2)
VALUES
(5)
GO 5
INSERT INTO #temp(c2)
VALUES
(10)
GO 10
INSERT INTO #temp(c2)
VALUES
(25)
GO 25
INSERT INTO #temp(c2)
VALUES
(50)
GO 50
INSERT INTO #temp(c2)
VALUES
(100)
GO 100
INSERT INTO #temp(c2)
VALUES
(250)
GO 250
INSERT INTO #temp(c2)
VALUES
(500)
GO 500
INSERT INTO #temp(c2)
VALUES
(1000)
GO 1000
INSERT INTO #temp(c2)
VALUES
(1000)
GO 1000

CREATE INDEX ix ON #temp(c2)

--The baseline
SET STATISTICS io ON
SELECT
*
FROM #temp WITH (INDEX=1)
WHERE c2 = 1

--1 row returned
SELECT *
FROM #temp
WHERE c2 = 1

--5 rows returned
SELECT *
FROM #temp
WHERE c2 = 5

--10 rows returned
SELECT *
FROM #temp
WHERE c2 = 10

--25 rows returned
SELECT *
FROM #temp
WHERE c2 = 25

--50 rows returned
SELECT *
FROM #temp
WHERE c2 = 50

--100 rows returned
SELECT *
FROM #temp
WHERE c2 = 100

--250 rows returned
--Must begin using hints because the optimizer can tell that a scan is better
SELECT *
FROM #temp WITH (INDEX=ix, forceseek)
WHERE c2 = 250

--1000 rows returned
--Must begin using hints because the optimizer can tell that a scan is better
SELECT *
FROM #temp WITH (INDEX=ix, forceseek)
WHERE c2 = 1000

email it! |   |   | 

Updated Unused Index Query

Posted by Jason Massie Click to IM Jason Massie on Tuesday, January 13, 2009 at 5:41 PM to SQL Server 2008, SQL Server 2005, SQL performance tuning, tsql, Indexes
3330 Views | 4 Comments | Article Rating

I have cleaned up and updated my unused index query. It returns a list of indexes ordered by ascending reads. If the instance has been up for a long time, it is probably safe to drop indexes with zero or close to zero reads. If the reads are low and the writes are high, this may help improve your transactions per second count. Otherwise, you are just cleaning up unused space. Use caution though. A missing index is worse than an unused index.

SELECT objectname=OBJECT_NAME(s.OBJECT_ID)
,
indexname=i.name
, i.index_id  
, reads=user_seeks + user_scans + user_lookups  
, writes user_updates  
, p.rows
FROM sys.dm_db_index_usage_stats s JOIN sys.indexes i 
ON i.index_id = s.index_id AND s.OBJECT_ID = i.OBJECT_ID  
JOIN sys.partitions p ON p.index_id = s.index_id AND s.OBJECT_ID = p.OBJECT_ID
WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1  
AND s.database_id = DB_ID()  
AND
i.type_desc = 'nonclustered'
AND i.is_primary_key = 0
AND i.is_unique_constraint = 0
AND p.rows > 10000
ORDER BY reads, rows DESC

I will also be putting this query up at the SSP WIKI.

email it! |   |   | 

Finding Index Scans due to Implicit Conversions

Posted by Jason Massie Click to IM Jason Massie on Monday, January 12, 2009 at 7:00 PM to SQL Server 2008, SQL Server 2005, query optimizer, SQL performance tuning, tsql, Indexes, Procedure Cache, DMVs
3183 Views | 8 Comments | Article Rating

It is fairly widely known that implicit conversions of data types can cause scans. However, you may not be able to control what goes in your database, you may have inherited an app and we just are not perfect.

On top of that some can be tricky. For example, did you know that scope_identity and @@identity both return numeric data types. An AVG of a SMALLINT column returns an INT.

Here is a simple repro which the same query does an index scan instead of an index seek

CREATE TABLE t1
(
id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
, c1 VARCHAR(100)
)

GO

INSERT INTO t1(c1)
SELECT 1
GO 10000

INSERT INTO t1(c1)
SELECT 0
GO 10

CREATE INDEX ix ON t1 (c1)

CREATE PROC p1
@blah NVARCHAR(100)
AS
SELECT
id FROM t1
WHERE c1 = @blah

EXEC p1 N'0'

CREATE PROC p2
@blah VARCHAR(100)
AS
SELECT
id FROM t1
WHERE c1 = @blah

EXEC p2 '0'

DROP TABLE t1
DROP PROC p1
DROP PROC p2

 

I found a sample query from Umachandar Jayachandran and made some mod’s to it. It basically scans your procedure cache for index scans containing implicit conversions.

I have posted it at the SQLServerPedia Wiki here. It is the 2nd query listed.

Note: Implicit conversions may not always result in a scan.

email it! |   |   | 

Never Index a BIT?

Posted by Jason Massie Click to IM Jason Massie on Tuesday, December 09, 2008 at 8:03 AM to SQL Server 2008, SQL performance tuning, Indexes
2455 Views | 4 Comments | Article Rating

Never say never, right? For example, you have an orders table. It has an IsShipped bit column which is what marks an order as done to the business. You might have a more complex version of this example so employees can see the outstanding order count in the app:

CREATE TABLE#temp
   
(
               
c1INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
               
c2 VARCHAR(10) DEFAULT  'bob',
               
c3 bit
   
)

INSERT INTO#temp(c3)
       
VALUES
   
(1)
GO 5000000

INSERT INTO#temp(c3)
       
VALUES
   
(0)
GO 500 

--13,000 IO's
SET STATISTICS io ON
SELECT
COUNT(*)
   
FROM #temp
   
WHEREc3 = 0

CREATE INDEXix ON#temp(c3) include (c2

--6 IO's 
SET STATISTICS io ON
SELECT
COUNT(*)
   
FROM #temp
   
WHEREc3 = 0
           

In SQL 2008, we can get a little better with filtered indexes but really all we are saving is space.

CREATE INDEXix2 ON#temp(c3) include (c2
   
WHEREc3 = 0

--4 IO's

SET STATISTICS ioON
SELECT
COUNT(*)
   
FROM #temp
   
WHEREc3 = 0

email it! |   |   | 

Why do I have to use a MERGE\HASH JOIN Hint?

Posted by Jason Massie Click to IM Jason Massie on Saturday, November 15, 2008 at 8:04 PM to SQL Server 2008, SQL Server 2005, query optimizer, SQL performance tuning
2989 Views | 4 Comments | Article Rating

Hash match operator iconI have run into this several times lately so I thought I would whip out a quick post.

The scenario

A query runs much faster with a hash join hint than a nested loop but that is what the optimizer is choosing.

High level background

Merge and hash are good when you are working with lots of rows because it can leverage sequential IO. If you do a nested loop join with lookups on 100-1000+ rows, the random IOs can be 10-1000% more expensive than a sequential scan.

Why?

So why does the optimizer choose a nested loop? I have seen it is because the optimizer had to make a best guess at how make rows were being returned. It most cases, the optimizer guesses 1 row will be returned so a nested loop is the best choice as far as the optimizer can tell.

Identifying the problem 

Identifying the problem is pretty easy. All you need to do is look at the query plan. It can be text, xml or graphical. Look actual and estimated rows returned. If the actual rows are 11teen,000,000 but the estimated rows is 1 then this may be the problem

Causes

I most often see this when large table variables, local variables or modified stored procedure parameters are involved. There can be other causes. See my favorite white paper.

Fix?

Easy. Don’t use table variables and local variables. If you have to modify sp parameters, pass them to a nested stored proc. Make sure stats are up to date. If the WHERE cause is complex, try putting the logic in a persisted computed column or indexed view. Another option is to use a covering index. Of course, the last resort is the JOIN hint.

Need more?

Want to get real deep on stuff like this? Check out Craig Freedman’s blog or his Inside SQL Server 2005 chapters.

email it! |   |   | 

This Post Needs More Stored Procedures

Posted by Jason Massie Click to IM Jason Massie on Tuesday, September 23, 2008 at 9:53 PM to SQL Server 2008, SQL performance tuning, tsql, internals, Procedure Cache
2502 Views | 5 Comments | Article Rating

New in SQL Server 2008 is a server setting called “Optimize for adhoc workloads”. I was happy to see this. You should be too especially if you have ever had arm wrestle an app that causes a bloated proc cache on an x86 box. Ugghh… Adam blogged on it here, Bob blogged on it here and here is the documentation.

I was wondering how this setting would play with the forced parameterization database setting. It looks like forced parameterization trumps the new server wide setting. If my simple tests below are right, it could present an interesting problem. What if your app generates a lot of totally unique adhoc queries that have no benefit from parameterization and you have queries that would benefit from parameterization? Well, the answer is still the same. Write stored procedures. :)

In the real world, sometimes you get a pig and all you can do is put lipstick on it. If you have to choose, “Optimize for Adhoc” would most benefit memory\IO bound servers while “forced parameterization” could help both the CPU and memory usage. However, that is a very general statement. Other things should be considered so it will depend.

 

--So what happens when you turn on "Optimize for adhoc" and forced parameterization 
exec sp_configure  'show advanced options', 1
reconfigure with override
go
exec sp_configure 'optimize', 1
reconfigure with override
alter database master set parameterization forced
dbcc freeproccache

--Let's run a query will not generate a trivial plan.
select name, object_id, create_date
from sys.all_objects
where object_id = 3 and create_date = '2008-07-09 16:19:59.943'
go

--The CacheObjType is a compiled plan and 57344 bytes
select p.cacheobjtype, p.size_in_bytes ,  s.*
from sys.dm_exec_query_stats s
join sys.dm_exec_cached_plans p
on s.plan_handle = p.plan_handle
cross apply sys.dm_exec_sql_text(sql_handle)
where text like '%select name , object_id , create_date from%'

--Now, let’s turn off forced parameterization

alter database master set parameterization simple
dbcc freeproccache

select name, object_id, create_date
from sys.all_objects
where object_id = 3 and create_date = '2008-07-09 16:19:59.943'
go 

--The CacheObjType is a Compiled Plan Stub and 320 bytes
select p.cacheobjtype, p.size_in_bytes ,  s.*
from sys.dm_exec_query_stats s
join sys.dm_exec_cached_plans p
on s.plan_handle = p.plan_handle
cross apply sys.dm_exec_sql_text(sql_handle)
where text like '%select name , object_id , create_date from%'

email it! |   |   | 

Indexed Views vs. Filtered Indexes Part 2

Posted by Jason Massie Click to IM Jason Massie on Monday, August 25, 2008 at 9:21 AM to SQL Server 2008, SQL performance tuning, tsql, Indexes
1663 Views | 1 Comments | Article Rating

image In the first part, we looked at the a very simple example with a single table SELECT using covering indexes. This one is a little more complex. We are still covering but we are joining two tables, ordering and grouping. Filtered indexes do not support joins so we have to create two indexes. In this test, the indexed view wins but filtered indexes come a respectable 2nd place compared to normal indexes.

 

 

The DDL(2.5M in tblitem and 25M in tblitemdetails)

CREATE TABLE [dbo].[tblItem](
 
    [Itemid] [int] IDENTITY(1,1) NOT NULL,
 
    [ItemName] [varchar](100) NULL,
 
    [Datecreated] [datetime] NULL,
 
    [Closed] [bit] NULL,
 
PRIMARY KEY CLUSTERED 
 
(
 
    [Itemid] ASC
 
) ON [PRIMARY]
 
) ON [PRIMARY]
 
GO
 
CREATE TABLE [dbo].[tblItemDetails](
    [ItemDetailsID] [int] IDENTITY(1,1) NOT NULL,
    [ItemID] [int] NULL,
    [Color] [varchar](20) NULL,
    [Size] [varchar](20) NULL,
    [Flavor] [varchar](20) NULL,
PRIMARY KEY CLUSTERED 
(
    [ItemDetailsID] ASC
) ON [PRIMARY]
) ON [PRIMARY]
 
GO
 
SET ANSI_PADDING OFF
GO
 
ALTER TABLE [dbo].[tblItemDetails]  WITH CHECK ADDCONSTRAINT [fk01] FOREIGN KEY([ItemID])
REFERENCES [dbo].[tblItem] ([Itemid])
GO
 
ALTER TABLE [dbo].[tblItemDetails] CHECK CONSTRAINT [fk01]
GO
 
 
--Create normal and filtered indexes
create index ix01 on tblitem(Datecreated, ItemName)  
 
create index ix02 on tblitem(Datecreated, ItemName) where Datecreated > '2007-08-22' and itemid > 0 and itemid < 50 
 
create index ix01 on tblitemdetails(itemid, color)  
 
create index ix02 on tblitemdetails(itemid, color) where itemid > 0 and itemid < 50
 
--Create indexed views
alter view ivw01
 
with schemabinding
 
as 
 
select  a.ItemID, b.Color, COUNT_big(*) as cnt
from dbo.tblItem a join dbo.tblItemDetails b on a.Itemid= b.ItemID 
where Datecreated > convert(datetime, '2007-08-22', 101) and a.itemid > 0 and a.itemid < 50
group by a.ItemID, b.Color
 
create unique clustered index ix01 on ivw01(itemid)
create index ix02 on ivw01(itemid, color) 


Test Queries

set statistics io on
 
--Using a normal covering index
 
select  a.ItemID, b.Color, COUNT_big(*) as cnt
from dbo.tblItem a with (index=ix01) join dbo.tblItemDetails b with (index=ix01) on a.Itemid= b.ItemID 
where Datecreated > convert(datetime, '2007-08-22', 101) and a.itemid > 0 and a.itemid < 50
group by a.ItemID, b.Color
order by cnt
 
 
 
--using a filtered index
 
select  a.ItemID, b.Color, COUNT_big(*) as cnt
from dbo.tblItem a with (index=ix02) join dbo.tblItemDetails b with (index=ix02) on a.Itemid= b.ItemID 
where Datecreated > convert(datetime, '2007-08-22', 101) and a.itemid > 0 and a.itemid < 50
group by a.ItemID, b.Color
order by cnt
 
 
--using an indexed view
 
select * from ivw01
order by cnt

IO numbers:

Normal covering indexes

(8 row(s) affected)
Table 'tblItemDetails'. Scan count 8, logical reads 24, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblItem'. Scan count 1, logical reads 991, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Filtered covering indexes

(8 row(s) affected)
Table 'tblItem'. Scan count 1, logical reads 99, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblItemDetails'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Indexed View

(8 row(s) affected)
Table 'ivw01'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Note: By increasing the result set 100 fold(itemid > 0 and itemid < 5000), we see very similar results except with merge joins.

 

Query Plans:

The query plans are different. Most notably being the scans for the filtered index and indexed view vs. the normal index. This is fine though. They as just a fraction of the size of the normal index.

image

 

Conclusion

The mere fact that the indexed view runs with 50 times less IO overhead does not make it the automatic choice. What if you are on standard edition? Filtered indexes work on standard edition. Also we filtered on a date column. What if you need a real time rolling total? You have to use a normal index. Know you tools and then test, test and test some more.

email it! |   |   | 

Breaking: Quest-SolidQ Partnership

Posted by Jason Massie Click to IM Jason Massie on Friday, August 15, 2008 at 4:20 PM to SQL Server 2008, SQL Server 2005, SQL performance tuning, Rumor mill, Rumor mill
7147 Views | 88 Comments | Article Rating

The podcast from Brian Moran and Bill Bosworth provides good insight into the the deal. Click here.

Here is an excerpt for the forthcoming press release:

Quest Software is announcing its new partnership with Solid Quality Mentors, a global provider of education and solutions for Microsoft data and development platforms.

This is not a sales reseller relationship, but a true joining of forces to provide new real value to the SQL Server community. Specifically, our plans are to work together in three specific areas: community outreach, product development and professional services.

Community outreach: The companies will work together on joint programs including speaking at Quest and Solid Quality Mentors events, authoring technical briefs and white papers and providing joint webcasts and podcasts for the SQL Server community.

Product development: Quest will collaborate with Solid Quality Mentors on product road maps and feature sets, drawing on their combined experience of real-world SQL Server challenges to provide the best solutions possible for SQL Server DBAs and developers.

Professional services: Quest customers will benefit from consulting services offered by Solid Quality Mentors to help implement tools and manage complex environments.

Peep http://quest.com Monday for more details.

email it! |   |   | 

NTSSUG June Presentation

Posted by Jason Massie Click to IM Jason Massie on Thursday, July 17, 2008 at 2:37 PM to query optimizer, SQL performance tuning, Indexes, PASS, Data collection
1648 Views | 0 Comments | Article Rating

Here are the slides and scripts from my June presentation at the North Texas SQL Server Users Group. The presentation was on the Data Collector in SQL Server 2008 but it was short so I also touched on filtered indexes, query hashes and troubleshooting. Enjoy and let me know if you have any questions.

Also Peter Debetta is presenting TONIGHT at 6:30 on the following topics:

"We'll be discussing the various ways you can secure your SQL Server data via encryption, including the new Transparent Data Encryption feature of SQL Server 2008."

 Stop by.

email it! |   |   | 

CXPACKET, MAXDOP and your OLTP system

Posted by Jason Massie Click to IM Jason Massie on Thursday, June 26, 2008 at 5:15 PM to SQL Server 2005, query optimizer, SQL performance tuning, tsql, DMVs
9477 Views | 6 Comments | Article Rating

So you are experiencing CXPACKET wait types? If you run a google search, you will quickly find out you are experiencing the dreaded SQL Server parallelism problem and you must reduce MAXDOP to 1. While that is possible, I say most likely not. I say that, most of the time, SQL is doing the best it can with what it has to work with. That is the query, the data and the schema.

Microsoft has come a long way since version <= 7.0 when it comes to parallelism. If your OLTP queries follow best practices and are well indexed, they probably will never generate a parallel plan. This is because they are fast and access a small amount of rows. If they are missing indexes or SQL overestimates cardinality, SQL might decide to do scans, sorts, hashes, spools etc. These iterators, among others, can go parallel to reduce execution time at the cost of system resources. These iterators are not bad and they do have their place. It just isn’t on OLTP type of queries most of the time.

So dropping the MAXDOP on an OLTP system to 1 probably won’t hurt much because most of the time there is an IO bottleneck. If the CXPACKET wait types are a symptoms of poor indexing and row count estimation, it won’t help either. Whole books have been written on indexing, query tuning and there is a nice whitepaper on stats best practices to avoid estimation problems so I am not going to go into that. I will give you some pointers on quickly identifying whether the CXPACKET is a symptom or the cause.

So your boss and boss's boss run over to your desk yelling about the customer complaints of slowness with SQL. You run this query or something similar:

select r.cpu_time 
, r.logical_reads
, r.session_id 
into #temp
from sys.dm_exec_sessions as s 
inner join sys.dm_exec_requests as r 
on s.session_id =r.session_id --and s.last_request_start_time=r.start_time
where is_user_process = 1 
 
waitfor delay '00:00:01'
 
select substring(h.text, (r.statement_start_offset/2)+1 , ((case r.statement_end_offset when -1 
then datalength(h.text)  else r.statement_end_offset end - r.statement_start_offset)/2) + 1) as text
, r.cpu_time-t.cpu_time as CPUDiff 
, r.logical_reads-t.logical_reads as ReadDiff
, p.query_plan
, r.wait_type
, r.wait_time
, r.last_wait_type
, r.wait_resource
, r.command
, r.database_id
, r.blocking_session_id
, r.granted_query_memory
, r.session_id
, r.reads
, r.writes
, r.row_count
, s.[host_name]
, s.program_name
, s.login_name
from sys.dm_exec_sessions as s 
inner join sys.dm_exec_requests as r 
on s.session_id =r.session_id and s.last_request_start_time=r.start_time
full outer join #temp as t on t.session_id=s.session_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) h
cross apply sys.dm_exec_query_plan(r.plan_handle) p
order by 3 desc
 
drop table #temp
 

You notice that you have several rows with CXPACKET wait types. With this query, you click the xml show plan link and:
1.    search the xml for missing indexes.
2.    Save as a .sqlplan and reopen in SSMS
3.    Compare estimated and actual rows in the iterators on the right side of the plan. Poor estimates may bubble to the left as well.
4.    If there are no missing indexes, estimates and actual are fairly close, reducing MAXDOP may help if it is not a huge report or query.

5. If there are missing index or bad estimations, fix it! :)


Of course, these are not rules set in stone. Just a style in the art of database administration. Just don't blame it on a "parallelism bug" because it is a poor musician that blames his instrument.

One thing to note, if you do turn down MAXDOP server-wide, turn in on at the query level(enterprise edition) on your index operation because they are optimized for it.
Alter index all on tblBlah rebuild with (maxdop=32);

email it! |   |   | 

Reminder: SSWUG Virtual Conference Starts Tomorrow

Posted by Jason Massie Click to IM Jason Massie on Monday, June 23, 2008 at 2:35 PM to SQL Server 2008, SQL performance tuning, Indexes, SSRS
1200 Views | 0 Comments | Article Rating

Just a quick reminder that the SSWUG virtual conference starts tomorrow. I hope you can attend. They have given me a discount code to distribute: VIPJM2008DIS I do not get any extra cha-ching from you using it. I just state that so this post doesn't sound spammy.

Actually hearing my recorded voice makes me cringe so this will definitely be interesting for me. :)

 

Here are the sessions I am doing:

Filtered Indexes and Statistics in SQL 2008

SQL Reporting Services for the DBA

SQL Profiler: Configuration, Analysis, and SQL Server 2008 Enhancements

http://vconferenceonline.com/sswug/demo.asp

email it! |   |   | 

Attn DFW SQL'ers

Posted by Jason Massie Click to IM Jason Massie on Wednesday, June 18, 2008 at 2:58 PM to SQL Server 2008, SQL performance tuning, Indexes, PASS, Data collection, RC0
1696 Views | 0 Comments | Article Rating

Reminder: You can catch me at the North Texas SQL Server User's Group tomorrow night. The Data Collection presentation that I put together may be short so I have a few backup slides and demo's on filtered indexes, queries hashes and the 2 second SQL Server 2008 upgrade. Directions and details can be found at http://northtexas.sqlpass.org

Hope to see you there.

email it! |   |   | 

Query Hashes in SQL 2008 RC0

Posted by Jason Massie Click to IM Jason Massie on Tuesday, June 10, 2008 at 9:43 PM to SQL Server 2008, SQL performance tuning, DMVs, RC0
1139 Views | 0 Comments | Article Rating

I believe this is a new feature RC0. If I missed it before, shame on me because this is awesome. My first 2 blog posts(Part 1 and Part 2) were on doing a similar things with query patterns. It is basically an additional column on sys.dm_exec_requests and sys.dm_exec_query_stats that hold a binary hash. The hash is same for queries that are the same or similar. For example, these queries should all have the same hash(disregard parameterization and trivial plans for simplicity):

select c1 from t1 where c2 = 1

select c1 from t1 where c2 = 1

select c1 from t1 where c2 = 2

But this query will have a different hash:

select c1 from t1 where c2 > 1

This allows you to group and aggregate similar queries by the hash. It is really useful in OLTP environments where the same query might only take milliseconds but execute 1000's of times a second. If you don't aggregate, those those types of queries may not show up on the radar but might be responsible 90% of the CPU usage. Shaving a couple of cycles of CPU off of one of those and server-wide utilization drops.

Here is a nice sample from BOL.

 

   1:SELECT TOP 5 query_stats.query_hash AS "Query Hash", 
   2:SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
   3:MIN(query_stats.statement_text) AS "Statement Text"
   4:FROM 
   5:    (SELECT QS.*, 
   6:SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
   7:    ((CASE statement_end_offset 
   8:WHEN -1 THEN DATALENGTH(st.text)
   9:ELSE QS.statement_end_offset END 
  10:            - QS.statement_start_offset)/2) + 1) AS statement_text
  11:FROM sys.dm_exec_query_stats AS QS
  12:CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
  13:GROUP BY query_stats.query_hash
  14:ORDER BY 2 DESC;
  15:GO

email it! |   |   | 

TechEd! Let's do this.

Posted by Jason Massie Click to IM Jason Massie on Friday, May 30, 2008 at 9:05 PM to SQL Server 2008, SQL performance tuning, tsql, WMI, CLR, internals, TechEd, Procedure Cache, DMVs
2467 Views | 2 Comments | Article Rating

I am pumped up about TechEd. Bill Gates's last technical keynote. He has to have a bomb to drop... The launch of Windows\VS\SQL Server 2008. All of the speakers are reaching deep into their bag of tricks and bringing new hardcore sessions. The universal party. The blogger's lounge. Twitter, FriendFeed, MSDN, and Technet are all buzzing with posts.

I found out a couple of days ago that a speaker would not be able to make it due to a family emergency. I offered to present his session on DMV's so I will be presenting these sessions:

 

Using Dynamic Management Views to Improve Your Development

Dynamic Management Views were added to SQL Server 2005 and have been enhanced in SQL Server 2008. While they provide great functionality and usefulness, it appears they are not widely understood or implemented as yet. This session provides an overview of the Dynamic Management Objects available (both Views and Functions) and describes those considered the most useful. It shows how they can be utilized in monitoring, troubleshooting, and ongoing management of SQL Server systems and in gaining an understanding of SQL Server’s operation. The session also illustrates how they can be used to improve the database development experience and how they can be used in custom reports within SQL Server Management Studio.

Database Platform
300
BRK
Jason Massie

Windows Management Instrumentation (WMI) for the Command Line DBA

This is a demo filled session on how a command line DBA can unleash the power of WMI. WMI allows the DBA access to information that was previously hard to get or unavailable like the application event logs, OS perfmon counters, and OS events. This session covers WQL queries from SSIS, accessing OS performance counters from SQL, automated reactions to WMI events, WMI from Windows PowerShell and much more.

Database Platform
400
BRK
Jason Massie

Troubleshooting Query Plans Gone Wrong

Have you ever flushed the procedure cache to resolve a performance problem? Have you ever had to use a join or index hint when you should not have to? Have you ever updated statistics to fix a performance problem and wondered why it worked? In this session, we get deep into how the query optimizer decides how to execute a SQL statement. We look at common problems that cause the optimizer to choose the wrong access paths like underestimation. We go over a detailed demo on parameter sniffing, problems with local variables and statistics. We also look at solutions to these problems in several areas including good design at the application, schema, and query levels, proactive maintenance and reactive fixes. This session applies to Microsoft SQL Server 2000 through 2008 but we pay special attention to new features in SQL Server 2008 like plan freezing, new hints, filter indexes, and statistics among others.

Database Platform
400
BRK
Jason Massie

I'll be there Monday through Saturday. Hope to see you there!

email it! |   |   | 

Tech∙ED database platform track

Posted by Jason Massie Click to IM Jason Massie on Sunday, March 02, 2008 at 11:15 AM to SQL Server 2008, SQL Server 2005, SQL performance tuning, WMI, TechEd
1938 Views | 0 Comments | Article Rating

It looks like they posted the sessions for Tech∙ED. Get ready for some hardcore SQL mayhem. All of the Database Platform are sessions are 300 and 400 level. Not a single 200. The line up is heavy weight and it looks like they are all pulling out new sessions. It is an honor just to be presenting with this crew. I was planning on taking in some sessions on VS2008, sharepoint and other topics I don't get to see much action in but I am afraid I would miss something if I did.

My sessions will be "Troubleshooting query plans gone wrong." and "WMI for the command line DBA". I hope to see you there. It is truly going to be a great year with the new products. 

 

email it! |   |   | 

SQL Server 2008 filtered indexes in 5 minutes

Posted by Jason Massie Click to IM Jason Massie on Saturday, February 23, 2008 at 11:11 PM to SQL Server 2008, SQL performance tuning, tsql, Indexes, CTP6, In 5 minutes
4717 Views | 1 Comments | Article Rating

My jaw literally dropped when I saw it. If this works as advertised, it has the potential of changing everything. The days of over indexing will be over. Dynamic indexing off of the missing and unused index DMV's could be possible especially with added support. I also think this will better accomplish what people tried to do with partitioning for performance reasons in SQL Server 2005.

It will be really interesting to see how it gets applied in production and where this leads in the next versions. Maybe an autoindex checkbox will replace the DBA :)

This is a small and quick example. We will have to see how it scales to larger environments.

 

use adventureworks

--Let's nullify a random 400

update  Production.WorkOrder

set EndDate = null

where WorkOrderID in (select top 400 WorkOrderID from Production.WorkOrder where enddate is not null)


--Let's give a random 200 a recent date to mimic prod data

update top (200) Production.WorkOrder

set duedate = getdate()-10

where enddate is null

--Base query
--This is query type that should be simple yet common
--Let's get open WO's that will be due soon

set statistics io on

select p.Name, wo.OrderQty, wo.DueDate

from Production.WorkOrder wo JOIN Production.Product p on wo.ProductID=p.ProductID

where wo.EndDate is null and wo.DueDate >= '2007-11-24'

 --CI Scan with loop join

--Table 'WorkOrder'. Scan count 1, logical reads 530, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Now let's create a covering index

create index ix01 on Production.WorkOrder(EndDate,DueDate, ProductID) include (OrderQty)

--Run the base query

set statistics io on
select
p.Name, wo.OrderQty, wo.DueDate
from
Production.WorkOrder wo JOIN Production.Product p on wo.ProductID=p.ProductID
where
wo.EndDate is null and wo.DueDate >= '2007-11-24'

--Does NCI seek

--Table 'WorkOrder'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

--Clean up

drop index [Production].[WorkOrder].[ix01]

 

--Create Filter index

create index ix01 on Production.WorkOrder(EndDate,DueDate, ProductID) include (OrderQty) where EndDate is null and DueDate >= '2007-11-24'

--Run base query

set statistics io on
select
p.Name, wo.OrderQty, wo.DueDate

from Production.WorkOrder wo JOIN Production.Product p on wo.ProductID=p.ProductID

where wo.EndDate is null and wo.DueDate >= '2007-11-24'

 

--It uses it. We have a real small sample but it performs better

--Table 'WorkOrder'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Now let's look at size
--Create unfiltered index for comparison

create index ix02 on Production.WorkOrder(EndDate,DueDate, ProductID) include (OrderQty)

--note the new syntax

declare @dbid int = db_id()

declare @objid int = object_id('[Production].[WorkOrder]')

select * from sys.dm_db_index_physical_stats(@dbid, @objid, null, null, 'detailed') ps join sys.indexes i

on ps.object_id=i.object_id and ps.index_id=i.index_id

and i.name in ('ix01', 'ix02') and i.type_desc='NONCLUSTERED'

--We are looking at 2 page vs. 301 pages

That is a huge difference in size. The major point is not the fact that we retrieve the same while taking up so much less space on disk but so much less space in memory as well. We are going to get into this much more!

 

 Technorati Tags: ,,

email it! |   |   | 

SQL Shorts

Posted by Jason Massie Click to IM Jason Massie on Sunday, February 10, 2008 at 9:47 PM to SQL Server 2008, SQL Server 2005, Heroes Happen Here, SQL performance tuning, SQLOS, internals, scheduler
2295 Views | 0 Comments | Article Rating

Kalen Delaney's first DVD on SQL Server internals is available for viewing online. More details here.

Connor Cunningham, formerly on the SQL query optimizer team, is now blogging at SQL Skills and is whipping out posts on optimizer related topics at a frantic pace. Check out the latest here or all of them here.

Dan Jones rebuffs the idea that SQL Server 2008 is a "dot release".

CTP6 may or may not be coming soon. Chris says that CTP5 will be given away at the HHH launch. This is in contrast to the "feature complete" CTP that was mentioned in the roadmap but who know...

Mario B has raised some eyebrows with his post on scheduler troubleshooting and SOS_SCHEDULER_YIELD waits and the possibility that is is not CPU pressure always. Linchi posted about it here and then SQLServerPedia posts shortly afterwards. Anyway, check out Mario's tools.

email it! |   |   | 

Never use table variables?

Posted by Jason Massie Click to IM Jason Massie on Wednesday, January 30, 2008 at 4:20 AM to SQL Server 2005, query optimizer, SQL performance tuning
1909 Views | 0 Comments | Article Rating

Itzik Ben-Gan describes a method of analyzing trace files in his book "Inside Microsoft SQL Server 2005: T-SQL Querying" He provides a couple functions for stripping the literals from queries in the TextData column. If you do not use one of these functions or one of your own similar tools, you are not being very effective in your trace analyisis. The first one is from Microsoft PSS and can be downloaded here. The second one is a CLR function that uses regex. It performs much better. His book is worth it just for this function.

Here is the concept: Tuning single queries can be insane ineffective for multiple reasons. When facing a performance problem, you should be looking at query patterns that way to can get the biggest bang in the shortest amount of time.

Here is a simple trace analysis query using the PSS function.

-- Generate pattern and order by highest CPU

-- This is long running so you may want to dump it into a table.

--http://statisticsio.com/files/patterns/fn_tsqlsig.sql

SELECT dbo.fn_SQLSigTSQL(textdata, 4000)

, SUM(CPU)

, COUNT(*)

FROM dbo.trace_table

GROUP BY dbo.fn_SQLSigTSQL(textdata, 4000)

order by CPU DESC

 

Check out Itzak's book for more info on trace analysis. In this series on blog's, we will look at applying this concept to data in the DMV.

email it! |   |   | 

The problem with local variables

Posted by Jason Massie Click to IM Jason Massie on Friday, January 25, 2008 at 9:32 AM to SQL Server 2005, query optimizer, SQL performance tuning, Indexes
1651 Views | 0 Comments | Article Rating

Have you ever been writing a query and just cannot get it to use the right index. This could be one of the reasons why. Let's use this query with local variables as our example.

declare @Start datetime
declare @End datetime
select @Start = '2004-08-01 00:00:00.000'
select @End = '2004-07-28 00:00:00.000'
select ProductID from sales.SalesOrderDetail where ModifiedDate >= @End and ModifiedDate <= @Start
It generates this plan:

SQL Server Clustered index scan

but we have an index on ModifiedDate. There are many reason why SQL would not use this index but, for this post, we will assume we have eliminated them. Finally, we hard code the dates and we get this plan.

SQL Server index seek with bookmark lookup

So why is it doing this? The reason is because the query optimizer cannot accurately use the statistics to estimate how many rows are returned with local variables. Let's look at how we can tell there is a problem with the cardinality estimates. In the query with the local variables, the optimizer thinks we are getting 10918.5 rows so we do the index scan. In the query with hard coded literals, the estimated rows and actual rows are the same and accurate.

SQL Server Cardinality underestimation

How can we fix this? There are several way. This is another situation that makes a case for stored procedures or parameterized queries.

create proc pDemo01 @Start datetime, @End datetime
as
select ProductID from sales.SalesOrderDetail
where ModifiedDate >= @End and ModifiedDate <= @Start

The stored proc generates the proper plan. However, you will run into the same problem if you modify the parameter within the stored proc like select @start = @start-90. In this case, to should use sp_executeSQL. What if you cannot use a stored proc because it is a 3rd party app or some other reason? A covering index is probably the solution. Once we create this index, it will always be used:

create index ix01 on sales.SalesOrderDetail(ModifiedDate) include (ProductID)

We could use a plan guide or an index hint with a forceseek(SQL 2008) but performance will be really bad when we really do need to get 10k rows. The same problem can happen with stored proc's but that is another post.

To get deeper into this subject, check out this.

email it! |   |   | 

SQL Server 2008 Plan Guides from Cache

Posted by Jason Massie Click to IM Jason Massie on Friday, January 18, 2008 at 5:47 AM to SQL Server 2008, query optimizer, SQL performance tuning
1694 Views | 0 Comments | Article Rating

Uhoh... I can see some junior developers going crazy with this. One of the things that kept plan guides from being over used was the fact that they are kind of hard :) Well, Microsoft built their empire making hard stuff easy. They do it again with sp_create_plan_guide_from_cache.

Let's look at this BOL sample.

 

USE AdventureWorks;

GO

SELECT WorkOrderID, p.Name, OrderQty, DueDate

FROM Production.WorkOrder AS w

JOIN Production.Product AS p ON w.ProductID = p.ProductID

WHERE p.ProductSubcategoryID > 4

ORDER BY p.Name, DueDate;

GO

-- Inspect the query plan by using dynamic management views.

SELECT * FROM sys.dm_exec_query_stats AS qs

CROSS APPLY sys.dm_exec_sql_text(sql_handle)

CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) AS qp

WHERE text LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

GO

-- Create a plan guide for the query by specifying the query plan in the plan cache.

DECLARE @plan_handle varbinary(64);

DECLARE @offset int;

SELECT @plan_handle = plan_handle, @offset = qs.statement_start_offset

FROM sys.dm_exec_query_stats AS qs

CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS st

CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) AS qp

WHERE text LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

 

EXECUTE sp_create_plan_guide_from_cache

    @name =  N'Guide1',

    @plan_handle = @plan_handle,

    @statement_start_offset = @offset;

GO

-- Verify that the plan guide is created.

SELECT * FROM sys.plan_guides

WHERE scope_batch LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';

GO

 

--Let's verify it actually worked.

--Click the xml link

--Save as a .sqlplan, reopen in SSMS and then hit f4

set statistics xml on

go

SELECT WorkOrderID, p.Name, OrderQty, DueDate

FROM Production.WorkOrder AS w

JOIN Production.Product AS p ON w.ProductID = p.ProductID

WHERE p.ProductSubcategoryID > 4

ORDER BY p.Name, DueDate;

GO

So when would you use this? I would say hardly ever hopefully. However, it could solve the really tough problems.

Let's say you have a 3rd party application that generates adhoc dynamic SQL. You cannot modify the code or schema. Index changes are not supported. Sometimes parameter sniffing causes unpredictable performance. Sound like a nightmare? Welcome to most CRM apps.

Other scenarios that come to mind are when best practices are not or cannot be followed. Let's say you just cannot update stats often enough with a large enough sample on a very very VERY large table to get a consistently optimal plan. Use a plan guide!

Here are some other times that the optimizer might have trouble and a plan guide may be a good option.

  • Use of local variables
  • Modifying stored proc parameters.
  • Ascending keys
  • Complex queries with table variables

There are usually better solutions than plan guides so save them for times that best practices are not an option. sp_create_plan_guide_from_cache makes using plan guides so much easier. Put it in your toolbox!

 

 

email it! |   |   | 

SQL Performance Tuning

Posted by Jason Massie Click to IM Jason Massie on Thursday, January 03, 2008 at 6:06 AM to SQL performance tuning
978 Views | 0 Comments | Article Rating

Wow, the holidays have been busy for me. I hope everybody had a great time and I want to wish everyone a great 2008.

I have been busy working on a two new services. It is highly specialized performance tuning. One specializes only on SQL 2005. It leverages the new features to get results very fast. The other focuses on individual query tuning on both 2000\2005.

http://sqlvelocity.com

http://tunemyquery.com

I think I am starting to get good with Dotnetnuke. That is scary. I would love to hear feedback on the site design. After looking at it for weeks while designing it, I might not have an objective opinion anymore. Thanks. 

email it! |   |   | 

Finding the cause of high CPU or high IO on SQL Server 2005

Posted by Jason Massie Click to IM Jason Massie on Friday, December 07, 2007 at 3:24 PM to SQL Server 2005, SQL performance tuning
1643 Views | 0 Comments | Article Rating

Note: For those of you on the RSS feed, this is a repost that I had to delete during a server move.

You can use this to get a quick and dirty picture of what is going on with the server. We basically grab a snapshot, wait a second and then get a diff. Sort as needed.

 

select r.cpu_time

                ,r.logical_reads

                , r.session_id

into #temp

from sys.dm_exec_requests as r

 

waitfor delay '00:00:01'

 

select substring(h.text, (r.statement_start_offset/2)+1 , ((case r.statement_end_offset when -1 then datalength(h.text)  else r.statement_end_offset end - r.statement_start_offset)/2) + 1) as text

                , r.cpu_time-t.cpu_time as CPUDiff

                , r.logical_reads-t.logical_reads as ReadDiff

                , r.wait_type, r.wait_time

                , r.last_wait_type

                , r.wait_resource

                , r.command

                , r.database_id

                , r.blocking_session_id

                , r.granted_query_memory,r.session_id

                , r.reads

                , r.writes

                , r.row_count

                , s.[host_name]

                , s.program_name

                , s.login_name

from sys.dm_exec_sessions as s inner join sys.dm_exec_requests as r

on s.session_id =r.session_id and s.last_request_start_time=r.start_time

left join #temp as t on t.session_id=s.session_id

CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) h

where is_user_process = 1

order by 3 desc

 

drop table #temp

 

email it! |   |   | 

Queries using a lot of memory using dm_exec_query_memory_grants and dm_exec_sql_text

Posted by Jason Massie Click to IM Jason Massie on Thursday, December 06, 2007 at 11:56 PM to SQL Server 2005, query optimizer, SQL performance tuning, memory bottleneck
2117 Views | 0 Comments | Article Rating

Here is query I relied on heavily while troubleshooting a customer with  resource_semaphore_query_compile wait types. It is useful in other low memory conditions.

select text, query_plan, requested_memory_kb, granted_memory_kb, used_memory_kb from sys.dm_exec_query_memory_grants MG
CROSS APPLY sys.dm_exec_sql_text(sql_handle)  t
CROSS APPLY sys.dm_exec_query_plan(MG.plan_handle)

High granted memory and used memory is what I looked at. Once you find the culprits, you can look through the plan for the usual suspects like hash joins. Just click on the xml link and save and a .sqlplan. Reopen it in SSMS.

email it! |   |   | 

The forceseek hint in SQL Server 2008

Posted by Jason Massie Click to IM Jason Massie on Thursday, December 06, 2007 at 10:38 PM to Windows Server 2008, query optimizer, SQL performance tuning, tsql, Indexes
1810 Views | 0 Comments | Article Rating

Have you every had a query that should totally be using an existing index and doing a seek but it keeps doing a clustered index scan? In SQL Server 2008, you can force it to do a seek. That may or may not be a good thing. Let's take a look at an AdventureWorks example.

--logical reads 1246

select ProductID, OrderQty from sales.SalesOrderDetail

where ProductID > 776

This query generates this plan.


Now let's add the hint.

--logical reads 271264

select ProductID, OrderQty from sales.SalesOrderDetail with (forceseek)

where ProductID > 776


This query generates this plan.

The second query plan contains the index seek but logical reads are 20 times more. This is because it has to do lookup for OrderQty. Obviously using the hint would be a bad thing.

So when would be a good time to use this hint? Let's look at this query
create proc p1
@p int
as
select c1, c2 from t1 where c3 = @p

t1 has 1 million rows
99% of c3 =1
.5% of c3 = 2
.5% of c3 = 3
99% of the queries run with @p = 2 or 3

The problem is if the query compiles with c3 = 1. The optimizer will choose a CI scan. In SQL 2005, we could recompile each time at the statement level or we could use an optimize for hint. Now we have another tool in the bag of tricks.

Use with caution.

email it! |   |   | 

Take it a step further with the unused and missing index DMV's

Posted by Jason Massie Click to IM Jason Massie on Sunday, November 25, 2007 at 10:33 AM to SQL Server 2005, query optimizer, SQL performance tuning, tsql, memory bottleneck, Indexes
4687 Views | 0 Comments | Article Rating

In SQL 2000 I wanted to find unused indexes but found it to be difficult. I had to capture a huge trace that fully represents the workload and “trust” that the ITW knows what it’s talking about. You could also use the scan started trace event filtered by dbid, objectid, indexid. This procedure was very tedious.

In SS2005, a quick query of the dynamic management views lets you know which indexes are not being used. You can do this in a few minutes what had previously taken days or weeks. This alone is a very powerful feature.

I am proposing that you take it a step further. We are going to use a little logic and the missing index DMV’s to combine indexes and remove indexes that are still used but redundant. The steps in this process would look like this:

1.       Remove unused indexes with the unused index script

2.       Get your list of tables to analyze.

3.       Remove redundant but used indexes.

4.       Revue missing index DMV’s for mistaken index drops.

5.       Combine indexes that where it is logical to do so.

6.       Revue missing index DMV’s for mistaken index drops.

This methodology is most effective and viable when these conditions are met:

·         The server has been online and thus collecting stats for a long time.

·         The server is not pushing a hardware bottleneck. If so, this should be done during maint window.

·         The server is enterprise edition and the tables allow online operations (i.e. no LOB data or partitions).

What are the benefits of removing unused indexes?

·         Reduced writes during updates

·         Reduced space usage

·         Reduced backup\restore space and time

·         Reduced index maintenance time

What are the benefits of removing unused indexes, used but redundant indexes and combining indexes?

·         Less memory footprint for the same amount of data

·         A different angle to index tuning

·         Indexes are more likely to be covering

·         Reduced writes during updates

·         Reduced space usage

·         Reduced backup\restore space and time

·         Reduced index maintenance time

Now, let’s get down to business…

 

Remove Unused Indexes

This is pretty straightforward and relatively safe. You want to make sure that your server has been up long enough to get good index usage. This includes infrequent operations like month end reporting, etc.

--Unused indexes

declare @dbid int 

select @dbid = db_id() 

 

select object_name(s.object_id) as ObjName

, i.name as IndName

, i.index_id 

, user_seeks + user_scans + user_lookups  as reads

, user_updates as writes 

, sum(p.rows) as rows

from sys.dm_db_index_usage_stats s join sys.indexes i on s.object_id = i.object_id  and i.index_id = s.index_id 

join sys.partitions p on s.object_id = p.object_id and p.index_id= s.index_id

where objectproperty(s.object_id,'IsUserTable') = 1 and s.index_id> 0 and s.database_id = @dbid 

group by object_name(s.object_id), i.name, i.index_id, user_seeks+ user_scans + user_lookups, user_updates

order by reads, writes desc 

 

You can most likely drop any indexes with zero or close to zero reads. The more rows, the more space you will reclaim. The more writes, the better write performance you get by removing them. If the index is not being read but writes are minimal and rows are low, there is little benefit of dropping it. Keep in mind that not having an index that you need is a lot worse than having an index you don’t need in most situations. I suggest starting off with a conservative approach.

Now that we have gotten rid of the low hanging fruit, we can get deeper. I get a list of tables by size and work my way down.

select object_name(object_id), max(rows)

from sys.partitions

group by object_name(object_id)

order by 2 desc

After running that query we see that tEvent is the largest table and should provide nice gains. This is where we start. Let’s use this schema for our example.

create table tEvent

(EventID int primary key clustered,

EventType int,

EventName varchar(100),

EventDetailID int,

CustomerID int,

CompanyID int,

DateOpen datetime,

DateClose datetime)

 

create index ind1 on tEvents(EventDetailID);

 

create index ind2 on tEvents(CustomerID);

 

create index ind3 on tEvents(EventDetailID, CustomerID, CompanyID);

 

create index ind4 on tEvents(CompanyID, DateClose, EventType);

 

create index ind5 on tEvents(CustomerID, DateClose, EventType);

Removing used but redundant indexes

If we do a sp_helpindex, we can see that ind3 should satisfy queries currently using ind1 and ind2. The keyword is “should”. It’s a fairly safe bet so we drop ind1 and ind2. Now we should monitor the missing index DMV to see if there is any negative impact. We should also see the read count substantially increase by using the unused index query filtered by tEvent. Here is the missing index monitoring query:

--Missing indexes

SELECT sys.objects.name

, (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) as Impact

, mid.equality_columns

, mid.inequality_columns

, mid.included_columns

FROM sys.dm_db_missing_index_group_stats AS migs

INNER JOIN sys.dm_db_missing_index_groups AS mig ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details AS mid ON mig.index_handle = mid.index_handle

INNER JOIN sys.objects WITH (nolock) ON mid.object_id = sys.objects.object_id

WHERE     (migs.group_handleIN

(

SELECT     TOP (5000) group_handle

FROM sys.dm_db_missing_index_group_stats WITH (nolock)

ORDER BY (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) DESC)

)

and objectproperty(sys.objects.object_id, 'isusertable')=1and name = 'tEvent'

ORDER BY 2 DESC , 3 desc

As long as the indexes do not come back up in this report, we should be ok. Now we are servicing the same queries with 2 less indexes. Those are index pages that are no longer taking up buffer pool space!

Combining indexes

The same concept can be applied to the following scenario but a little more “feel” and understanding of how the app accesses the data is needed.

create index ind4 on tEvents(CompanyID, DateClose, EventType);

 

create index ind5 on tEvents(CustomerID, DateClose, EventType);

We know a former DBA added these. It is logical for us, based on our knowledge of the app and the cardinality of the data, to try to replace these indexes with this one.

create index ind6 on tEvents(CustomerID, CompanyID,  DateClose, EventType) with (online=on, maxdop=8);

This index should satisfy all queries using both indexes. Again, we want go back to the missing index report to see if SQL thinks it needs one of those indexes.

Now work your way down your list of tables by row count. I was able to reduce the size of a 200GB database by 15% in addition to removing the unused indexes. That’s a big gain! The only problem I ran into was changing the name of an index that had a hint. There were a few indexes that I had to add back but since my approach was conservative it was nothing drastic and completely online.

Let me know if you have any other tips or questions regarding this topic.

email it! |   |   | 

SQL Server 2008 NOV CTP changes to sys.dm_exec_query_memory_grants

Posted by Jason Massie Click to IM Jason Massie on Wednesday, November 21, 2007 at 11:25 AM to SQL Server 2008, query optimizer, SQL performance tuning, memory bottleneck
1608 Views | 1 Comments | Article Rating

I got well aquainted with this DMV when I was wrestling with resource_semaphore and resource_semaphore_query_compile wait types pre-sp2.

It looks like there are a few changes to it. Most notably, hooks for the resource govenor and ideal_memory_kb. BOL defines ideal_memory_kb as "Size, in kilobytes (KB), of the memory grant to fit everything into physical memory. This is based on the cardinality estimate." I think this will be very useful when troubleshooting plan/stats problems.

email it! |   |   | 

Summarizing data in the DMV's Part 3

Posted by Jason Massie Click to IM Jason Massie on Sunday, December 31, 2006 at 9:53 AM to SQL Server 2005, SQL performance tuning, tsql, memory bottleneck
1660 Views | 0 Comments | Article Rating

Itzik Ben-Gan describes a method of analyzing trace files in his book "Inside Microsoft SQL Server 2005: T-SQL Querying" He provides a couple functions for stripping the literals from queries in the TextData column. If you do not use one of these functions or one of your own similar tools, you are not being very effective in your trace analyisis. The first one is from Microsoft PSS and can be downloaded here. The second one is a CLR function that uses regex. It performs much better. His book is worth it just for this function.

Here is the concept: Tuning single queries can be insane ineffective for multiple reasons. When facing a performance problem, you should be looking at query patterns that way to can get the biggest bang in the shortest amount of time.

Here is a simple trace analysis query using the PSS function.

-- Generate pattern and order by highest CPU

-- This is long running so you may want to dump it into a table.

--http://statisticsio.com/files/patterns/fn_tsqlsig.sql

SELECT dbo.fn_SQLSigTSQL(textdata, 4000)

, SUM(CPU)

, COUNT(*)

FROM dbo.trace_table

GROUP BY dbo.fn_SQLSigTSQL(textdata, 4000)

order by CPU DESC

 

Check out Itzak's book for more info on trace analysis. In this series on blog's, we will look at applying this concept to data in the DMV.

email it! |   |   | 

A tool - pExecStats

Posted by Jason Massie Click to IM Jason Massie on Sunday, December 31, 2006 at 9:03 AM to SQL Server 2005, query optimizer, SQL performance tuning, tsql
1483 Views | 0 Comments | Article Rating

Itzik Ben-Gan describes a method of analyzing trace files in his book "Inside Microsoft SQL Server 2005: T-SQL Querying" He provides a couple functions for stripping the literals from queries in the TextData column. If you do not use one of these functions or one of your own similar tools, you are not being very effective in your trace analyisis. The first one is from Microsoft PSS and can be downloaded here. The second one is a CLR function that uses regex. It performs much better. His book is worth it just for this function.

Here is the concept: Tuning single queries can be insane ineffective for multiple reasons. When facing a performance problem, you should be looking at query patterns that way to can get the biggest bang in the shortest amount of time.

Here is a simple trace analysis query using the PSS function.

-- Generate pattern and order by highest CPU

-- This is long running so you may want to dump it into a table.

--http://statisticsio.com/files/patterns/fn_tsqlsig.sql

SELECT dbo.fn_SQLSigTSQL(textdata, 4000)

, SUM(CPU)

, COUNT(*)

FROM dbo.trace_table

GROUP BY dbo.fn_SQLSigTSQL(textdata, 4000)

order by CPU DESC

 

Check out Itzak's book for more info on trace analysis. In this series on blog's, we will look at applying this concept to data in the DMV.

email it! |   |   | 

Summarizing data in the DMV's Part 2

Posted by Jason Massie Click to IM Jason Massie on Saturday, December 30, 2006 at 7:43 PM to SQL Server 2005, SQL performance tuning, tsql
2051 Views | 0 Comments | Article Rating

Itzik Ben-Gan describes a method of analyzing trace files in his book "Inside Microsoft SQL Server 2005: T-SQL Querying" He provides a couple functions for stripping the literals from queries in the TextData column. If you do not use one of these functions or one of your own similar tools, you are not being very effective in your trace analyisis. The first one is from Microsoft PSS and can be downloaded here. The second one is a CLR function that uses regex. It performs much better. His book is worth it just for this function.

Here is the concept: Tuning single queries can be insane ineffective for multiple reasons. When facing a performance problem, you should be looking at query patterns that way to can get the biggest bang in the shortest amount of time.

Here is a simple trace analysis query using the PSS function.

-- Generate pattern and order by highest CPU

-- This is long running so you may want to dump it into a table.

--http://statisticsio.com/files/patterns/fn_tsqlsig.sql

SELECT dbo.fn_SQLSigTSQL(textdata, 4000)

, SUM(CPU)

, COUNT(*)

FROM dbo.trace_table

GROUP BY dbo.fn_SQLSigTSQL(textdata, 4000)

order by CPU DESC

 

Check out Itzak's book for more info on trace analysis. In this series on blog's, we will look at applying this concept to data in the DMV.

email it! |   |   | 

Summarizing data in the DMV's Part 1

Posted by Jason Massie Click to IM Jason Massie on Saturday, December 30, 2006 at 10:52 AM to SQL Server 2005, SQL performance tuning
1219 Views | 2 Comments | Article Rating

Itzik Ben-Gan describes a method of analyzing trace files in his book "Inside Microsoft SQL Server 2005: T-SQL Querying" He provides a couple functions for stripping the literals from queries in the TextData column. If you do not use one of these functions or one of your own similar tools, you are not being very effective in your trace analyisis. The first one is from Microsoft PSS and can be downloaded here. The second one is a CLR function that uses regex. It performs much better. His book is worth it just for this function.

Here is the concept: Tuning single queries can be insane ineffective for multiple reasons. When facing a performance problem, you should be looking at query patterns that way to can get the biggest bang in the shortest amount of time.

Here is a simple trace analysis query using the PSS function.

-- Generate pattern and order by highest CPU

-- This is long running so you may want to dump it into a table.

--http://statisticsio.com/files/patterns/fn_tsqlsig.sql

SELECT dbo.fn_SQLSigTSQL(textdata, 4000)

, SUM(CPU)

, COUNT(*)

FROM dbo.trace_table

GROUP BY dbo.fn_SQLSigTSQL(textdata, 4000)

order by CPU DESC

 

Check out Itzak's book for more info on trace analysis. In this series on blog's, we will look at applying this concept to data in the DMV.

email it! |   |   | 

Page 1 of 1First   Previous   Next   Last   


Copyright 2006 by Statistics IO, My SQL Server Blog