BorderLayoutBoxedLayoutOpenLayoutMaximum textMedium textSmall text


Register
Saturday, July 04, 2009
 
MyStreamMinimize
Print  

The Perils of SQL Profiler

Posted by Jason on Monday, July 14, 2008 to SQL Server 2008, SQL Server 2005, Procedure Cache, DMVs, Captain Varchar(MAX)
1546 Views | 8 Comments | Article Rating

profiler I posted the first Captain VarChar(MAX) comic last Friday. It takes a swing at profiler. A lot of you are like “Duh!”. I have just done too many dba interviews where the candidate’s default answer is profiler. The interview usually gets tough or really short at that point. Just say no to SQL profiler.

I really have to stretch my imagination to come up with a situation where using SQL Profiler sounds like a good idea. For example, you quickly want to capture “ordered” events when repro’ing a problem in your *test* environment AND you filter it by your SPID. Any other time, you are wasting lots of time, cause performances overhead and probably missing what you are looking for while digging for the needle in the haystack. Also contrary to popular belief, running the GUI on another machine DOES NOT help much. For the rest of the post when I refer to a trace, I am referring to one captured with a server side trace unless I say “profiler” or “GUI”

I will admit that I have brought a production app to its knees with profiler. Picture this. You get a call while you are running profiler that the app is having issues. At first you think, “Hey, I have a trace running so at least I will be able to pinpoint the problem.” Then you see TRACE_WRITE wait types. Your stomach knots up because you realize that you just caused a production outage. Don’t be that guy(or gal).

SQL 2005+ just gives you too much info through the DMV’s. 90%+ of the time you can use the DMV’s instead of Profiler. The other 10% of the time, use a server side trace not the Profiler GUI. I have never seen a server side trace bring down an app. Even unfiltered traces on highly transactional systems where a gigabyte+ is collected every minute. I can see it causing an issue but the server would have to be pretty max’ed out to begin with.

Lastly, a trace only reports CPU, Reads, Duration etc when the query completes. Query completion for a misbehaving query could take 10x to 100x longer than normal. This makes traces unusable in a lot of cases when troubleshooting.

Let’s look at some scenario’s:

Troubleshooting a performance issue that is currently happening

A SQL trace in this situation is almost never needed on SQL 2005+. I troubleshoot like this:

1) See if\what hardware subsystem is pegged out with about 5 or 6 perfmon counters.

a) Verify with DMV data or wait stats if needed.

2) Identify the problematic queries with the sys.dm_exec_% DMV's

3) Fix.

Here is the query I use. In a nutshell, it grabs a snapshot, waits a second and does a diff so I can see what is currently slamming the server. Otherwise, the cumulative CPU and IO columns might throw me off. I also grab the query plan which is easy here but very expensive in a trace and hard to correlate with the SQL completed event.

select r.cpu_time 
, r.logical_reads
, r.session_id 
into #temp
from sys.dm_exec_sessions as s 
inner join sys.dm_exec_requests as r 
on s.session_id =r.session_id --and s.last_request_start_time=r.start_time
where is_user_process = 1 
 
waitfor delay '00:00:01'
 
select 

substring(h.text, (r.statement_start_offset/2)+1 ,

((case r.statement_end_offset when -1 then datalength(h.text)

else r.statement_end_offset end - r.statement_start_offset)/2) + 1) as text
, r.cpu_time-t.cpu_time as CPUDiff 
, r.logical_reads-t.logical_reads as ReadDiff
, p.query_plan
, r.wait_type
, r.wait_time
, r.last_wait_type
, r.wait_resource
, r.command
, r.database_id
, r.blocking_session_id
, r.granted_query_memory
, r.session_id
, r.reads
, r.writes
, r.row_count
, s.[host_name]
, s.program_name
, s.login_name
from sys.dm_exec_sessions as s 
inner join sys.dm_exec_requests as r 
on s.session_id =r.session_id and s.last_request_start_time=r.start_time
full outer join #temp as t on t.session_id=s.session_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) h
cross apply sys.dm_exec_query_plan(r.plan_handle) p
order by 3 desc
 
drop table #temp

 

Proactive performance tuning

Now this starts to get in to fuzzy territory. Technically, a SQL trace is more thorough than doing your analysis off of the procedure cache. However, I would say benefits outweigh the losses in most but not all cases.

Procedure Cache Pro’s

· No need to capture a trace

· No need to load a trace

· Aggregation already exists so the data can be smaller assuming that the application has good plan reuse.

· Easy access to XML Showplan

Procedure Cache Con’s

· Queries can be flushed until next execution for many reasons

· Some queries are not cached. These are usually queries that you don’t care about.

To replace SQL trace analysis with a DMV query, use something like this:

select total_worker_time/execution_count as AvgCPU  
, total_elapsed_time/execution_count as AvgDuration  
, (total_logical_reads+total_physical_reads)/execution_count as AvgReads
, execution_count   
, substring(st.text, (qs.statement_start_offset/2)+1 , 
((case qs.statement_end_offset when -1 then datalength(st.text) 
else qs.statement_end_offset end - qs.statement_start_offset)/2) + 1) as txt  
, query_planfrom sys.dm_exec_query_stats as qscross apply sys.dm_exec_sql_text(qs.sql_handle) as stcross apply sys.dm_exec_query_plan (qs.plan_handle) as qp
--where – filter as needed.
--group by – Consider striping literals and grouping by the SQL text. 
--See Inside SQL Server 2005 for more info.
--Order by – Sort as needed.

Uncommon cases

There are some less common uses of SQL traces like the blocked process report, deadlock report, recompile tracing and many others. Make sure you review all of your options and don’t forget about the WMI Provider for Server Events. Use your best judgment and avoid the GUI like the plague.

Conclusion

To summarize very briefly: Never use the SQL Profiler GUI, know your tools, and pick the best one for the job.

email it! |   |   |   |  | 
Permalink     8 Comments  

Rate this Post:
COMMENTS:

Amen. Junior DBAs - you'll hear people (even MS) telling you that a profile will only add 5-10% load to a server, max. In theory, that's true, but any number of things can make that amount skyrocket. A great example is storing the trace data in a database across the network to a slow server with slow drives. The profile won't add much load to the profiled SQL Server, but if the trace table target can't keep up, then the profiled server will slow to a crawl while it waits for the data to hit the target.

posted @ Monday, July 14, 2008 3:23 PM by Brent Ozar


There are cases where it makes a lot of sense. One such case is troubleshooting connectivity issues, especially when you've got a lot of moving parts like firewall ACLs, IPSEC policies on the servers, etc. In that case, running profiler and seeing the connection come in verifies a lot. Also, it helps troubleshoot rare cases like when you know the credentials are correct but for some reason you're getting a connection failed. Default database not existing and things like that are easily seen in those cases where they may not show up using other techniques.

posted @ Monday, July 14, 2008 3:27 PM by K. Brian Kelley


If you're running sever side traces with judicious filtering, performance impact should be small.

I'd never use the client, except to view a *.trc file. I'd also always use a trc file instead of writing to a table.

Here's a script to throw a bunch of trc files into a table written by Linchi Shea. Don't recall if I tested it on 2005. Think I did...

http://chuckboyce.blogspot.com/2007/03/how-to-create-trace-table-from-multiple.html

posted @ Monday, July 14, 2008 3:53 PM by Chuck Boyce


Good point on the connection\security related use of profiler although it could be argued that your should still use a server side trace and then read with fn_trace_gettable().

posted @ Monday, July 14, 2008 4:17 PM by Jason


I'm an unapologetic SQL Server Profiler user. I am fully aware of it's drawbacks and it's those drawbacks that I avoid like the plague. And I agree that there are more than a few of them.

You've covered the drawbacks of SQL Server, I've found myself facing every one of those drawbacks: I've written my own "Current wait stats script" which is almost identical to yours. (http://dbwhisperer.blogspot.com/2008/04/what-are-you-waiting-for.html) and I've seen memory consumption needed to remember the thousands of trace rows skyrocket (luckily not in production).

But what I do use sql profiler for is for those blocked process reports like you mention. They're pretty lightweight traces. I also use it heavily during development. Profiler isn't perfect, but it has its place.

To be clear, I don't use profiler as a tool for emergency performance troubleshooting. That's like using a hammer to drive a screw. And so I would agree with your dark cloud comic (BTW, keep those comics coming!!)

posted @ Tuesday, July 15, 2008 7:10 AM by Michael Swart


Generally speaking, yes -- a server-side trace is preferable to Profiler. It has a lower impact on system resources (but certainly not no impact). But that said, an improperly filtered server-side trace is MUCH WORSE than an improperly filtered Profiler session. The rowset provider used by Profiler will start dropping events if it can't keep up. The file provider, on the other hand, has a guarantee of capturing ALL events that come through. It does this by queuing incoming requests if it gets backed up. While Profiler can cause things to slow down, it won't cause things to actually stop, as can a server-side trace. So either way, proceed with caution.

posted @ Tuesday, July 15, 2008 9:32 AM by Adam Machanic


Yah but what are you going to filter? What if you have a proc that executes a 100 of times a second but only takes a fraction of a second. However, the total CPU usage of all of the executions accounts for most of the CPU usage on the system. If you filter, you could miss this. Luckily, you can identify these with sys.dm_exec_query_stats most of the time.

posted @ Tuesday, July 15, 2008 10:03 AM by Jason


In an extremely active system like one you describe, if you want to take a general picture using a trace, the key is not filtering but rather limiting trace duration or trace file size. If a system is doing thousands of batch requests per second, I would tend to switch to a sampling methodology and do numerous very brief traces over the course of a day, then merge them to get a more complete picture. If you keep your trace brief, there is no chance of backing the system up, no matter how unfiltered it is... Of course, this all depends on what you're trying to accomplish by tracing. The trickiest scenario I've encountered was trying to capture (and, hopefully, debug) an error that was occurring occasionally and unpredictably, and which the application was not properly logging the source of. In that case filtering is out, and short traces are problematic as you might miss the error...

http://sqlblog.com

posted @ Tuesday, July 15, 2008 10:25 AM by Adam Machanic


Name (required)

Email (required)

Website


Simple BBCode can be used like [url=http://example.com]Example[/url] and [B]

Copyright 2006 by Statistics IO, My SQL Server Blog