Tuesday, May 28, 2013

Hmm... Users Experience the Average SQL Elapsed Time

You're joking. Right?


Oracle Database performance tuning and analysis has come a long way in the last 20 years. First there was the “just add more resources” approach and tuning the blatantly poor SQL. Then there was ratio analysis, followed by wait event analysis, time based analysis, and unit of work time based analysis. In addition to the performance diagnosis and analysis evolution, the Oracle Database as a product has changed, and architectures are more diverse. Yet with all this change, some things are in many ways timeless. They relate to complexity, basic mathematical statistics, efficiency, and doctrinal purity. Over the next few weeks, I'll post four different "myths." Last week I posted about increased complexity. This posting centers on a user's experience.

Myth #2

The second myth is Users experience the average SQL elapsed time. Concurrency, multiple execution plans, different bind variables, and the current cache situation make the average elapsed time less relevant—and perhaps even misleading. Let me explain.

If I tell someone the average elapsed time for their key SQL statement is 10 seconds, 99.9% of the time they will picture in their minds a bell curve. They will think to themselves, “OK. This means that usually the statement runs for about 10 seconds, sometimes less and sometimes more.” Unless verified, my research clearly shows that there is likely to be a significant difference between the average and the typical elapsed time(s). This means we are missetting expectations and flat-out misleading users. Not the place we want to be!

It is very simple to calculate the average SQL statement elapsed time. Even a Statspack report will show you the total elapsed time and the total number of executions over the report interval. The average elapsed time is simply the total elapsed time divided by the total number of executions. So it’s very enticing to make a quick statement about the elapsed time using the average.

Now suppose this SQL statement has two execution plans: one typically completes in 2 seconds and the other completes in 18 seconds. Now also assume that they both get executed the same number of times (don’t count on this in your environment!). The average elapsed time would then be 10 seconds. Now I picture myself telling my client that I had discovered the key SQL statement and its average elapsed time is 10 seconds. The “2 seconds” group would think I’m making the situation look worse than it is so I can claim an amazing performance improvement. The “12 seconds” group would think I’m simply an idiot. Either way, I lose. So I needed a way to find out the truth.

What I learned through my research is that if the statement is important to the users and I care about not misleading them, I need to collect some data, calculate the statistics, and create a histogram. I even blogged about the problem here and and provided solutions here! My research clearly showed three reliable ways to collect elapsed times: Oracle tracing by SQL_ID, instrumenting application code, and sampling running SQL. I have created a low overhead tool called “SQL Sampler” to sample SQL statement elapsed times. You can download it for free from my website here. The moral of this story is Do not expect users to experience average elapsed times, and if appropriate, provide details about the typical elapsed times.

Thanks for reading!

Craig.

If you enjoy my blog, I suspect you'll get a lot out of my courses; Oracle Performance Firefighting,  Advanced Oracle Performance Analysis, and my One-Day Oracle Performance Research Seminar. I teach these classes around the world multiple times each year. For the latest schedule, go to www.orapub.com . I also offer on-site training and consulting services.

P.S. If you want me to respond to a comment or you have a question, please feel free to email me directly at craig@orapub .com. Another option is to send an email to OraPub's general email address, which is currently orapub.general@comcast .net.



Monday, May 20, 2013

Hmm... Better Tools Are the Answer to Increased Complexity?

You're joking. Right?

Oracle Database performance tuning and analysis has come a long way in the last 20 years. First there was the “just add more resources” approach and tuning the blatantly poor SQL. Then there was ratio analysis, followed by wait event analysis, time based analysis, and unit of work time based analysis. In addition to the performance diagnosis and analysis evolution, the Oracle Database as a product has changed, and architectures are more diverse. Yet with all this change, some things are in many ways timeless. They relate to complexity, basic mathematical statistics, efficiency, and doctrinal purity. Over the next few weeks, I'll post four different "myths". Here is the first.

Myth #1

The first myth is Better tools are the solution to increasing architecture complexity. I was attending an Oracle Corporation product demonstration a few years ago, and the presenter said something like, “Architectures are increasing in complexity. The solution is better tools.” I looked around and everyone was agreeing, like in the Apple “1984” commercial, “Yes ... Complexity is progress ... We need more and better tools ... Who will bring us these new and better tools?”

I was thinking to myself, how about we focus on reducing the complexity?! Am I alone in thinking that with each increase in complexity, there is an increase in potential problems, which means an increase in risk? Contrary to popular opinion, Oracle DBAs don't typically enjoying a 3am login to fix a down production system.

A few years ago I did some consulting for a very large and well-known ecommerce company. I was amazed at the lengths they went to keep complexity and risk low, uptime high, and performance consistently good. In addition, they architected their systems so the workload could be easily and quickly partitioned. By keeping the complexity low, they were able to manage performance more simply and adjust more quickly. Their transaction throughput levels and on-line brand presence led them to the path of minimizing architectural complexity, resulting in an amazing uptime.

While advanced tools are fantastic (see: stori.orapub.com ), the true answer is to start with simplifying the underlying architecture.

Thanks for reading!

Craig.


If you enjoy my blog, I suspect you'll get a lot out of my courses; Oracle Performance Firefighting,  Advanced Oracle Performance Analysis, and my One-Day Oracle Performance Research Seminar. I teach these classes around the world multiple times each year. For the latest schedule, go to www.orapub.com . I also offer on-site training and consulting services.

P.S. If you want me to respond to a comment or you have a question, please feel free to email me directly at craig@orapub .com. Another option is to send an email to OraPub's general email address, which is currently orapub.general@comcast .net.