Thursday, May 31, 2012

Row versus Set Processing, Surprise!

You're row processing on the Java client? Surprise!


Two weeks ago I joined a good friend on-site at one of his clients. Upon arriving a discouraged developer wanted to know why his report was currently running so slow, when previously it had run just fine. For sure it was the database, right?

Long story short, the report sucked data from a couple of database tables into a Java client front end placing this data into memory structures, and then the Java code did all sorts of (I'm sure) amazing sequential row/record processing. This worked great with smaller quantities of data, but as the developer confessed, this past run was processing a larger volume of data. The database processing only took a few seconds, while the Java portion took much longer. We're not sure just how much longer since the report never finished!

Once the database and Java processing timing situation was explained and we think understood, the developer was still very visibly unsettled. He was taught that the database was there to simply retrieve, insert, or change mass amounts of data and all the complex business rule processing should be done in client-side Java code. It was obvious to both my friend and I that the benefits of set processing compared to row processing were not something the developer thought about and perhaps understood.

We gently explained that perhaps he could do some of the processing in the database using the benefits of set processing. After a short discussion about this, it was clear his solution was the way it "should be done" and that meant all the business processing should be done in the Java client.

There are two things he just didn't get. First, that set processing rocks! And second, row processing is not likely to scale as data volumes increase. He was encountering both of these truths.

While all Oracle DBAs have at one time or another been told about the benefits of set processing and the scalability issues with row processing, no one has ever pointed me to a real test. And I have never personally created a test to prove (at least in a test situation) the benefits of set processing.

This posting is the results of a simple experiment comparing row to set processing to specifically see for myself:

1. Does set processing truly bring increased performance, in terms of increased throughput?

2. Does set processing scale better then row processing, in relation to increased data processing volume?

If you want to know what happened... read on! (You can also skip to the Conclusion section below.)

Experimental Design


I wanted to keep the experimental design simple and give row processing as much advantage as possible.

View the complete analysis script here, below I pull just the key snippets I need to present.
View the raw experimental data results from the op_results table here.

The processing is pretty straightforward. I'm inserting rows into the op_interim table containing the number and amount of orders for each customer who has a status of '1'.

Below is the set processing query that runs on the database server.

insert into op_interim
  select c.id, c.status, c.name, count(o.amount), sum(o.amount)
  from   op_customer c,
         op_order o
  where  c.status = '1'
    and  c.id = o.cust_id (+)
  group by c.id, c.status, c.name;
commit;

Below is row processing code written in simple plsql. The plsql code was run on the database server, not from a separate machine.
order_sum_check_v := 0;

open cust_cur for
  select id, status, name
  from   op_customer
  order by id;
loop
  fetch cust_cur into
    cust_cur_id_v, cust_cur_status_v, cust_cur_name_v ;
  exit when cust_cur%NOTFOUND;

  order_count_v      := 0 ;
  order_sum_amount_v := 0 ;

  open order_cur for
    select id, cust_id, amount
    from   op_order
    order by cust_id, id;
  loop
    fetch order_cur into
      order_cur_id_v, order_cur_cust_id_v, order_cur_amount_v ;
    exit when order_cur%NOTFOUND;

    if (cust_cur_status_v = '1') AND (cust_cur_id_v = order_cur_cust_id_v) then
      order_count_v      := order_count_v + 1 ;
      order_sum_amount_v := order_sum_amount_v + order_cur_amount_v ;
      order_sum_check_v  := order_sum_check_v + order_cur_amount_v ;
    end if;
  end loop;
  close order_cur;

  insert into op_interim values
  (cust_cur_id_v, cust_cur_status_v, cust_cur_name_v, order_count_v,
     order_sum_amount_v);
end loop;
close cust_cur;
commit;

A big benefit to the row processing is I pull the data from Oracle in sorted order. This makes the row-by-row nested loop easy to code and "fast." To ensure both the row and set processing actually did the equivalent from a business processing perspective, I kept a counter, order_sum_check_v so I knew the total order amount processed. To my relief, comparing set to row processing, the counter always matched perfectly.

Just comparing the two above snippets of code makes me run towards set processing. The chances of buggy code is obviously far greater in the row processing. (...reminds me a lot of my first "MIS" class on COBOL programming! Run away!!!)

For my experiment I varied the number of rows processed from 100K to 1M in 10 sets. That is, 100K, 200K,..., 1000K. I took 12 samples for both row and set processing at each row set level. I would have liked to take more samples, but the row processing took a long time. The entire experiment ran for around 40 hours.

The detailed results were stored in the op_results table. I wrote a couple simple queries to make analyzing the data in Mathematica easier. I ran the test on (cat /proc/version): Linux version 2.6.32-300.3.1.el6uek.x86_64 (mockbuild@ca-build44.us.oracle.com) (gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC) ) #1 SMP Fri Dec 9 18:57:35 EST 2011. The Dell Precision T5500, 6 CPU core box has 24GB of ram, and Oracle 11.2.0.1.0.

Experimental Results


The experiment was designed to answer these two questions:

1. Does set processing truly bring increased performance, in terms of increased throughput?

2. Does set processing scale better then row processing, in relation to increased data processing volume?

Let's look at each of these in more detail (not that much detail though).

You can download the raw experimental data here.
You can download the throughput and scalability details in PDF or native Mathematica.
You can download the 100K row vs set processing significance test details in either PDF or native Mathematica.

1. Does set processing truly bring increased performance?

Take a quick look at the below chart. Each point represents 12 samples for a set number of rows to process. The red square points are the set processing results and the round blue points are the row processing results.

The chart above clearly shows set processing operates at a much higher throughput than row processing. In fact, with 100K rows to processing, set processing throughput was 3741 times greater than row processing throughput (1526476 rows/sec vs 408 rows/sec).

Our experiment showed from 100K to 1M rows processed, set based processing's throughput was much higher compared to row based processing.

2. Does set processing scale better then row processing, in relation to increased data processing volume?

Have you ever heard someone say, "I can't understand why it taking so much longer now. It took 10 minutes to process 10K rows but with 50K rows it still isn't finished after four hours!" What the person is not understanding is referred to is application scalability (among other things).

The misconception is believing if 10K rows are processing in 10 minutes, then 50K rows will be processed in 50 minutes and on and on. Set processing is much more likely to scale linearity than row processing because it processes in group or sets. In contrast, row processing just plows through the data one row at a time missing out on the natural grouping of rows that set processing provides.

The above chart shows the time related to process x number of rows. As in the first chart, the row processing points are the blue circles and the set processing are the red squares. The set based processing results is difficult to see because all the points on virtually on the x-axis! The set based points are the red square points. So obviously, set based processing is faster and stays faster as the number of rows increases. Don't ever let someone convince you row based processing is faster than set based processing unless they can clearly demonstrate using your real data.

For the statistically minded, the time difference is the smallest with the fewest number of rows processes, that is, 100000. Because I'm plotting the averages, we can't visually tell for sure if there is a statistically significant difference between row and set processing. For example, the variance could be massive so in reality, there really isn't a difference. So I performed a statistical significance test (download link at top of Experimental Results section) using the 12 elapsed time samples from both the row and the set based processing. The p-value is 0.00000 (really) indicating it is highly unlikely the two sample sets came from the same population. This means, they are different. OK... no surprise here.

Contrasting set and row processing scalability is very evident both visually and numerically. Looking at the above graph, you can see that difference between row and set processing increases as the rows to process increases.  Numerically, when creating a linear trend line, the slope of the row based processing is 0.00259 and the slope for the set processing data is 0.00000. (For set processing, the slope is actually 6.17x10-7.)

This means that as the number of rows to processes increases, the time difference between row and set processing continues to increase.

So not only is row based processing slower as the amount of work increases, the time difference continues to increase as the amount of work increases.

Conclusions


If you're a DBA you already knew in your gut what the conclusions would be. But now for both of us, we can stand in front of people and point to experimental evidence clearly showing:

1. Set based processing will likely be much faster than row based processing. Our experiment of processing 100K rows showed row based processing was 3700 times slower than set based processing. Not twice as slower or even 10 times slower... 3700 times slower!

2. Compared to set based processing, row based processing times degrade much quicker than set based processing. That is, row based processing does not scale nearly as well as set based processing. We saw this in that the linear trend line for row based processing was 0.00259 compared to 0.00000 for set based processing.

Thanks for reading!


Craig.


If you enjoyed this blog entry, I suspect you'll get a lot out of my courses; Oracle Performance Firefighting and Advanced Oracle Performance Analysis. I teach these classes around the world multiple times each year. I also occasionally teach my Oracle Forecasting and Predictive Analysis class and also a one day Performance Research Seminar. For the latest schedule, click here. I also offer on-site training and consulting services.

P.S. If you want me to respond with a comment or you have a question, please feel free to email me directly at craig @orapub .com.



























Tuesday, May 8, 2012

Creating realistic think times for research projects

Creating sensible think times is kind of tricky

My job requires I spend a tremendous amount of time doing performance analysis research. Over the years I have developed a list of things I do to help ensure my research is up to the level I need for a specific experiment. One of the more interesting things on my list is creating a more realistic workload by using sleep times (i.e., think times) that are not constant. When sleep times are the same the system behaves unrealistically constant. On real systems, even a CPU bottlenecked system exhibits an ebb and flow.

Here's an example of what I'm referring to. Let's say we need to create a logical IO (buffer gets) intensive workload. One way to do this is:
  • Create a table with 1M rows.
  • Create a big enough buffer cache to ensure all the rows will always be cached.
  • Create a query that SELECTs from this table.
  • Vary the workload intensity by either altering the query itself, increasing the number of sessions executing the query, or by decreasing the "think time" between queries.
All OLTP systems contain think time because there are real users pushing work into the system...and they have to occasionally think. (key word: occasionally) If we want a more realistic behaving system, we need think times to be more than just a constant value.

How you can create more realistic think times is what this blog is all about.

The trick is, how do we model this think time?

You can download all the plsql mentioned below in this single text file here. If you need a quick refresher regarding histograms, read the first part of this blog post.

Constant think time.

An easy option is to simply use a constant time by using dbms_lock.sleep(x) or creating the values using dbms_random.value(x,x). The problem with this, is the resulting workload is not very realistic because users don't exhibit constant think times.

When you observe the load using a tool like vmstat, while there can be utilization ups and downs, you'll notice the utilization will stay unrealistically constant...not very realistic as real systems ebb and flow. Including multiple SQL statements can make for a more variable utilization, but it will likely be very erratic and not the smooth ebb and flow that tends to occur in a real system. Plus the ebb and flow is occurring because the workload changes, but real systems also ebb and flow because user think times are not constant, plus a variety of other reasons.

A more formal way to describe this constant type of think time is each sample think time is pulled from a uniform distribution with the same minimum and maximum values. Here's the plsql to create the sample data:
--
-- Code for UNIFORM sample values with min and max of 1.0.
--
declare
  max_samples  number := 1000;
  sample_value number;
  i            number;
begin
  dbms_random.seed(13);
  for i in 1..max_samples
  loop
    sample_value := dbms_random.value(1.0, 1.0) ;
    dbms_output.put_line(round(sample_value,4));
  end loop;
end;
/
A histogram was not created because all 1000 samples are exactly the same. Picture the Washington Monument and you have a pretty good picture of what it will look like.

Uniform think time.

The next step for many of us is to provide a range of think times. We think, "Uses will think between 0.5 and 1.5 seconds." This can be implemented simply by using dbms_random.value(0.50, 1.50). We can create a uniform distribution samples between 0.50 and 1.50 seconds using the code below.
--
-- Code for UNIFORM sample values with min of 0.50 and max of 1.5.
--
declare
  max_samples  number := 1000;
  sample_value number;
  i            number;
begin
  dbms_random.seed(13);
  for i in 1..max_samples
  loop
    sample_value := dbms_random.value(0.50, 1.5) ;
    dbms_output.put_line(round(sample_value,4));
  end loop;
end;
/
Looking at the histogram above, is this what you expected? Probably not. Surely, this can't be production system realistic. While observing the OS utilization will not be as constant as the "constant think time," the ebb and flow is still far from real and will likely be spiky; abrupt increases and decreases.

Normal think time.

Referring to the previous histogram above, I suspect most readers were expecting to see what is called "normal think time." The rational follows this line of thinking. "Users will think an average of one second, sometimes more and sometimes less." The hidden assumption is the more and less likelihood are the same.

Oracle provides the dbms_random.normal function that returns samples with an average of 0 and standard deviation of 1. This is called a standard normal distribution. The code and histogram are shown directly below.
-- Code for NORMAL sample values with average of 0.0 and stdev of 1.0.
--
declare
  max_samples  number := 1000;
  sample_value number;
  i            number;
begin
  dbms_random.seed(13);
  for i in 1..max_samples
  loop
    sample_value := dbms_random.normal;
    dbms_output.put_line(round(sample_value,4)||',');
  end loop;
end;
/
Important: But this is not what we want! We need a normal distribution with an average of 1.0 and a standard deviation of 0.5. We can easily increase the average by either adding or subtracting from the resulting sample value. But what about altering the dispersion, that is, the variance and standard deviation? It's very simple using Mathematica or R because one of the inputs for the normal distribution function is the standard deviation. Not so with dbms_random.

But with dbms_random, it's not that simple. In fact, altering the dispersion, that is, the variance and standard deviation is not obvious. Surprisingly, I could find no references searching using Google!

So I started digging... and figure it out. Not being a statistician, for me this was a discovery. I recalled that we can standardize values, which results in the infamous z value using the following formula:

z = ( x - m ) / s

Where:
z is the standardized sample value
x is the unstandardized sample value
m is the sample set average
s is the sample set standard deviation

In our situation, we are trying to determine the un-standardized value (x) while dbms_random.normal provides us with a standardized value (z) with a default average (m) of zero, and a default standard deviation (s) of 1. So I figured I would simply set z, m, and s and then solve for x... the unstandardized value!

With a little algebra, solving for x becomes:

x = z * s + m

This is coded in the plsql below followed by the histogram.
declare
  max_samples  number := 1000;
  sample_value number;
  i            number;
  z            number;          -- standardized value
  m            number  := 1.0 ; -- average
  s            number  := 0.5 ; -- standard deviation
  x            number;          -- un-standardized value
begin
  dbms_random.seed(13);
  for i in 1..max_samples
  loop
    z := dbms_random.normal;
    x := z * s + m ;
    dbms_output.put_line(round(x,4)||',');
  end loop;
end;
/
And it worked! The figure above displays an average of 0.991 and a standard deviation of 0.499! I tried the above code snippet with a variety of averages and standard deviations and it worked every time! Wow, that is so cool!

But we are not finished because user think time is not normally distributed; balanced. In fact, it's very unbalanced, this is, skewed. For sure the CPU utilization ebb and flow will occur, but we are not likely to get the dramatic yet infrequent workload bursts that all OLTP systems exhibit (as a result of the transaction arrival rate pattern).

Log Normal think time.

Log normal sample values are always greater than zero, infrequently less than the typical values (think: median), definitely exhibit a cluster of typical values, and strangely crazy large values do occur but very infrequently! Sound like Oracle performance? An example log normal histogram looks like this:
I've done quite a bit of research on this topic. Transaction arrival rates and transaction service times are supposed to be exponentially distributed, yet my experiments and production system data begs to differ. I blogged about this in reference to transaction arrival rates here, SQL statement elapsed times here (and here and here), and wait event times here. While none of my sample sets were truly statistically log normal, they were kinda close and sure looked log normal in many cases.

Once we have a normal distribution, it's easy to convert to a log normal distribution. I discuss this in some detail in my posting, Important Statistical Distribution..really. But to summarize, simply take the normal sample value and apply the exponential function to it. Really, it's this simple.

Here's the sample plsql code, followed by the histogram for an log normal distribution with its associated normal average 1.0 and standard deviation 0.50.
--
-- Code for LOG NORMAL sample values with normal average of 1.0 and stdev of 0.5.
--
declare
  max_samples  number := 1000;
  sample_value number;
  i            number;
  z            number;          -- standardized value
  m            number  := 1.0 ; -- average
  s            number  := 0.5 ; -- standard deviation
  x            number;          -- un-standardized value
  log_normal   number;          -- log normal sample value
begin
  dbms_random.seed(13);
  for i in 1..max_samples
  loop
    z          := dbms_random.normal;
    x          := z * s + m ;
    log_normal := exp(x) ;
    dbms_output.put_line(round(log_normal,4)||',');
  end loop;
end;
/
It worked! Notice how the quick think times (low sample value) do not occur that often...but they do occur. And also notice the most likely values (median) are less then the average. And finally, while infrequent, sometimes the think time is really long (think: someone just walked in my office).

But how can we use this in our performance research? Read on!

How to actually use this

If you've made it this far, you are probably wondering how to use all this seemingly not-so-useful statistical knowledge. It's actually pretty straightforward.

Create a think time function. Notice the below function is based on the log normal plsql above.

create or replace function op_think_time(avg_in number, stdev_in number)
return number as
begin
declare
  sample_value number;
  i            number;
  z            number; -- standardized value
  m            number; -- average
  s            number; -- standard deviation
  x            number; -- un-standardized value
  log_normal   number; -- log normal sample value
begin
  m          := avg_in ;
  s          := stdev_in ;
  z          := dbms_random.normal;
  x          := z * s + m ;
  log_normal := exp(x) ;

  return log_normal;
end;
end;
/
Use think time as the sleep time. Place the op_think_time function result into the sleep time function to simulate a user's think time. Below is a simple example.
loop
  select count(*) from big_table;
  dbms_lock.sleep(op_think_time(0.50,0.2));
end loop;
Pick a good think time. You may have noticed the op_think_time function inputs are a normal distribution average and standard deviation. This will not be the log normal sample values average and standard deviation. Below is a table created based on the op_think_time function generating 500 sample values. This will give you an idea of the inputs (in_m, in_s) you may want to use based on your desired think times (out_m, out_s).
in_m  in_s   out_m  out_s
----  ----   -----  -----
 0.5   0.2    1.68   0.35
 1.0   0.5    3.12   1.65
 1.5   0.5    5.05   2.61
 1.5   1.0    7.10   8.98
 2.0   0.1    7.44   0.73
Where:
in_m is the normal distribution input average
in_s is the normal distribution input standard deviation
out_m is the log normal distribution sample values average
out_s is the log normal distribution sample values standard deviation

So now you know one way to create more realistic think times when you want to place a specific load on a system during your experiments. Plus I suspect you can use this in a variety of performance analysis aspects.

Thanks for reading!

Craig.

If you enjoyed this blog entry, I suspect you'll get a lot out of my courses; Oracle Performance Firefighting and Advanced Oracle Performance Analysis. I teach these classes around the world multiple times each year. I also occasionally teach my Oracle Forecasting and Predictive Analysis class and also a one day Performance Research Seminar. For the latest schedule, click here. I also offer on-site training and consulting services.

P.S. If you want me to respond with a comment or you have a question, please feel free to email me directly at orapub.craig@gmail .com.