Thursday, June 23, 2011

Impact of Consistent Reads

The Purpose

For sure there is a cost for read consistency. But is that cost significant? Can users feel the difference? Does it require significant computing resources? Like many things in Oracle, perhaps this entire discussion is more academic and just makes for interesting presentations and internals discussions... Delving in these issues is what this blog entry is all about!

Some Internals Background

If you've studied Oracle internals you will eventually learn that by default, query results reflect the committed state of the database system when the query started. (Oracle can also provide this capability at the transaction level.) This implies that if a needed buffer has a "change date" (think: SCN) AFTER the query started or  BEFORE the query started yet contains uncommitted data, Oracle must rebuild an image of the buffer containing committed data consistent with the query start time.

But it can get more complicated. In addition to the above, Oracle can cache multiple images of blocks, representing the block at various times (think: SCN) in both the buffer cache and a server process's PGA. Oracle can decide to use any appropriate buffer to fulfill a read consistent query. This is important to understand, because it is common to see consistent read activity when the data you are querying has been committed well before the query begin or even more surprising after the instance has been cycled! This is why you can see cr (consistent read) activity when running a tkprof even though you know there is no dml occurring related to the table. If you really want to dig into this, start by looking at a handful of instance statistics starting with "consistent".

I need to interject a couple definitions. The rebuilt or copied buffers are commonly called cloned buffers because they are a clone of the buffer at a point in time. When a cloned buffer is created, at a minimum, the instance statistic consistent change and consistent get are incremented. If an already buffered clone is accessed the consistent get statistic is increased and probably other consistent related instance statistics.

Even with multiple Oracle instances, every block that is buffered has a single changeable latest and greatest version in the entire database system, called the current buffer or the cu buffer. If you have done a tkprof and see values in the cu column you know your SQL has accessed cu buffers.

The performance implications of clones are massive. For example and to summarize, to make a clone a free buffer must be secured, the cu buffer is copied into cloned (cr) buffer, the most recent buffer chance undo address is retrieved from the clones' itl, and the server process checks is to see if the undo block is in the buffer cache. If the undo block is not in the buffer cache, the server process must find a free buffer for it, make an IO request (and probably wait a bit), place the block into the buffer cache, update various internal structures, and finally apply the undo to the cloned buffer. Now suppose after applying the undo to the cloned buffer it is discovered that additional undo is required! Then this process cycles again and potentially again and again. This "again and again" is following what is called a "chain of undo."

So you can see that creating a lot of cr buffers will increase both CPU and IO requirements. There is a cost for read consistency. But is that cost significant? Or perhaps its more academic and just makes for interesting presentations and internals discussions... Let's find out!

The Experimental Design

You can download the entire script that is somewhat documented HERE.

The experiment is actually quite simple. I created and loaded a table with 18M rows and committed. (The actual experimental 18M table did have DML and commits against it before the experiment began.) To buffer the data, I ran a full table scan query on the table five times. I then took my initial sample and called this my baseline sample, since updates have not yet occurred. The baseline query required no cr buffers as a result of uncommitted data or because of buffers changing after the query started and then clones had to be created. However, as I mentioned in the internals discussion above, Oracle can keep older versions of the buffer and when they are accessed they count as consistent reads. This indeed occurred because after an update occurred, the session committed (and even the instance cycled), Oracle recorded consistent read activity when the table was queried.

Before the second, third, and fourth loops were run, in a different Oracle session I updated 1M rows twice, but did not commit. It's important to notice the same 1M are being updated twice. This will likely force a "chain of undo" which will intensify read consistent activity.

As you can see below, when I "run the query" I actually run the full stats collection and actual query 60 times, providing me with 60 samples for each of the four loops.

I am expecting to see an increasingly larger elapsed time gap between the baseline query and each subsequent query.

So the statistics collection and query looping looks something like this:

load data...(18M rows)
cache data: select sum(update_seq) from rc_test; 5X
for i in 1..4 loop
  for sample in 1..60
    gather time t0 performance data 
    select sum(update_seq) into bogus from rc_test;
    gather time t1 performance data
    calculate deltas and store
  end loop
  pause this script and do the update in another session
end loop

The performance data gathered was:
  • time stamp based on the timestamp data type for maximum precision.
  • instance CPU consumption. I want to know the impact on the entire instance, not just my session. The data source is the v$sys_time_model, statistics "DB CPU" and "background cpu time".
  • instance non-idle wait time. Again, I wanted to know the impact on the entire instance, not just my session. The data source is v$system_event.
  • consistent gets. The number of cr buffers accessed in the entire instance. My sessions where the only sessions connected as I was in a very controlled environment. The data source is v$sysstat and the statistics is "consistent gets".
The update, which is performed in a different Oracle session, looks like this:

declare
  cntr number;
begin
for cntr in 1..2
loop
  update rc_test
  set    update_stamp=sysdate
  where  rownum<1000000;
end loop;
end;
/

You may notice I updated the same row set twice. As I stated above, this can cause a "chain of undo" to occur which increases consistent ready intensity.

Experimental Results (Summary)

The results are very clear, although a little more statistically tricky than I anticipated. First the results and then I'll provide some numeric and visual details.

There was both a visual and significant statistical difference between the initial baseline query (sample set 1) and the each query run after the three update (sample sets 2, 3, 4). This means consistent read statements take longer to run. The non-baseline samples were at least 1.5X slower than our baseline! Consistent read activity clearly made an elapsed time impact that users will notice.

There is also another key take away, that is not so apparent. There is a real (statistically significant) difference between each of the samples. Not just between the baseline and the others but between each of the four sample sets. This means that each time I ran the 1M update twice, it caused a real slowdown in the query.
Figure 1.
While you can download all the raw experimental data here, Figure 1 shows some of the numeric results. The columns should be self explanatory with the exception of the "P Value". This the result of the statistical significance test (details in the following section). If the P Value was less than 0.05 then we can say there is a real difference between the baseline sample set and the other sample set. It's important to understand that the P Values are between the baseline sample and the other sample set. For example, between set 1 and set 2, set 1 and set 3, and set 1 and set 4.
Figure 2.
Figure 2 above is a smooth histogram of each of the four sample set's elapsed times. Our baseline line sample set appears on the far left, then to the right set 2, set 3, and finally set 4. Visually it appear there is a clear and real difference between these sample sets...and the statistics agree!
Figure 3.
Figure 3 above is a consistent read histogram containing sample sets 2, 3, and 4. If you look closely at sample set 1's (our baseline) average consistent read value of 308390 it dwarfs in comparison to the other samples. So much so, when the histogram includes all four sample sets, because of the required scaling, the histogram is visually worthless. This implies that visually there is a real difference between the baseline sample set and the others. And as Figure 3 shows, clearly there is also a visual difference between each of the other sample sets as well...and the statistics agree!

The actual statistical analysis was more difficult than it may appear. If you are interested, read the next section. Otherwise, just skip to the Conclusion.

Experimental Results (Detailed)

By looking at the numerical and visual experimental results, it appears that indeed when consistent buffers are involved, query performance (our query to be exact) was indeed negatively impacted. But visually comparing values and histograms is not enough to make a strong statement. In fact, it's rather weak. So I needed to perform a statistical hypothesis test.

My plan was perform a simple t-test comparing sample set 1 (our baseline) to sample set 2, set 1 to set 3, and set 1 to set 4. But the t-test has a key requirement which, our data clearly fails to demonstrate.
Figure 4.                                  Figure 5.
As you'll recall, each sample set consists of 60 samples. Figure 4 above is the histogram for our baseline consistent reads and Figure 5 is the histogram for our baseline elapsed times. 

A key t-test requirement is our sample set must be normally distributed. Figure 4 looks extremely exponential and Figure 5 is some inverse normal distribution looking monstrosity-thing! Oh boy... talk about disappointment! After doing some research I found a non-parametric significance test needed to be performed. While I summarize this in the actual Mathematica notepad (notepad file, pdf file) there is a very nice explanation of non-parametric significance testing here.

Thankfully Mathematica has built in functions for this type of significance testing. In fact, I can tell it to choose what it thinks to be the most appropriate statistical test and do the test. It choose the Kruskal-Wallis test, which was one of the tests mentioned in the URL above.

To show a real difference exists between the sample sets, the P-Value needed to be less than 0.05. All the significance tests I performed resulted in a P-Value rounded to 0.0000. (I'm not joking.) The tests where between set 1 and set 2, set 1 and set 3, set 1 and set 4, and also between set 2 and set 3, and set 3 and set 4. For every test, statistically there is a real difference between the sample sets.

Just in case you're wondering, I too thought a consistent 0.0000 value was perhaps too good to be true. So I did a significance test between set 1 and set 1, which would have resulted in a P-Value of 1. And it did! So I'm comfortable with the results.

I must say though, I am not a statistician but based on my research and Mathematica, I believe I did the correct statistical significance test and represented the results correctly. If you think I make a mistake, please email me at OraPub's general email (orapub@comcast.net) and I'll be more then happy to make any required changes.

Conclusion

Many DBAs don't realize how resource intensive building and accessing consistent read buffers are. But once the Oracle internals is understood it becomes clear that intensive consistent read activity can significantly increase both CPU and IO consumption, which results in an elapsed time increase. But is this really true and is there is an increase make a real performance difference?
This experiment demonstrated that:
  1. Consistent read buffer access significantly increases query elapsed time.
  2. An increase in the "chain of undo" also significantly increases consistent read access and elapsed time.
  3. The elapsed time difference could definitely be felt by a user as our simple test showed a 1.5X elapsed time increase.
One important and final comment: My experiment did not include the creation of the consistent read buffers (statistics consistent changes), but only the access of already created cloned buffers. I ran a few other tests just to be sure the consistent read statistics did not include consistent changes...in fact, it was usually zero. So if my experiments shows a significant elapsed time increase when the consistent read buffers have already been created, we know that creating and accessing consistent read buffers will result in even longer elapsed times.

To minimize the impact of consistent reads, keep your SQL tuned, commit often (but often enough to cause frequent commit issues!), and try not to query blocks that another transaction is changing.

Thanks for reading.

Craig.



If you enjoy my blog, I suspect you'll get a lot out of my courses; Oracle Performance Firefighting and Advanced Oracle Performance Analysis. I teach these classes around the world multiple times each year. For the latest schedule, click here. I also offer on-site training and consulting services.

P.S. If you want me to respond to a comment or you have a question, please feel free to email me directly at craig@orapub .com. I use a challenge-response spam blocker, so you'll need to open the challenge email and click on the link or I will not receive your email. Another option is to send an email to OraPub's general email address, which is currently orapub@comcast .net.

1 comment:

  1. hey ,nice post. I love the style of blogging I see here :)
    you see, there are 2 blogs that i've found so far to be very helpful and have something interesting for me whenever i visit, this one and http://danieluyi.com
    Keep it up. I will be visiting again.

    ReplyDelete