Monday, November 17, 2014

Off May Not Be Totally Off: Is Oracle In-Memory Database 12c (12.1.0.2.0) Faster?

This page has been permanently moved. Please CLICK HERE to be redirected.

Thanks, Craig.

Off May Not Be Totally Off: Is Oracle In-Memory Database 12c (12.1.0.2.0) Faster?


Most Oracle 12c installations will NOT be using the awesome Oracle Database in-memory features available starting in version 12.1.0.2.0. This experiment is about the performance impact of upgrading to 12c but disabling the in-memory features.

Every experiment I have performed comparing buffer processing rates, clearly shows any version of 12c performs better than 11g. However, in my previous post, my experiment clearly showed a performance decrease after upgrading from 12.1.0.1.0 to 12.1.0.2.0.

This posting is about why this occurred and what to do about it. The bottom line is this: make sure "off" is "totally off."

Turn it totally off, not partially off


What I discovered is by default the in-memory column store feature is not "totally disabled." My experiment clearly indicates that unless the DBA takes action, not only could they be a license agreement violation but a partially disabled in-memory column store slightly slows logical IO processing compared to the 12c non in-memory column store option. Still, any 12c version processes buffer faster than 11g.

My experiment: specific and targeted


This is important: The results I published are based on a very specific and targeted test and not on a real production load. Do not use my results in making a "should I upgrade decision." That would be stupid and an inappropriate use of the my experimental results. But because I publish every aspect of my experiment and it is easily reproducible it is a valid data point with which to have a discussion and also highlight various situations that DBAs need to know about.

You can download all my experimental results HERE. This includes the raw sqlplus output, the data values, the free R statistics package commands, spreadsheet with data nicely formatted and lots of histograms.

The instance parameter settings and results


Let me explain this by first showing the instance parameters and then the experimental results. There are some good lessons to learn!

Pay close attention to the inmemory_force and inmemory_size instance parameters.

SQL> show parameter inmemory

NAME         TYPE  VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default       string
inmemory_force        string  DEFAULT
inmemory_max_populate_servers      integer  0
inmemory_query        string  ENABLE
inmemory_size        big integer 0
inmemory_trickle_repopulate_servers_ integer  1
percent
optimizer_inmemory_aware      boolean  TRUE

SQL> show sga

Total System Global Area 7600078848 bytes
Fixed Size      3728544 bytes
Variable Size   1409289056 bytes
Database Buffers  6174015488 bytes
Redo Buffers     13045760 bytes

In my experiment using the above settings the median buffers processing rate was 549.4 LIO/ms. Looking at the inmemory_size and the SGA contents, I assumed the in-memory column store was disabled. If you look at the actual experimental result file "Full ds2-v12-1-0-2-ON.txt", which contain the explain plan of the SQL used in the experiment, there is no mention of the in-memory column store being used. My assumption, which I think is a fair one, was that the in-memory column store had been disabled.

As you'll see I was correct, but only partially correct.

The parameter settings below are when the in-memory column store was totally disabled. They key is changing the default inmemory_force parameter value from DEFAULT to OFF.

SQL> show parameter inmemory

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
inmemory_clause_default              string
inmemory_force                       string      OFF
inmemory_max_populate_servers        integer     0
inmemory_query                       string      ENABLE
inmemory_size                        big integer 0
inmemory_trickle_repopulate_servers_ integer     1
percent
optimizer_inmemory_aware             boolean     TRUE
SQL> show sga

Total System Global Area 7600078848 bytes
Fixed Size                  3728544 bytes
Variable Size            1291848544 bytes
Database Buffers         6291456000 bytes
Redo Buffers               13045760 bytes

Again, the SGA does not show any in-memory memory space. In my experiment with the above "totally off" settings, the median buffers processing rate was 573.5 LIO/ms compared to "partially off" 549.4 LIO/ms. Lesson: Make sure off is truly off.

It is an unfair comparison!


It is not fair to compare the "partially off" with the "totally off" test results. Now that I know the default inmemory_force must be changed to OFF, the real comparison should be made with the non in-memory column store version 12.1.0.1.0 and the "totally disabled" in-memory column store version 12.1.0.2.0. This is what I will summarize below. And don't forget all 12c versions showed a significant buffer processing increase compared to 11g.

The key question: Should I upgrade?


You may be thinking, if I'm NOT going to license and use the in-memory column store, should I upgrade to version 12.1.0.2.0? Below is a summary of my experimental results followed by the key points.


1. The non column store version 12.1.0.1.0 was able to process 1.1% more buffers/ms (median: 581.7 vs 573.5) compared to to "totally disabled" in-memory column store version 12.1.0.2.0. While this is statistically significant, a 1.1% buffer processing difference is probably not going to make-or-break your upgrade.

2. Oracle Corporation, I'm told, knows about this situation and is working on a fix. But even if they don't fix it, in my opinion my experimental "data point" would not warrant not upgrading to the in-memory column store version 12.1.0.2.0 even if you are NOT going to use the in-memory features.

3. Visually (see below) the non in-memory version 12.1.0.1.0 and the "totally off" in-memory version 12.1.0.2.0 samples sets look different. But they are pretty close. And as I mentioned above, statistically they are "different."

Note for the statistically curious: The red color 12.1.0.1.0 non in-memory version data set is highly variable. I don't like to see this in my experiments. Usually this occurs when a mixed workload sometimes impacts performance, I don't take enough samples or my sample time duration is too short. To counteract this, in this experiment I captured 31 samples. I also performed the experiment multiple times and the results where similar. What I could have done was used more application data to increase the sample duration time. Perhaps that would have made the data clearer. I could have also used another SQL statement and method to create the logical IO load.

What I learned from this experiment


To summarize this experiment, four things come to mind:

1. If you are not using an Oracle Database feature, completely disable it. My mistake was thinking the in-memory column store was disabled when I set it's memory size to zero and "confirmed" it was off by looking at the SGA contents.

2. All versions of 12c I have tested are clearly faster at processing buffers than any version of 11g.

3. There is a very slight performance decrease when upgrading from Oracle Database version 12.1.0.1.0 to 12.1.0.2.0.

4. It is amazing to me that with all the new features poured into each new Oracle Database version the developers have been able to keep the core buffer processing rate nearly at or below the previous version. That is an incredible accomplishment. While some people may view this posting as a negative hit against the Oracle Database, it is actually a confirmation about how awesome the product is.

All the best in your Oracle performance tuning work!

Craig.




Monday, November 10, 2014

Is Oracle Database 12c (12.1.0.2.0) Faster Than Previous Releases?

This page has been permanently moved. Please CLICK HERE to be redirected.

Thanks, Craig.

Is Oracle Database 12c (12.1.0.2.0) Faster Than Previous Releases?


I was wondering if the new Oracle Database 12c version 12.1.0.2.0 in-memory column store feature will SLOW performance when it is NOT being used. I think this is a fair question because most Oracle Database systems will NOT be using this feature.

While the new in-memory column store feature is awesome and significant, with each new Oracle feature there is additional kernel code. And if Oracle is not extremely careful, these new lines of Oracle kernel code can slow down the core of Oracle processing, that is, buffer processing in Oracle's buffer cache.

Look at it this way, if a new Oracle release requires 100 more lines of kernel code to be executed to process a single buffer, that will be reflected in how many buffers Oracle can process per second.

To put bluntly, this article is the result of my research comparing core buffer processing rates between Oracle Database versions 11.2.0.2.0, 12.1.0.1.0 and 12.1.0.2.0.

With postings like this, it is very important for everyone to understand the results I publish are based on a very specific and targeted test and not on a real production load. Do not use my results in making a "should I upgrade decision." That would be stupid and an inappropriate use of the my experimental results. But because I publish every aspect of my experiment and it is easily reproducable it is valid data point with which to have a discussion and also highlight various situations that DBAs need to know about.

There are two interesting results from this research project. This article is about the first discovery and my next article will focus on the second. The second is by far the most interesting!

FYI. Back in August of 2013 performed a similar experiment where I compared Oracle database versions 11.2.0.2.0 with 12.1.0.1.0. I posted the article HERE.

Why "Faster" Means More Buffer Gets Processed Per Second


For this experiment when I say "faster" I am referring to raw buffered block processing. When a buffer is touched in the buffer cache it is sometimes called a buffer get or a logical IO. But regardless of the name, every buffer get increases the instance statistic, session logical reads.

I like raw logical IO processing experiments because they are central to all Oracle Database processing. Plus with each new Oracle release, as additional functionality is inserted it is likely more lines of Oracle kernel code will exist. To maintain performance with added functionality is an incredible feat. It's more likely the core buffer processing will be slower because of the new features. Is this case with Oracle's in-memory column store?

How I Setup The Experiment


I have included all the detailed output, scripts, R commands and output, data plots and more in the Analysis Pack that can be downloaded HERE.

There are a lot of ways I could have run this experiment. But two key items must exist for a fare comparison. First, all the processing must be in cache. There can be no physical read activity. Second, the same SQL must be run during the experiment and have the same execution plan. This implies the Oracle 12c column store will NOT be used. A different execution plan is considered "cheating" as a bad plan will clearly loose. Again, this is a very targeted and specific experiment.

The experiment compares the buffer get rates for a given SQL statement. For each Oracle version, I gathered 33 samples and excluded the first two, just to ensure caching was not an issue. The SQL statement runs for around 10 seconds, processes around 10.2M rows and touches around 5M buffers. I checked to ensure the execution plans are the same for each Oracle version. (Again, all the details are in the Analysis Pack for your reading pleasure.)

I ran the experiment on a Dell server. Here are the details:
$ uname -a
Linux sixcore 2.6.39-400.17.1.el6uek.x86_64 #1 SMP Fri Feb 22 18:16:18 PST 2013 x86_64 x86_64 x86_64 GNU/Linux
To make this easier for myself, to perform the test I used my CPU Speed Test tool (version 1i). I blogged about this last month HERE. The latest version of this tool can be downloaded HERE.

The Results, Statistically


Shown below are the experimental results. Remember, the statistic I'm measuring is buffer gets per millisecond.


Details about the above table: The "Normal" column is about if the statistical distribution of the 31 samples is normal. If the p-value (far right column) is greater than 0.05 then I'll say they are normal. In all three cases, the p-value is less than 0.05. If fact, if you look at the histograms contained in the Analysis Pack every histogram is visually clearly not normal. As you would expect the "Average" and the "Median" are the statistical mean and median. The "Max" is the largest value in the sample set. The "Std Dev" is the standard deviation, which is doesn't mean much since our sample sets are not normally distributed.

As I blogged about before the Oracle Database 12c buffer processing is faster than Oracle Database 11g. However, the interesting part is Oracle version with in-memory column store 12.1.0.2.0 is slower then the previous version of 12c, 12.1.0.1.0. In fact, in my experiment the in-memory column store version is around 5.5% slower! This means version 12.1.0.1.0 "out of the box" can process logical buffers around 5.5% faster! Interesting.

In case you're wondering, I used the default out-of-the-box in-memory column store settings for version 12.1.0.2.0. I checked the in-memory size parameter, inmemory_size and it was indeed set to zero. Also, when I startup the Oracle instance there is no mention of the in-memory column store.

Statistically Comparing Each Version


As an important side bar, I did statistically compare the Oracle Database versions. Why? Because while a 5.5% decrease in buffer throughput may seem important, it may not be statistically significant, meaning this difference can not be explained with our sample sets.

So going around saying version 12.1.0.2.0 is "slower" by 5.5% would be misleading. But in my experiment, it would NOT be misleading because the differences in buffer processing are statistically significant. The relevant experimental details are shown below.

Version A   Version B   Statistical  p-value
                         Difference
----------  ----------  -----------  -------
11.2.0.1.0  12.1.0.1.0      YES       0.0000
11.2.0.1.0  12.1.0.2.0      YES       0.0000
12.1.0.1.0  12.1.0.2.0      YES       0.0000

In all three cases the p-value was less than 0.05 signifying the two sample sets are statistically
different. Again, all the details are in the Analysis Pack.

The chart above shows the histograms of both Oracle Database 12c version sample sets together. Visually they look very separated and different with no data crossover. So from both a numeric and visual perspective there is a real difference between 12.1.0.1.0 and 12.1.0.2.0.


What Does This Mean To Me


To me this is surprising. First, there is a clear buffer processing gain upgrading from Oracle 11g to 12c. That is awesome news! But I was not expecting a statistically significant 5.5% buffer processing decrease upgrading to the more recent 12.1.0.2.0 version. Second, this has caused me to do a little digging to perhaps understand the performance decrease. The results of my experimental journey are really interesting...I think more interesting than this posting! But I'll save the details for my next article.

Remember, if you have any questions or concerns about my experiment you can run the experiment yourself. Plus all the details of my experiment are included in the Analysis Pack.

All the best in your Oracle performance tuning work!

Craig.





Saturday, November 1, 2014

Does Increasing An Oracle Background Process OS Priority Improve Performance?

This page has been permanently moved. Please CLICK HERE to be redirected.

Thanks, Craig.

Does Increasing An Oracle Background Process OS Priority Improve Performance?


Does increasing an Oracle Database background process operating system priority improve performance? As you might expect, the answer is, "It depends."

In this posting I will explain the results of an experiment where I increase the Oracle Database 12c log writer background processes operating system priority.

In my experiment I created a clear CPU bottleneck and the top wait event was log file parallel write. I gathered some data. Next I increased all the log writer background process priorities. Again, I gathered some data and then I analyzed the two data sets. The results were disappointing, not surprising, but a fundamental rule in performance tuning was demonstrated and reinforced.

You can download the "analysis pack" which contains the raw experimental data, histogram, statistical R results and the R statical package commands HERE. You can download total time delta reporting script (ttpctx.sql) I show below, which is contained within my OraPub System Monitor (OSM) Toolkit HERE.

Before I could gather some data, I needed to create the appropriate system load, the data gather scripts and the data analysis scripts. Here's a quick overview of each.

Increasing The LGWR Processes Priority

If you are not familiar with changing Oracle Database background OS process priority, I blogged about how to do this HERE.

My experiment consisted of creating two situations and statistically comparing them to see if increasing the LGWR background process OS priority increased performance. The only difference in the "two situations" was the instance parameter, _high_priority_processes. For the "normal priority" part of the experiment, the default "LMS*|VKTM" was used. For the "high priority" part of the experiment the parameter was changed to "LMS*|VKTM|LG*". The "LG*" caused the increase in the Linux OS priority of all log writer processes from the default 19 to 41.

Ready for some version specifics? Oracle continues to make OS priority related instance parameter changes... even within 12c releases. Since this experiment was done with Oracle Database version 12.1.0.1.0 the parameter default was "LMS*|VKTM" not "LMS*" as with version 12.1.0.2.0. Also, in 12.1.0.2.0 VKTM is placed into a new parameter, _highest_priority_processes.

Generating The DML With CPU Bottleneck Load


To generate the DML workload, I used my OP Load Generator. You can download it HERE. It's quick, easy and I'm familiar with it. I kept increasing the number of DML processes until the CPU utilization was pegged at 100%. While the database server was clearly out of CPU power, the log writer background processes were rarely seen as the top CPU consuming process. I used the "top" program to monitor the process activity. The top CPU consuming processes were almost always the Oracle server/foreground/shadow processes. As I'll explain in the Results section below, this has significant ramifications on the results.

Oracle Time Based Analysis Summary


The data collection script was similar to the one I used back in 2012 when collecting data regarding Oracle's commit write facility. Essentially, I collected multiple three minute samples of the delta elapsed time, user commits, total non-idle wait time and CPU consumption. In both the normal and high priority runs, around 99% of the database time was CPU consumption and 1% of the time Oracle non-idle wait time. Also the top wait event (~70%) was log file parallel write (display name: log file redo write).

If you want master an Oracle Time Based Analysis, check out my online seminar, Tuning Oracle Using An AWR Report. It will teach you how to using an AWR report to optimize Oracle performance so users will feel the difference.

The OraPub System Monitor Toolkit script ttpctx.sql report below was taken during the "normal priority" log writer load.

SQL> @ttpctx

Database: prod35                                                 16-MAY-14 06:07pm
Report:   ttpctx.sql                OSM by OraPub, Inc.             Page         1
                           Total Time Activity (39 sec interval)

                                                         Avg Time        Time     Wait
Time Component                           % TT    % WT Waited (ms)       (sec) Count(k)
------------------------------------- ------- ------- ----------- ----------- --------
CPU consumption: Oracle SP + BG procs   98.91    0.00       0.000     238.716        0
log file redo write                      0.73   67.56       8.082       1.770        0
control file parallel write              0.18   16.41      33.077       0.430        0
target log write size                    0.05    4.20       6.111       0.110        0
oracle thread bootstrap                  0.03    3.05      40.000       0.080        0
os thread creation                       0.02    1.53      20.000       0.040        0
commit: log file sync                    0.01    0.76      10.000       0.020        0
enq: CR - block range reuse ckpt         0.01    0.76      20.000       0.020        0
Disk file operations I/O                 0.00    0.00       0.000       0.000        0

The OSM report below was taken during the "high priority" log writer load.

SQL> @ttpctx

Database: prod35                                                 16-MAY-14 09:25pm
Report:   ttpctx.sql                OSM by OraPub, Inc.             Page         1
                           Total Time Activity (41 sec interval)

                                                         Avg Time        Time     Wait
Time Component                           % TT    % WT Waited (ms)       (sec) Count(k)
------------------------------------- ------- ------- ----------- ----------- --------
CPU consumption: Oracle SP + BG procs   98.92    0.00       0.000     238.733        0
log file redo write                      0.83   77.01       8.272       2.010        0
control file parallel write              0.08    7.28      14.615       0.190        0
target log write size                    0.05    4.98       5.909       0.130        0
oracle thread bootstrap                  0.03    3.07      40.000       0.080        0
os thread creation                       0.02    1.92      25.000       0.050        0
commit: log file sync                    0.01    0.77      10.000       0.020        0
enq: CR - block range reuse ckpt         0.01    0.77      20.000       0.020        0
enq: RO - fast object reuse              0.00    0.38      10.000       0.010        0


Data Collection


For the normal priority load 27 three minute samples where collected. For the high priority situation there were 30 three minute samples collected. (I forgot why there was only 27 samples collected for the normal priority.)  I collected the elapsed time, total non-idle wait time, total CPU consumption (v$sys_time_model: db_cpu + background cpu time) and total user commits.

In this experiment more user commits processed per second means better performance.


Experimental Results


I used the free statistics package "R" (www.r-project.org) to analyze the data. I demonstrate how to get, install and use "R" in my online video seminar, Using Skewed Data To Your Advantage.

With the normal log writer process priority, an average of 984.5 commits/sec and a median of 983.0 commits/sec occurred. With the LG* high process priority, an average of 993.6 commits/sec and a median of 991.0 commits/sec occurred. While the "high priority" situation was able to process more commits per second, is this statistically significant?

The red "smoothed" histogram is the normal priority situation and the blue smoothed histogram is when the log writers were set to the higher priority. The more separated the two histograms the more "different" the sample sets, the more likely there is a statistically significant difference and the more likely a user would feel the difference. Looking at the above histograms plot, there does not appear to be a real difference. But let's do a proper significance test!

Because both sample sets are normally distributed (details are in the Analysis Pack), I could use a simple t-test. R produced a p-value of 0.04451. To be statistically "different" I want the p-value to be less than 0.05 and it is. What does this mean?

While statistically and numerically the commit rates are different, I wouldn't expect any special performance tuning award! In fact, the hassles with cycling a production instance and setting underscore/hidden parameters would make it very unlikely I would increase the OS priority of the log writer background processes. I want to see a big performance difference.

To Summarize... What I Learned... Again


This situation is a perfect example of focusing on the wrong thing! While there is a clear operating system CPU bottleneck and the top wait event is about redo, the log writers are not suffering from a want/need of CPU resources. I suspect the server processes want more CPU resources, but they are NOT the processes we increased their OS priority.

If the log writers were suffering from a lack of CPU resources and fighting for CPU resources, I would expect to see them consuming CPU resources along with the Oracle server processes. And I would definitely expect to see them near the top of the "top" process monitor... especially when their priority has been increased!

Because of this "misguided" tuning effort, this experiment does not build a case for or against changing the log writer priority. What it reinforces is in our quest to optimize performance, make sure we focus on the right thing.

As a side note, this is a great statistical analysis example for two reasons. First, our samples sets look similar, but statistically they are not. Second, while they are statistically different, the performance impact will not be very different. And my guess is the users won't feel a thing... except frustration and anger.

To Super Summarize


When considering increasing a background process's operating system priority, make sure the process is in need of CPU and is not able to get it. In this situation, the DBA could have been mislead by the CPU bottleneck. But upon closer inspection of the log writers from an operating system perspective and knowing the wait event "log file parallel write" is probably more about IO than CPU (Oracle perspective) it would be unlikely that increasing the log writer processes OS priority would help increase the commits per second.

Thanks for reading!

Craig.