Monday, October 27, 2014

How To Change The Priority Of Oracle Background Processes

This page has been permanently moved. Please CLICK HERE to be redirected.

Thanks, Craig.

How To Change The Priority Of Oracle Background Processes


Before you get in a huf, it can be done! You can change an Oracle Database background process

priority through an instance parameter! I'm not saying it's a good idea, but it can be done.

In this post I explore how to make the change, just how far you can take it and when you may want to consider changing an Oracle background process priority.

To get your adrenaline going, check out the instance parameter _high_priority_processes from one of your production Oracle system with a version of 11 or greater. Here is an example using my OSM tool, ipx.sql on my Oracle Database version 12.1.0.2.0.
SQL> @ipx _highest_priority_processes
Database: prod40                                               27-OCT-14 02:22pm
Report:   ipx.sql              OSM by OraPub, Inc.                Page         1
                         Display ALL Instance Parameters

Instance Parameter and Value                       Description          Dflt?
-------------------------------------------------- -------------------- -----
_highest_priority_processes         = VKTM         Highest Priority     TRUE
                                                   Process Name Mask
Then at the Linux prompt, I did:
$ ps -eo pid,class,pri,nice,time,args | grep prod40
 2879 TS   19   0 00:00:00 ora_pmon_prod40
 2881 TS   19   0 00:00:01 ora_psp0_prod40
 2883 RR   41   - 00:02:31 ora_vktm_prod40
 2889 TS   19   0 00:00:01 ora_mman_prod40
 2903 TS   19   0 00:00:00 ora_lgwr_prod40
 2905 TS   19   0 00:00:01 ora_ckpt_prod40
 2907 TS   19   0 00:00:00 ora_lg00_prod40
 2911 TS   19   0 00:00:00 ora_lg01_prod40
...
Notice the "pri" for priority of the ora_vktm_prod40 process? It is set to 41 while all the rest of the Oracle background processes are set to the default of 19. Very cool, eh?

Surprised By What I Found


Surprised? Yes, surprised because changing Oracle process priority is a pandoras box. Just imagine if an Oracle server (i.e., foreground) process has its priority lowered just a little and then attempts to acquire a latch or a mutex? If it doesn't get the latch quickly, I might never ever get it!

From a user experience perspective, sometimes performance really quick and other times the application just hangs.

This actually happened to a customer of mine years ago when the OS started reducing a process's priority after it consumed a certain amount of CPU. I learned that when it comes to Oracle processes, they are programed to expect an even process priority playing field. If you try to "game" the situation, do so at your own risk... not Oracle's.

Then why did Oracle Corporation allow background process priority to be changed. And why did Oracle Corporation actually change a background processes priority?!

Doing A Little Exploration


It turns out there are a number of "priority" related underscore instance parameters! On my 11.2.0.1.0 system there 6 "priority" parameters. On my 12.1.0.1.0 system there are 8 "priority" parameters. On my 12.1.0.2.0 system there are 13 "priority" parameters! So clearly Oracle is making changes! In all cases, the parameter I'm focusing on, "_high_priority_processes" exists.

In this posting, I'm going to focus on my Oracle Database 12c version 12.1.0.2.0 system. While you may see something different in your environment, the theme will be the same.

While I'll be blogging about all four of the below parameters, in this posting my focus will be on the _high_priority_processes parameter. Below are the defaults on my system:
_high_priority_processes        LMS*
_highest_priority_processes     VKTM
_os_sched_high_priority         1
_os_sched_highest_priority      1

Messing With The LGWR Background Processes


I'm not testing this on a RAC system, so I don't have an LMS background process. When I saw the "LMS*" I immediately thought, "regular expression." Hmmm... I wonder if I can change the LGWR background process. So I made the instance parameter change and recycled the instance. Below shows the instance parameter change:
SQL> @ipx _high_priority_processes
Database: prod40                                               27-OCT-14 02:36pm
Report:   ipx.sql              OSM by OraPub, Inc.                Page         1
                         Display ALL Instance Parameters

Instance Parameter and Value                       Description          Dflt?
-------------------------------------------------- -------------------- -----
_high_priority_processes            = LMS*|LGWR    High Priority        FALSE
                                                   Process Name Mask

Below is an operating system perspective using the ps command:

ps -eo pid,class,pri,nice,time,args | grep prod40
...
 5521 RR   41   - 00:00:00 ora_vktm_prod40
 5539 TS   19   0 00:00:00 ora_dbw0_prod40
 5541 RR   41   - 00:00:00 ora_lgwr_prod40
 5545 TS   19   0 00:00:00 ora_ckpt_prod40
 5547 TS   19   0 00:00:00 ora_lg00_prod40
 5551 TS   19   0 00:00:00 ora_lg01_prod40
...

How Far Can I Take This?


At this point in my journey, my mind was a blaze! The log file sync wait event can be really difficult to deal with and especially so when there is a CPU bottleneck. Hmmm... Perhaps I can increase the priority of all the log writer background processes?

So I made the instance parameter change and recycled the instance. Below shows the instance parameter change:
SQL> @ipx _high_priority_processes
Database: prod40                                               27-OCT-14 02:44pm
Report:   ipx.sql              OSM by OraPub, Inc.                Page         1
                         Display ALL Instance Parameters

Instance Parameter and Value                       Description          Dflt?
-------------------------------------------------- -------------------- -----
_high_priority_processes            = LMS*|LG*     High Priority        FALSE
                                                   Process Name Mask

Below is an operating system perspective using the ps command:

ps -eo pid,class,pri,nice,time,args | grep prod40
...
 5974 TS   19   0 00:00:00 ora_psp0_prod40
 5976 RR   41   - 00:00:00 ora_vktm_prod40
 5994 TS   19   0 00:00:00 ora_dbw0_prod40
 5996 RR   41   - 00:00:00 ora_lgwr_prod40
 6000 TS   19   0 00:00:00 ora_ckpt_prod40
 6002 RR   41   - 00:00:00 ora_lg00_prod40
 6008 RR   41   - 00:00:00 ora_lg01_prod40
 6014 TS   19   0 00:00:00 ora_lreg_prod40
...

So now all the log writer background processes have a high priority. My hope would be that if there is an OS CPU bottleneck and the log writer background processes wanted more CPU, I now have the power to give that to them! Another tool in my performance tuning arsenal!

Security Hole?


At this point, my exuberance began to turn into paranoia. I thought, "Perhaps I can increase the priority of an Oracle server process or perhaps any process." If so, that would be a major Oracle Database security hole.

With fingers trembling, I changed the instance parameters to match an Oracle server process and recycled the instance. Below shows the instance parameter change:

SQL> @ipx _high_priority_processes
Database: prod40                                               27-OCT-14 02:52pm
Report:   ipx.sql              OSM by OraPub, Inc.                Page         1
                         Display ALL Instance Parameters

Instance Parameter and Value                       Description          Dflt?
-------------------------------------------------- -------------------- -----
_high_priority_processes            =              High Priority        FALSE
LMS*|LG*|oracleprod40                              Process Name Mask

Below is an operating system perspective using the ps command:

$ ps -eo pid,class,pri,nice,time,args | grep prod40
...
 6360 TS   19   0 00:00:00 ora_psp0_prod40
 6362 RR   41   - 00:00:00 ora_vktm_prod40
 6366 TS   19   0 00:00:00 ora_gen0_prod40
 6382 RR   41   - 00:00:00 ora_lgwr_prod40
 6386 TS   19   0 00:00:00 ora_ckpt_prod40
 6388 RR   41   - 00:00:00 ora_lg00_prod40
 6394 RR   41   - 00:00:00 ora_lg01_prod40
 6398 TS   19   0 00:00:00 ora_reco_prod40
...
 6644 TS   19   0 00:00:00 oracleprod40...
...

OK, that didn't work so how about this?

SQL> @ipx _high_priority_processes
Database: prod40                                               27-OCT-14 02:55pm
Report:   ipx.sql              OSM by OraPub, Inc.                Page         1
                         Display ALL Instance Parameters

Instance Parameter and Value                       Description          Dflt?
-------------------------------------------------- -------------------- -----
_high_priority_processes            =              High Priority        FALSE
LMS*|LG*|*oracle*                                  Process Name Mask

Let's see what happened at the OS.

$ ps -eo pid,class,pri,nice,time,args | grep prod40
...
 6701 RR   41   - 00:00:00 ora_vktm_prod40
 6705 RR   41   - 00:00:00 ora_gen0_prod40
 6709 RR   41   - 00:00:00 ora_mman_prod40
 6717 RR   41   - 00:00:00 ora_diag_prod40
 6721 RR   41   - 00:00:00 ora_dbrm_prod40
 6725 RR   41   - 00:00:00 ora_vkrm_prod40
 6729 RR   41   - 00:00:00 ora_dia0_prod40
 6733 RR   41   - 00:00:00 ora_dbw0_prod40
...
 6927 RR   41   - 00:00:00 ora_p00m_prod40
 6931 RR   41   - 00:00:00 ora_p00n_prod40
 7122 TS   19   0 00:00:00 oracleprod40 ...
 7124 RR   41   - 00:00:00 ora_qm02_prod40
 7128 RR   41   - 00:00:00 ora_qm03_prod40

Oh Oh... That's not good! Now EVERY Oracle background process has a higher priority and my Oracle server process does not.

So my "*" wildcard caused all the Oracle processes to be included. If all the processes a high prioirty, then the log writer processes have no advantage over the others. And to make matters even worse, my goal of increasing the server process priority did not occur.

However, this is actually very good news because it appears this is not an Oracle Database security hole! To me, it looks like the priority parameter is applied during the instance startup for just the background processes. Since my server process was started after the instance was started and for sure not included in the list of background processes, its priority was not affected. Good news for security, not as good of news for a performance optimizing fanatic such as myself.

Should I Ever Increase A Background Process Priority?


Now that we know how to increase an Oracle Database background process priority, when would we ever want to do this? The short answer is probably never. But the long answer is the classic, "it depends."

Let me give you an example. Suppose there is an OS CPU bottleneck and the log writer background processes are consuming lots of CPU while handling all the associated memory management when server process issues a commit. In this situation, performance may benefit by making it easier for the log writer processes to get CPU cycles, therefore improving performance. But don't even think about doing this unless there is a CPU bottleneck. And even then, be very very careful.

In my next block posting, I'll detail an experiment where I changed the log writer background processes priority.

Thanks for reading!

Craig.



Wednesday, October 8, 2014

11 Tips To Get Your Conference Abstract Accepted

This page has been permanently moved. Please CLICK HERE to be redirected.

Thanks, Craig.

11 Ways To Get Your Conference Abstract Accepted


This is what happens when your abstract is selected!
Ready for some fun!? It's that time of year again and the competition will be intense. The "call for abstracts" for a number of Oracle Database conferences are about to close.

The focus of this posting is how you can get a conference abstract accepted.

As a mentor, Track Manager and active conference speaker I've been helping DBAs get their abstracts accepted for many years. If you follow my 11 tips below, I'm willing to bet you will get a free pass to any conference you wish in any part of the world.

1. No Surprises! 


Track Manager After A Surprise
The Track Manager wants no surprises, great content and a great presentation. Believe me when I say, they are looking for ways to reduce the risk of a botched presentation, a cancelation or a no show. Your abstract submissions is your first way to show you are serious and will help make the track incredibly awesome.

Tip: In all your conference communications, demonstrate a commitment to follow through.

2. Creative Title.


The first thing everyone sees is the title. I can personally tell you, if the title does not peak my curiosity without sounding stupid, then unless I know the speaker is popular I will not read the abstract. Why do I do this? Because as a Track Manager, I know conference attendees will do the same thing! And as a Track Manager, I want attendees to want to attend sessions in my track.

Tip: Find two people, read the title to them and ask what they think. If they say something like, "What are you going to talk about?" that's bad. Rework the title.

3. Tell A Story


The abstract must tell a compelling story. Oracle conferences are not academic conferences! There needs to be some problem along with a solution complete with drama woven into the story.

Tip: People forget bullet points, but they never forget a good story.

4. Easy To Read


The abstract must be easy to review. The abstract reviewers may have over a hundred abstracts to review. Make it a good quick read for the reviewers and your chances increase.

Tip: Have your computer read your abstract back to you. If you don't say, "Wow!" rework the abstract. 

5. Be A Grown-Up


You can increase the perception you will physically show up and put on a great show at the conference by NOT putting into your abstract emoji, bullet points, your name and title or pushing a product or service. NEVER copy/paste from a powerpoint outline into the abstract or outline. (I've seen people do this!)

Tip: Track Managers do not want to baby sit you. They want an adult who will help make their track great.

6. Submit Introductory Level Abstracts


I finally figured this out a couple years ago. Not everyone is ready for a detailed understanding of cache buffer chain architecture, diagnosis, and solution development. Think of it from a business perspective. Your market (audience) will be larger if your presentation is less technical. If this bothers you, read my next point.

Tip: Submit both an introductory level version and advanced level version of your topic.

7. Topics Must Be Filled


Not even the Track Manager knows what people will submit. And you do not know what the Track Manager is looking for. And you do not know what other people are submitting. Mash this together and it means you must submit more than one abstract. I know you really, really want to present on topic X. But would you rather not have an abstract accepted?

Tip: Submit abstracts on multiple topics. It increases your chances of being accepted.

8. Submit Abstract To Multiple Tracks


This is similar to submitting both an introductory version of your abstract. Here's an example: If there is a DBA Bootcamp track and a Performance & Internals Track, craft your abstract to Bootcamp version has a more foundational/core feel to it. And craft your Performance & Internals version to feel more technical and advanced.

Do not simply change the title and the abstract can not be the same.  If the conference managers or the Track Manager feels you are trying to game the conference, you present a risk to the conference and their track and your abstracts will be rejected. So be careful and thoughtful.

Tip: Look for ways to adjust your topic to fit into multiple tracks.

9. Great Outline Shows Commitment


If the reviewers have read your title and abstract, they are taking your abstract seriously. Now is the time to close the deal by demonstrating you will put on a great show. And this means you already have in mind an organized and well thought out delivery. You convey this with a fantastic outline. I know it is difficult to create an outline BUT the reviewers also know this AND having a solid outline demonstrates to them you are serious, you will show up, and put on a great show.

Tip: Develop your abstract and outline together. This strengthens both and develops a kind of package the reviewers like to see.

10. Learning Objectives Show Value


You show the obvious value of your topic through the learning objectives. Personally, I use these to help keep me focused on my listener, just not what I'm interested in at the moment. Because I love my work, I tend to think everyone also does... not so. I must force myself to answer the question, "Why would a DBA care about this topic?"

Tip: Develop your learning objectives by asking yourself, "When my presentation is over, what do I want the attendees to remember?"

11. Submit About Problems You Solved


Submit on the topics you have personally explored and found fascinating. Every year, every DBA has had to drill deep into at least one problem. This concentrated effort means you know the topic very well. And this means you are qualified to tell others about it! People love to hear from people who are fascinated about something. Spread the good news resulting from a "bad" experience.

Tip: Submit on topics you have explored and are fascinated with.

How Many Abstracts Should I Submit?


It depends on the conference, but for a big North America conference like ODTUG, RMOUG and IOUG I suggest at least four.

Based on what I wrote above, pick three topics, perhaps create both an introductory and advanced version and look to see if it makes sense to submit to multiple tracks. That means you'll probably submit at least four abstracts. It's not as bad as it sounds, because you will only have perhaps three core abstracts. All the others are modifications to fit a specific need. Believe when you receive the acceptance email, it will all be worth it!

See you at the conference!

Craig.


Monday, October 6, 2014

Comparing SQL Execution Times From Different Systems

This page has been permanently moved. Please CLICK HERE to be redirected.

Thanks, Craig.

Comparing SQL Execution Times From Different Systems


Suppose it's your job to identify SQL that may run slower in the about-to-be-upgrated Oracle Database. It's tricky because no two systems are alike. Just because the SQL run time is faster in the test environment doesn't mean the decision to upgrade is a good one. In fact, it could be disastrous.

For example; If a SQL statement runs 10 seconds in production and runs 20 seconds in QAT, but the production system is twice as fast as QAT, is that a problem? It's difficult to compare SQL runs times when the same SQL resides in different environments.

In this posting, I present a way to remove the CPU speed differences, so an appropriate "apples to apples" SQL elapsed time comparison can be made, thereby improving our ability to more correctly detect risky SQL that may be placed into the upgraded production system.

And, there is a cool, free, downloadable tool involved!

Why SQL Can Run Slower In Different Environments


There are a number of reasons why a SQL's run time is different in different systems. An obvious reason is a different execution plan. A less obvious and much more complex reason is a workload intensity or type difference. In this posting, I will focus on CPU speed differences. Actually, what I'll show you is how to remove the CPU speed differences so you can appropriately compare two SQL statements. It's pretty cool.

The Mental Gymnastics


If a SQL statement's elapsed time in production is 10 seconds and 20 seconds in QAT, that’s NOT an issue IF the production system is twice as fast.

If this makes sense to you, then what you did was mentally adjust one of the systems so it could be appropriately compared. This is how I did it:

10 seconds in production * production is 2 times as fast as QA  = 20 seconds 

And in QA the sql ran in 20 seconds… so really they ran “the same” in both environments. If I am considering placing the SQL from the test environment into the production environment, then this scenario does not raise any risk flags. The "trick" is determining "production is 2 times as fast as QA" and then creatively use that information.

Determining The "Speed Value"


Fortunately, there are many ways to determine a system's "speed value." Basing the speed value on Oracle's ability to process buffers in memory has many advantages: a real load is not required or even desired, real Oracle code is being run at a particular version, real operating systems are being run and the processing of an Oracle buffer highly correlates with CPU consumption.

Keep in mind, this type of CPU speed test is not an indicator of scalability (benefit of adding additional CPUs) in any way shape or form. It is simply a measure of brut force Oracle buffer cache logical IO processing speed based on a number of factors. If you are architecting a system, other tests will be required.

As you might expect, I have a free tool you can download to determine the "true speed" rating. I recently updated it to be more accurate, require less Oracle privileges, and also show the execution plan of the speed test tool SQL. (A special thanks to Steve for the execution plan enhancement!) If the execution plan used in the speed tool is difference on the various systems, then obviously we can't expect the "true speeds" to be comparable.

You can download the tool HERE.

How To Analyze The Risk


Before we can analyze the risk, we need the "speed value" for both systems. Suppose a faster system means its speed rating is larger. If the production system speed rating is 600 and the QAT system speed rating is 300, then production is deemed "twice as fast."

Now let's put this all together and quickly go through three examples.

This is the core math:

standardized elapsed time = sql elapsed time * system speed value

So if the SQL elapsed time is 25 seconds and the system speed value is 200, then the standardized "apples-to-apples" elapsed time is 5000 which is 25*200. The "standardized elapsed time" is simply a way to compare SQL elapsed times, not what users will feel and not the true SQL elapsed time.

To make this a little more interesting, I'll quickly go through three scenarios focusing on identifying risk.

1. The SQL truly runs the same in both systems.

Here is the math:

QAT standardized elapsed time = 20 seconds X 300 = 6000 seconds

PRD standardized elapsed time = 10 seconds X 600 = 6000 seconds

In this scenario, the true speed situation is, QAT = PRD. This means, the SQL effectively runs just as fast in QAT as in production. If someone says the SQL is running slower in QAT and therefore this presents a risk to the upgrade, you can confidently say it's because the PRD system is twice as fast! In this scenario, the QAT SQL will not be flagged as presenting a significant risk when upgrading from QAT to PRD.

2. The SQL runs faster in production.

Now suppose the SQL runs for 30 seconds in QAT and for 10 seconds in PRD. If someone was to say, "Well of course it's runs slower in QAT because QAT is slower than the PRD system." Really? Everything is OK? Again, to make a fare comparison, we must compare the system using a standardizing metric, which I have been calling the, "standardized elapsed time."

Here are the scenario numbers:

QAT standardized elapsed time = 30 seconds X 300 = 9000 seconds
PRD standardized elapsed time = 10 seconds X 600 = 6000 seconds

In this scenario, the QAT standard elapsed time is greater than the PRD standardized elapsed time. This means the QAT SQL is truly running slower in QAT compared to PRD. Specifically, this means the slower SQL in QAT can not be fully explained by the slower QAT system. Said another way, while we expect the SQL in QAT to run slower then in the PRD system, we didn't expect it to be quite so slow in QAT. There must another reason for this slowness, which we are not accounting for. In this scenario, the QAT SQL should be flagged as presenting a significant risk when upgrading from QAT to PRD.

3. The SQL runs faster in QAT.

In this final scenario, the SQL runs for 15 seconds in QAT and for 10 seconds in PRD. Suppose someone was to say, "Well of course the SQL runs slower in QAT. So everything is OK." Really? Everything is OK? To get a better understanding of the true situation, we need to look at their standardized elapsed times.

QAT standardized elapsed time = 15 seconds X 300 = 4500 seconds
PRD standardized elapsed time = 10 seconds X 600 = 6000 seconds 

In this scenario, QAT standard elapsed time is less then the PRD standardized elapsed time. This means the QAT SQL is actually running faster in the QAT, even though the QAT wall time is 15 seconds and the PRD wall time is only 10 seconds. So while most people would flag this QAT SQL as "high risk" we know better! We know the QAT SQL is actually running faster in QAT than in production! In this scenario, the QAT SQL will not be flagged as presenting a significant risk when upgrading from QAT to PRD.

In Summary...


Identify risk is extremely important while planning for an upgrade. It is unlikely the QAT and production system will be identical in every way. This mismatch makes identifying risk more difficult. One of the common differences in systems is their CPU processing speeds. What I demonstrated was a way to remove the CPU speed differences, so an appropriate "apples to apples" SQL elapsed time comparison can be made, thereby improving our ability to more correctly detect risky SQL that may be placed into the upgraded production system.

What's Next?


Looking at the "standardized elapsed time" based on Oracle LIO processing is important, but it's just one reason why a SQL may have a different elapsed time in a different environment. One of the big "gotchas" in load testing is comparing production performance to a QAT environment with a different workload. Creating an equivalent workload on different systems is extremely difficult to do. But with some very cool math and a clear understanding of performance analysis, we can also create a more "apples-to-apples" comparison, just like we have done with CPU speeds. But I'll save that for another posting.

All the best in your Oracle performance work!

Craig.