Monday, January 12, 2015

Do The LGWRs Always Sleep For The Full Three Seconds?

This page has been permanently moved. Please CLICK HERE to be redirected.

Thanks, Craig.

Do Oracle Database LGWRs (10g, 11g, 12c) Always Sleep For The Full Three Seconds?


Back in June I wrote (included a video) about the Oracle Database log writer(s) "3 second sleep rule." That's the rule we were all taught by our instructors when we started learning about Oracle yet never really knew if it was true. In that post, I demonstrated Oracle Database log writer background processes are normally put to sleep for three seconds.

In this post, I want to answer a related but different question.

Do Oracle Database log writer background processes ALWAYS sleep for the full three seconds? Our initial response would likely be, "Of course not! Because what if a foreground process commits during the three second sleep? The log writer(s) must wake up." That would make sense.

But, is this really true and what else could we learn by digging into this? I created an experiment to check this out, and that is what this post is all about.

The Experiment


In my June post I demonstrated the Three Second Rule. You will see this again below. But in this experiment we are looking for a situation when one of the 12c log writers wakes BEFORE their three second sleep.

You can download the experimental script I detail below HERE.

This is really tricky to demonstrate because of all the processes involved. There is a the Oracle foreground process and in 12c, there are multiple log writer background processes. Because this is experiment follows a timeline, I needed to gather the process activity data and then somehow merge it all together in a way that we humans can understand.

What I did was to do an operating system trace ( strace ) each process ( strace -p $lgwr )  with the timestamp option ( strace -p $lgwr -tt ) sending each process's the output to a separate file ( strace -p $lgwr -tt -o lgwr.txt ). This was done to all four processes and of course, I needed to start the scripts to run in the background. Shown directly below are the log writer strace details.

lgwr=`ps -eaf | grep $sid | grep lgwr | awk '{print $2}'`
lg00=`ps -eaf | grep $sid | grep lg00 | awk '{print $2}'`
lg01=`ps -eaf | grep $sid | grep lg01 | awk '{print $2}'`

echo "lgwr=$lgwr  lg00=$lg00  lg01=$lg01"

strace -p $lgwr -tt -o lgwr.str &
strace -p $lg00 -tt -o lg00.str &
strace -p $lg01 -tt -o lg01.str &

Once the log writers were being traced, I connected to sqlplus and launched the below text in the background as well.

drop table bogus;
create table bogus as select * from dba_objects where object_id in (83395,176271,176279,176280);
select * from bogus;
commit;
exec dbms_lock.sleep(2.1);

exec dbms_lock.sleep(2.2);
exec dbms_lock.sleep(2.3);
update bogus set object_name='83395' where object_id=83395;
exec dbms_lock.sleep(3.1);
update bogus set object_name='176271' where object_id=176271;
exec dbms_lock.sleep(3.2);
update bogus set object_name='176279' where object_id=176279;
exec dbms_lock.sleep(3.3);
update bogus set object_name='176280' where object_id=176280;
exec dbms_lock.sleep(3.4);
commit;
exec dbms_lock.sleep(3.5);
update bogus set object_name='89567' where object_id=89567;
exec dbms_lock.sleep(3.6);
commit;
exec dbms_lock.sleep(3.7);
exit;

Once the sqlplus session was connected,

sqlplus system/manager @/tmp/runit.bogus &
sleep 2

I grabbed it's OS process id and started an OS trace on it as well:

svpr=`ps -eaf | grep -v grep | grep oracle$sid | awk '{print $2}' `
echo "svpr=$svpr"

strace -p $svpr -tt -o svpr.str &

Then I slept for 30 seconds, killed the tracing processes (not the log writers!):

sleep 30

for pid in `ps -eaf | grep -v grep | grep strace | awk '{print $2}'`
do
  echo "killing pid $pid"
  kill -2 $pid
done

Then I merged the trace files, sorted them by time, got rid of stuff in the trace files I didn't want to see and put the results into a final "clean" file.

merge=/tmp/strace.merge.bogus
rm -f $merge
for fn in lgwr lg00 lg01 svpr
do
  cat ${fn}.str | awk -v FN=$fn '{print $1 " " FN " " $2 " " $3 " " $4 " " $5 " " $6 " " $7 " " $8 " " $9}' >> $merge
done

ls -ltr $merge
date
cat $merge | sort > /tmp/final.bogus

cat /tmp/final.bogus | grep -v times | grep -v getrusage | grep -v "svpr lseek" | grep -v clock_gettime | grep -v gettimeofday | grep -v "svpr read" | grep -v "svpr write" > /tmp/final.bogus.clean

The amazing thing is... this actually worked! Here is the output below:

19:11:41.981934 svpr semtimedop(7503875, {{34, -1, 0}}, 1, {2, 200000000}) =
19:11:42.859905 lg01 semtimedop(7503875, {{19, -1, 0}}, 1, {3, 0}) =
19:11:43.986421 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {3, 0}) =
19:11:44.186404 svpr semtimedop(7503875, {{34, -1, 0}}, 1, {2, 300000000}) =
19:11:44.982768 lg00 semtimedop(7503875, {{18, -1, 0}}, 1, {3, 0}) =
19:11:45.860871 lg01 semtimedop(7503875, {{19, -1, 0}}, 1, {3, 0}) =
19:11:46.499014 svpr semtimedop(7503875, {{34, -1, 0}}, 1, {3, 100000000}) =
19:11:46.989885 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {3, 0}) =
19:11:47.983782 lg00 semtimedop(7503875, {{18, -1, 0}}, 1, {3, 0}) =
19:11:48.861837 lg01 semtimedop(7503875, {{19, -1, 0}}, 1, {3, 0}) =
19:11:49.608154 svpr semtimedop(7503875, {{34, -1, 0}}, 1, {3, 200000000}) =
19:11:49.993520 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {3, 0}) =
19:11:50.984737 lg00 semtimedop(7503875, {{18, -1, 0}}, 1, {3, 0}) =
19:11:51.862921 lg01 semtimedop(7503875, {{19, -1, 0}}, 1, {3, 0}) =
19:11:52.817751 svpr semtimedop(7503875, {{34, -1, 0}}, 1, {3, 300000000}) =
19:11:52.997116 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {3, 0}) =
19:11:53.985784 lg00 semtimedop(7503875, {{18, -1, 0}}, 1, {3, 0}) =
19:11:54.863809 lg01 semtimedop(7503875, {{19, -1, 0}}, 1, {3, 0}) =
19:11:55.998974 lgwr open("/proc/41955/stat", O_RDONLY) = 19
19:11:55.999029 lgwr read(19, "41955 (ora_pmon_prod35) S 1 4195"..., 999) =
19:11:55.999075 lgwr close(19) = 0
19:11:55.999746 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {3, 0}) =
19:11:56.127326 svpr semtimedop(7503875, {{34, -1, 0}}, 1, {3, 400000000}) =
19:11:56.986935 lg00 semtimedop(7503875, {{18, -1, 0}}, 1, {3, 0}) =
19:11:57.864930 lg01 semtimedop(7503875, {{19, -1, 0}}, 1, {3, 0}) =
19:11:59.003212 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {3, 0}) =
19:11:59.531161 svpr semctl(7503875, 16, SETVAL, 0x7fff00000001) = 0
19:11:59.531544 lgwr semctl(7503875, 18, SETVAL, 0x7fff00000001) = 0
19:11:59.532204 lg00 pwrite(256, "\1\"\0\0\311\21\0\0\354\277\0\0\20\200{\356\220\6\0\0\r\0\0\0\367^K\5\1\0\0\0"..., 2048, 2331136) = 2048
19:11:59.532317 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {2, 480000000}) =
19:11:59.532680 svpr semtimedop(7503875, {{34, -1, 0}}, 1, {0, 100000000}) =
19:11:59.537202 lg00 semctl(7503875, 34, SETVAL, 0x7fff00000001) = 0
19:11:59.537263 lg00 semctl(7503875, 16, SETVAL, 0x7fff00000001) = 0
19:11:59.537350 lg00 semtimedop(7503875, {{18, -1, 0}}, 1, {3, 0}) =
19:11:59.538483 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {2, 470000000}) =
19:11:59.540574 svpr semtimedop(7503875, {{34, -1, 0}}, 1, {3, 500000000}) =
19:12:00.865928 lg01 semtimedop(7503875, {{19, -1, 0}}, 1, {3, 0}) =
19:12:02.011876 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {3, 0}) =
19:12:02.537887 lg00 semtimedop(7503875, {{18, -1, 0}}, 1, {3, 0}) =
19:12:03.050381 svpr semtimedop(7503875, {{34, -1, 0}}, 1, {3, 600000000}) =
19:12:03.866796 lg01 semtimedop(7503875, {{19, -1, 0}}, 1, {3, 0}) =
19:12:05.014819 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {3, 0}) =
19:12:05.538797 lg00 semtimedop(7503875, {{18, -1, 0}}, 1, {3, 0}) =
19:12:06.657075 svpr semtimedop(7503875, {{34, -1, 0}}, 1, {3, 700000000}) =
19:12:06.867922 lg01 semtimedop(7503875, {{19, -1, 0}}, 1, {3, 0}) =
19:12:08.017814 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {3, 0}) =
19:12:08.539750 lg00 semtimedop(7503875, {{18, -1, 0}}, 1, {3, 0}) =
19:12:09.868825 lg01 semtimedop(7503875, {{19, -1, 0}}, 1, {3, 0} 

There is a lot of detail in the above output. I'm only going to make a few comments that pertain to the objectives of this post.

Oracle is using the semaphore call semtimedop to sleep. The beauty of this call, is it allow the process to be woken (that is, signaled) by another process! Keep that mind as you follow the timeline.

Here we go:

19:11:41.981934. Notice the server process' "2, 2" and later the "2,3" and "3, 1" and "3, 2"? This is the result of the dbms_lock.sleep commands contained in the sqlplus script!

19:11:42.859905. Notice lg01 and the other log writer background processes always have a "3, 0" semtimedop call? That is their "3 second sleep."

Look at the first few lgwr entries. I've listed them here:

19:11:43.986421
19:11:46.989885
19:11:49.993520
19:11:52.997116

Notice anything strange about the above times? They are all just about 3 seconds apart of from each other. That's the 3 second sleep in action. But that's not the focus of this post. So let's move on.

Read this slow: I want to focus on just one part of the output which, is shown below. Notice the server process is sleeping for 3.4 seconds. If you look at the sqlplus script (near the top of this post), immediately after the 3.4 second sleep the server process issues a commit. Therefore, because the 3.4 sleep starts at 19:11:56.1 and I'm expecting to see some log writer activity in 3.4 seconds. This would be at 19.11.59.5. This could occur in the middle of the log writer 3 second sleep, which means we will likely see a log writer kick into action before their 3 second sleep completes! Let's take a look.

19:11:56.127326 svpr semtimedop(7503875, {{34, -1, 0}}, 1, {3, 400000000}) =
19:11:56.986935 lg00 semtimedop(7503875, {{18, -1, 0}}, 1, {3, 0}) =
19:11:57.864930 lg01 semtimedop(7503875, {{19, -1, 0}}, 1, {3, 0}) =
19:11:59.003212 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {3, 0}) =
19:11:59.531161 svpr semctl(7503875, 16, SETVAL, 0x7fff00000001) = 0
19:11:59.531544 lgwr semctl(7503875, 18, SETVAL, 0x7fff00000001) = 0
19:11:59.532204 lg00 pwrite(256, "\1\"\0\0\311\21\0\0\354\277\0\0\20\200{\356\220\6\0\0\r\0\0\0\367^K\5\1\0\0\0"..., 2048, 2331136) = 2048
19:11:59.532317 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {2, 480000000}) 

We can see the server process 3.4 second sleep starting at time 19:11:56.1 and we can see the sleep end and the server process' next command begin at the expected time of 19:11:59.5. Next in the trace file output is result of the commit. The commit results in the wake of both the lgwr and lg00 background processes.

But notice the lgwr background process started one of its 3 second sleeps at 19:11:59.0 which means it doesn't want to wake until 19:12:02.0. But look at when the lgwr process woke up. It woke up at 19.11.59.5 which is clearly before the expected time of 19:12:02.0. What you just noticed was the lgwr background process was signaled to wake up before its three second sleep completed.

But why did the lgwr need to be woken up? Because the server process' redo must be immediately written.

But it gets even better because the lgwr background process doesn't do the redo write! The lgwr process signals the lg00 process to do the write, which we can see occurs at time 19:11:59:5. Wow. Amazing!

What We Can Learn From This


Personally, I love these kinds of postings because we can see Oracle in action and demonstrating what we believe to be true. So what does all this actually demonstrate? Here's a list:

  1. We can see the 12c log writers involved. Not only lgwr.
  2. All log writer background process initiate a sleep for the default three seconds. I have seen situations where it is not three seconds, but it appears the default is three seconds.
  3. The server process signals the lgwr process to write immediately after a commit is issued.
  4. The server process signals the lgwr process to write using a semaphore.
  5. The log writers (starting in 12c) can signal each other using semaphores. We saw lgwr signal the lg00 background process to write.
  6. The server process was performing updates over 10+ a second period, yet its redo was not written to disk until it committed. This demonstrates that ALL redo is not flushed every three seconds. (This is probably not what you learned... unless you joined one of my Oracle Performance Firefighting classes.)
  7. The log writers while normally put to sleep for three seconds, can be woken in the middle for an urgent task (like writing committed data to an online redo log).

I hope you enjoyed this post!

Thanks for reading,

Craig.

1 comment:

  1. One more observation.
    The lgwr after wokeup and instructing lg00 to flush redo went to sleep for 2.48 secs, may be (3.0 - time spent on redo flush).
    19:11:59.532317 lgwr semtimedop(7503875, {{16, -1, 0}}, 1, {2, 480000000}

    My question.
    1. We learnt that lgwr will flush redo every 3 secs or 1/3 rd of the log buffer is full and whichever is earlier.
    Is this not correct in 12c, especially when 1/3rd log buffer is full.

    2. The log buffer is cyclic buffer with redo generated from all the processes.
    When the commit happens it will flush redo of the committed process and also the redo generated from the uncommitted processes.
    Is this same or modified in 12c. Is redo maintained separate for each process and on commit will flush only redo generated for that process.

    ReplyDelete