Not every commit results in a redo write. This is because there are multiple optimizations (some controlled by the user e.g. with COMMIT_LOGGING parameter, some automatic) that aim at reducing the number of redo writes caused by commits by grouping redo records together. Such group or “piggyback” commits are important for understanding log file sync waits and various statistics around it. In particular, “piggyback” commits play a key role when many sessions commit concurrently at a high rate, as described in my previous post. I made myself a little demo to actually see this mechanism in work with my own eyes. I think it could be of interest for others, so I’m sharing it here. Since the demo involves stopping and resuming background process, I wouldn’t recommend running it on anything other than a designated private sandbox environment.
The basic idea is very simple:
1) Enable LGWR trace
2) Freeze the LGWR using oradebug SUSPEND command
3) Insert same data in 3 different sessions and issue commits
4) Resume LGWR
5) Look at LGWR trace to make sure that all 3 commits are processed in one go.
Before the test, I ran the autotrace on the INSERT statement to find redo size (so that I could cross-check data in LGWR trace file later). It was 11 kB, or 23 log writer blocks.
Now we are ready to run the test. As a first step, let’s find some information about LGWR:
select sid, serial# from v$session where program like '%LGWR%'; select pid from v$process where program like '%LGWR%';
We’ll need the output for enabling tracing and setting oradebug process id:
exec dbms_monitor.session_trace_enable(<sid>, <serial#>); oradebug setorapid <pid>
Now suspend the log writer:
oradebug suspend
and do the INSERTs in 3 different sessions, with a commit after each INSERT:
insert into t (text) select rpad('x', 1000, 'x') from dual connect by level < = 10; commit;
Since LGWR is frozen, the sessions hang on “log file sync” waits. Then I resume LGWR:
oradebug resume
and examine the LGWR trace file contents:
*** 2014-09-07 16:42:25.109 Received ORADEBUG command (#18) 'suspend' from process 'Windows thread id: 4616, image: <none>' *** 2014-09-07 16:42:49.961 Received ORADEBUG command (#19) 'resume' from process 'Windows thread id: 4616, image: <none>' *** 2014-09-07 16:42:49.969 Finished processing ORADEBUG command (#19) 'resume' *** 2014-09-07 16:42:49.969 Finished processing ORADEBUG command (#19) 'suspend' WAIT #0: nam='rdbms ipc message' ela= 25086817 timeout=300 p2=0 p3=0 obj#=-1 tim=32472186328 WAIT #0: nam='log file parallel write' ela= 451 files=1 blocks=72 requests=1 obj#=-1 tim=32472186937
As we can see, all 3 commits were serviced by a single “log file parallel write”! We can also see that the amount of redo written matches redo generated: 23 x 3 + 3 “commit records” = 72.
Pretty neat, huh?
Can you get the same effect with kill -STOP ? Then follow with -CONT? LGWR won’t know.
I ran this test on a Windows machine, so kill -STOP is not an option. I can run a test on a Linux machine, but not at this time. Do you expect it to behave any differently? I don’t think the commit optimization algorithm cares why LGWR can’t process the writes — as long as it can’t, and there’s a queue of user processes waiting on log file sync, it would group the outstanding redo records together to the write operations (and hence LFS waits)
When I do this sort of stuff I find STOP/CONT is the same as oradebug. I suppose I could put oradebug under a debugger and find that it is sending -STOP/-CONT to the named PID on Unix ports…but…no time :)
Thanks for the tip — might come handy some time!
Q: 23 x 3 + 3 “commit records” = 72.
Why plus 3?
Hi Lonion,
when a transaction is commited, it requires an update to the undo segment header block. Updates to undo blocks, just like updates to regular data blocks, generate redo. 3 commits = 3 updates to the undo segment header block = 3 additional redo records, that’s why “+3”.
Best regards,
Nikolay