Observer effect

2 Apr

In physics, one important limitation of any experiment is the fact that an act of observation inevitably interferes with the observed process (“observer effect”). Same thing is true about databases. It is a well known fact that, for example, turning on tracing can significantly slow down the process for which it’s enabled. But there exist even nastier forms of this effect: for example, when you try to trace a SQL statement using nested loop batching mechanism, apparently, the batching gets disabled! This is extremely confusing, because plan statistics in the trace file indicate that the batching takes place (i.e. the plan shows the “double nested loop” shape characteristic for NLJ batching), and if you look at the plan using dbms_xplan.display_cursor using “outline” option, you’ll see NLJ_BATCHING among other hints.

This effect was discovered by Randolf Geist, and published in the part 3 of his wonderful series on Logical I/O evolution (see here, here and here). It is very thorough and very detailed. Because of the level of detail in there it’s rather lengthy, which is why not everyone gets to that particular paragraph, so let me quote it in here:

…be aware of an odd behaviour I’ve experienced during my tests: If any kind of row source statistics sampling was enabled by either using STATISTICS_LEVEL = ALL, the GATHER_PLAN_STATISTICS hint or even enabling (extended) SQL trace, the optimized, batched form of physical I/O could not be reproduced. You could tell this from the session statistics that start with “Batched IO%” – these all stayed at 0. Only when disabling all these things the effects were visible and the corresponding statistics where non-zero. I don’t know why this is the case, but it is an important detail when testing this feature.

I think it is important that the community be aware of this behavior, because not knowing about it made me waste several days of work, and I know at least one other person who also had difficulties of the same type. But aside from this particular oddity, this is also a reminder of importance of the “observer effect” in general. So any findings obtained with intrusive methods (like extended SQL trace or statistlcs_level = ALL etc.) need to be cross-checked with less intrusive methods (e.g. dumps from V$MYSTAT, V$SESSION_EVENT, V$SESS_TIME_MODEL etc., or SQL real-time monitor reports).

The obvious downside for using V$ views is necessity to take the baseline, but there is a simple workaround if taking a baseline is inconvenient: one can simply establish a new session, thus resetting all counters and statistics to zero.

Optimizing work environment. Part II, command line

10 Mar

(the first part can be found here)

Now let’s consider command line. There is a lot of things that can be done to improve user experience here as well.

1) Maintain a local script repository on the machine(s) you’re using the most. To make sqlplus search there, define SQLPATH variable (using Control Panel => System => Advanced => Environment variables on Windows, or shell initialization scripts such as .profile or .bash_profile on Unix).

2) To facilitate synchronization of scripts among different environments, use web-repositories. One option here is to go with a plain http (or ftp) repository e.g. using any (free or paid) hosting service, or your own web server if you have any. The other option is to go with a code repository tool like git that utilizes the secure (https) protocol. Plain http has the advantage of you being able to run scripts directly from the online repository (see here for more detail). But if your internet connection requires proxy, this may prove difficult (setting HTTP_PROXY variable might help, but I myself had no luck with it).

Continue reading

Optimizing working environment. Part I, GUI

6 Mar

Working with database performance troubleshooting means dealing with challenging and exciting problems. But it also means doing certain things over and over. And over. Like typing the same query or clicking the same button zillion times a day. And when you’re doing something very frequently, anything that can be done to simplify or automate such tasks becomes very important. So I decided to share some of the tricks I use to optimize my working experience in a short mini-series. This post will be on GUI, and the part II will be on command line.

Continue reading

HotSOS symposium 2015

5 Mar

This week I participated (online option) in HotSOS conference — the only conference that is focused entirely around Oracle database performance and internals. Like a year ago, it was a great conference, I really enjoyed it. Predictably, the number one topic was the in-memory option. Probably every third talk was about it. I learned a lot of stuff I didn’t know before (and I strongly suspect that this stuff wouldn’t be easy to find anywhere else).

Continue reading

Analyzing segment content by block type

26 Feb

Occasionally one might want to know what a segment is made of in terms of block types. For example, you notice that the number of blocks in an index segment is somewhat larger than the number of branch and leaf blocks, and wonder what kind of blocks accounts for the difference. The only way to do this is by dumping index blocks (e.g. as described in Richard Foote’s blog here). Dumping blocks is easy, but analyzing them — not so much. Sure, there exists a plethora of tools that allow to parse text from the OS side (awk, perl, sed and whatnot), but this leads to usual problems: OS access, scripting skills, certain platforms may not have the scripting tool you’re most comfortable with, and even more importantly: scripts cannot do cool stuff that Oracle can (like joining data to other data) . Fortunately, those difficulties can be circumvented by using regexp + external files as I already posted in my blog here. This time, I’d like to show how this technique can be adjusted for index block dumps.

Continue reading

MOS “log file parallel write” reference note updated

23 Feb

Last year, I spent some time researching redo log related performance problems, which resulted in a mini-series, including one post devoted specifically to one previously unknown scenario of excessive log file sync waits. I am happy to announce that a service request opened on the back of this research resulted in the MOS note on “log file parallel write” wait event (Doc ID 34583.1) having been updated with a general description of this scenario and factors that may contribute to it. Unfortunately, more specific information regarding this issue has been put to an internal Oracle note because of the limitations that concern underscore parameters. So if you are dealing with log file sync (or log buffer space) issues, then I strongly recommend to log in to MOS and familiarize yourself with the updated version of the note as soon as possible.

Logical I/O

17 Feb

Except a few special cases, optimizing SQL is about minimizing I/O. And by “I/O” we normally mean “physical I/O”, because everybody knows that logical I/O (LIO) is much, much faster. But how much faster exactly? For a long time, this question has been bothering me. It looks like there has been little research in this area. Basically the only thorough investigation I managed to find on the subject was one by Cary Millsap and co-authors. So I conducted some research on my own. Continue reading

High CPU during I/O (continued)

4 Feb

Last week I posted a blog on high CPU utilization during I/O. The blog title contained a question mark because it seemed like there were more questions than answers. Were the CPU utilization numbers real or just an instrumentation artifact? Were they distorted by double-counting? If real, where was it coming from? About a dozen top experts participated in that discussion, and everybody had a different theory. I’ve spent some time looking at the problem on different test environments from the OS angle, so now I have some answers I’d like to share.

Continue reading

High CPU during I/O?

28 Jan

Yesterday, I posted a trace file that seemed odd to me Twitter. This resulted in a very lively discussion, and several very interesting theories were put forward. Even though the issue hasn’t been solved yet, I would like to post a summary of this discussion (plus my findings) so that more people would be able to give their inputs. Also, some of the theories put forward to explain the observed anomaly are very interesting on their own.

The issue is very simple (which I think is one of the reasons why it resulted in such a heated debate). It has to do with unreasonably high CPU utilization with low values of the db_file_multiblock_read_count (from that point on, MBRC) parameter. Here is the setup I used in my tests:

Continue reading


22 Jan

Disjunction (logical OR) is known to cause various performance problems, from extreme parse times (e.g. here) to sub-optimal plans. A common solution to such problems is getting rid of OR’s by “OR expansion” (i.e. rewrite via UNION ALL), although it doesn’t work in 100% cases. In this post, I will consider an example of an OR problem that can be solved differently.

Continue reading


Get every new post delivered to your Inbox.

Join 568 other followers