SQL performance, Part III. Data storage strategies

Imagine that you’re on a desert island where some pirates hid their treasure. You don’t know where exactly it is hidden, but you want to find it. How should you approach this – e.g. should you dig randomly? Or should you follow some sort of a system – and if yes, then which?

I think most readers have smelled a trap right away (it’s not a very sophisticated trap anyway) – of course it doesn’t matter how you do it. If it takes you X minutes to examine 1 square foot of the surface, and the total surface is Y square feet, it will take you anywhere from 0 to Y/X minutes (so, Y/X/2 on the average) to explore the entire island and find the treasure, no matter which system you’re following (or using no system at all). When there is no information whatsoever about the object(s) sought, all search strategies are equally (in)efficient.

Continue reading “SQL performance, Part III. Data storage strategies”

SQL Performance, Part I. Data access efficiency.

Database performance optimization deals with the problem of reducing time (and/or other resources) required to complete database queries by resolving various inefficiencies. There are many kinds of performance issues, and they can be categorized in many different ways. One important way is by scope: statement-level or instance-level. Instance-level performance tuning (which we are not covering in this mini-series) deals with such problems as excessive parsing, cache efficiency, concurrency etc. Statement-level performance tuning deals mostly with inefficient data access.

Continue reading “SQL Performance, Part I. Data access efficiency.”

Giving back

Occasionally, I receive requests to explain stuff I talk about in my blog posts on a level accessible to beginners. Unfortunately, it’s not always possible to do so. You start to explain one little thing, and then you realize that in order to understand that thing, one would to understand a dozen other little things, and before you know it, instead of writing a short blog post you find yourself writing something of the size of Encyclopedia Britannica. I decided to try to go in a different direction. I’ve accumulated some beginner-to-intermediate-level material that hopefully should make certain topics in database performance optimization clearer. I will be posting them in form of short “lessons”. I don’t have how many such lessons will be in total — I guess this depends a lot on the feedback that I’ll be receiving. My plan is to start with statement-level optimization topics, and then, if this goes well, to move on to instance-level stuff. I will try to: 1) provide a high-level overview without getting caught up into details 2) present a holistic view of Oracle database performance rather than asystematic assembly of performance-related topics 3) “derive” facts from higher principles rather than simply state them 4) use analogies to make basic performance concepts more intuitive 5) be as quantitative and specific as possible 6) heavily use demos. Hope you’ll find it useful.

 

Links:

Part I, Data access efficiency

Part II, Disk I/O: metrics and scales

Part III, Data storage strategies

Observer effect

In physics, one important limitation of any experiment is the fact that an act of observation inevitably interferes with the observed process (“observer effect”). Same thing is true about databases. It is a well known fact that, for example, turning on tracing can significantly slow down the process for which it’s enabled. But there exist even nastier forms of this effect: for example, when you try to trace a SQL statement using nested loop batching mechanism, apparently, the batching gets disabled! This is extremely confusing, because plan statistics in the trace file indicate that the batching takes place (i.e. the plan shows the “double nested loop” shape characteristic for NLJ batching), and if you look at the plan using dbms_xplan.display_cursor using “outline” option, you’ll see NLJ_BATCHING among other hints.

This effect was discovered by Randolf Geist, and published in the part 3 of his wonderful series on Logical I/O evolution (see here, here and here). It is very thorough and very detailed. Because of the level of detail in there it’s rather lengthy, which is why not everyone gets to that particular paragraph, so let me quote it in here:

…be aware of an odd behaviour I’ve experienced during my tests: If any kind of row source statistics sampling was enabled by either using STATISTICS_LEVEL = ALL, the GATHER_PLAN_STATISTICS hint or even enabling (extended) SQL trace, the optimized, batched form of physical I/O could not be reproduced. You could tell this from the session statistics that start with “Batched IO%” – these all stayed at 0. Only when disabling all these things the effects were visible and the corresponding statistics where non-zero. I don’t know why this is the case, but it is an important detail when testing this feature.

I think it is important that the community be aware of this behavior, because not knowing about it made me waste several days of work, and I know at least one other person who also had difficulties of the same type. But aside from this particular oddity, this is also a reminder of importance of the “observer effect” in general. So any findings obtained with intrusive methods (like extended SQL trace or statistlcs_level = ALL etc.) need to be cross-checked with less intrusive methods (e.g. dumps from V$MYSTAT, V$SESSION_EVENT, V$SESS_TIME_MODEL etc., or SQL real-time monitor reports).

The obvious downside for using V$ views is necessity to take the baseline, but there is a simple workaround if taking a baseline is inconvenient: one can simply establish a new session, thus resetting all counters and statistics to zero.

Optimizing work environment. Part II, command line

(the first part can be found here)

Now let’s consider command line. There is a lot of things that can be done to improve user experience here as well.

1) Maintain a local script repository on the machine(s) you’re using the most. To make sqlplus search there, define SQLPATH variable (using Control Panel => System => Advanced => Environment variables on Windows, or shell initialization scripts such as .profile or .bash_profile on Unix).

2) To facilitate synchronization of scripts among different environments, use web-repositories. One option here is to go with a plain http (or ftp) repository e.g. using any (free or paid) hosting service, or your own web server if you have any. The other option is to go with a code repository tool like git that utilizes the secure (https) protocol. Plain http has the advantage of you being able to run scripts directly from the online repository (see here for more detail). But if your internet connection requires proxy, this may prove difficult (setting HTTP_PROXY variable might help, but I myself had no luck with it).

Continue reading “Optimizing work environment. Part II, command line”

HotSOS symposium 2015

This week I participated (online option) in HotSOS conference — the only conference that is focused entirely around Oracle database performance and internals. Like a year ago, it was a great conference, I really enjoyed it. Predictably, the number one topic was the in-memory option. Probably every third talk was about it. I learned a lot of stuff I didn’t know before (and I strongly suspect that this stuff wouldn’t be easy to find anywhere else).

Continue reading “HotSOS symposium 2015”

Logical I/O

Except a few special cases, optimizing SQL is about minimizing I/O. And by “I/O” we normally mean “physical I/O”, because everybody knows that logical I/O (LIO) is much, much faster. But how much faster exactly? For a long time, this question has been bothering me. It looks like there has been little research in this area. Basically the only thorough investigation I managed to find on the subject was one by Cary Millsap and co-authors. So I conducted some research on my own. Continue reading “Logical I/O”

High CPU during I/O (continued)

Last week I posted a blog on high CPU utilization during I/O. The blog title contained a question mark because it seemed like there were more questions than answers. Were the CPU utilization numbers real or just an instrumentation artifact? Were they distorted by double-counting? If real, where was it coming from? About a dozen top experts participated in that discussion, and everybody had a different theory. I’ve spent some time looking at the problem on different test environments from the OS angle, so now I have some answers I’d like to share.

Continue reading “High CPU during I/O (continued)”