Tuning Analytic Functions

In general, tuning analytic functions (and more generally, all sort operations) is rather difficult. While for most poorly performing queries it’s relatively straightforward to gain some improvements by applying “eliminate early” principle one way or another, for slow sort operations it’s rarely applicable. Usually options are limiting to rewriting a query without analytics (e.g. using self-joins or correlated subqueries to achieve the same goal) or manually resizing the workarea to reduce/eliminate the use of disk. Recently, however, I had a case where I managed to obtain an excellent performance gain using a different technique that I would like to share in this post.

The original query was selecting about 100 columns using the LAG function on one of the columns in the WHERE clause, but in my test case I’ll both simplify and generalize the situation. Let’s create a table with a sequential id, three filtering columns x, y and z, and 20 sufficiently lengthy columns.

Continue reading “Tuning Analytic Functions” →

Efficiency-based SQL tuning

Introduction

In order to tune a query, you need to know two things:
– can it be tuned, and if yes, then by how much
– which part of the query (which operation, or which data object) is most promising from the tuning point of view.

Currently existing tuning methods don’t really answer these questions. You either focus on the most expensive operation(s), and hope that you can eliminate them (or transform them into something less cosly), or you focus on the ones where you see a large discrepancy between actual rowcounts and optimizer predictions (cardinality feedback tuning). Either way, you can’t be sure that you’ve set your priorities right. It could well be the case that the cost of the most expensive operation cannot be reduced by much, but you can win back enough performance elsewhere. With the cardinality feedback tuning, you also don’t have any guarantee that improving accuracy of optimizer estimates would eventually transform into acceptable level of performance.

Of course, if the plan only contains a few operations, this is not a big issue, and after a few trials you will usually get to the bottom of the problem. However, when dealing with very complex plans, hundreds operations long, this is not really an option. When dealing with such plans a few months back, I developed for myself a simple tuning method that allows to evaluate with high accuracy potential tuning benefits of plan operations, using rowsource stats and optimizer as input. In this post, I’m sharing this method, as well as a script that implements it.

Continue reading “Efficiency-based SQL tuning” →

CPU starvation disguised as an I/O issue (yet another AWR case study)

In AWR analysis, what appears to be the root cause of the issue, can easily turn out to be just a symptom. Last week, Rajat sent me an AWR report which is a perfect illustration of this (thanks Rajat), I posted the key sections from this report below (sorry for less than perfect formatting — I had to manually re-format the HTML version of the report into text).


WORKLOAD REPOSITORY report for
DB Name      DB Id           Instance       Inst num        Release          RAC           Host
DSS          37220993      dss              1               10.2.0.4.0       NO            dssdbnz

                  Snap Id      Snap Time             Sessions      Cursors/Session
Begin Snap:       18471      12-Oct-12 08:30:28      131              1.5
End Snap:         18477      12-Oct-12 14:30:24      108              1.8
Elapsed:          359.93 (mins)
DB Time:          25,730.14 (mins)

Load Profile
                              Per Second      Per Transaction
Redo size:                    325,282.85      103,923.02
Logical reads:                33,390.52       10,667.77
Block changes:                1,307.95        417.87
Physical reads:               1,927.33        615.75
Physical writes:              244.65          78.16
User calls:                   391.34          125.03
Parses:                       68.14           21.77
Hard parses:                  3.33            1.06
Sorts:                        47.86           15.29
Logons:                       3.15            1.01
Executes:                     234.32          74.86
Transactions:                 3.13
% Blocks changed per Read:       3.92       Recursive Call %:      61.11
Rollback per transaction %:      24.71      Rows per Sort:         3325.52

Top 5 Timed Events
Event                               Waits          Time(s)      Avg Wait(ms)      % Total Call Time      Wait Class
free buffer waits              10,726,838      344,377     32      22.3      Configuration
db file sequential read        6,122,262      335,366      55      21.7      User I/O
db file scattered read         3,597,607      305,576      85      19.8      User I/O
CPU time                                      161,491              10.5
read by other session          2,572,875      156,821     61       10.2      User I/O

Operating System Statistics
Statistic                                 Total
AVG_BUSY_TIME                             2,093,109
AVG_IDLE_TIME                             63,212
AVG_IOWAIT_TIME                           18,463
AVG_SYS_TIME                              87,749
AVG_USER_TIME                             2,004,722
BUSY_TIME                                 16,749,988
IDLE_TIME                                 510,692
IOWAIT_TIME                               152,594
SYS_TIME                                  707,137
USER_TIME                                 16,042,851
LOAD                                      4
OS_CPU_WAIT_TIME                          ###############
RSRC_MGR_CPU_WAIT_TIME                    0
VM_IN_BYTES                               5,503,492,096
VM_OUT_BYTES                              2,054,414,336
PHYSICAL_MEMORY_BYTES                     34,288,209,920
NUM_CPUS                                  8
NUM_CPU_SOCKETS                           8

Continue reading “CPU starvation disguised as an I/O issue (yet another AWR case study)” →

A map to AWR report

Introduction

An average 11g AWR report spans 40 screens broken into approximately 50 sections. That’s a lot, especially for someone who’s not very well familiar with AWR reports, so I decided to make a some sort of a map. The purpose is to show that this report has a certain structure (which may not be obvious at first sight), and knowing this structure can help extract the most essential information in the fastest way possible.

Types of sections

For simplicity, I break AWR report sections into following categories:

1) basic (key information)

2) detalization (provides details on a specific topic briefly covered in the basic section, such as latches, enqueues etc.)

3) advisories (helps find optimal values of parameters)

4) advanced (stuff that is not generally needed, but can be useful on certain occasions — basically, everything not covered in 1-3).

Continue reading “A map to AWR report” →

AWR report: load profile

Introduction

“Load profile” section of the AWR report contains some extremely useful information, and yet it is very often overlooked (often in favor of instance efficiency percentages, which is easier to read but much more likely to mislead). I decided to make some sort of a short guide for it, describing how different statistics in it can be used to better understand performance of a database.

Continue reading “AWR report: load profile” →

AWR report case study: stating the obvious

It’s been a while since I came across an interesting and complex case. However, yesterday on an OTN forum thread I saw a case which was interesting by its simplicity. Even though it’s almost trivial on the technical level, it’s very useful to highlight typical tuning “rookie mistakes” (I can remember quite a few cases from not so long ago, when I did similar mistakes, too).

The author posts a question about “library cache: mutex X” events that are ruining performance of his 2-node RAC cluster. The original post doesn’t contain any specifics except for CPU utilization percentage on both nodes.

Within a few hours, a few replies appear, most of them either trying to shed light on this particular wait event or sharing similar experiences. I asked the original poster to provide key sections of an AWR report (workload profile, top events, database/host CPU, top SQL), which he soon did:

Continue reading “AWR report case study: stating the obvious” →

Troubleshooting log file sync waits — a simple case study

Yesterday I was asked to give my opinion on the AWR below:

DB Name	DB Id	Instance	Inst num	Release	        RAC	Host
*****	*****	*****   	1	        10.2.0.4.0	NO	******

	        Snap Id	Snap Time	   Sessions  Cursors/Session
Begin Snap:	66340	6/20/2012 14:30	   117	  9.3
End Snap:	66341	6/20/2012 15:00	   115	  9.1
Elapsed:		30.16 (mins)
DB Time:		35.57 (mins)

Report Summary

Cache Sizes

	                Begin	End
Buffer Cache:	        1,888M	1,888M	Std Block Size:	8K
Shared Pool Size:	1,024M	1,024M	Log Buffer:	3,168K

Load Profile

	                Per Second	Per Transaction
Redo size:	        504,437.43	3,150.59
Logical reads:	        23,547.86	147.07
Block changes:	        1,900.01	11.87
Physical reads:	        1,931.69	12.06
Physical writes:	50.85	        0.32
User calls:	        478.35	        2.99
Parses:	                20.39	        0.13
Hard parses:	        0.05	        0
Sorts:	                8.67	        0.05
Logons:	                0.1	        0
Executes:	        250.29	        1.56
Transactions:	        160.11

% Blocks changed per Read:	8.07	Recursive Call %:	18.31
Rollback per transaction %:	0	Rows per Sort:	178.25

...

Top 5 Timed Events

Event	                Waits	        Time(s)	Avg Wait(ms)	% Total Call Time	Wait Class
CPU time		                1,882		        88.2
log file sync	        293,178	        1,217	4	        57	                Commit
log file parallel write	290,903	        961	3	        45	                System I/O
db file scattered read	301,788	        45	0	        2.1	                User I/O
db file parallel write	11,839	        31	3	        1.5	                System I/O

Continue reading “Troubleshooting log file sync waits — a simple case study” →

Histograms for strongly skewed columns

On a recent OTN thread, I learned a nice trick by J. Lewis that allows to circumvent certain problems with histograms.

Histograms were designed to solve the problem of estimating cardinality for skewed columns (i.e. where some values occur much more frequently than the others). For columns with low number of distinct values (NDV) Oracle collects a frequency histogram, which can be thought of as a set of two one-dimensional arrays: one containing all possible values, the other containing their frequency (i.e. how many rows have this value). However, if sample size is small, then Oracle can miss rare values, and they won’t be reflected in the histogram. As a result, the cardinality estimates for those values will be wrong (depending on version Oracle will either set it to either 1 or to half of the frequency for the rarest value found). A detailed explanation of the issues with examples can be found in blog posts by J. Lewis and R. Geist.

Continue reading “Histograms for strongly skewed columns” →

Library cache locks: a case study

Recently I was asked to help with a case involving library cache locks, and even though it was really simple, I think it’s good for illustrating a few points in performance analysis.

As it often happens, it all started with customers complaining about “the database being slow”. I asked for more specific details, and I found that it wasn’t the entire database, but rather a reporting subsystem. Since I didn’t have specific session sid or sql_id at hand, I started with analyzing a 30-min AWR report, and since there wasn’t much time to look at everything, I just focused on top timed events:

Event                   Waits          Time (s)  Avg Wait(ms) %Total  Wait Class

CPU time                               16,188               46.2
db file sequential read 18,230,956     11,787     1         33.6     User I/O
db file parallel write  278,480        4,509      16        12.9     System I/O
library cache lock      1,131          3,297      2,915     9.4      Concurrency
log file sync           120,356        1,686      14        4.8      Commit

Continue reading “Library cache locks: a case study” →

Reading SQL plans

SQL tuning is the key to database performance tuning, and yet this seems to be a “blind spot” for many database specialists. I think it’s probably because it falls right on the boundary between DBA’s and developer’s responsibilities: so while a DBA expects database developers take care of performance while developing code, developers often neglect that, relying on a “develop first, let the DBA tune it later” approach. Also, until recently, there were surprisingly few good (and accessible to a newbie) descriptions of how to read a SQL plan.

Now that we have Chritian Antognini’s great book “Troubleshooting Oracle Performance”, the situation has improved dramatically. But still, I think that a blog post on that subject won’t hurt: after all, it’s free and it’s written by someone who still remembers difficulty his first analyzing SQL plans. :)

Continue reading “Reading SQL plans” →