Tuning methods (comparison based on a real-life example)

First principles, Clarice. Simplicity. Read Marcus Aurelius. Of each particular thing ask: what is it in itself? What is its nature?

The Silence of the Lambs

What do you do when you get a call about a query that keeps running without completing, where the query text itself looks very simple:

select col1, col2, ...
from someview
where colA = '12345-A'
and colB in ('B')
order by colN, colM

and the SQL plan monitor report looks like this:

=====================================================================================================================================================================================================================================================
| Id    |                   Operation                   |             Name             |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  | Write | Write | Mem | Temp | Activity |        Activity Detail        | Progress |
|       |                                               |                              | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes | Reqs  | Bytes |     |      |   (%)    |          (# samples)          |          |
=====================================================================================================================================================================================================================================================
|     0 | SELECT STATEMENT                              |                              |         |       |           |        |     1 |          |       |       |       |       |     |      |          |                               |          |
|     1 |   FILTER                                      |                              |         |       |           |        |     1 |          |       |       |       |       |     |      |          |                               |          |
|  -> 2 |    SORT GROUP BY                              |                              |       9 |  140K |      3521 |   +105 |     1 |        0 |       |       |       |       |  8M |      |     0.12 | Cpu (4)                       |          |
|  -> 3 |     NESTED LOOPS OUTER                        |                              |     822 |  140K |      3521 |   +105 |     1 |    44414 |       |       |       |       |     |      |          |                               |          |
|  -> 4 |      NESTED LOOPS OUTER                       |                              |     822 |  140K |      3521 |   +105 |     1 |    44415 |       |       |       |       |     |      |          |                               |          |
|  -> 5 |       NESTED LOOPS OUTER                      |                              |     822 |  139K |      3521 |   +105 |     1 |    44415 |       |       |       |       |     |      |          |                               |          |
|  -> 6 |        NESTED LOOPS OUTER                     |                              |     604 |  138K |      3521 |   +105 |     1 |    16778 |       |       |       |       |     |      |          |                               |          |
|  -> 7 |         HASH JOIN RIGHT OUTER                 |                              |     553 |  137K |      3625 |     +1 |     1 |    16778 |       |       |       |       |  1M |      |          |                               |          |
|     8 |          TABLE ACCESS FULL                    | PRODUCTTYPE                  |     970 |    14 |         1 |     +1 |     1 |      970 |    10 | 128KB |       |       |     |      |          |                               |          |
|  -> 9 |          NESTED LOOPS                         |                              |         |       |      3527 |    +99 |     1 |    16778 |       |       |       |       |     |      |     0.06 | Cpu (2)                       |          |
| -> 10 |           NESTED LOOPS                        |                              |     553 |  137K |      3527 |    +99 |     1 |    47079 |       |       |       |       |     |      |          |                               |          |
| -> 11 |            NESTED LOOPS OUTER                 |                              |     548 |  137K |      3527 |    +99 |     1 |     181K |       |       |       |       |     |      |          |                               |          |
| -> 12 |             NESTED LOOPS OUTER                |                              |     548 |  135K |      3527 |    +99 |     1 |     181K |       |       |       |       |     |      |          |                               |          |
| -> 13 |              NESTED LOOPS OUTER               |                              |     548 |  135K |      3527 |    +99 |     1 |     181K |       |       |       |       |     |      |          |                               |          |
| -> 14 |               VIEW                            |                              |     548 |  133K |      3527 |    +99 |     1 |     181K |       |       |       |       |     |      |          |                               |          |
| -> 15 |                NESTED LOOPS OUTER             |                              |     548 |  133K |      3527 |    +99 |     1 |     181K |       |       |       |       |     |      |     0.03 | Cpu (1)                       |          |
| -> 16 |                 NESTED LOOPS OUTER            |                              |     548 |  132K |      3527 |    +99 |     1 |     181K |       |       |       |       |     |      |     0.03 | Cpu (1)                       |          |
| -> 17 |                  NESTED LOOPS OUTER           |                              |     548 |  132K |      3527 |    +99 |     1 |     181K |       |       |       |       |     |      |          |                               |          |
| -> 18 |                   HASH JOIN                   |                              |     548 |  132K |      3625 |     +1 |     1 |     181K |    97 |  24MB |   619 | 152MB |  6M | 164M |     0.35 | Cpu (11)                      |       7% |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | direct path read temp (1)     |          |
|    19 |                    TABLE ACCESS FULL          | CONFIRMATION                 |    303K | 84070 |        43 |     +1 |     1 |       1M | 16516 |   1GB |       |       |     |      |     1.06 | Cpu (10)                      |     100% |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | direct path read (26)         |          |
|    20 |                    VIEW                       |                              |      4M | 27099 |      2712 |    +99 |     1 |       4M |       |       |       |       |     |      |          |                               |          |
|    21 |                     HASH GROUP BY             |                              |      4M | 27099 |      2768 |    +43 |     1 |       4M |   363 |  89MB |   363 |  89MB |     |      |     0.50 | Cpu (14)                      |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | direct path read temp (3)     |          |
|    22 |                      INDEX FULL SCAN          | CONFIRMATION_PKI             |      4M | 27099 |        55 |    +43 |     1 |       4M | 22646 |  88MB |       |       |     |      |     1.23 | Cpu (6)                       |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (36)  |          |
| -> 23 |                   TABLE ACCESS BY INDEX ROWID | CONFIRMATIONTOSPREADSHEETDOC |       1 |     1 |      3506 |   +120 |  181K |    25374 |  2087 |   8MB |       |       |     |      |     0.18 | Cpu (3)                       |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (3)   |          |
| -> 24 |                    INDEX UNIQUE SCAN          | CONFIRMATIONTOSPREADSHDOC_PK |       1 |       |      3506 |   +120 |  181K |    25374 |  2423 |   9MB |       |       |     |      |     0.38 | Cpu (3)                       |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (10)  |          |
| -> 25 |                  TABLE ACCESS BY INDEX ROWID  | COUNTERPARTYMAP              |       1 |     1 |      3527 |    +99 |  181K |     162K |  6695 |  26MB |       |       |     |      |     1.32 | Cpu (15)                      |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (30)  |          |
| -> 26 |                   INDEX RANGE SCAN            | COUNTERPARTYMAPSSSSCPID_I    |       1 |       |      3527 |    +99 |  181K |     162K |  4028 |  16MB |       |       |     |      |     0.91 | Cpu (10)                      |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (21)  |          |
| -> 27 |                 TABLE ACCESS BY INDEX ROWID   | COUNTERPARTY                 |       1 |     1 |      3527 |    +99 |  181K |     162K |  9753 |  38MB |       |       |     |      |     1.85 | Cpu (5)                       |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (58)  |          |
| -> 28 |                  INDEX UNIQUE SCAN            | COUNTERPARTY_PKI             |       1 |       |      3527 |    +99 |  181K |     162K |  3021 |  12MB |       |       |     |      |     0.88 | Cpu (8)                       |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (22)  |          |
| -> 29 |               VIEW PUSHED PREDICATE           |                              |       1 |     4 |      3527 |    +99 |  181K |     181K |       |       |       |       |     |      |     0.03 | Cpu (1)                       |          |
| -> 30 |                SORT GROUP BY                  |                              |       1 |     4 |      3527 |    +99 |  181K |     181K |       |       |       |       |     |      |     0.47 | Cpu (16)                      |          |
| -> 31 |                 INDEX RANGE SCAN              | CONFLEG_CONFIRM_FKI          |       1 |     4 |      3527 |    +99 |  181K |     308K |  113K | 440MB |       |       |     |      |    17.51 | Cpu (44)                      |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (553) |          |
| -> 32 |              TABLE ACCESS BY INDEX ROWID      | CONFIRMATIONCPTYUSI          |       1 |       |      3504 |   +122 |  181K |     1252 |   903 |   4MB |       |       |     |      |     0.26 | Cpu (1)                       |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (8)   |          |
| -> 33 |               INDEX RANGE SCAN                | CONFIRMATIONCPTYUSI_CNF_IX   |       1 |       |      3504 |   +122 |  181K |     1252 |  1216 |   5MB |       |       |     |      |     0.15 | Cpu (4)                       |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (1)   |          |
| -> 34 |             VIEW PUSHED PREDICATE             |                              |       1 |     3 |      3527 |    +99 |  181K |     152K |       |       |       |       |     |      |     0.09 | Cpu (3)                       |          |
| -> 35 |              SORT GROUP BY                    |                              |       1 |     3 |      3527 |    +99 |  181K |     152K |       |       |       |       |     |      |     0.53 | Cpu (18)                      |          |
| -> 36 |               INDEX RANGE SCAN                | FAXHISTORY_INDEX             |       1 |     3 |      3527 |    +99 |  181K |     492K |  108K | 420MB |       |       |     |      |    10.35 | Cpu (26)                      |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (327) |          |
| -> 37 |            INDEX UNIQUE SCAN                  | PRODUCTUNIQUEINDEX           |       1 |       |      3527 |    +99 |  354K |    47079 |  216K | 844MB |       |       |     |      |     2.58 | Cpu (50)                      |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file parallel read (38)    |          |
| -> 38 |           TABLE ACCESS BY INDEX ROWID         | PRODUCT                      |       1 |     1 |      3521 |   +105 | 91553 |    16778 |  2997 |  12MB |       |       |     |      |     0.47 | Cpu (2)                       |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file parallel read (11)    |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (3)   |          |
| -> 39 |         TABLE ACCESS BY INDEX ROWID           | COUNTERPARTYCONFIRMATION     |       1 |     1 |      3521 |   +105 | 16778 |    16746 |  4354 |  17MB |       |       |     |      |     1.32 | Cpu (1)                       |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (44)  |          |
| -> 40 |          INDEX UNIQUE SCAN                    | COUNTERPARTYCONFIRMATION_PKI |       1 |       |      3521 |   +105 | 16778 |    16746 |  3319 |  13MB |       |       |     |      |     0.94 | Cpu (1)                       |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (31)  |          |
| -> 41 |        INDEX RANGE SCAN                       | CONFIRMLEG_PKI               |       1 |     1 |      3521 |   +105 | 16778 |    44415 | 16721 |  65MB |       |       |     |      |     4.72 | Cpu (8)                       |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (153) |          |
| -> 42 |       TABLE ACCESS BY INDEX ROWID             | TRADESUMMARY                 |       1 |     1 |      3521 |   +105 | 44415 |    44415 | 26764 | 105MB |       |       |     |      |     8.04 | Cpu (13)                      |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (261) |          |
| -> 43 |        INDEX UNIQUE SCAN                      | TRADESUMMARY_PK              |       1 |       |      3521 |   +105 | 44415 |    44415 | 63801 | 249MB |       |       |     |      |    19.80 | Cpu (30)                      |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (645) |          |
| -> 44 |      TABLE ACCESS BY GLOBAL INDEX ROWID       | TRADE                        |       1 |     1 |      3521 |   +105 | 44415 |    44414 | 27333 | 107MB |       |       |     |      |     7.69 | Cpu (10)                      |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (252) |          |
| -> 45 |       INDEX UNIQUE SCAN                       | TRADE_PK                     |       1 |       |      3521 |   +105 | 44415 |    44415 | 45734 | 179MB |       |       |     |      |    16.13 | Cpu (20)                      |          |
|       |                                               |                              |         |       |           |        |       |          |       |       |       |       |     |      |          | db file sequential read (530) |          |
====================================================================================================================================================================================================================================================="

There are several choices here. The traditional approach from 80-90s (still very popular, especially among developers who specialize in languages other than SQL or PL/SQL) would be to go through the query text (and/or query plan) and identify anything suspicious in it (anything indicative of “bad practices” or “antipatterns”). This is a terribly inefficient approach, because what is a “antipattern” in one context could easily be a valid “pattern” in another, and vice versa. Besides, a list of things that could be wrong with a query is very long most of the time, and going through it one by one takes forever. Even in this particular case, despite the very simple form of the query, we could find dozens of things that are wrong in one way or another from the database conventional wisdom point of view (because when we are done with the query itself, we can go to the definition of the view and continue there).

Nowadays most professional database developers know better than this old-fashioned “checklist” approach. But there are still several choices left. We could attach the problem in the “Method R” way: i.e. look where the query spends most of the time and focus just on this part. This is a very simple, yet extremely efficient method which existed since forever, but was first formulated in simple terms by Cary Milsap in his ground breaking book “Optimizing Oracle Performance”. There is, however, one small problem here: the most time-consuming operation in the plan is only responsible for less than 20% of the elapsed time. Of course, it’s possible that the plan has several independent problems that need to be addressed separately, but it’s a bad scenario because it means more work and lesser chance of success in the end. Fortunately, such scenarios are not that common in practice. What are the other options?

There is another modern approach known as “Cardinality Feedback Tuning”, put forward by Wolfgang Breitling. It too is also very methodical, very scientific, and very efficient approach. It’s based on the analysis of actual and estimated cardinalities, find where the optimizer gets it wrong, and try to fix it. But unfortunately, this is not a very promising approach for this particular case, either. First of all, optimizer estimates don’t look completely crazy. At very least, the optimizer “understands” that it has chosen an expensive plan (cost = 140K), whereas the cardinality feedback approach is most efficient in cases when the cost is underestimated badly (because of underestimated cardinality). Besides, with 13 (!!) joins it’s nearly impossible to get all cardinalities right, because it’s enough to have just a couple 1-to-N relationships between joined tables to get a cardinality estimate several orders of magnitude off.

Since both “scientific” tuning methods were not promising, and since I’m a very lazy person, I decided to try another method, which I would call “philosophical”. I.e. understand the nature of the query, get some background on it. So I had a brief conversation with the person who brought it to my attention and quickly found that:

1) it was a newly developed query (and not e.g. an attempt to simulate a prod issue on a test database)
2) the view the query was using was newly written to, not yet released to production
3) the query was expected to return a very small number of rows because conditions on both colA and colB were very strongly selective (as it turned out, colA had a meaning of an identifier of some sort which was suppose to identify the desired row in the view uniquely or almost so).

Immediately, I knew where to focus my attention. I looked up definitions of both columns in the view and found that one of them was… a concatenation of different columns. Now the root cause of the problem becomes obvious: searching data by a concatenation of two columns is like trying to find a person using the sum of his/hers weight, height and age as the search key — there’s simply no good way to do it. Fortunately, the view contained both concatenated columns in addition to their concatenation, so the problem could be solved by replacing the condition colA = ‘12345-A’ with colA1 = ‘12345’ and colA2 = ‘A’. That way the query completed within a second instead of hanging for hours.

Obviously, in this case it’s not the issue itself that is interesting, but rather the fundamental problem of finding the right methodology for the issue. The “scientific” method isn’t necessarily always the best way to get the answer. In fact, when I got to this point in writing this blog, I realized, that I probably was overly critical of the old-fashioned “checklist” approach, because just “scanning” the text of the query and the plan for “red flags” could have given me some very useful information on the early stage of the investigation. For example, by looking at query text carefully enough, I could have noticed that colA was a concatenation of two columns, or I could have spotted the FILTER operation in the plan which is often a sign of a problem (especially when it’s right below the SELECT step — i.e. keep some unnecessary data and you only through it out at the very last step), and then the information in the predicate section would have told me the reason for this filter operation.

It is also interesting to note here that if the query did complete within reasonable time, then there is yet another (also “scientific”) method that would have given the right answer in the most immediate way: “tuning by throwaway minimization”, which, as I recently found out, was put forward by Martin Berg as long ago in 2000 (I mentioned my own variation of this method in some of my earlier blogs, e.g. here ). In this case we would have looked at dbms_xplan.display_cursor output, and then we’d see that at the last step we have tens of thousands of rows coming in, and only one coming out — obviously, that’s the biggest “throw-away” in the entire plan and that’s one needs to focus his attention.

Summary

“Scientific” (or “systematic” as some people call them) methods of troubleshooting SQL performance give best results most, but not all of the time. Sometimes it helps to look at things more “philosophically” than scientifically. In this particular case, a five minute phone conversation about the nature of the query saved me several hours of hard work doing drill down analysis.

4 thoughts on “Tuning methods (comparison based on a real-life example)”

Kyle Hailey says:

January 15, 2015 at 10:33 pm

Nice write up on the challenges and strategies to deal with query tuning of query that doesn’t even finish. Two of the strategies outlined here are covered in a sql plan query posted at the end of this article

http://datavirtualizer.com/power-of-display_cursor/

this query gives cardinality feedback and buffer to row ratio per line of the execution plan

– Kyle Hailey
http://datavirtualizer.com

Dom Brooks says:

January 16, 2015 at 10:50 am

I was about to leap on your criticism of looking at the query itself … but am pleased that you judged yourself as overly critical.

Looking at the query is not so much a checklist approach.
Fact is that, in my experience, I often find that the SQL which has performance problems – intermittent, flipflopping or constant – is often (often != always) that which is not expressed in the most natural set-based way.

It’s still a most-valid technique particularly when most application SQL seems to be written by people (or generated by tools) who don’t find it easy to find this natural expression easy.

We have to approach such a problem from all sides.
Depends also on short term vs longer term intervention – fix it right now then fix it for the future.
Sometimes we might waste a bit of time depending on which angle we look at first, we might look be looking at runtime metrics and when we eventually get to the sql statement itself, we find the answer there, sometimes vice versa.

1. savvinov says:
  
  January 20, 2015 at 4:03 pm
  
  Hi Dominic,
  
  thanks for your comment. I guess the main reason I’m not very enthusiastic about the “checklist” approach is how frequently and how badly it’s abused. Which, if you think about, is not necessarily always the fault of the method itself — but still, methods which are more quantitative and scientific are somewhat more difficult to abuse. But yes, when used properly, the “checklist” approach can also be helpful. Much depends on whether or not you can be sure that the code is well written.
  
Pingback: Method R and parallelism (another real-life example) | Oracle Diagnostician