SQL tuning is the key to database performance tuning, and yet this seems to be a “blind spot” for many database specialists. I think it’s probably because it falls right on the boundary between DBA’s and developer’s responsibilities: so while a DBA expects database developers take care of performance while developing code, developers often neglect that, relying on a “develop first, let the DBA tune it later” approach. Also, until recently, there were surprisingly few good (and accessible to a newbie) descriptions of how to read a SQL plan.
Now that we have Chritian Antognini’s great book “Troubleshooting Oracle Performance”, the situation has improved dramatically. But still, I think that a blog post on that subject won’t hurt: after all, it’s free and it’s written by someone who still remembers difficulty his first analyzing SQL plans. :)
The most important thing about SQL plans is that they’re not some flat lists of SQL operations. They are hierarchical structures. Operations may have parent-child relationships between each other, and this has a couple of important implications:
1) the cost and timing of an operation is the sum of costs/timings for all its children, plus the possible overhead on combining their inputs (no such simple formula exists for the cardinality — it entirely depends on how the rows are combined together);
2) an operation depends on its children to complete — so the hierarchy determines the order of steps, and order is everything in SQL plans.
So understanding SQL plans boils down to two things: understanding individual plan operations (full table scans, various kinds of index scans, filters etc.) and understanding how they are combined together (using joins, set and FILTER operations). It is not always necessary to understand (i.e. have an integral picture) the entire query — and it will be simply impossible if the query is big enough — but it is important to identify the part of the plan that matters most and understand it.
In the rest of the post I am assuming that the reader is already familiar with access paths (full table scans, index range scans etc.) and joins methods (nested loops, hash joins, merge joins) so I’ll focus solely on putting it all together.
Probably the most confusing thing about SQL plans is that the order of operations in the plan is not directly related to the actual chronological order. Some GUI tools (like Toad) offer various methods of displaying plans which may be easier to visualize, like plan trees, but it’s useful to build the plan tree by hand to get a better grasp of what the query is doing and how.
To find the first operation, simply keep moving from top to bottom until indentation stops growing. Then identify the sibling(s) by looking at operation(s) immediately below with the same level of indentation. Then identify the parent by going just one step up from the child operation identified in the first step. Then repeat this sequence of steps over and over until the entire tree is built.
If the plan isn’t too complex then it’s easier to visualize it drawing it “workflow-style”. Let’s consider an example using the standard HR schema from demo scripts.
In this example, we are selecting employee first and last name together with department name for all employees with salary below 10,000 from departments located in the European region.
select department_name, first_name, last_name, salary from HR.DEPARTMENTS d, hr.locations l, hr.countries c, HR.REGIONS r, hr.employees e where D.DEPARTMENT_ID = E.DEPARTMENT_ID and d.location_id = l.location_id and c.country_id = l.country_id and c.region_id = r.region_id and R.REGION_NAME = 'Europe' and salary < 10000; ------------------------------------------------------------ | Id | Operation | Name | ------------------------------------------------------------ | 0 | SELECT STATEMENT | | | 1 | NESTED LOOPS | | | 2 | NESTED LOOPS | | | 3 | NESTED LOOPS | | | 4 | HASH JOIN | | | 5 | NESTED LOOPS | | | 6 | VIEW | index$_join$_002 | | 7 | HASH JOIN | | | 8 | INDEX FAST FULL SCAN | LOC_COUNTRY_IX | | 9 | INDEX FAST FULL SCAN | LOC_ID_PK | | 10 | INDEX UNIQUE SCAN | COUNTRY_C_ID_PK | | 11 | TABLE ACCESS FULL | REGIONS | | 12 | TABLE ACCESS BY INDEX ROWID| DEPARTMENTS | | 13 | INDEX RANGE SCAN | DEPT_LOCATION_IX | | 14 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | | 15 | TABLE ACCESS BY INDEX ROWID | EMPLOYEES | ------------------------------------------------------------
First, we move down until the indentation level stops increasing — that’s step 8. This is our first operation. Right below it, step 9 has the same indentation level, it’s a sibling. Right above it is the step 7, it’s the parent (indentation level just 1 less). If indentation isn’t clearly seen (big and cluttered plan), then one can use V$SQL_PLAN view to identify parent/children relationships instead: e.g.
select P.ID, P.PARENT_ID from v$sql_plan p where sql_id = 'c03487d4nwbws' and id in (8, 9); --- 8 7 9 7
Step 7 isn’t joined to anything else, it’s the only child of step 6: “VIEW”. Actually, “VIEW” operation doesn’t involve any processing, it’s simply an indication of the fact that the view that corresponds to the child operations, wasn’t merged.
In the same fashion we establish that the parent of step 6 is step 5, which has another child, step 10. Using the same reasoning, we continue to find that:
4 is the parent of 5 and 11
3 is the parent of 4 and 12
2 is the parent of 3 and 14
1 is the parent of 2 and 15.
Of course, the order of children is of key importants: it determines how the rowsource will be accessed by the join. Graphically, the result is seen on the diagram below:
In this form, the diagram shows the flow on time (from top to bottom) same way it’s done on workflow diagrams, which makes it easier to understand the sequence of steps. Note that the fact that the joins on the left are have numbers decreasing by 1 from top to bottom is a mere coincidence.
In my next post I am planning to continue on the same subject and show how cardinality and cost can be used to tune a SQL statement.