Positive feedback loops

Ever since I heard Tanel Poder talk about application servers causing positive feedback loops (performance gets worse -> open more sessions -> make performance even worse) in Riga 2012 LVOUG conference, I was curios to see one myself. And recently I got a chance to see a very interesting variety of such behavior.

positive_feedback_loop_example

On the plot above you can see database CPU usage (red line) versus time, plotted together with the number of sessions running same SQL statement (identically the same, i.e. not even different binds — there are no binds in this statement). There is a clear correlation between the two.   Apparently, when database performance degrades to a certain point, the application server is not getting the answer fast enough, and launches the query again in another session, thus degrading performance even more, etc.

This behavior  is very dangerous and may easily cause prolonged outages (see the plateau towards the right side of the plot — the database server was maxed out in terms of CPU for over 24 hours!).

Moral of the story: don’t spawn more sessions in response to poor performance, unless you want poor performance to quickly become terrible!

Tuning very complex SQL

Troubleshooting relatively short plans is simple, once one gets a little practice, but the mere sight of a plan that has several tens or even hundred operations can be very initimidating. In this post I am describing a few tricks that make this task easier.

First of all, use V$SQL_PLAN/V$SQL_PLAN_STATISTICS instead of (or in addition to) the formatted plan produced by DBMS_XPLAN

For example, in DBMS_XPLAN output for a long plan it is hard to figure out relationships between operations, but it is very straightforward to figure them out by querying V$SQL_PLAN, e.g.:


select *
from v$sql_plan p
where sql_id = :sql_id
and id = :op_id
and P.PARENT_ID = :operation_id;

If the statement is a join of several views, and the optimizer doesn’t merge the views, then it is convenient to check which views are contributing to the performance problem the most, e.g.:

select  object_name, s.last_cr_buffer_gets
from v$sql_plan p,
     v$sql_plan_statistics s
where p.sql_id = s.sql_id
and p.child_number = s.child_number
and P.ID = s.operation_id
and p.sql_id = :sql_id
and p.child_number = :child_no
and p.operation = 'VIEW'
order by S.LAST_CR_BUFFER_GETS desc;

Continue reading “Tuning very complex SQL”