Very often, significant performance benefits can be obtained by using some very basic knowledge of the application, its data and business rules. Sometimes even less than that: even if you are not familiar with the application logic at all, you can still use common sense to make some reasonable guesses that would get you a long way in improving query’s performance. Here is an example (based on an actual query that I had to tune today).
Database query tuning is mostly about getting better plans. Mostly, but not always. Sometimes, the problem has nothing to do with the plan, and you might need to get a bit creative to find a solution. In this recent case a query was showing a decent performance when running from SQL Developer, but it took about 5 times longer to complete when running from R. The plan was the same, so I knew that it was irrelevant. The R session wasn’t showing as active most of the time, so it was fairly clear that the problem was fetching data — i.e. it was fetching too few rows at a time which lead to a large number of roundtrips, and consequently, high waits on “idle” event “SQL*Net message from client”.
On one of the databases I’m looking after (184.108.40.206, Solaris, non-RAC), several different INSERT statements (all into tablespaces with manually managed segments) suffer from occasional hiccups. The symptoms are always the same: in one of the sessions, the INSERT gets stuck doing lots of single-block I/O against one of the indexes on the inserted table, and if other sessions are running similar INSERTs, they hang on enq: TX – index contention. The situation can last just a few seconds, but sometimes it’s much longer than this (several minutes), in which case the impact on the application is quite serious.
Today I’d like to share another tuning example from a recent case at work, which in my opinion is good for illustrating typical steps involved in SQL optimization process.
I was handed a poorly performing query with a relatively verbose text, so I will only give the general structure here (it will also prevent me from accidentally disclosing some sensitive information from that application):
Nested loop join appears like the simplest thing there could be — you go through one table, and as you go, per each row found you probe the second table to see if you find any matching rows. But thanks to a number of optimizations introduced in recent Oracle releases, it has become much more complex than that. Randolf Geist has written a great series of posts about this join mechanism (part 1, part 2 and part 3) where he explores in a great detail how numerous nested loop optimization interact with various logical I/O optimizations for unique and non-unique indexes. Unfortunately, it doesn’t cover the physical I/O aspects, and that seems to me like the most interesting part — after all, that was the primary motivation behind introducing all those additional nested loop join mechanism on the top of the basic classical nested loop. So I conducted a study on my own, and I’m presenting my results in the mini-series that I’m opening with this post.
Chasing cost efficiency, business often cuts back on money spent on UAT boxes used for performance testing. More often than not, this is a bad-decision, because the only thing worse than not having a UAT environment is having a UAT environment that is nothing like production. It gives a false sense of security while exposing your application to all sorts of nasty surprises. In this post I tried to summarize a few typical configuration differences between UAT and production which can affect performance test results in a major way.
Continue reading “Lies, damned lies and non production-like performance testing”
SQL trace file provide the highest level of detail possible about SQL execution. The problem with that information is converting it to a convenient format for further analysis. One very good solution is parsetrc tool by Kyle Hailey written in Perl. It gives high-resolution histograms, I/O transfer rates as a function of time, and other very useful info. Unfortunately, I myself am not a Perl expert, so it’s a bit difficult for me to customize this tool when I need something slightly different from defaults (e.g. change histogram resolution, look at events not hardcoded into the script etc.). Another limitation is that since the tool is external to the database, you can’t join the data anything else (like ASH queries). So I found another solution for raw trace file analysis: external tables + regexp queries.
Nowadays, data in databases is wrapped in may layers of cache: result cache, buffer cache, OS page cache, storage hardware cache… They greatly improve performance, but they also make it less stable and harder to predict. And when I/O performance takes a turn for worse, one has to go through multiple layers of cache trying to understand what went wrong and why. I had such a case not too long ago.
The incident took place on an Oracle 220.127.116.11 database running on a Solaris 10 server. The first symptom was decreased throughput of one of batch processes in the database by about 30 percent. Since the process was spending more than half of its time doing I/O (as ASH indicated), it was easy to establish that the slowdown was linked to increased time of db file sequential reads by plotting average wait times from DBA_HIST_SYSTEM_EVENT (the SQL code for this and other queries in this blog post can be found below, in the Appendix):
As most of database developers who have to work with database performance analysis and tuning, I have my own collection of favorite sqlplus scripts for various occasions. Over years, I have turned many of them into PL/SQL code which allows greater flexibility. Imagine, for examples, trying to call a sqlplus script from another sqlplus script with both in and out parameters, or simulating autonomous transactions in sqlplus — obviously, such things are much easier in PL/SQL. Of course, PL/SQL has certain drawbacks of its own, one of them being the rather peculiar way Oracle handles user security. More specifically, you cannot use privileges received via roles inside PL/SQL code. This can be really annoying, especially for performance benchmarking code, which heavily relies on access to data dictionary views, granted via SELECT_CATALOG_ROLE or similar, e.g.: Continue reading “Accessing data dictionary views from PL/SQL”
When producing SQL execution plans with rowsource statistics, one has a choice whether or not specify the sql_id. It’s simpler to run dbms_stats.display_cursor without specifying sql_id, but it can be tricky, because then the plan for the last statement executed by this session will be displayed, and it can be not the statement you expected, especially when running the statement from a GUI like Toad. But even with sqlplus you can still get unexpected results, e.g. a plan for a hidden call to a dbms_output routine instead of the last statement you ran.
Below I describe a method to bypass this difficulty.
Copy your SQL into a file (let’s call it slow_query.sql). Then in the same directory open a sqlplus session, and in it run the wrapping script as shown below:
set termout off set serverout off set linesize 200 set pagesize 9999 alter session set statistics_level = all; @slow_query.sql; spool slow_query_plan.lst select * from table(dbms_xplan.display_cursor(null, null, 'allstats last')); spool off
You can save it to a script and used it repeatedly with different queries (all that needs to be changed is the name of the script with the query on line 6).
The script produces a text file with nicely formatted plan with rowsource statistics, ready for analysis or posting. Note that since the output is suppressed
(for the reasons explained in the next section), in case of a SQL error you won’t see one — if that happens, just temporarily re-enable the output (set termout on,
set serverout on) to find out the cause of the error.
If you’re not interested in how it’s working, then you can stop reading — just copy-paste the script below, and you’re all set.
For those who are more curios, I’ll describe the purpose of individual instructions in the script:
set termout off — suppress output of query rows on the screen. This is necessary, if the query produces many rows of output: displaying them would slow down the script considerably
set serverout off — I’m not entirely sure how, but this line takes care of wrapped sqlplus calls to dbms_output.disable or dbms_output.get_lines. If you don’t add this line, then you may end up having the plan for one of such call instead of your statement
set linesize 200 — this makes sure that lines aren’t wrapped
set pagesize 9999 — prevents headers of plan columns from appearing on each screen
alter session set statistics_level = all — this is the key to obtaining rowsource statistics (i.e. how many rows were produced at each step and how long the step took), without it the execution plan is almost useless for tuning
The rest of the script should be fairly self-explanatory.