AWR report: load profile

savvinov AWR August 13, 2012November 30, 2014 5 Minutes

Introduction

“Load profile” section of the AWR report contains some extremely useful information, and yet it is very often overlooked (often in favor of instance efficiency percentages, which is easier to read but much more likely to mislead). I decided to make some sort of a short guide for it, describing how different statistics in it can be used to better understand performance of a database.

Redo size

Everything that you do in a database is protected by redo. Redo is a collection of so-called “change vectors” that tell Oracle how to repeat an operation on data if necessary. Even though SELECTs can also generate some redo, the main sources of redo are (in roughly descending order): INSERT, UPDATE and DELETE. For INSERTs and UPDATE s, the size of redo is close to the amount of data created or modified. For DELETEs, you only need to know the rowid’s of deleted rows to repeat the operation, so if the rows are “fat”, then the size of redo may be much smaller than the size of deleted data.

High redo figures mean that either lots of new data is being saved into the database, or existing data is undergoing lots of changes.

How high is high? Databases are not created equal, so there is no universal standard. However, I find it useful multiplying redo per second by 86,400 (number of seconds there are in a day) and compare it to the size of the database — if the numbers are within the same order of magnitude, then this would make me curious. Is the database doubling in size every few days? Or is it modifying almost every row on a daily basis? Or maybe there is something going on that I don’t know about?

What do you do if you find that redo generation is too high (and there is no business reason for that)? Not much really — since there is no “SQL ordered by redo” in the AWR report. Just keep an eye open for any suspicious DML activity. Any unusual statements? Or usual statements processed more usual than often? Or produce more rows per execution than usual? Also, be sure to take a good look in the segments statistics section (segments by physical writes, segments by DB block changes etc.) to see if there are any clues there.

Logical reads, block changes, physical reads/writes

Logical reads is simply the number of blocks read by the database, including physical (i.e. disk) reads, and block changes is fairly self-descriptive. These statistics tell the nature of the database activity (read-mostly, write-mostly, a little bit of both) and its scale at the time of the report. It also gives you an idea how well data caching works in the database (but you can also see that directly from the buffer cache hit ratio in the “instance efficiencies” section).

If you find those number higher than expected (based on usual numbers for this database, current application workload etc.), then you can drill down to the “SQL by logical reads” and “SQL by physical reads” to see if you can identify specific SQL responsible.

User calls

A user call is when a database client asks the server to do something, like logon, parse, execute, fetch etc. This is an extremely useful piece of information, because it sets the scale for other statistics (such as commits, hard parses etc.).

In particular, when the database is executing many times per a user call, this could be an indication of excessive context switching (e.g. a PL/SQL function in a SQL statement called too often because of a bad plan). In such cases looking into “SQL ordered by executions” will be the logical next step.

Parses and hard parses

A parse is analyzing query’s text and optionally, optimizing a plan. If plan optimization is involved, it’s a hard parse, otherwise a soft parse.

As we all know, parsing is expensive (performance-wise). Excessive parsing can cause very nasty performance problems (one moment your database seems fine, the next moment it comes to a complete standstill). Another bad thing about excessive parsing is that it makes troubleshooting of poorly performing SQL much more difficult.

How much hard parsing is acceptable? It depends on too many things, like number of CPUs, number of executions, how sensitive are plans to SQL parameters etc. But as a rule of a thumb, anything below 1 hard parse per second is probably okay, and everything above 100 per second suggests a problem (if the database has a large number of CPUs, say, above 100, those numbers should be scaled up accordingly). It also helps to look at the number of hard parses as % of executions (especially if you’re in the grey zone).

If you suspect that excessive parsing is hurting your database’s performance:

1) check “time model statistics” section (hard parse elapsed time, parse time elapsed etc.)

2) see if there are any signs of library cache contention in the top-5 events

3) see if CPU is an issue.

If that confirms your suspicions, then find the source of excessive parsing (for soft parsing, use “SQL by parse calls”; for hard parsing, use force_matching_signature) and see if you can fix it.

Sorts

Sort operations consume resources. Also, expensive sorts may cause your SQL fail because of running out of TEMP space. So obviously, the less you sort, the better (and when you do, you should sort in memory). However, I personally rarely find sort statistics particularly useful: normally, if expensive sorts are hurting your SQL’s performance, you’ll notice it elsewhere first.

Logons

Establishing a new database connection is also expensive (and even more expensive in case of audit or triggers). “Logon storms” are known to create very serious performance problems. If you suspect that high number of logons is degrading your performance, check “connection management elapsed time” in “Time model statistics”.

Executes

Executes statistic is very important for analyzing performace, but what I had to say about it I’ve already said above in “user calls” and “parses and hard parses” sections.

Transactions

This is another extremely important statistic, both on the general (i.e. creating context for understanding the rest of the report) and specific (troubleshooting performance problems related to transaction control) levels. The AWR report provides information about transactions and rollbacks, i.e. the number of commits can be calculated as the difference between the two. Rollbacks are expensive operations, and can cause performance problems if used improperly (i.e. in tests, to revert the database to the original state after testing), which can be controlled either by reducing the number of rollbacks or by tuning rollback segments. Rollbacks can also indicate that a branch of code is failing and thus forced to rollback the results (this can be overseen if resulting errors are not processed or rethrown properly).

Excessive commits can lead to performance problems via log file sync waits .

How many is excessive? Once again, this entirely depends on the database. Obviously, OLTP databases commit more than DWH ones, and between OLTP databases the numbers can vary several orders of magnitude. For the databases that I worked with, below 10-20 commits per second there never was a problem, and above 100-200 there almost always was (when not sure, look in “top timed events”: if there are no “log file sync” waits up there, then you’re probably okay!).

Published by savvinov

View all posts by savvinov

Published August 13, 2012November 30, 2014

28 thoughts on “AWR report: load profile”

sandy says:

August 13, 2012 at 5:26 am

Hi savvy..awesome post..am new in the performance tuning part of the database and I got very useful concepts from your posts..
Looking foward for the rest of the awr report parts..

Reply
1. savvinov says:
  
  August 13, 2012 at 6:38 am
  
  Hi Sandy,
  
  thanks for your feedback! If you have any AWR reports that you have trouble interpreting, feel free to send them over and I’ll do my best to help!
  
  Best regards,
  Nikolay
  
  Reply
  1. rajat says:
    
    October 17, 2012 at 6:49 am
    
    hey Savvinov, im pretty new to administering a database n im facing an issue wherein our db stops working. i have the awr report but not able to completely understand the issue. would highly appreciate if you could help
    
    Reply
    1. savvinov says:
      
      October 17, 2012 at 6:55 am
      
      Hi Rajat,
      
      sure, you can email the AWR report and I’ll be happy to take a look. My email is nsavvinv at gmail dot com.
      
      Best regards,
      Nikolay
      
      Reply
  2. tnk says:
    
    April 28, 2015 at 9:57 am
    
    got a out of memory issue on one of prod server how to troubleshoot from AWR report?need your help
    
    Reply
sandy says:

August 13, 2012 at 7:00 pm

Sure savvy.. Actually I have gone through various awr related articles in net..but for a gear up I want some case studies..I know that this need a lot of experience and currently I don’t have that..am 2 yrs experienced in database field..therefore I started to take awr reports of various times of our database and trying to analyze it..I will surely post some of those to you where I am facing difficulty..thanks a lot savvy for your encouraging comments.. :)

Reply
1. savvinov says:
  
  August 14, 2012 at 4:26 am
  
  Hi Sandy,
  
  you’re welcome. BTW you can call me Nikolay — that’s my first name
  
  Reply
Yasir says:

August 24, 2012 at 5:48 am

Hi,
IMO, it is not right to include physical reads in logical reads as you said.
Logical is buffer cache access, physical is disk access.

Reply
1. savvinov says:
  
  August 24, 2012 at 6:46 am
  
  Hi Yasir,
  
  thanks for your comment.
  
  it’s very easy to test whether or not logical reads include physical ones. Create a moderately big table without any indexes, flush the buffer cache to be extra sure that no blocks are in the cache, check the value of “session logical reads” and “physical reads” statistics, then run a query against that table, mark the value of those statistics again, and subtract — the two resulting numbers will be very close to each other.
  
  I just ran a quick test myself, and it worked as expected. If you have any difficulty reproducing this behavior let me know and I’ll post the test case.
  
  Reply
  1. Yasir says:
    
    August 24, 2012 at 8:14 am
    
    Hi Nikolay,
    There are two scenarios:
    1) If the blocks are not found in cache ,oracle would go to disk and put them into cache.
    
    2) If the blocks are found in cache, no disk access is required.
    
    Therefore, it is not always that a logical read would require a physical read.
    It may require a physical read.
    
    So, my point is generally a logical read should be classified as bufer cache read not disk read.
    
    Reply
  2. savvinov says:
    
    August 24, 2012 at 8:37 am
    
    Hi Yasir,
    
    obviously, physical reads don’t include logical reads. And I never claimed that. If you read carefully, I was stating exactly the opposite: logical reads include physical reads.
    
    Let me put it this way: any request for a data block results in a logical read. If the request for block cannot be satisfied from cache, then it will also result in a physical read.
    
    BTW in addition to reading a block from buffer cache, there’s also a possibility of a direct read — in which case the block would go to the workarea directly, without going to the buffer cache first.
    
    In any case, it’s easier to understand by seeing it with your own eyes. Try to make a test case where you know for sure that all reads are physical, and watch the values of stats “session logical reads” and “physical reads” before and after the test.
    
    Reply
  3. savvinov says:
    
    August 24, 2012 at 8:48 am
    
    Hopefully this should convince you (but I still recommend that you do a test):
    
    “Logical reads may result in a physical read if the requested block does not reside with the buffer cache.” (c)
    Oracle® Database Reference 10g Release 2 (10.2)
    
    http://docs.oracle.com/cd/B16240_01/doc/doc.102/e16282/oracle_database_help/oracle_database_instance_throughput_logreads_ps.html
    
    Reply
Yasir says:

August 24, 2012 at 8:53 am

Hi Nikolay,
I didn’t say you said physical read require logical reads.
What you stated was “Logical reads is simply the number of blocks read by the database, including physical (i.e. disk) reads”

This classification of logical reads is only for a special case when there is no blocks in the buffer cache. Generally speaking, a logical read is a buffer read.It “may” involve a disk operation
You can easily verify that with set autotrace on trace only statistics.
Run the query twice and you will see the 2nd time the physical reads are close to 0.

Reply
1. savvinov says:
  
  August 24, 2012 at 9:12 am
  
  Yasir,
  
  if I say that there are 100 cars on a parking lot, and that includes broken cars — it doesn’t mean or imply that all cars on the parking lot are broken. :) But I see where this misunderstanding is coming from, thanks for clarifying that.
  
  Let me think if I can rephrase it more clearly.
  
  Reply
  1. Yasir says:
    
    August 24, 2012 at 9:25 am
    
    haha, even if it is a broken car,its still a car!! ;)
    
    Reply
Yasir says:

August 24, 2012 at 8:58 am

Hi Nikolay,
Thanks for the link.
“Logical reads may result in a physical read if the requested block does not reside with the buffer cache”
The docs cleary says it “may” result in physical reads. That is it is not always the case.
Your statmenent gives an impression that it is always the case.

Reply
kobs says:

September 4, 2012 at 10:18 pm

Hi Nicolay,

just want to thank you for you blog!
I currently trying to understand an awr report and the information that you put there are really helpfull

Reply
1. savvinov says:
  
  September 5, 2012 at 1:50 am
  
  Hi,
  
  thanks for visiting my blog! If you have trouble understanding an AWR report, you can post it here (or mail it to me) and I will help you.
  
  Best regards,
  Nikolay
  
  Reply
vjdba says:

December 14, 2012 at 12:06 am

Heloo Nikolay,

Thanks for this wonderful post. I am new to Oracle performance tuning and your post has helped me get a lot of insight.

Thanks and Regards
-Vijayendra

Reply
1. savvinov says:
  
  December 14, 2012 at 4:14 am
  
  Hi Vijay,
  
  thanks for visiting my blog!
  
  Reply
Emir says:

February 6, 2013 at 8:02 pm

Hi Nikolay
Thanks so much its really awesome blog

Reply
1. savvinov says:
  
  February 7, 2013 at 3:45 am
  
  Hi Emir,
  
  Thanks for visiting my blog, I’m glad you liked it!
  
  Reply
Sohail says:

September 24, 2013 at 4:30 pm

Hi Dear, your blog is very much useful for new dba and the experience dba.
very soon iam comming with some performance tunning questions…………….

Reply
Kenny T. says:

March 14, 2015 at 11:23 am

Hi, I am having trouble understanding DB CPU time and DB time of the AWR Load Profile.
I don’t understand how DB time can be calculated “Per Second”. Same goes to DB CPU time. For example, say if I am a cook, then cooking time should be equivalent to DB time in oracle. And my cooking time was 30 minutes. What will my cooking time “Per Seconds” be? I think it should always be 1(if I have only 1 core CPU).

Reply
Ankit says:

September 10, 2015 at 1:48 pm

Hi Nikolay,

Can you please share some of the stuffs to look at in the AWR report for any Database related performance issue.
I am not a DB SME but understanding AWR report to identify the bottlenecks in the overall DB performance makes my performance engineering job easier…specially while discussing with the developers and other project stakeholders.

Thanks in advance!

Reply
veera says:

December 1, 2015 at 8:31 am

hi,
it was really a good piece of writing for a beginner like me thanks

Reply
rickyzhu says:

May 20, 2016 at 6:40 am

Thanks for the post, Sav, however, i have a question.

in my 1 hours of busy hour AWR report, the redo size total is: 1,411,472K, which is 1.4GB. but when i take a look at v$log_history and found that during hours, nearly 78 redo switch happen which is about 40GB (512M per redo)?

how can this explain, or am i understand wrong?

Reply
1. savvinov says:
  
  May 20, 2016 at 10:11 am
  
  Hi Ricky,
  
  I’d trust the AWR figure. The figure you derived based on log switches is much less reliable because things other than a log file becoming full can cause a log switch (the simplest example would be ALTER SYSTEM SWITCH LOGFILE).
  
  Best regards,
  Nikolay
  
  Reply