Long-running INSERT

On one of the databases I’m looking after (, Solaris, non-RAC), several different INSERT statements (all into tablespaces with manually managed segments) suffer from occasional hiccups. The symptoms are always the same: in one of the sessions, the INSERT gets stuck doing lots of single-block I/O against one of the indexes on the inserted table, and if other sessions are running similar INSERTs, they hang on enq: TX – index contention. The situation can last just a few seconds, but sometimes it’s much longer than this (several minutes), in which case the impact on the application is quite serious.

We did some research on the subject, and found a number of similar bugs, but none of them matched our symptoms closely enough. Most of them were for ASSM tablespaces and had to do with table block being read, and not index ones. Yesterday, Jonathan Lewis published yet another blog article of a problem of a similar kind:

https://jonathanlewis.wordpress.com/2017/01/30/assm-help/ ,

and I’ve asked in comments if he was aware of similar issues on MSSM. He gave a couple of links, and one of them lead me to a blog article by Saibabu Devabhaktuni:

http://sai-oracle.blogspot.co.uk/2009/04/beware-of-index-contention-after-mass.html ,

which looks very much like our problem.

The best part was that Saibabu provided a reproducible test case, which allowed me to do some digging. I also did my own, somewhat simplified version of the script reproducing the issue, and I will give it here with some additional comments to help understand the essence of the issue.

Let me remind you, that the problem is reproducible in (I did it on Linux, although it is unlikely that the issue is platform-dependent). Saibabu raised a bug request with Oracle, and according to it, the issue was fixed in 12c, although I haven’t checked that myself.

First, the setup:

drop tablespace mssm including contents and datafiles;

create tablespace mssm datafile '/data01/orcl11g/data/mssm.dbf' size 500M
 segment space management manual
 extent management local uniform size 10M;

drop table mytest;

create table mytest(x number) tablespace mssm;

create index i$mytest$x on mytest(x) tablespace mssm;

insert into  mytest
select level
from dual
connect by level <= 200126;


delete mytest where x<=199000;


insert into mytest
  with x as 
  (select level i from dual connect by level <= 199000)
  select i from x where mod(i, 250) = 0;


Let me now explain what exactly is being done here and why. First, we create an MSSM tablespace. Then we create a table MYTEST with a single column, and an index on this column.

The table is populated with 200,216 rows (with consecutive values from 1 to 200,216). Why this particular number? I started with 200,000 — a simple number that would give me enough data to produce the desired effect, and yet not too big so that I wouldn’t have to deal with memory issues that CONNECT BY can run into (they can be easily avoided by using a cartesian join, but that means typing more code and I’m lazy that way).

Additional 216 rows are necessary to make sure that the rightmost index leaf block is full — it will later become clear what for.

Then over 99% of the rows are deleted. All blocks except the rightmost two are cleared, and added to the freelist. The next statement is the key. We are inserting data back into the previously cleared blocks, although this time we are inserting much less rows (every 250th from the original data set). This is enough to make sure that all leaf blocks now again contain data, and therefore they must be taken off the freelist. Remember, that unlike regular tables, which are just heaps of unsorted data, indexes contain data in a sorted way. It doesn’t matter where in a heap table you insert a row, but in an index, you don’t have this flexibility anymore, as you cannot insert into a leaf block entries which are not within the proper range.

If at this point you look at segment space usage (either by dumping the segment header block, or by using show_space wrapper to dbms_space written by T.Kyte), you will see that it doesn’t happen. This is where the bug is, and this explains what happens next.

If you flush the buffer cache, and run INSERT INTO MYTEST VALUES (200217) with autotrace on, you will see that Oracle does about 800 logical reads, and about half of them result in physical reads, causing a potential performance issue (remember, not only the original session would hang while doing the I/O, it could also block the other sessions as well).

What happens here is that the last insert causes a so-called 90-10 split, which, unlike the usual 50-50 split, requires the new row to be inserted into a new empty block. In order to save space, Oracle first tries to reuse one of the blocks below the high watermark which it assumes free (as they are on the freelist). If you trace it, you’ll see that Oracle ends up examining (and taking them off the freelist) the entire range of blocks cleared by the delete statement one by one.

I’ve also tried the partial workaround provided by Saibabu: enabling event 45051, and it worked (“partial” because the amount of I/O remains anomalously high, but fortunately it’s mostly repeated reads of the same blocks, so very little physical I/O will result from it). Other possible solution (apart from fixing the physical structure of the index by coalescing or rebuilding it after a large delete, which isn’t always convenient on a live system during business hours) would be to migrate to ASSM, but it’s not clear whether this would be much of improvement as ASSM has several somewhat similar bugs, and I’m not sure how many of them are fixed in recent versions.

I was unable to confirm whether or not we are hitting this exact bug or something else — some scripts have been deployed to obtain additional diagnostic information, and if we manage to get a definitive confirmation, I’ll post an update (but due to the very intermittent nature of the problem I have no idea how long this may take).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s