“4k bug” is not a bug

A couple of weeks back I received an update on my “4k bug” SR (slow multirow DML performance for small block sizes). As it turns out, the observed behavior is not a bug. It’s actually more interesting than that. It is an undocumented feature that controls the insert mode depending on the estimated row size (the sum of all column sizes). If the estimated column size exceeds the size of the block, then Oracle switches to row-by-row processing.

This can be illustrated with the example below (in a 4k tablespace):

select * from v$version;
  
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
"CORE	11.2.0.3.0	Production"
TNS for Solaris: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production  

set autotrace on statonly

create table t4k(x number, padding varchar2(4000));

insert into t4k 
select level, rpad(' ', 100, ' ')
from dual
connect by level <= 10000;

Statistics
----------------------------------------------------------
          1  recursive calls
      11200  db block gets
        313  consistent gets
        109  physical reads
    3733140  redo size
        830  bytes sent via SQL*Net to client
        857  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      10000  rows processed


create table t4k2(x number, padding varchar2(2000));

insert into t4k2
select level, rpad(' ', 100, ' ')
from dual
connect by level <= 10000;


Statistics
----------------------------------------------------------
          1  recursive calls
       1255  db block gets
        616  consistent gets
         19  physical reads
    1317000  redo size
        838  bytes sent via SQL*Net to client
        858  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      10000  rows processed

As you can see, it sufficies to decrease the column length so that the maximum row length would become less than the block size to significantly improve performance (e.g. reduce redo generated by a factor of 3!).

This means that specifying large column size “in case” can get you in trouble, if it brings the total column size over the block size. Obviously, the smaller the block size, the larger the risk.

The query below can identify the tables that will have slow bulk DML because of the high maximum row length:

  select c.owner, 
         c.table_name, 
         sum(data_length) est_row_length, 
         ts.block_size
  from dba_tab_columns c,
       (select owner, table_name, tablespace_name 
        from dba_tables 
        union select table_owner owner, table_name, tablespace_name 
        from dba_tab_partitions
       ) t,
       dba_tablespaces ts
  where c.owner = t.owner
  and c.table_name = t.table_name
  and ts.tablespace_name = t.tablespace_name
  group by ts.block_size, c.owner, c.table_name
  having sum(data_length)>=ts.block_size
  order by 1, 2

For the tables returned by this query, bulk DML will be internally performed row-by-row, decreasing performance and increasing redo generation. If you want to change this, you’ll need to either shrink column size, or migrate the table to a tablespace with a larger block size.

2 thoughts on ““4k bug” is not a bug”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s