DUMPING ORACLE BLOCKS
Brian Peasland, Raytheon at EROS Data Center
In a few cases, it is desirable to dump the contents of an Oracle data block. These blocks may comprise a table, an index, or even the control file. While we can query the contents of a table, we may want to see what is happening to a particular block of that table. We might want to see not only the data in the block that any query can return, but also see what‟s going on behind the scenes in the block. This paper will show you how to dump the contents of a block in a table, index and control file. This paper will also show you how to interpret some of the results of these dumps.
WHY DUMP BLOCKS?
So why are we doing this? For the most part, it is just idle curiosity. DBAs are inquisitive folks by nature. Oracle Corp. has released just enough information on database internals to tantalize us without giving away all the secrets. And we‟d like to see what‟s going on behind the scenes. So for most cases, we are dumping blocks just for fun. In other cases, we are dumping blocks to actually find out some meaningful information. But in the end, it is up to you.
TRACE FILE INFORMATION
All of the examples in this paper will generate trace files. Those trace files will be present in USER_DUMP_DEST for you to view. To determine the trace file
generated, use a query similar to the following:
ORA9I SQL> select pa.value || '/' || i.instance_name || '_ora_'
2 || pr.spid || '.trc' as trace_file
3 from v$session s, v$process pr, v$parameter pa, v$instance i
4 where s.username = user and s.paddr = pr.addr
5* and pa.name='user_dump_dest';
This query shows the full path and filename of the generated trace file for my session. This is the text file we look in to see the results of our dump.
All trace files contain the same basic information at the beginning of the file.
Oracle9i Enterprise Edition Release 220.127.116.11.0 - Production
With the Partitioning option
JServer Release 18.104.22.168.0 - Production
ORACLE_HOME = /edcsns14/pkg/oracle/9.0.1
System name: SunOS
Node name: edcsns14
Instance name: ora9i
Redo thread mounted by this instance: 1
Oracle process number: 11
Unix process pid: 653, image: oracle@edcsns14 (TNS V1-V3)
*** 2002-03-27 17:06:06.573
*** SESSION ID:(12.4240) 2002-03-27 17:06:06.535
Output similar to above is shown in each trace file. This output shows the database version, some platform specific information such as host name and OS level, the database instance name, the processes identifiers (Oracle and Unix) for the session that generated the trace file, and the date and time the file was generated. We‟ll skip this introductory information in examining our trace files.
DUMPING A TABLE BLOCK
To dump a block belonging to a table, you‟ll need to know the file number and block number of that block. If you already know the file number and block, then you are all set. But for our example, we‟ll want to figure that out.
ORA9I SQL> select file_id,block_id,bytes,blocks
2 from dba_extents
3 where owner='PEASLAND' and segment_name='EMP';
FILE_ID BLOCK_ID BYTES BLOCKS
------- ---------- ---------------- --------
3 9 65,536 8
Here, I‟ve queried the data dictionary to find out which file my EMP table resides
in. The EMP table is in file# 3, starting at block# 9, and is eight blocks long. This query will return one row for each extent of the object. So this object is comprised of only one extent. We can verify which tablespace file# 3 belongs to with the following query:
ORA9I SQL> select tablespace_name,file_name from dba_data_files
2 where relative_fno = 3;
------------------------------ ----------------------------------------- USER_DATA /edcsns14/oradata3/ora9i/user_data01.dbf
As I had expected, my table is in the USER_DATA tablespace.
Now that we know which file and blocks hold our table, let‟s dump a sample block of the table. This is done as follows:
ORA9I SQL> alter system dump datafile 3 block 10;
You can dump a range of blocks with the following command:
ORA9I SQL> alter system dump datafile 3 block min 10 block max 12;
Let‟s now look at the contents of dumping one block.
Start dump data blocks tsn: 3 file#: 3 minblk 10 maxblk 10
buffer tsn: 3 rdba: 0x00c0000a (3/10)
scn: 0x0000.00046911 seq: 0x02 flg: 0x04 tail: 0x69110602
frmt: 0x02 chkval: 0x579d type: 0x06=trans data
Block header dump: 0x00c0000a
Object id on Block? Y
seg/obj: 0x6d9c csc: 0x00.46911 itc: 2 flg: O typ: 1 - DATA
fsl: 0 fnx: 0x0 ver: 0x01
Itl Xid Uba Flag Lck Scn/Fsc 0x01 xid: 0x0005.02f.0000010c uba: 0x00806f10.00ca.28 C--- 0 scn 0x0000.00046900
0x02 xid: 0x0003.01c.00000101 uba: 0x00800033.0099.04 C--- 0 scn 0x0000.00046906
This is the beginning of the data block dump. The first line tells us that we are dumping file#3, starting at block# 10 (minblk), and finishing with block# 10
(maxblk). Had we dumped more than one data block, these values would represent a range. The relative data block address (rdba) is 0x00c0000a. For
more information on the rdba, refer to a later section in this paper. At the end of this line, we can see in parentheses that the rdba corresponds to file# 3, block# 10 (3/10).
The third line describes the SCN of the data block. In our case, the SCN is 0x0000.00046911. The tail of the data block is composed of the last two bytes of the SCN (6911) appended with the type (06) and the sequence (02). If the decomposition of the tail does not match these three values, then the system knows that the block is inconsistent and needs to be recovered. While this tail value shows up at the beginning of the block dump, it is physically stored at the end of the data block.
The block type shows up on the fourth line. Some of the valid types correspond to the following table:
0x02 undo block
0x06 table or index data block
0x0e undo segment header
0x10 data segment header block
0x17 bitmapped data segment header
The “Object id on Block?” line tells us whether or not this object is in SYS.OBJ$.
Since Oracle 6, this should always be “Y”. If you look at the next line, the
seg/obj value tells us the segment‟s object id (in hex). In our example, this is 0x6d9c. Hex „6D9C‟ is „28060‟ in decimal. We can verify that this is our table with the following query:
ORA9I SQL> select owner,object_name from dba_objects
2 where object_id=28060;
As we had hoped, this is our table.
The csc value is the Cleanout System Change number. This value tells us when block cleanout was performed on this block. Hopefully, it matches the SCN of the data block. The itc value is the Interested Transaction List Count. In our case, there are two transactions interested in this block. Those interested transactions appear at the end of our example. We can see the transaction id (Xid) of those
two transactions. Those transaction ids correspond to rollback segments that are used to process our transactions.
The flag (flg) is either “-” or “O”, used to indicate if this block is on a freelist. If the block is on a freelist, the flag will be “0”. If it is not on a freelist, then the flag
will be “-”. Our block in question is on the freelist.
Well, that was quite a lot of information and we haven‟t really looked at too much of the dump. Let‟s look at the next section of the data block dump.
0xe:pti nrow=14 offs=0
The tsiz value shows us the amount of available room in the block for data. Here, we get „1fa0‟ which translates to 8,096 bytes of useable room. The rest of our 8,192 byte block is used for overhead such as the block header.
The ntab value shows us how many tables are stored in this block. Unless this block belongs to a cluster, this value will be „1‟. The nrow value tells us how
many rows of data are stored in this block. Our data block has 14 rows of data.
Starting at address „0xe‟, we get a directory to each row. We can see that the
first row (index entry zero) starts at offset address to the block „0x1c30‟. Each of the blocks rows follows from here. This way, a row can be found really quickly. Remember that a ROWID is basically a pointer to a unique row. In Oracle 8+, the ROWID is of the form O.F.B.R (or objectno,relativefno,blockno,rowno). So when the system quickly points to a particular block in a particular file, the row number points to a slot in this directory. The directory then points to a specific location in the block. This is the start of that row.
Now that we have a roadmap to our data block, let‟s look at the remainder of the trace file to see the actual rows of data in the block.
tab 0, row 0, @0x1c30
tl: 39 fb: --H-FL-- lb: 0x0 cc: 8
col 0: [ 3] c2 4a 46
col 1: [ 5] 53 4d 49 54 48
col 2: [ 5] 43 4c 45 52 4b
col 3: [ 3] c2 50 03
col 4: [ 7] 77 b4 0c 11 01 01 01
col 5: [ 3] c2 09 19
col 6: *NULL*
col 7: [ 2] c1 15
The actual row data starts with the phrase “block_row_dump:”. Then a row of
data is given. I‟ve only shown one row of data here, as the rest is similar. We can see that this row belongs to table „0‟ (tab) of our cluster. Since there is no
cluster in our example, we do not have more than one table so this value will be zero. We can also see that this is row „0‟ and the address of that row is given. This address should correspond to our roadmap noted above.
The „tl‟ value gives us the total number of bytes for this row, including any overhead. We can see that this row occupies 39 bytes. The „cc‟ value gives us a
column count. We have eight columns in this row. This can easily be verified by
doing a DESCRIBE on the table and counting the columns, or by querying USER_TAB_COLUMNS.
The „fb‟ value gives us flags about the row. „H‟ means that we have the head of the row. „F‟ means that we have the first piece of the row. „L‟ means we also have the last piece of the row. Since this is the first and last piece of the row, the row is not chained. Since this is also the head of the row, the row has not been migrated.
The rest of the information for the row is the data for each column. For instance, in column 1, we have the following ASCII character codes, “53 4d 49 54 48”. A
quick look at an ASCII conversion chart will tell us that these characters are “SMITH”. If you are familiar with the sample EMP table, you will know that SMITH
is one of our employees. Notice that column 6 is NULL. Column 4 is the HIREDATE
column. This is a DATE datatype. From this block, you can easily verify that the DATE datatype requires seven bytes of storage. Column 0 contains a number. The three bytes here are the representation of that number.
DUMPING THE EXTENT HEADER
In the previous section, we dump a block in the middle of the table. The first block in the segment contains some interesting information as well. If you recall, our EMP table was in file#3, starting at block# 9. We previously dumped a block in the middle of the table. Let‟s now dump the first block of this table.
ORA9I SQL> alter system dump datafile 3 block 9;
We will now look at the contents of our trace file.
Start dump data blocks tsn: 3 file#: 3 minblk 9 maxblk 9
buffer tsn: 3 rdba: 0x00c00009 (3/9)
scn: 0x0000.000467ee seq: 0x01 flg: 0x04 tail: 0x67ee1001
frmt: 0x02 chkval: 0x3d71 type: 0x10=DATA SEGMENT HEADER - UNLIMITED
Extent Control Header
Extent Header:: spare1: 0 spare2: 0 #extents: 1 #blocks: 7
last map 0x00000000 #maps: 0 offset: 4128
Highwater:: 0x00c0000b ext#: 0 blk#: 1 ext size: 7
#blocks in seg. hdr's freelists: 1
#blocks below: 1
mapblk 0x00000000 offset: 0
Map Header:: next 0x00000000 #extents: 1 obj#: 28060 flag: 0x40000000
0x00c0000a length: 7
We‟ve already seen the first four lines in the previous section. One thing to
notice is that the type is 0x10 for a data segment header. After this, things look a
little different. We are now given information about this particular extent. We can see that this particular segment is comprised of only one extent (#extents) and
that it has seven data blocks (#blocks). If you remember from the previous
section, this segment is really composed of eight blocks. But the first block is for the segment header, so it is not counted here.
From this segment header, we can get information about the High Watermark (HWM). If we look at the line starting with „Highwater::‟, we can see that the thHWM is in rdba 0x00c0000b. This is in the 0 extent of our segment (ext#), in stthe 1 block (blk#). Remember that we count from zero so the HWM is really in the first extent, second block. Additionally, two lines below this one, we can see that there is only 1 block below the HWM (#blocks below). We can also see that
there is only one block on the segment header freelist (#blocks in seg. hdr’s
freelists). Let‟s look at a more complex example from a table that is using more than one block.
Extent Header:: spare1: 0 spare2: 0 #extents: 48 #blocks: 383
last map 0x00000000 #maps: 0 offset: 4128
Highwater:: 0x00c01d8c ext#: 47 blk#: 3 ext size: 8
#blocks in seg. hdr's freelists: 0
#blocks below: 378
Here, we can see a much bigger table. From this information, we can tell that the thtable is made of 48 extents, totaling 383 blocks. The HWM is in the 47 extent in
block 3. There are 378 blocks below the HWM. This means that we have five blocks above the HWM. Before the DBMS_SPACE package became available, this
method of dumping the segment header was the only way to get information on the HWM. Now that the DBMS_SPACE package is available, this method has
become basically obsolete. But it is still an interesting exercise.
DUMPING AN INDEX
For this section of the paper, we are only concerning ourselves with B-tree indexes. This section does not cover dumping blocks of bitmap indexes or IOTs.
The data blocks that make up an index are different than table data blocks. A B-tree index has branch blocks and leaf blocks. We will first perform a tree dump of the entire B-tree index. We will examine various blocks of this tree dump. To start the tree dump, we need to know the OBJECT_ID of the index. This is done
with a simple SQL statement:
ORA9I SQL> select object_id from user_objects
2 where object_name='DB_OBJ_OBJID_IDX';
We now have the OBJECT_ID of our index. Let‟s generate a dump of this index.
ORA9I SQL> alter session set events 'immediate trace name treedump level 28046';
With the ALTER SESSION SET EVENTS command above, we have forced the
system to generate a tree dump of the B-tree index. Note that the level number is the OBJECT_ID of the index in question. We have now generated a trace file that we can read to get detailed information of our B-tree index. Let‟s examine
the trace file a section at a time to understand its contents.
----- begin tree dump
branch: 0xc01d92 12590482 (0: nrow: 9, level: 2)
branch: 0xc01ed7 12590807 (-1: nrow: 323, level: 1)
leaf: 0xc01d93 12590483 (-1: nrow: 42 rrow: 42)
This is the beginning of our tree dump. The first branch block is the root of the tree. In our example, this block is hex address c01d92, which translates to
decimal address 12590482. These addresses help us traverse from block to block. Notice that the level of this root block is level 2. If you start counting at zero, then it should be obvious that this index B-tree has a height of three.
Below the root block is another branch block at address 0xc01ed7. You can verify
that this block is one below the root by looking at the level of this block, in this
case, „1‟. The next line shows us the first leaf block of the B-tree index. Leaf
blocks are always at level 0 so there is no need to put that information in the dump. What follows next is a dump of that leaf block.
Leaf block dump
header address 2349359196=0x8c08605c
KDXCOLEV Flags = - - -
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
Of interest here is the kdxlenxt and kdxleprv values. These denote the
addresses of the next leaf block and previous leaf block respectively. These pointers are important for INDEX RANGE SCANS of data. The next leaf block is at address 0xc01d94 while the previous leaf block is address 0x0. This block is the
first leaf block looking at the tree from left to right. Therefore, there is no previous leaf block. The address 0x0 signifies no previous block. Likewise, the
rightmost leaf block will have no next leaf block and will have this dummy address. Continuing with the leaf block dump is the data in the leaf block.
row#0 flag: -----, lock: 0
col 0; len 2; (2): c1 03
col 1; len 6; (6): 43 5f 4f 42 4a 23
col 2; len 3; (3): 53 59 53
col 3; len 6; (6): 00 c0 1c 4a 00 9b
row#1 flag: -----, lock: 0
col 0; len 2; (2): c1 03
col 1; len 6; (6): 43 5f 4f 42 4a 23
col 2; len 3; (3): 53 59 53
col 3; len 6; (6): 00 c0 1d 07 00 9b
I‟ve only included the first two rows of data in the leaf block. You can see that (counting from zero) these are the first two rows. If you page through the dump file, you‟ll see each and every row in the leaf block. There are many more leaf
blocks, but they look very similar.
Of particular interest are the three columns for each row. This index is a composite index on three columns of the table. The index columns are listed in order. You can see the hexadecimal values of the ASCII codes for the data in those columns. This is where you can see the actual data stored in the leaf rows. Unfortunately, you‟ll have to convert these ASCII codes by hand.
After this leaf block, the rest of the leaf block dumps off that first branch block are shown. With many, many leaf blocks per branch block, it can generate quite a lengthy dump file. If you page forward far enough, you‟ll see the end of the leaf block dump and the start of the next branch block.
----- end of leaf block dump -----
branch: 0xc01fef 12591087 (0: nrow: 279, level: 1)
leaf: 0xc01ed6 12590806 (-1: nrow: 25 rrow: 25)
Here is the start of the second branch block off the root node. I know that this branch block is one off the root block since its level is one less. How do I know that this branch block is the second branch block from left to right? Look at the first number inside the parentheses. For this block, that number is „0‟. The leftmost branch block is „-1‟. You can go back in this paper to verify this is true
for the first branch block. Then start counting forward. So „0‟ is really the second branch block.
Why start counting at „-1‟ when normally we start counting at „0‟? It all has to do with how keys are stored in the branch blocks. The first key is not stored because it is assumed to be less than the second key. Since the second key is started at „0‟ this means the first branch is one less than zero.
Let‟s look at one more piece of information. The NROW indicator shows the
number of rows in that block. For leaf blocks, this is the number of table rows pointed to by that leaf block. For branch blocks, this is the number of blocks
underneath that branch block in the tree. Leaf blocks also contain the RROW
indicator. RROW is the number of rows after all current transactions have been committed. In our examples, these two indicators are the same since there are no active transactions against that data.