The Oracle Sponge Blog -- An Oracle Data Warehousing Resource

Thursday, May 11, 2006

Getting the (Sub)Partition Name for a Row

Adapted from a response to a question posted on Oracle-l, here are three methods for finding out what partition or subpartition a table row is stored in, or which partition or subpartition a row will be stored in.

Logical Inference

For range-based or list-based partitioning or for range-list composite partitioning it is often feasible to infer the partition or subpartition name from the values of the partition and subpartition key columns, as long as a sensible partition naming convention has been chosen.

For example, for a range partitioned table defined in part by ...

Create table T (col1 date ...)
Partition By Range (col1)
(
Partition Y2006_M01 values less than (date '2006-02-01'),
Partition Y2006_M02 values less than (date '2006-03-01'),
Partition Y2006_M03 values less than (date '2006-04-01'),
...
)

...we can use a SQL function to provide the partition name:

To_Char(col1,'"Y"YYYY"_M"MM')

Note the use of double-quotes in the above formula to identify literal strings in the date format.

This is a fast and simple method, although an historically varying granularity on the partitions or a complex partitioning scheme could make it difficult to maintain.

DBMS_MView.PMarker

In cases where it is difficult or impossible to do perform the logical inference (hash partitioning, for example, or with complex multicolumn range/list partitioning) there are a couple of other techniques that you can use.

In the DBMS_MVIEW package there is a PMARKER function that returns the data_object_id for the object in which the row resides, and you can join to user/dba/all_objects using that.

Here's a funky example script using variable numbers of hash subpartitions ...

drop table t
/

create table t (col1 , col2 , col3 )
partition by range (col1)
subpartition by hash (col2)
(
partition p1 values less than (2) subpartitions 2,
partition p2 values less than (3) subpartitions 4,
partition p3 values less than (4) subpartitions 8
)
as
select mod(rownum,3)+1,
      floor(dbms_random.value(1,256)),
      floor(dbms_random.value(1,10))
from   all_objects
where rownum < 101
/

with   obj as
      (select --+ materialize
              data_object_id,
              subobject_name
       from   user_objects
       where  object_name = 'T' and
              object_type = 'TABLE SUBPARTITION')
select subobject_name,
      col3
from   T,
      obj
where  data_object_id = DBMS_MView.PMarker(t.rowid)
order by 1
/

My unsubstantiated guess is that this uses the file#/block# of the rowid to perform a look-up on the extent that contains the row. It therefore also ought to be possible to "hand-roll" a similar method based on extracting the file# and block# from the rowid's and joining to the dba/user/all_extents view (or a materialized subset of it, for performance reasons) to get the segment properties. I don't know if I'd care to go that route myself.

TBL$OR$IDX$PART$NUM()

The TBL$OR$IDX$PART$NUM() function gives you the appropriate partition number for a table value or set of values. It's an undocumented (except through metalink, if that counts) function with a couple of magic numbers in it, but the general format to use is ...

TBL$OR$IDX$PART$NUM("PARTITIONED_TABLE_NAME", 0, d#, p#, "COLUMN_NAME")

The easiest way to get the appropriate format and magic numbers to use for this function is to run a trace on a "with validation" partition exchange against the table of interest, and you'll pretty much get the complete SQL that you need.

So you might end up with something similar to ...

with   utp as
      (select --+ materialize
              partition_position,
              partition_name
       from   user_tab_partitions
       where  table_name = 'MY_TABLE')
select utp.partition_name,
      last_name,
      first_name
from   my_table,
      utp
where  utp.partition_position = TBL$OR$IDX$PART$NUM("MY_TABLE", 0, 0,
65535, "PART_COL")
/

This method has the advantage that you can just supply it with arbitrary values for the partition key column(s) and you will get the appropriate partition number/name. Therefore you can use it to answer the question "Which partitions will these rows go into?". If you use partition exchanges "without validation" then you can also use it to run periodic checks on whether your ETL process has been putting rows in the wrong partitions.

The disadvantage is that it is undocumented and has those pesky magic numbers, or course.

Summary

My preference is to use the method of logical inference, if possible.

The other two are evidently based on very different methodologies -- one being based on the physical location of the row and the other on values of the partition key column(s), so they can be applied to different situations. The TBL$OR$IDX$PART$NUM method is rather more amenable to performance tuning by pre-aggregating the table data based on distinct partition keys prior to applying the function call.

10 Comments:

At 11:52 PM, Anonymous said...: The dbms_mview.pmarker function probably doesn't have to do any work with the file#/block# to get the object number, as the object number (in fact the data_object_id, rather than the object_id) will be in the block header of the block. Here's the relevant extract from a data block dump:

Block header dump: 0x0340080a
Object id on Block? Y
seg/obj: 0x10e3f
At 4:34 AM, mathewbutler said...: Good to see you posting again. More please :o).
At 9:02 AM, David Aldridge said...: Thanks for that Jonathan,

I guess that the inner workings could be verified in some cunning way -- unless your "probably" is you being super-cautious and your 99.999% certain that this is the way it works ;)

If it is using the block header then that would make the use of the file#/block# to lookup the object# via the extent map a different logical method altogether then. If I have a quiet moment then I might have a crack at doing that just to see if it's feasible. I doubt it, because these joins to system views don't usually go well, hence the materializing of the SQF clause. Having to join to dba_extents and dba_objects is probably asking too much.

I ought to have mentioned that the DBMS_MView.PMarker method works for non-partitioned objects also, and therefore can tell you what table a row is in. I suppose that might be of value in distinguishing what set of a union-all query a row comes from, or something like that.

My word verification of "oxbnd" just gave me a flashback to O-level geography. The horror!
At 9:05 AM, David Aldridge said...: Mathew,

I noticed Melanie C's comment on TK's blog about some techie blogs being social clubs with little technical content, and I got all paranoid. I thought I'd best get my act together.

I'm having trouble at the moment pruning my articles down to bloggable size -- it's the "And Another Thing ..." syndrome.
At 9:31 AM, mathewbutler said...: I'd thought you were just taking a break. Nice to know you have a backlog of stuff to post.

Cheers,

Mat.
At 3:09 AM, Anonymous said...: David,
The 'probably' is because I can't think of anything that would be less work for Oracle to do at that point - and it's the "obvious" choice. Occam's Razor applies until further notice.

I'm not sure whether you could prove that my guess was true from any activity that could be monitored - you could only show that there was nothing to contradict the hupothesis.

Jonathan Lewis
At 8:50 AM, David Aldridge said...: Is the "read data object id from block header" functionality available to SQL or PL/SQL engine in any way, Jonathan?
At 3:37 AM, Anonymous said...: Getting the data_object_id from the object header - I don't know of an explicit call but you can get the data_object_id from a rowid with dbms_rowid.rowid_object:

select
dbms_rowid.rowid_object('AAARYAAANAAAASKAAA')
from dual;

DBMS_ROWID.ROWID_OBJECT('AAARYAAANAAAASKAAA')
---------------------------------------------
71168

1 row selected.

SQL> select object_id, data_object_id from user_objects where object_name = 'CHILD';

OBJECT_ID DATA_OBJECT_ID
---------- --------------
71147 71168
At 7:30 PM, David Aldridge said...: I must learn to RTFM ... :(

I suppose that's actually pretty explicit -- it doesn't seem to require that the rowid exists, or that the block is below the HWM or anything ...

SQL> create table t (col1 number);

Table created.

SQL> insert into t values (1);

1 row created.

SQL> commit;

Commit complete.

SQL> select rowid from t;

ROWID
------------------
AABDZJAAWAAE2cKAAA

SQL> select dbms_rowid.rowid_object('AABDZJAAWAAE2cKAAB') from dual;

DBMS_ROWID.ROWID_OBJECT('AABDZJAAWAAE2CKAAB')
---------------------------------------------
276041

SQL> select dbms_rowid.rowid_object('AABDZJAAWAAE2cLAAA') from dual;

DBMS_ROWID.ROWID_OBJECT('AABDZJAAWAAE2CLAAA')
---------------------------------------------
276041

In fact it seems that the data object id is just being extracted directly from the extended rowid (the 'AABDZJ' in the above).

SQL> select dbms_rowid.rowid_object('AABDZJZZZZZZZZZZZZ') from dual;

DBMS_ROWID.ROWID_OBJECT('AABDZJZZZZZZZZZZZZ')
---------------------------------------------
276041

... and I'd infer that this is a very efficient way of getting the object id as it's just performing a check on the format and a base conversion on the first six bytes.
At 7:38 PM, David Aldridge said...: Tut.

And I kick myself for not realizing immediately that DBMS_MVIEW.PMARKER is probably just a wrapper for DBMS_ROWID.ROWID_OBJECT.

Also, for using a "z" in "realising" instinctively. These eight years have taken their toll.

The DBMS_ROWID call consistently seems to be about 10% faster though.

1 declare
2 l_num number;
3 begin
4 for x in 1 .. 100000 loop
5 select dbms_mview.pmarker('AABDZJZZZZZZZZZZZZ')
6 into l_num
7 from dual;
8 end loop;
9* end;
SQL> /

PL/SQL procedure successfully completed.

Elapsed: 00:00:07.54
SQL> edit
Wrote file afiedt.buf

1 declare
2 l_num number;
3 begin
4 for x in 1 .. 100000 loop
5 select dbms_rowid.rowid_object('AABDZJZZZZZZZZZZZZ')
6 into l_num
7 from dual;
8 end loop;
9* end;
SQL> /

PL/SQL procedure successfully completed.

Elapsed: 00:00:06.59

Oracle Sponge -- Now Moved To Wordpress

Thursday, May 11, 2006

Getting the (Sub)Partition Name for a Row

10 Comments:

About Me

Previous Posts