.comment-link {margin-left:.6em;}

Oracle Sponge -- Now Moved To Wordpress

Please use http://oraclesponge.wordpress.com

Wednesday, March 15, 2006

Predicate Pushing And Analytic Functions

I'm sure that there must be a fair number of Oracle professionals who carry around in their heads a little score card of some of their best tuning results ever ... hopefully, at least, 'cos otherwise I'm a weirdo. Anyway, today I improved the performance of a set of decision support queries and achieved my best result ever - improving query speed by a factor of 180,000, from an original run time of one hour down to a new time of 0.02 seconds.

The wise monkeys in the audience will immediately be thinking "partition pruning!" or "materialized view!", and in fact if you thought the former then you'd be right. Here's how it worked.

I had defined a view against a large fact table so that it included several analytic functions. The details really don't matter, but the intention was to allow a column to contribute to a metric value only once per transaction_id, so the metric definition was something like:

Decode(Row_Number() Over (Partition By Transaction_ID Order By 1),1,Qty,0)

Queries that access this view had predicates on a number of columns, including some very selective indexed columns (Item_ID) and a partition key column (Fiscal_Month). Unfortunately there is nothing in the database to tell the optimizer that each unique value of Transaction_ID had but a single value Item_ID and Fiscal_Month, so logically the predicates could not be applied until after the analytic function had been calculated. Hence there was no predicate pushing on the fiscal_month and item_id, and neither partition pruning nor index access was considered. The query was actually scanning about one quarter of the table (it looks like a combination of subpartition pruning and partition pruning was taking place, but this table is multicolumn range + list composite partitioned, and pruning at the partition level was only taking place on the leading column of the partition key).

However, we included the two predicate columns in the analytic functions' partition clause like so:

Decode(Row_Number() Over (Partition By Transaction_ID, Item_Id, Fiscal_Month Order By 1),1,Qty,0)

Now this doesn't change the result because the Item_Id and Fiscal_Month entries are actually logically redundant, but it did allow both 9i and 10g to push the predicates and give full partition and subpartition pruning and index-based access.

Quite a nice result.

Here's a script that I used to demonstrate that it would work.

drop table t1;
create table t1
(
txid ,
month ,
item_cd,
qty
)
as
select
floor(rownum/5),
floor(rownum/20),
floor(rownum/10),
floor(rownum/5)
from
dual
connect by
level < 100
/

explain plan for
select txid,
qty
from (
select txid,
item_cd,
month,
qty,
decode(row_number()
Over (Partition By txid Order By 1),1,qty,0)
qty_fix
from t1
) v
where item_cd = 0
/
select * from table(dbms_xplan.display)
/

explain plan for
select txid,
qty_fix
from (
select txid,
item_cd,
month,
qty,
decode(row_number()
Over (Partition By txid Order By 1),1,qty,0)
qty_fix
from t1
) v
where item_cd = 0
/
select * from table(dbms_xplan.display)
/

explain plan for
select txid,
qty_fix
from (
select txid,
item_cd,
month,
qty,
decode(row_number()
Over (Partition By txid,item_cd Order By 1),1,qty,0)
qty_fix
from t1
) v
where item_cd = 0
/
select * from table(dbms_xplan.display)
/

In 10g the explain plan with predicate pushing was very straightforward:

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

Plan hash value: 2273146475

----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 252 | 3 (34)| 00:00:01 |
| 1 | VIEW | | 9 | 252 | 3 (34)| 00:00:01 |
| 2 | WINDOW SORT | | 9 | 351 | 3 (34)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T1 | 9 | 351 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("ITEM_CD"=0)

Note the filter being applied to line 3. Where the predicate was not pushed it was applied to line 1.

The 9i explain plan was rather ... funky. But it worked.

7 Comments:

At 12:06 AM, Blogger Doug Burns said...

Have you been spending time with Jonathan Lewis lately?

;-)

 
At 3:39 AM, Anonymous Anonymous said...

Nothing to do with me.

Just because I was sitting next to him ( http://oracledoug.blogspot.com/2006/03/hotsos-day-3-wrap-up.html ) doesn't mean I was talking to him. It wasn't even his copy of the book I was signing.

Jonathan Lewis

 
At 6:05 AM, Blogger Doug Burns said...

Ah, but he was at your training day!

(Which I am *still* waiting for one of the bloggers who were in attendance to review. David gave me an informal review via email but it included a swear-word of sorts so he may need to edit it)

 
At 6:27 AM, Blogger David Aldridge said...

I'll tell you what's unbelievable -- that Shrek wouldn't let me autograph Jonathan's book. Cheek!

I'll tell you what else though, there is a thread at AskTom which starts off with the assertion that you can't predicate push into a view with analytic functions because it changes the result set, and that made me cry a modern equivalent of "Poppycock!" when I read it, but later on Tom cunningly refines his advice into "you can do it if the optimizer knows it won't change the result set" (which is rather different from being able to do it if it won't change the result set, of course). I was still suprised that the optimizer did it, more so about 9i than 10g.

I thought I had him there, but alas ...

Anyway, for the small-slice-of-the-public record the training day was an excellent experience. It was a good job Jonathan skipped out early 'cos I was going to ask him a series of unfeasible questions, such as "All of what you've said today, does it apply with parallel query/partitioning/MTS as well?" Again, alas ...

 
At 3:23 PM, Blogger Don Burleson said...

Hi David,

Hmm, and 18k improvement with a single change? Not too shabby.

I'd call it a Silver Bullet, but I know better . . . .

 
At 7:09 AM, Blogger David Aldridge said...

"Silver bullet" ... heh heh heh.

I guess if there is a wider lesson to be drawn from this very specific situation, it is to reinforce that reducing work by filtering as soon as possible is a Very Important Thing. When it also involves partition pruning then the results can be particularly memorable.

Now that is something that Jonathan addressed in the training day -- except for the bit about partitioning ;)

 
At 8:16 AM, Blogger Mike said...

It would be interesting to see if a defined dimension (realizing they were meant for materialized views, but should work in this situation) would accomplish the same thing...that is...telling the optimizer about the relationship...since that is what dimensions are supposed to do, tell the optimizer about hierarchies in flattened table structures.

Mike

 

Post a Comment

Links to this post:

Create a Link

<< Home