.comment-link {margin-left:.6em;}

Oracle Sponge -- Now Moved To Wordpress

Please use http://oraclesponge.wordpress.com

Wednesday, November 23, 2005

Optimizing Materialized Views Part III: Manual Refresh Mechanisms

So, back on the subject of fast-refreshing multiple materialized views.

Reviewing my previous comments on this, I think it's time for some clarification. I wrote before that I prefer to consider that fast refresh is based on the materialized view log rather than the master table itself, but as I also wrote in a followup this is not actually the case when you load a master table using direct path inserts. In fact the materialized view log is not touched in this circumstance and instead the logging takes place in the ALL_SUM_DELTA system view. So from the view point of the refresh process itself you can consider the materialized view log to be redundant. This doesn't mean that you can do away with it, however. There is no mechanism in Oracle for saying that you want a materialized view to be fast refreshable based only on direct path inserts and that the MV log is never going to be used.

In fact you might want to go so far as to ensure that it really is redundant by preventing inserts into it, since these would be an indication that conventional path insert is being used. A couple of mechanisms for this spring to mind, such as placing the log in a readonly tablespace, or placing an error-raising insert trigger on it. This would thereby prevent anything other than a direct path insert into the master table, including deletes and updates -- considering the performance disparity between conventional and direct path change logging and subsequent fast refresh I would be inclined to go ahead with such a plan.

Refresh Mechanism Following Direct-Path Insert

So given that you have inserted data to the master table in direct path mode, what is the refresh mechanism for the materialized views? Looking at a SQL trace file tells us that a MERGE statement is used. Here is a prettified example derived from the script used here.

Select /*+ Opt_Estimate(Query_Block Max=1000) */
Sum(1) D0,
Nvl(Sum(1*(DLT$0.TX_CST)), 0) D1,
Sum(1* Decode((DLT$0.TX_CST), Null, 0, 1)) H1,
Nvl(Sum(1*(DLT$0.TX_QTY)), 0) D2,
Sum(1* Decode((DLT$0.TX_QTY), Null, 0, 1)) H2
From (
Select /*+ Cardinality(16800) No_Merge No_Merge(LL$) Rowid(MAS$)
Ordered Use_Nl(MAS$) No_Index(MAS$) PQ_Distribute(MAS$,Random,None) */
Where LL$.TABLEOBJ# = 256646 And
) As Of Snapshot (:2) DLT$0
On (
Sys_Op_Map_NonNull(SNA$.LOCATION_CD) = Sys_Op_Map_NonNull(AV$.GB0) And
Sys_Op_Map_NonNull(SNA$.TX_MONTH) = Sys_Op_Map_NonNull(AV$.GB1)
When Matched Then
Update Set SNA$.C_STAR = SNA$.C_STAR+AV$.D0,
Null, Decode(AV$.H1, 0, Null, AV$.D1),
(SNA$.S_TX_CST + AV$.D1)),
Null, Decode(AV$.H2,0,Null,AV$.D2),
(SNA$.S_TX_QTY + AV$.D2))
When Not Matched Then
Values (AV$.GB0,
One of the characteristics to note is the absence of an APPEND hint in the merge statement, and to me this implies the following:
  1. New rows will be inserted through the conventional path. No direct path for you!
  2. Data segment compression on the MV base table will not be leveraged.
  3. If a fast refreshable MV is referencing MV1 as its master table then the change logging will be performed in the MV1 materialized view log, not through the SUM_DELTA mechanism

Optimizing MV Refresh By Avoiding MERGE

Another issue to note is simply that a single merge statement is used instead of an insert and an update. It is not inconcievable that you have a situation where the new data added to a fact table does not require modification of existing rows in one or more materialized views. If the materialized view groups by "Column A" and each load cycle on the fact table contains values of "Column A" that never appear in any other load cycle then this would be the case, and this might be either implicit or explicit in your data loading application.

The distinction here between implicit and explicit refers to whether the uniqueness of "Column A" to a particular load cycle is a deliberate design feature or whether it is more by happenstance or business requirements. An example of an explicit situation would be where you use a unique "load cycle code" as part of your ETL procedure (not a bad idea in itself), assigning a new value to each load and pushing this into one or more MV's. An example of an implicit situation might be where each load cycle operates on a granularity of fact table data that is repeated in the materialized view, such as where you load a complete day of transactions represented by a contiguous range of "TRANSACTION_TIME" (midnight-to-midnight) and the materialized view contains an expression such as:
In such a circumstance it is conceivable that it would be more efficient to manage the materialized view aggregation ourself than to leave it to the RDBMS, because using an insert allows us the following potential advantages:
  1. We avoid a potentially expensive join to the MV.
  2. A direct path write allows us to minimize logging.
  3. A direct path write allows us to use data segment compression.
  4. We can enforce a sort order in the materialized view data, potentially improving report query performance
I predict that this last issue will become more important in 10gR2 as the new hash-based aggregation algorithm replaces the current sort-based method, and the likelihood that GROUP BY returns a value-clustered result set is reduced.

However the tricky part of such an operation is in modifying the materialized view's data, which is not allowed through DML.
SQL> insert into master_mv1
2 select
3 location_cd,
4 trunc(tx_timestamp,'MM') tx_month,
5 sum(tx_qty) s_tx_qty,
6 sum(tx_cst) s_tx_cst,
7 count(*) c_star
8 from
9 master
10 group by
11 location_cd,
12 trunc(tx_timestamp,'MM')
13 /
insert into master_mv1
ERROR at line 1:
ORA-01732: data manipulation operation not legal on this view
For the experienced data warehouse developer an obvious method for bypassing such a limitation is to use a partition exchange operation, so let us construct a test case.

In this script we load a fact table upon which are based two materialized views, one of which is refresh on demand and the other of which is never refresh. The latter will be manually refreshed using a direct path insert into a non-partitioned table followed by a partition exchange without validation. To avoid either of the refresh processes benefiting from buffer caching by the other, the buffer cache will be flushed.
drop table master;
drop table master_test_data;
drop table master_mv1_exch;
drop materialized view master_mv1;
drop materialized view master_mv2;

create table master
location_cd number not null,
tx_timestamp date not null,
product_cd number not null,
tx_qty number not null,
tx_cst number not null
pctfree 0

create materialized view log on master
with rowid
including new values

create materialized view master_mv1
compress nologging
Partition By List (TX_MONTH)
Partition Y2005M01 Values (to_date('01-jan-2005'))
Using No Index
Never Refresh
Enable Query Rewrite
trunc(tx_timestamp,'MM') tx_month,
sum(tx_qty) s_tx_qty,
sum(tx_cst) s_tx_cst,
count(*) c_star
group by

create table master_mv1_exch
compress nologging
select *
from master_mv1
where 1=0

create materialized view master_mv2
compress nologging
Partition By List (TX_MONTH)
Partition Y2005M01 Values (to_date('01-jan-2005'))
Using No Index
Refresh Fast On Demand
Enable Query Rewrite
trunc(tx_timestamp,'MM') tx_month,
sum(tx_qty) s_tx_qty,
sum(tx_cst) s_tx_cst,
count(*) c_star
group by

create table master_test_data
location_cd not null,
tx_timestamp not null,
product_cd not null,
tx_qty not null,
tx_cst not null
pctfree 0
+ dbms_random.value(0,30),
connect by
1=1 and
level < list =""> 'MASTER_MV2',
method => 'F');

From the wall-clock timing of the queries the manual refresh is superior with its three stages of insert, partition exchange and "consider fresh" completing in 5.08 seconds (4.75+0.15+0.18), representing 66% of the merge-based MV refresh time of 7.65 seconds.

Tkprof extracted the following details from the trace file


Insert /*+ append */ Into master_mv1_exch
trunc(tx_timestamp,'MM') tx_month,
sum(tx_qty) s_tx_qty,
sum(tx_cst) s_tx_cst,
count(*) c_star
group by

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.05 2 344 3 0
Execute 1 2.89 3.78 1927 3404 110 9999
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 2.92 3.83 1929 3748 113 9999

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 40

Rows Row Source Operation
------- ---------------------------------------------------
1 LOAD AS SELECT (cr=3404 pr=1927 pw=27 time=3785583 us)
9999 SORT GROUP BY (cr=3365 pr=1914 pw=0 time=3652049 us)
1000000 TABLE ACCESS FULL MASTER (cr=3365 pr=1914 pw=0 time=4008780 us)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 11 0.01 0.05
control file sequential read 4 0.00 0.00
db file scattered read 70 0.18 0.73
direct path write 5 0.00 0.00
SQL*Net message to client 1 0.00 0.00
SQL*Net message from client 1 0.00 0.00


("DLT$0"."TX_CST")), 0) "D1", SUM(1* DECODE(("DLT$0"."TX_CST"), NULL, 0, 1))
"H1", NVL(SUM(1* ("DLT$0"."TX_QTY")), 0) "D2", SUM(1*
NULL, "AV$"."D1"), ("SNA$"."S_TX_CST" + "AV$"."D1")), "SNA$"."S_TX_QTY"=
DECODE("SNA$"."S_TX_QTY", NULL, DECODE("AV$"."H2", 0, NULL, "AV$"."D2"),
"SNA$"."S_TX_QTY") VALUES ( "AV$"."GB0", "AV$"."GB1", "AV$"."D0", DECODE
("AV$"."H1", 0, NULL, "AV$"."D1"), DECODE ("AV$"."H2", 0, NULL, "AV$"."D2"))

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 4 0
Execute 1 6.04 6.66 1922 3760 10640 9999
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 6.06 6.68 1922 3760 10644 9999

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 40 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
2 MERGE (cr=3811 pr=1925 pw=0 time=6662264 us)
9999 VIEW (cr=3682 pr=1914 pw=0 time=6556192 us)
9999 HASH JOIN RIGHT OUTER (cr=3682 pr=1914 pw=0 time=6516189 us)
0 PARTITION LIST SINGLE PARTITION: 1 1 (cr=3 pr=0 pw=0 time=92 us)
0 MAT_VIEW ACCESS FULL MASTER_MV2 PARTITION: 1 1 (cr=3 pr=0 pw=0 time=73 us)
9999 VIEW (cr=3679 pr=1914 pw=0 time=6415840 us)
9999 SORT GROUP BY (cr=3679 pr=1914 pw=0 time=6365838 us)
1000000 VIEW (cr=3679 pr=1914 pw=0 time=14006210 us)
1000000 NESTED LOOPS (cr=3679 pr=1914 pw=0 time=8006206 us)
35 VIEW (cr=76 pr=0 pw=0 time=4235 us)
35 FILTER (cr=76 pr=0 pw=0 time=4055 us)
35 NESTED LOOPS (cr=76 pr=0 pw=0 time=3802 us)
35 NESTED LOOPS (cr=4 pr=0 pw=0 time=755 us)
1 TABLE ACCESS BY INDEX ROWID OBJ$ (cr=3 pr=0 pw=0 time=75 us)
1 INDEX UNIQUE SCAN I_OBJ1 (cr=2 pr=0 pw=0 time=32 us)(object id 36)
35 INDEX RANGE SCAN I_SUMDELTA$ (cr=1 pr=0 pw=0 time=472 us)(object id 161)
35 TABLE ACCESS CLUSTER USER$ (cr=72 pr=0 pw=0 time=2556 us)
35 INDEX UNIQUE SCAN I_USER# (cr=2 pr=0 pw=0 time=798 us)(object id 11)
0 NESTED LOOPS (cr=0 pr=0 pw=0 time=0 us)
0 INDEX RANGE SCAN I_OBJAUTH1 (cr=0 pr=0 pw=0 time=0 us)(object id 107)
0 FIXED TABLE FULL X$KZSRO (cr=0 pr=0 pw=0 time=0 us)
0 FIXED TABLE FULL X$KZSPR (cr=0 pr=0 pw=0 time=0 us)
1000000 TABLE ACCESS FULL MASTER (cr=3603 pr=1914 pw=0 time=3222833 us)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 10 0.00 0.03
db file scattered read 70 0.02 0.48
log buffer space 13 0.00 0.03

The difference in elapsed time between the two methods is almost entirely accounted for in CPU time, probably due to the joins required by the merge statement. You might also notice that I took the opportunity to make the tables NOLOGGING and COMPRESS.

The methodology presented here for the manual refresh of the materialized view through a partition exchange is not the only one available, of course. Another choice might be to define the materialized view upon a prebuilt table, which allows the MV to be temporarily dropped so as to allow DML on the underlying table. The recreation of the MV definition is then a trivial issue in terms of database work required, and the only effect would be the temporary loss of the ability to rewrite queries against that MV. It would have the further advantage that indexes (or partitions thereof) on the materialized view table could be temporarily rendered unusable during the insert process so as to rebuild them in a nologging mode, without users being inconvenienced by "unusable index" errors.

It would also be possible to split the manual refresh into more stages to reduce the period for which query rewrite is unavailable:
  1. Aggregate master table to temporary table
  2. Drop materialized view
  3. Insert into materialized view table from temp table (no aggregation required)
  4. Recreate materialized view
However this would sacrifice many of the performance benefits of the single stage manual insert.


If there is an alternative mechnism for avoiding the MERGE then I'd be very glad to hear it, but this method may stand on its own if the alternatives do not allow direct path operations on the MV table or the sorting of the table data.

There are also some clear benefits to this mechanism when you consider the refresh of multiple materialized views, but they will wait until after thanksgiving (if I escape both the long term and immediate threats to my health of deep-frying a turkey tomorrow).

Last Minute Addendum

It occured to me just after posting this that another benefit of using a manual refresh is that you can avoid the overhead of calculating the extra columns required by the fast refresh mechanism, in this case the COUNT(*). I'll do a test on the benefits of that also.

Monday, November 14, 2005

Choosing Partitioning Keys: For ETL or Reporting?

This is one of the recursive thought processes that has been spun off from considering ways of optimizing the fast refresh of multiple materialized views: I must now be at a recursive depth well into double figures because I cannot recall how I got here at all. Anyway ...

I was just browsing the 10gR2 documentation, and noticed the following comments here:

"The partitioning scheme of the data warehouse is often crucial in determining the efficiency of refresh operations in the data warehouse load process. In fact, the load process is often the primary consideration in choosing the partitioning scheme of data warehouse tables and indexes.

The partitioning scheme of the largest data warehouse tables (for example, the fact table in a star schema) should be based upon the loading paradigm of the data warehouse."

I have no argument with "The partitioning scheme ... is often crucial in determining the efficiency of refresh operations ...", nor with "... the load process is often the primary consideration in choosing the partitioning scheme ...": the former statement is self-evidently true and I'm sure that the latter is also true. However I do have a problem with, "The partitioning scheme ... should be based upon the loading paradigm ...". This strikes me as being not a very well thought-out piece of advice that depends more on an appeal to historical precedent or coincidence than it does to a rational design process. And it neglects the critical point that the partitioning scheme is always crucial in determining the efficiency of reporting operations.

I recently wrote something that touched on this subject in a roundabout way by mentioning that I was migrating a fact table from a partitioning scheme that aligned with the ETL process to a new scheme that aligns with the reporting requirements. Throughout the migration I keep thinking of events Quite Some Time Ago in which the administrators of an OLTP database objected to the addition of new indexes on a couple of columns. Their rationale for this was superficially sound: that it would have slowed down both the insert of new rows and the subsequent update of those columns. While they were not discounting the benefits of the indexes for select operations they were treating the pros and the cons as if they were on a level footing. In fact at the time that the row was inserted the pain was already half over - a few days after the rows were inserted the newly indexed column's value would change once and from there on there were only benefits to be gained from the new index. Substantial benefits they were too: two-orders of magnitude improvement in the performance of some key reporting queries.

Now that isn't the be all and end all of the considerations. To be completely fair we should also consider the sensitivity of the system (or the business) to the effects of the changes at the time that they are experienced. Taking the example above, if the loading of new data or the subsequent change to the indexed value was a critical process on which there were particular time restrictions, while the select statements that benefited were part of a batch operation with no current performance problems (in terms of whether the business was impacted by it running for longer than it could have with the index in place) then the technical benefits of the additional indexes might have been outweighed by operational issues.

So to appreciate the impact of a design feature on a system you not only have to consider whether those features represent potential benefits and detriments, but you have to weight them by a couple of other factors:

  1. How often the benefit or detriment occurs
  2. How sensitive the system's operations are to the benefits and detriments occuring when they do.

So back to the matter of the partitioning.

Hopefully the parallels between the situations are pretty clear. We have design choices (to index or not to index ... to partition according to the convenience of either the ETL procees or the reporting queries) that impose costs and benefits on different operational stages in the data handling.

Now let us look at the costs and benefits of taking different decisions in the partitioning, and we will start here by stating the obvious: that there is only a difference between the two partitioning schemes when the ETL and the reporting processes use different logical bases for identifying record sets of interest. If your data arrives in sets that correspond to the reporting requirements then this is not an issue.

I'll not rehash on the situation that caused the migration from an ETL-based partitioning scheme to a report-based partitioning scheme, but in brief the users are interested in reports based on the date of a financial transaction, but financial transactions often arrive very "late". For example, transactions for March 2005 could be transmitted in September 2005.

Let us suppose then that we follow the advice given in the documentation. We partition our fact table by some natural or "synthetic" date (synthetic in the sense that it does not represent a business object) that represents the arrival period of the data, say "DAY_OF_FILE_ARRIVAL". This is a great convenience to the ETL programmer, who can transform the new data set into a table having the same structure as the fact table (without partitioning, or partitioned if the fact table is composite partitioned), create or rebuild indexes on it, analyze it, handle new or changed dimensional values, then perform a partition exchange with the fact table to make the new data available to the user community.

However the users are then querying by "DATE_OF_TRANSACTION". For example they specify a range-based predicate of DATE_OF_TRANSACTION between '01-jan-2005' and '31-Jan-2005'. As this represents a small fraction of the data in the fact table, and as the rows are fairly well clustered, an index-based access method is used to retrieve the rows. In fact until the number of fact table blocks to be touched starts to exceed some threshold like 10% of the total, an index-based access method can be expected, with the undesirably high number of logical reads and small physical read size that you'd expect.

So in this scenario partition pruning for the reporting queries has been lost and with it has gone arguably the single most effective means of enhancing query performance against a large fact table, short of slapping materialized views all over it or using parallel query (which I take rather for granted in a data warehousing environment).

If we adopted the alternative partitioning scheme based on DATE_OF_TRANSACTION then partition pruning sanity would return to user reports, but at what cost to the ETL process?

First, let's consider what alternatives there might be to a simple partition exchange-based loading method. In practice there is just one -- the direct path insert, either serial or parallel. While a full quantitative comparison of the partition exchange and direct path inserts is not not going to be presented right here and now (although maybe I'll add it to my list), I feel that direct path inserts are not actually such a bad method for fact table loading. I'd consider the major disadvantage to be that they generate more redo because you are modifying enabled indexes for which the nologging attribute is not respected. In terms of manageability the pros and cons are mixed. On the one hand, you no longer need to worry about identifying which partition the data has to be exchanged with (if you ever did worry about that).

As far as statistics are concerned you do need to either allow monitoring and DBMS_Stats to handle the identification and collection of modified table statistics, or you need to start worrying about partitions again and implementing your own mechnaism for refreshing partition-level statistics. I have a feeling that while the latter approach may be more trouble by several orders of magnitude, it might still be preferable to the monitoring option. As the documentation here states, "If a monitored table has been modified more than 10%, then these statistics are considered stale and gathered again." While I am no fanatic about making sure that statistics are right up to date you would certainly want to think about whether this threshold is low enough, particularly in regard to histograms - it doesn't take a 10% change in the number of rows for a histogram to be significantly altered, particularly if business conditions and data distributions change with time.

I can't believe that I just set myself up for another blog topic. * sigh * Another recursive layer -- I need a vacation.

Another consideration is how well the batches of new records correlate with the partitioning scheme. Maybe your batch of new records is scattered all over a bunch of partitions, or maybe nearly all of the data is going to a single partition with a few rows scattered over a handful of others. In fact I prefer to think about this the other way around -- instead of thinking about this from the view point of how the new data is going to be split I'd rather consider what is ging to happen to the fact table partitions. Here are a few different scenarios:

  • One empty partition to be populated, one other partition to gain around 1,000 rows (0.01% of current total)
  • One empty partition to be populated, six others to gain between one and three million rows (1-3% of total), five others to gain between 10 and 1000 (0.001-0.1% of current total)
  • All partitions to gain around one million rows (1% of total)
How you treat these different scenarios, and how confortable you are with them, depends on where the breakeven point lies for you in your tolerance of ETL complexity and performance, and probably therefore on your ETL tool also.

When a partition starts out with data we have a couple of options.
  1. Copy the data into a new/empty table in nologging mode, insert the new data, rebuild indexes, then partition exchange with the original.
  2. Direct path load into the original partition, again with the option of disabling indexes where possible.
Which one of these approaches works best for you depends on a few factors. The former approach is certainly more challenging to program, while the latter will incur higher overhead on logging (although the overall i/o subsystem load may well be smaller).

So there you have it. I don't think that there is a clear cut case for partitioning according to ETL requirements, and the benefits of partitioning by reporting requirements are certainly enough to compel a close examination of the alternative to loading exclusively by partition exchange. Does this seem like an issue worthy of being rasied as a documentation bug? I'm not sure, but if people think it important then I'd be willing to raise one.

Saturday, November 12, 2005

Oracle Patent on Load-Dependent Query Optimizer Costing

As if we don't have enough factors to consider when scratching our heads over execution plans, we'll soon have to consider server load as a variable if Oracle implements the technology described in Patent 6,957,211.

When will the madness stop? "When", I ask you?

Thursday, November 10, 2005

Choosing an Ordering Method for Best Table Compression

Some comments here: http://www.phpbbserver.com/phpbb/viewtopic.php?t=614&mforum=dizwellforum

Friday, November 04, 2005

Oracle XE Registration

Has anyone successfully registered this yet? I've submitted my registration three times so far without getting an acknowledgement. E-Mail details are correct and everything.

Anyone got any clues on this?