Using Histograms on Evenly Distributed Values
A quick demo ... here's a script ...
drop table demo_hist
/
create table demo_hist
as
select floor(rownum/5) rn,do.* from dba_objects do
/
begin
dbms_stats.gather_table_stats
(
ownname => user,
tabname => 'DEMO_HIST',
estimate_percent => 100,
method_opt => 'for columns OWNER size 254'
);
end;
/
select distinct column_name from user_tab_histograms
where table_name = 'DEMO_HIST'
/
Select min(rn),
max(rn),
count(distinct rn),
count(*)
from demo_hist
/
set autotrace on
select count(*) from demo_hist
where rn = 5
/
select count(*) from demo_hist
where rn between 1 and 100
/
begin
dbms_stats.gather_table_stats
(
ownname => user,
tabname => 'DEMO_HIST',
estimate_percent => 100,
method_opt => 'for columns RN size 254'
);
end;
/
select count(*) from demo_hist
where rn = 5
/
select count(*) from demo_hist
where rn between 1 and 100
/
When I run this I get a demonstration that with a histogram on the RN column the optimizer gets a much better cardinality estimate for the two Count(*)'s on RN = 5 and RN between 1 and 100.
But I'm not going to show you by how much because you need to go and run this script yourself ;)
Whether the results are significant in terms of getting a good execution plan, well this test doesn't demonstrate that. But I believe that it does. More of that damned instinct.
OK, if someone begs me then I'll post my results. But it'll have to be a real grovelling.
1 Comments:
Hi David, I'm fairly new to your site, and like it.
I just read your example, and am curious as to why you set the estimate percent to 100 rather than doing a compute, by setting it to null (on a side note, wasn't the default null in 9i instead of the param?).
Also, what is the point of doing the histogram on the owner column? Just to show that it works?
I did see how the histogram, even on an evenly distributed column, caused the cardinality in the plans to more accurately reflect the actual data.
I've been suggesting to people to always gather histograms, because the more info for the CBO the better, and now I have a cool example to point them to. Thanks!
(also, for those who tell me to only get histos on columns with skewed data distributions, how is one to know in most cases the distribution without actually calculating it? If you are going to go to the trouble of calculating it yourself, might as well let oracle do it for you and gather the histo).
Post a Comment
<< Home