.comment-link {margin-left:.6em;}

Oracle Sponge -- Now Moved To Wordpress

Please use http://oraclesponge.wordpress.com

Friday, July 29, 2005

A Lesson Learned?

A sad story catches my eye over at the forums of Joel on Software. Sounds like a lesson learned the hard way.

Wednesday, July 27, 2005

Eyeballs: Fried but not Battered

It's a little over a week since my eyes went on their excellent adventure, and I can finally stop squirting them with steroids and antibiotics.

All seems to be well ... Leftie is at least as good without correction as it used to be with correction, although Rightie is lagging behind a bit. Maybe a later followup will be needed there, but if it gets me more of that Valium and a nice lie down in the afternoon then I'm ready for it right now.

No great lifestyle changes yet, mind. The sun here is so bright that a pair of cool shades are required all the time anyway. My field of vision seems a lot better, particularly when out on the iron horsey where keen peripheral vision is what keeps me clear of these Colorado drivers.

So, thumbs-up so far.

Monday, July 25, 2005

Time Slicing of Disk I/O

It is a well known phenomenon that when a single process makes multiple multiblock read requests of physically contiguous data from a disk the performance far exceeds that available when multiple requests are made for different data concurrently. Right? I leverage this on my FSK development machine, where a serial full table scan can be arranged to exceed 40MB/sec on a single fast (10,000rpm SATA) disk. Incidentally it is not only fast but it's very quiet also due to the lack of excessive and inefficient head movement.

Now I don't propose to write a long essay on the mechanical details of why this is so, and I think that it suffices to say that time spent moving the drive heads from one part of the disk to another is time wasted.

So consider how this affects a data warehouse, where large scans of contiguous sets of data are two-a-penny. When you have multiple queries accessing the same disks, some layers inbetween the database may perform some optimisation of the access pattern by rescheduling disparate requests for contiguous data into a single request, but they are considering only very small time periods when doing so. There are presumably two objectives here -- to optimize and speed up disk access, but also to avoid delaying each request by more than some small amount of time (measured in milliseconds or tens of milliseconds, perhaps).

Let us suppose that we have a request for a 100Mb set of data to be read from disk, in chunks of 512kb. If the data could be contiguous on the disk(s) then this data set could be read with no latency due to unnecessary head movement. Just grabbing some raw numbers from http://www.storagereview.com/articles/200411/20041116ST3146754LW_2.html this could be accomplished in around 1.2 seconds (at a transfer rate averaged by eyeball).

However if we introduce another similar query, concurrent with this one, of the same size, but of different data, then the heads start flitting around like flies in a jam jar -- the transfer of the 200Mb now required takes far in excess of 2x1.2 seconds because there are now something like 400 head movements to include. Let us say 3.7msec per head movement, and we've added on around 1.5 seconds ... so each query now has to wait for 3.9 seconds for the complete return of its result set. How much better would it be if one query's disk access was deferred until the others was completed, so that the preferred query's data was returned in 1.2 seconds and the delayed query's data was returned in 2.4 seconds? That average time of 1.8 seconds seems to me to be far preferable to the average of 3.9 seconds, and even the disadvantaged query benefits.

Well, the Aldridge Theory Of theEvils Of Concurrent I/O Access is only partly thought out so far, and there is probably some much more sophisticated processing required:
  • temporarily deferring access for a 10Gb read to allow a couple of 100Mb reads to get their feet in the door
  • completing the scanning of indexes before starting any table access, and possibly storing indexes on different devices
  • a heavy reliance upon the physical clustering of rows likely to be requested for the same query
... but I wonder, is there any really fatal flaw in such a scheme?

Sunday, July 24, 2005

Higher Dimensional Values In Fact Tables

update: 24-Jul-2005

Jonathan Lewis writes to educate me on function-based indexes and statistics thereon ... it is indeed possible to gather histogram data on them through the mechanism of the DBMS_Stats method_opt => 'for all hidden columns' -- who'd a thunk it? It's documented here. I hope it doesn't sound like sour grapes if I wish that this were also documented here.

In this article I dismissed the use of function-based indexes (for the purposes outlined) based on the inability of Oracle to generate histograms on them -- whether this new-to-me information changes that judgement will take some further thought and analysis. I may move it to another blog entry once I've reassessed the issue, as an analysis of the costs and benefits of function-based indexes vs. "materializing" the function result in the table rather appeals to me.

Thanks again, Jonathan.


introduction


In getting performance out of an Oracle database, and in data warehousing (DWh) in particular, we live or die by making effective use of the cost-based optimizer (CBO). As I wrote before, "The CBO is based on statistics -- statistics about tables, statistics about indexes, statistics about the distribution of values within a column". So we owe it to ourselves to go the extra mile in our search for statistical accuracy, and sometimes we might have to bend a few rules to do so.

This thought was prompted by the rebuilding of a major fact table, primarily in order to change its partitioning scheme, but the opportunity for some performance tweaking is not to be sniffed at. Accordingly, I have a plan ... a cunning plan if I may say so. A plan that may make database purists squirt coffee from their noses, and I consider that to be fair warning so let's press on.

the scenario

The fact table in question has a number of dimensional values for which the parent values are deterministically based on the lowest value itself. For example, DATE_OF_DAY has parent values of DATE_OF_MONTH, FISCAL_MONTH_CD, FISCAL_QUARTER_CD, FISCAL_YEAR_CD. Given a particular value of DATE_OF_DAY the calculation of the other four values is a simple matter, for example:

FISCAL_MONTH_CD =
To_Char(Add_Months(DATE_OF_DAY, 3), 'YYYY" M"MM')


So for 04-Dec-2004 the fiscal month is "2005 M03"

(Did you know that you can insert arbitrary strings of characters in a date-to-char conversion using double-quotes? A lot of people are surprised by that. Surprisingly.)

These parent values are stored in a dimension table, possibly snowflaked where required by the presence of summary tables at the DATE_OF_MONTH level, for example.

Now here's the problem that I'm trying to solve ... when a report query is submitted to the database and includes a predicate such as FISCAL_MONTH_CD = '2005 M04', and the query is directed to the fact table at the DATE_OF_DAY level, Oracle can only take the most generic estimation of the cardinality of the result set -- histograms on FISCAL_MONTH_CD in the dimension table and on DATE_OF_DAY in the fact table are not effective in this situation.

a demonstration script

set feedback off heading off

drop table dave_fct;

create table dave_fct
pctfree 0 nologging
as
select * from dba_objects;

begin
dbms_stats.gather_table_stats
(
OWNNAME => user,
TABNAME => 'DAVE_FCT',
METHOD_OPT => 'FOR COLUMNS OWNER SIZE 254'
);
end;
/

drop table dave_dim
/
create table dave_dim -- lovely name!
(owner primary key,
owner_2char not null)
as
select distinct
owner,
substr(owner,1,2)
from dave_fct
/

begin
dbms_stats.gather_table_stats
(
OWNNAME => user,
TABNAME => 'DAVE_DIM',
METHOD_OPT => 'FOR COLUMNS OWNER SIZE 254,
OWNER_2CHAR SIZE 254'
);
end;
/

set autotrace on

-- QUERY 1
select count(*)
from dave_fct
where owner = 'SYS'
/

-- QUERY 2
select count(*)
from dave_fct df,
dave_dim dd
where df.owner = dd.owner
and dd.owner = 'SYS'
/

-- QUERY 3
select count(*)
from dave_fct df,
dave_dim dd
where df.owner = dd.owner
and dd.owner_2char = 'SY'
/

results, and their implications

My results show that the statistics for queries 1 and 2 are pretty much spot on ... the actual number of rows and the estimated cardinality were within about 3% -- for query 3 the optimizer overestimated the cardinality by 324%. So what does this mean? Potentially it could prompt the optimizer to perform a full table scan instead of an index scan (standard disclaimer: not that I'm saying that full scans are necessarily a bad thing).

some possible solutions

So how do you get the optimizer better statistics?

One method would be to use a query such as:
select count(*)
from dave_fct df,
dave_dim dd
where df.owner = dd.owner
and dd.owner like 'SY%'
/

... which gives a very acceptable estimation in this case.

But this is a special case with its own advantages and disadvantages:
  • The "like" predicate is compatible with the use of histograms
  • This is a difficult technique to use for an In List query
So maybe ...
select count(*)
from dave_fct df,
dave_dim dd
where df.owner = dd.owner
and substr(dd.owner,1,2) = 'SY'
/

Nope ... the estimated cardinality is still over 300% too high.

How about creating a function-based index on the table?
create bitmap index dave_fct_ind on dave_fct (substr(owner,1,2))
compute statistics
/

Fast access on the query:
select count(owner)
from dave_fct
where substr(owner,1,2) = 'SY'
/

... but again no accurate statistics because there is no histogram on the index values, only on the table values. (Maybe an enhancement request ... hmmm ...) note: update above identifying this as incorrect information!.

a proposed solution

One method that I'm going to try is to push the higher level dimensional values down to the fact table. In our example, by placing the owner_2char attribute in the fact table and linking it to a snowflaked dimension table with owner_2char as the primary key we can use a histogram directly on that value.

So the demonstration script becomes:

drop table dave_fct;

create table dave_fct pctfree 0 compress nologging
as
select substr(do.owner,1,2) OWNER_2CHAR,
do.* from dba_objects do;

begin
dbms_stats.gather_table_stats
(
OWNNAME => user,
TABNAME => 'DAVE_FCT',
METHOD_OPT => 'FOR COLUMNS OWNER SIZE 254,
OWNER_2CHAR SIZE 254'
);
end;
/

drop table dave_dim
/

create table dave_dim
(owner primary key,
owner_2char not null)
as
select distinct
owner,
owner_2char
from dave_fct
/

create table dave_dim_p
(owner_2char primary key)
as
select distinct
owner_2char
from dave_fct
/

begin
dbms_stats.gather_table_stats
(
OWNNAME => user,
TABNAME => 'DAVE_DIM',
METHOD_OPT => 'FOR COLUMNS OWNER SIZE 254,
OWNER_2CHAR SIZE 254'
);
end;
/

begin
dbms_stats.gather_table_stats
(
OWNNAME => user,
TABNAME => 'DAVE_DIM_P',
METHOD_OPT => 'FOR COLUMNS OWNER_2CHAR SIZE 254'
);
end;
/

set autotrace on

select count(*) from dave_fct df, dave_dim_p dd
where df.owner_2char = dd.owner_2char
and dd.owner_2char = 'SY'
/


results of the proposal

Now, maybe because there are fewer distinct values of owner_2char than owner (17 instead of 25), or maybe because of blind luck, the estimation is actually better with predicates based on the former than it is on the latter ... in my case an error of 0.54% instead of 3%.

Another advantage is the ability to miss out the join to the dimension table, of course. In my example case I got a result in 0.12 seconds instead of 0.23 by eliminating the dimension table from the query altogether -- it would bear testing on a larger example of course, as dealing with wall clock times as low as that is not enormously significant.

So, downsides to this approach. Well as the higher values are deterministic upon the lower values there is little ETL overhead - no lookups to be performed, for example, so it doesn't seem to make ETL more complex. It does increase the row length by some amount, although that could be offset with data segment compression.

a common case for improvement

Now I particularly like the idea of this approach for date dimensions. As the dates in a table progress forwards in time there is varying skew on the higher levels of the dimension. For example if you have loaded data from 2001 up to the middle of January 2005 then two queries that filter on "Year = 2004" and "Year = 2005" can expect to retrieve significantly different numbers of rows from the fact table, and without this technique (or without placing the predicate on the level represented in the fact table) the optimizer is going to come up with something very funky -- probably the same incorrect estimation for both of those predicates.

some final words

I am convinced enough by this method to give it a go. The implementation seems simple, the downside seems to be manageable, and the benefits seem to be sufficient.

I have about three dimensions that I'm going to try it on. Date is the obvious one. Another is related to the identifier for the fact table's source system and type of record, which is enormously skewed at higher levels. In fact the records are broken down into regular transactions and journal vouchers, in a ratio of something like 500:1, so a report on journal vouchers is looking for an index-based access method but a report on transactions may do better with table-based access.

Now, if you can see any reason why this is A Really Bad Idea, then please let me know by the end of the week when I potentially start implementing this plan. Words of reassurance also acceptable ;)

Wednesday, July 20, 2005

Informatica Finally Does Direct Path Inserts

Well the title really says it all -- for a long time (well, since Informatica was born, really) you had to use its External Loader interface to work with SQL*Loader in order to get direct path inserts. The "bulk" switch was just bulk-binding.

So I heard from the excellent Informatica forum at Empowered Holdings that from version 7.1.2 direct path Oracle loads were native, and invoked by the "bulk" switch on the target. As it happens I got hold of 7.1.3 this afternoon. I tested it by reading from a copy of DBA_OBJECTS and inserting in both bulk and conventional mode into a table with the COMPRESS attribute set -- "Bulk" gives compression, the other does not, therefore "Bulk" is generating direct path inserts.

I'm so happy! Well, sort of.

Unfortunately we had another use for SQL*Loader, with was to load data in zoned decimal format, and unless I've lost my powers of documentation searching this is stil an alien concept to Informatica.

So there's some good news and some bad, but on the whole, this is a step forward.

Tuesday, July 19, 2005

Round-robin Segment Allocation Issues

As part of a discussion on tablespaces Noons mentions a problem that he found when creating partitions in a uniform extent size tablespace -- long story short, the partitions are all created in the first datafile rather than in a round-robin fashion.

However multiple extents within each partition are still allocated to data files in a round-robin fashion. So it seems to be appropriate to advise that where you are relying on the undocumented round-robin extent allocation technique for balancing i/o across multiple data files it is best to either have a great many extents per partition, or to make the number of extents equal to an integer multiple of the number of data files. Also, to avoid future problems with i/o imbalance when adding new data files later on, you might err on the side of caution by creating a larger number of smaller data files initially, so that you have plenty of room to expand their sizes to whatever upper limit you and your o/s are comfortable with.

Thanks for the heads-up on that one Noons.

Partition Change Tracking-based Refresh Without Materialized View Log

I see that this feature made it into 10gR2 ... it bothered me for quite a while that in order to get partition change tracking you had to have a materialized view log, and a couple of years ago Oracle Support couldn't explain to me why it was required.

Well, it's required no longer. Hopefully someone will now be able to tell me how changes to partitions are actually tracked -- it used to be a mystery, apparantly.

Monday, July 18, 2005

Eyeballs: "Frying tonight!"

So, tomorrow I go under the blade to get the old eyes sorted out. After too many years of wearing glasses I've had enough of them, and I suppose I might as well do it while I'm in a place with some scenery to appreciate, and to give me a few years before the reading glasses make their inevitable appearance.

My wife had PRK at the Air Force Academy a couple of months ago, and ended up being off work for nearly two weeks so I'm not going down that route -- it's bladeless "wavefront" Lasik for me. I had a pre-op exam today, immediately followed by a drive across town to take Number Two son for his first swimming lesson. I was feeling felt bad about my searing eyeball pain and nauseous disorientation until I saw what the swimming instructor of seven three-to-four year olds has to go through.

Sunday, July 17, 2005

Managing for Creativity

From Slashdot I picked up a reference to an article titled Managing for Creativity in the Harvard Business Review Online. It's a refreshing read, and it boggles my mind that there are so many companyies out there that just don't "get it".

Friday, July 15, 2005

A Dash Of Theory

Fabian Pascal's latest column in Database Magazine titled "What a Database And a DBMS Are" provides a wholesome broth of theory as an antidote to sloppy practice. We all need a good slapping sometimes, possibly us in the data warehousing world more than most ;) so I'll also just reemphasise my liking for his Database Debunkings site.

I wonder what FB would make of the Wikipedia entries on Relational Database, the Relational Model, and Database Management Systems? I think I'll email him to ask ... wish me luck boys, I'm going in!

Wednesday, July 13, 2005

Designs That Balance ETL and Report Performance

It seems to me that there is sometimes a very difficult balance to strike, between optimizing the structure of a database for fast and convenient ETL loading and fast and convenient report querying.

Normally the choice of how to partition a fact table is pretty straightforward, because the data being loaded arrives in an order compatible with querying -- ie. all of Tuesday's transaction are available to the data warehouse ETL process early on Wednesday morning, and during the day the users will be running reports with such filters as "Give me Tuesday's data". Thus you can partition on a daily basis and accomodate both a daily ETL process of loading via partition exchange, and a daily reporting process for which the optimizer can invoke partition pruning.

So far so good -- if the units in which the data becomes available is also a unit by which the end users want to report then the decision on partitioning is simple.

Now consider a case where all the data relating to a particular time period is not available immediately. An example here might help. Suppose you run an auto parts store, and as part of your business you sell replacements for broken-but-repairable items, such as a starter motor. When Joe Lunchpail comes into the store on Saturday afternoon to buy a new starter motor you tell him, "That part costs $100, but if you can bring in the old broken part then we'll give you a $20 credit for it" -- ie. there is a core charge of $20 on starter motors. Being a trusting organization you give him that $20 credit on the spot, as long as he uses a credit card for the purchase -- if he doesn't bring in the part within a week then you'll charge him that extra $20.

By the following Saturday evening Joe hasn't come back, so you charge him that $20. On sunday morning he comes in, hands over the part, and the system refunds him that $20. Three transactions in all, each one on a different date, but all of them relating to a customer interaction that occured on the first Saturday on which the starter motor was bought, and all of them linked back to it. In fact on the majority of the financial reports users are interested in "Original Transaction Date" (the first Saturday), not "Actual Transaction Date" (the three different dates on which the transactions really took place).

So imagine this being loaded into the company's data warehouse. For query reporting purposes it would be desirable to partition by "Original Transaction Date", but for ETL loading purposes it would be desirable to partition by "Actual Transaction Date". Which is the more appropriate method to choose?

Well, here's some pro's and con's.

Partition by Original Transaction Date
  • User queries run fast due to partition pruning, and partition-level statistics give optimizer better histogram information*.
  • ETL process cannot use simple exchange of populated table with empty partition. More complex method required
Partition by Actual Transaction Date
  • ETL process uses simple partition exchange of populated table with empty partition.
  • User query performance compromised by lack of partition pruning and use of global statistics.
* Just a quick note on this: For a partitioned table Oracle can store multiple sets of statistics. One set at the global (ie. all data in table) level, then one set for each partition, and one set for each subpartition. If a query accesses multiple partitions then global statistics are used by the optimizer in lieu of partition-level statistics, and this can be a real problem where histograms vary between different partitions. For example the partitions holding retail data for the period immediately prior to Father's Day will probably have very different histograms for those immediately prior to Mothers' Day, and will also have a different variation in the number of transactions per day. (I'm basing this on my own ability to leave shopping to the last minute in comparison to my wife's).

Moving along ...

There is a very clear implication in all this -- if your partitioning scheme is not compatible with the report queries then you are probably wasting your time collecting partition-level statistics (unless you then use those to build global statistics, such as "Number of Rows"). And that probably means that you are seriously compromising the ability of optimizer to define efficient execution plans.

You see where I'm leading with this ... the most important element of the system is report performance, and the partitioning scheme ought to be complimentary to that rather than to the ETL process. Hey, nobody said that ETL is easy -- conventionally it is supposed to represent 70-80% of the effort of a data warehousing project, I believe, which sounds about right.

So, what are the alternatives to partition exchanging as a load mechanism for fact tables? Bearing in mind that the fact table may be using compression, and is almost certainly using bitmap indexes, the most feasible approach seems to be a direct path insert of data into the fact table itself. Now if your business model guaranteed that you would never receive records that are more than a few days old, then it might be feasible to either perform multiple partition exchanges to refresh old partitions, or to use a hybrid solution in which a partition exchange is performed for the newest day of data (presumably the bulk of the new feed) and direct path inserts for the remaining few older records.

However I have a feeling that being liberated from partition exchanges may sometimes be beneficial. What it implies to me is that you are then much more able to define a more complex partitioning scheme to further complememnt the report queries because the ETL process is not made any more complex by increased complexity of partitioning. For example, where previously you might have used a partitioning scheme based on Range("Actual Transaction Date") for ETL compatibility, you can now potentially improve report performance by using a composite range-list partitioning scheme such as Range("Actual Transaction Date","Retail or Wholesale")-List("Region Code"). As long as the internal workings of Oracle don't hiccup on some rarely used partitioning scheme (ie. ORA-00600, admittedly a big factor) the details are immaterial to the ETL process.

Now another side-effect may be in the application of Materialized Views (MV's) - in particular, refreshing them. I'll be honest here -- I've thrown my heart and soul into MV's, but I've nearly always received a cruel rejection for my efforts. Getting the correct astrological alignment that will permit me to use fast refresh (on commit or on demand), partition change tracking, query rewrite, and partition exchanges in a production environment has so far eluded me, but maybe by eliminating partition exchanges (which were the source of my last MV-related debacle) I will finally find happiness. Who knows?

Tuesday, July 12, 2005

TimesTen

My imagination has been grabbed by the idea of the TimesTen in-memory database technology recently brought in house by Oracle, and how it might fit into a data warehousing architecture.

At the moment we have repositories for Business Objects and Informatica in the same database as our data warehouse. This has long struck me as a suboptimal architecture, as rapid application interface response can be seriously harmed by the simultaneous execution of reporting or ETL queries. Isolating the two types of database access in some way seems like a beneficial move for application usability.

So given that TimesTen allows in-memory caching of a subset of Oracle database tables, I'm now wondering how practical it would be to cache a subset of the repositories, and also to cache a subset of the data warehouse dimension tables to allow more rapid retrieval of lists of values (either for display through the Web Intelligence interface or for use by the ETL process). Much depends on how transparant the technology is to the client. I do not think that it would be a trivial matter to re-point the List Of Values queries to a different DSN, although with Informatica's lookups it could be done pretty easily.

Hmmm. Well, since the software is available for download I'll have a play around with the software and see how it looks.

Tuesday, July 05, 2005

I Achieve A State Of Networking Nirvana

All I know about networking, I learned at home. From humble modem-based beginnings I have now got a nice range of little boxes scattered around the house, each with it's own set of blinking, flickering or steady-state LED's.

Networking still seems to be a hit-or-miss business -- the cure for any kind of problem iss always to randomly reboot equipment, starting at the cable modem and working along to each connected device, until everything works again. If this technique fails then try fiddling with settings on the router. So it's always a bit of a thrill when you add something new to the network and get it working. Such was the case with my NSLU2, the purchase of which was inspired by having a spare 120Gb USB drive hanging around (and which has performed flawlessly ever since). By the way, I was pretty glad to see that Cisco bought Linksys -- I can almost plausibly claim that my house is "a Cisco shop" now and sound a little more professional than being Linksys ;)

The ultimate challenge was accepted last week when I attempted to integrate my new TiVo box into the network using a Linksys WUSB11 antenna which was not on the compatibility list. Sure enough I found that the TiVo required a restart before it would connect to the internet. I also found an upgrade to my Linksys BEFW11S4 wireless router that fixed a problem of communicating between the wired and the wireless sides of the network.

The end result is that the TiVo no longer needs the phone line, and I can download programs from its 80Gb internal hardrive to the 400Gb Hitachi drive in my desktop. I can also use the TiVo to view photos and play music stored on the drive attached to the NSLU2 through my PC ... nice! I remember mentally pooh-poohing the idea of 802.11g when it first came out, but now I'm thinking "upgrade!" because waiting for 38 minutes to transfer a 30 minute episode of Cook's Tour just seems somehow primitive. On the other hand, TiVo don't list any LinkSys 802.11g adapters as "recommended" so that plan will be on the back burner for a while.

By the way, I also see that the router that I think I paid close to $100 for a few years ago is now selling for $19.99 on Amazon (after the hated mail-in rebate, of course) while the WSB24 Wireless Signal Booster costs $60 -- what's up with that?

Blog Spam

I noticed a sudden rash of blogspot sites that I'd never heard of in the "Came From" section of my Statcounter report today ... a little investigative clicking showed that they are all spam blogs, just linking each post directly to some external site.

I think that Blogger needs to be using one of those "character recognition" anti-bot tools.

Is anyone else getting this?

The Fourth of July -- My Favo(u)rite Holiday

Getting used to the public holidays in a new country is a tricky thing. Each one has its own atmosphere and traditions, and in America they seem to be more distinct than they were in England.

You might imagine that Independence Day would be a tricky one -- with so many people fixing Chinese-made plastic Stars and Stripes to the windows of their Toyota SUV's, the nation is awash in patriotic symbols as a reminder of the military incompetence of perfidious Albion. I like to take the opportunity to tell my frends that this is the time when Americans everywhere give thanks to their buddies the French for their help in securing their independence, then drop in a mention of the Battle of Trafalgar as being a rather more important event around that time. By the way, that's also handy when someone says, "We're the reason you guys don't speak German now!" -- the Battle of Trafalgar explains why "We (the English) are the reason why you fellows don't speak French now!"

So that leads to lively debate over a few beers.

But anyway, I like the 4th of July. It's more relaxed than Christmas and Thanksgiving, not having so many family obligations and all. In fact once the preparation for the food was mostly out of the way I went over to help a friend build a fence and gate, and drink beer. That went on longer than expected and we ended up eating late and putting the littl'uns to bed at around 10:30pm.

And the other reason why I like this holiday is for the fireworks. The tent that I bought mine from had prominent notices stating that "the possession or use of fireworks is illegal in the City of Colorado Springs", which presumably means that the cops could slap a ticket on everyone driving out of the parking lot. Nevertheless the streets reverberated to the sound of explosions for a long time last night. I wonder where my neighbours got their rockets? They're supposed to be illegal in the whole state of Colorado -- so much for "Land of the Free"!

Oh well, Thanksgiving next. Last year we deep-fried a turkey, which can also lead to fireworks. My arteries are just regaining their flexibility from that, so in another few months I should be good to go for another one.

Happy holidays, everyone.

Discussion on Keys at Database Debunkings

Just a short note to point out an exchange at Fabian Pascal's Database Debunkings that touches on the subject of synthetic and natural keys.

"To Laugh Or Cry" at Database Debunkings

Fabian Pascal has put a new page on his "Database Debunkings" site: "To Laugh or Cry". Whether your interest is in relational database theory or just a good old-fashioned argument, it looks like it'll keep you entertained.