.comment-link {margin-left:.6em;}

Oracle Sponge -- Now Moved To Wordpress

Please use http://oraclesponge.wordpress.com

Tuesday, April 12, 2005

Response Times: What The User Thinks And Why It Matters

A while ago I started working full time on a system that I hadn't been associated with for a year or two -- a decision support system with a Microstrategy front end. One of the pleasing features about Microstrategy was the logging of report executions, and we extracted historical report performance numbers from the metadata. We found that over the prior months the average report duration had increased from around 25 seconds up to around 100 seconds.

Quelle horreur! (sp? Where's DaPi when you need him?)

There were two main causes for this. Most obviously, at some point in the past few months the automatic collection of table and index statistics had ceased, so there were partitions of fact tables with hundreds of thousands of rows in them that were marked as having zero, and other such anomalies.

Less obviously there were some suboptimal design elements that were causing problems for the optimizer, such as not allowing partition pruning.

Update Apr 14: I just remembered that there was a third reason, and this was a big one. Star transformations were turned off quite a while before due to an optimizer bug that caused reports to fail. Despite upgrading out of the problem, they had not been turned back on.

Sleeves rolled up, we worked through the issues in the odd gaps between "real" work and after three months the average report time was down to 16 seconds. A nice improvement that the client project staff were very pleased with. But while project staff opinions count for a lot, what do the users think?

We informally polled a few high-profile users, and were able to show that for their particular frequently run reports (which were on average larger/more work intensive than the average user's) the improvement was even more dramatic. A factor of 5-10x faster was not uncommon. so what did the users' think? Here's the point of this article. They had not noticed a thing! Even when we were able to say, "You know, that monthly sales report that you like to run used to take 250 seconds to execute just before Christmas, and now it's all done in 25 seconds. How about that?", the response was muted. "Hmmm, yeah maybe you're right. Hadn't really noticed myself, but thanks".

Dispiriting? Maybe. But in retrospect I'm barely surprised, and here is my theory to explain why.

First of all, try staring at a blank spot on your screen, and see how long it is before your attention naturally starts to wander. Remeber, this isn't a staring competition with the hardware, just be natural about it.

I may be some kind of easily distracted gadfly but I reckon that I'm thinking about wallpaper paste, how ball bearings are made, or how much more I'd be enjoying myself out on my bike this afternoon within something like ten seconds. Maybe a shade less if it's just before lunch. So I would be shouting over the cubical wall or brushing specs of dust off the screen or checking my voice mail pretty damned quickly, and you know then it doesn't matter much how long that report takes to run unless the boss is looking over my shoulder waiting for the result to come back.

So here's my theory.

In general users do not distinguish between a report that takes a short time and one that takes a long time, unless the result comes back within their casual attention span of a few seconds

As a corollary of that let me also propose the following.

"If you improve report performance from a minute or so to anything more than about fifteen seconds, then the users will not notice"

Pretty cynical stuff, you may think, but I'd disagree there. I think that it's a general observation that may well hold true for most systems and users. Other opinions are always welcome and will always be credited, even if I disagree.

I recall reading quite a while ago of some scheduling systems for buses and elevators that used research on how long people were happy to wait for before they got irritated by a delay. I recall something like one minute for an elevator and seven minutes for a bus. Don't quote me on that though. The important issue is that people do have these fairly sudden cut-off points at which their attention span or patience will turn for the worst. I don't know what drives this behaviour but I bet that it varies by population and by prior experience. Maybe Russians are better at waiting than Americans, or something. I'm sure that stock brokers are less patient than librarians when it comes to report run times, so these numbers may be variable. ie. your mileage will vary.

Here's another corollary, before anyone thinks that by all this I mean "do not bother tuning your system".

"If you do reduce reporting time as above then it is of value only for the purposes of system resource conservation and of making you look good".

I have some more thoughts on what this actually means for the tuning process itself, and I'll publish that separately because your attention span is now probably exhausted.


At 12:33 PM, Blogger Bill S. said...

I used to work for a reasonably-sized bank. Running software by EDS/Newtrend on an Unisys A-series, no test system. High OLTP, especially at month-end. M/E processing used to take 3 days (not a typo - 3 DAYS!) because of the number of "reports" that needed to be run at the end. Enter our hero (:-D) who is now in charge of datacenter operations. Recommendation is to first a foremost GET A TEST BOX! After 1.5 years, got one - but it is SMALL in comparison to the A-Series, really just an over-powered server. OK, so once we get our DB established on the test box (full representation of production code and data, how we managed I don't know). LSS, I recommend that at the close of the actual M/E processing we do a full b/u, restore to test, and run all the reports on the test box. Users immediately start SCREAMING because "our reports will be late!", until I remind them that they currently wait 2 DAYS for their reports, and this will make it run FASTER (no contention with production processing). Sometimes, they do notice it, they just don't realize the IMPROVEMENT.

At 2:40 PM, Anonymous Pete_S said...

That's the problem with DWH users - they have no sense of time. Unless 1) they now don't have time to get a cup of coffee whilst that old, slow YTD report runs or 2) they leave out a time predicate and then whinge about the report taking hours and having 5000 pages... you (and me) just can't win

At 8:10 AM, Blogger David Aldridge said...

Thanks for sharing guys. It's good to know that my problems aren't unique!

At 2:15 AM, Blogger DaPi said...

"Quelle horreur! (sp? Where's DaPi when you need him?)"

Ich war in Z├╝rich! sp looks OK to me.

Besides the professional satisfaction of speeding things up:

- your faster report is probably consuming less resources and so may have a less impact on anything else that my be running.

- I've seen it said: "no point in speeding up a 3-hour report that runs over-night". That's only true until it crashes and has to be rerun at 09:00 for an 10:00 meeting.

At 7:08 AM, Blogger David Aldridge said...

DaPi, well I had to think very hard before I committed to it.

I recently heard that our 2Gb controllers were configured to only 1Gb -- seems like an interesting way to waste money. So if that gets fixed things might get a tad better again.

Incidentally, although I didn't mention it in the article there was a thrid major reason why everything was slow -- star transformations were turned off. * glurk * Maybe I'll update it appropriately.

Thanks DaPi

At 9:16 AM, Blogger Pete_S said...

Well if you do change it don't forget to test it first ;-)

Cheers Pete
(oh don't read the blog - I haven't written anything yet... too busy working)

At 5:49 AM, Blogger Connor McDonald said...

I recently heard that our 2Gb controllers were configured to only 1Gb

In terms of wasting resources, we're running 32bit Oracle on a Sun box with 32G of RAM... the unused 28G I assume is being kept "for future use"



Post a Comment

Links to this post:

Create a Link

<< Home