v$filestat And Multiblock Read Statistics
More Updates: 19 May 2005, about 45 minutes later
- Jonathan Lewis writes to point out that X$KCFIO does include multiblock statistic columns, but that the numbers are not consistent. So maybe later on I'll think about a methodology for determining what the cause of the inconsistency might be. But not now, because ...
- I'm in the middle of an upgrade of my development instance to ASM. W00t, as the youngsters say. Otherwise I'd also be checking to see if my own math is consistent with the numbers in X$KCFIO. Actually, I don't see how they can be but it'll provide a diversion.
- Another cause of the smaller block counts occurs to me -- that individual blocks are already in the buffer cache, hence they are not requested from disk and hence the number of blocks requested is below DFMRC. This is unlikely to be an issue on my (or any) data warehouse where parallel query is in use because direct reads are used (hence the buffer cache is not inspected), and the blocks are also less likely to be in the buffer cache due to the static nature of the data. I think this is highly unlikely to be the case with the Peoplesoft instance.
- Happily though, the validation of the DFMRC was just a side effect -- what i was really looking for here was the average wait time. If the numbers in the X$KCFIO are not completely consistent then that may not be a problem. As long as some major class of read is not completely omitted, or the numbers on which the average wait are not all updated consistently, then the methodology may be viable. Can't test it right now though, of course.
- I think I'm going to change the subtitle of my blog.
- Note to self: get JL to check my tax returns next year. He's a sharp-eyed devil-for-detail. Thanks in advance, JL.
Update 19 May 2005 (Original posting below)
Jeff & Jan, thaks for the feedback there.
I'm thinking that some of the issues there may be due to extent sizing -- multiblock reads obviously won't fetch the full DFMRC on blocks if the extent is not that large, so I'm thinking of a query to breakout files according to extent size (that ain't going to be "performant", I'd guess). But because what we see from v$filestat is an average then numbers based on any file with lots of small extents (which may be more likely to get FTS and hence multiblock read requests) would be skewed by this.
The method might have limited application because of this -- in a data warehouse there's probably many tablespaces with only fact tables, therefore with only large extents. Not a problem, because those are the ones where the multiblock reads would be prevalent, of course.
Original Article 17 May 2005
So, the v$filestat view gives us some statistics on the number of disk reads and their duration, documented for 9i here. In 10g we have the same structure, with the enhancement that historical snapshots of v$filestat are preserved in DBA_HIST_FILESTATXS. It would be a pretty trivial task of course, to automate this collection in prior versions through DBMS_JOB. The v$filestat table is mostly based on X$KCFIO, by the way, but there do not appear to be any additional columns in X$KCFIO of interest here.
The view breaks out read statistics (and write, but I'm just interested in read right now) as follows:
- PHYRDS: Number of physical reads done
- PHYBLKRD: Number of physical blocks read
- SINGLEBLKRDS: Number of single block reads
- READTIM: Time (in hundredths of a second) spent doing reads
- SINGLEBLKRDTIM: Cumulative single block read time (in hundredths of a second)
So, we might define the following:
- MULTIBLKRDS: The number of physical reads in multiblock mode = (PHYRDS-SINGLEBLKRDS)
- MULTIBLKBLKRD: The number of blocks physically read in multiblock mode = (PHYBLKRD-SINGLEBLKRDS) ... because a single block read by definition is a read of a single block.
- BLKSPERMULTIBLKRD: The average number of blocks read for each multiblock read = MULTIBLKBLKRD/MULTIBLKRDS = (PHYBLKRD-SINGLEBLKRDS)/(PHYRDS-SINGLEBLKRDS)
- MULTIBLKRDTIM: The total time spent in multiblock reads = (READTIM-SINGLEBLKRDTIM)
- AVGMULTIBLKRDTIM: The average time taken per multiblock read = MULTIBLKRDS/MULTIBLKRDTIM = (PHYRDS-SINGLEBLKRDS)/(READTIM-SINGLEBLKRDTIM)
Here's the final query I came up with:
Perhaps you could hive it a try and see if it works for you. Feedback and comments are very welcome.