Wait Events - v$system_event



Every time an Oracle process cannot do some work, it reports the reason in an internal tables. The values in this table are called wait events. The summary of these waits can be found in the table

v$system_event
Examples of wait events:
  • I/O - waiting for a read from disk
  • LOCKS - waiting for a enqueue on a table
  • DBWR - waiting for a change to be written to disk
  • NET - waiting for data to come in over the network
  • IDLE - simply waiting for work to do, ie the users is getting a coffee
  • LGWR - waiting for a commit to be written out to redo
  • REDO LOGS - waiting log file switch completion
  • PAGING - can serious affect latches
  • SQL PARSING - latch waits on shared pool and library cache
    references


    Questions or comments ? Send a note to Kyle Hailey
    
    
    
    
    
    This is an great table. In tells you instantly if and where there is a problem. This table is more and more widely used, but I am absolutly amaized at how slow people were to catch on to this table and still how suprisingly often people don't recommend using it for situations where it would point out rapidly the problem.

    Wait events were started in version 7, many years ago (10?!!). For a long time after version 7 came out tuning was still dominated by version 6 white papers and presentations which should have been more or less thrown out the window. It was the version 6 tuning methods that put emphasis (the had to, that was all there was) on system statistics. They came up with formulas and ratios based on statistics and probablity. Well with the wait events that was no longer needed. The classic case is the cache buffer hit ratio. The ratio is no longer needed. Wait events give much more information on how much we are waiting for io than some ratio based on what Oracle thinks are disk reads verses logical reads.

    Examples:

    1) Low cache hit ration, good response time: Reduce the Oracle buffer Cache and increase the UNIX buffer cache (using data files on cooked file system). The data will be cached in the UNIX buffer cache, an not in the Oracle buffer cache because it is too small. Oracle will report high physical reads, thus a low cache hit ratio, but the services times will be extremely rapid because the data is cached in memory, the UNIX buffer cache, and not from disk (if of course it is already cached in the UBC).

    2) High cache hit ratio, bad response time: Take a fairly large sga, cache a large table in memory (data warehouse kind of thing). Do an update on a column of the large table but joining with some smaller table say 20 times smaller but who isn't cached. Your cache hit ratio could be around 95% (say by the end of the opperation you need to read all of the big table and all of the little table(or index) so that 20 out of every 21 blocks are in memory). Now say accessing that 1 in 20 blocks is 1/10th a second (bad disk, flood it with write or whatever). Say the big table has 360000 blocks and the little table(or index) has 1/20th, 18000 blocks. So then at 1/10th sec access for 18000 blocks adds up to half an hour of wait time with a buffer cache hit ratio of 95%. Your bad disk problem won't show up with the cache hit ratio but shows up very distinctly in the wait events.

    THe buffer cache hit ratio - is concept of little value in v7 and v8 unless your hit rate is 100%. Not that cache hit ratio is worthless, but why use try to figure something out in the dark when you can turn the lights on, ie why use cache buffer hit ratio which can be a bit ambiguous to determine inefficiency when wait events will give a clear detailed picture. What is important is how long we waited to get the data not how many times we had to wait. I've seen client sights where the average wait time for a block is 1/10,000ths a second to 1/10th a second. The same buffer cache ratio at the two situations have two very different performance impacts. The buffer cache hit ratio is a carry over from v6 where there were no wait events and the only way get an idea how much we waited for disk reads was the buffer cache hit ratio. In v7/v8 there are wait events "db file sequential read" and "db file scattered read". If you have timed statistics you can see how long you waited for the data.

    The examples were illustrations that the cache hit ratio is generally of little help debuging a tuning problem when compared to the detail given by the wait events. They are totally academic examples. In no way did I mean to be a propent of large UNIX buffer caches.