Oracle I/O优化方法(英文版)
Tuning Oracle I/O Usage In general differing types of required database functionality requires differing types of tuning. OLTP databases need a quick response time for generally small transactions and batch performing databases require hig
Tuning Oracle I/O Usage
In general differing types of required database functionality requires differing types of tuning. OLTP databases need a quick response time for generally small transactions and batch performing databases require high throughput of large transactions. A number of factors are important with respect to tuning of I/O with Oracle.
- Evenly distribute I/O,
- The number of disks, I/O bandwidth and seek/wait time are important.
- High I/O requests increases wait time and larger I/O requests increase service time (decrease service with striping of datafiles across different disks).
- Mirroring of disks causes an I/O increase but can be essential for rapid recoverability.
Block Size
Large block size is beneficial for sequential reads and multimedia storage. Note that the most efficient method of multimedia storage is to store a reference only in the database and store multimedia objects outside the database. Text searching through large text objects is not effectively stored outside the database since usually extensive searching facilities are required. Small block size is beneficial for small transactional random reads and writes since only what is required is read at once. Large block size coupled with small transactions produces wasted I/O by reading blocks where only a small part of each block read contains the required data. Also when large blocks are accessed there is a higher chance of block contention since larger blocks potentially contain more rows where it is more likely than different requests will be satisfied by the same block. 8K is probably the most effective all-round block size for Oracle.Note that in Oracle 2K and 4K block size do not allow reading of more than a single block at once, ie. DB_FILE_MULTIBLOCK_READ_COUNT and SORT_MULTIBLOCK_READ_COUNT. Large block size requires less overhead but is totally inappropriate for index accessed data in an OLTP environment.
Problem Detection
Problems can occur at both the operating system and the Oracle level. It is best to separate heavy disk usage files from Oracle database files, this includes even the Oracle installation. Assuming an external RAID array it may be best to place the operating system plus the Oracle installation software on a disk internal to the server and then place Oracle database files onto the RAID array.
Operating System
In NT use the performance monitor and in UNIX use either sar -d or theiostat command.
Oracle
Oracle I/O statistics can be found as shown below.
- v$filestat - database files.
- v$sysstat, v$system_event and v$session_event - logs.
- v$system_event and v$session_event - archives.
- v$system_event and v$session_event - control files.
V$FILESTAT
select substr(v$datafile.name,1,35) "Name" ,v$filestat.phyrds "Reads" ,v$filestat.phywrts "Writes" ,v$filestat.phyblkrd "Blocks Read" ,v$filestat.phyblkwrt "Blocks Written" ,v$filestat.readtim "Read Time" ,v$filestat.writetim "Write Time" from v$datafile, v$filestat where v$datafile.file#=v$filestat.file# order by 1; Name Reads Writes Blocks Read Blocks Written Read Time ----------------------------------- --------- --------- ----------- -------------- --------- C:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF 1446 6 2644 6 0 C:\ORACLE\ORADATA\ORCL\USERS01.DBF 0 0 0 0 0 D:\ORACLE\ORADATA\ORCL\INDX01.DBF 0 0 0 0 0 D:\ORACLE\ORADATA\ORCL\RBS01.DBF 13 2 13 2 0
V$SYSTEM_EVENT
select substr(event,1,28) "Event" ,total_waits "Waits" ,total_timeouts "Timeouts" ,time_waited "Wait Time" ,average_wait "Avg Wait" from v$system_event order by 1; Event Waits Timeouts Wait Time Avg Wait ---------------------------- --------- --------- --------- --------- Null event 1 1 0 0 SQL*Net break/reset to clien 24 0 0 0 SQL*Net message from client 190 0 0 0 SQL*Net message to client 191 0 0 0 SQL*Net more data to client 7 0 0 0 control file parallel write 956 0 0 0 control file sequential read 199 0 0 0 db file parallel write 1 0 0 0 db file scattered read 116 0 0 0 db file sequential read 1359 0 0 0 db file single write 4 0 0 0 direct path write 8 0 0 0 file identify 20 0 0 0 file open 53 0 0 0 instance state change 1 0 0 0 latch free 4 4 0 0 log file parallel write 24 0 0 0 log file sequential read 5 0 0 0 log file single write 5 0 0 0 log file sync 2 0 0 0 pmon timer 274422 274421 0 0 process startup 7 1 0 0 rdbms ipc message 2907 2850 0 0 rdbms ipc reply 7 0 0 0 refresh controlfile command 24 1 0 0 reliable message 1 0 0 0 smon timer 15 10 0 0 sort segment request 1 1 0 0
V$SESSION_EVENT
select substr(v$session.username,1,10) "Username" ,substr(v$session_event.event,1,29) "Event" ,v$session_event.total_waits "# of Waits" ,v$session_event.total_timeouts "Timeouts" ,v$session_event.time_waited "Time Waited" ,v$session_event.average_wait "Avg Wait" ,v$session_event.max_wait "Max Wait" from v$session,v$session_event where v$session.sid=v$session_event.sid order by 1,2; Username Event # of Waits Timeouts Time Waited Avg Wait Max Wait ---------- ----------------------------- ---------- --------- ----------- --------- --------- SYSTEM SQL*Net break/reset to client 24 0 0 0 0 SYSTEM SQL*Net message from client 161 0 0 0 0 SYSTEM SQL*Net message to client 162 0 0 0 0 SYSTEM SQL*Net more data to client 7 0 0 0 0 SYSTEM control file sequential read 87 0 0 0 0 SYSTEM db file sequential read 356 0 0 0 0 SYSTEM direct path write 8 0 0 0 0 SYSTEM file open 5 0 0 0 0 SYSTEM latch free 1 1 0 0 0 SYSTEM log file sync 1 0 0 0 0 SYSTEM refresh controlfile command 23 1 0 0 0 SYSTEM sort segment request 1 1 0 0 0 control file parallel write 9 0 0 0 0 control file parallel write 955 0 0 0 0 control file sequential read 23 0 0 0 0 control file sequential read 52 0 0 0 0 control file sequential read 21 0 0 0 0 db file parallel write 1 0 0 0 0 db file scattered read 110 0 0 0 0 db file sequential read 6 0 0 0 0 db file sequential read 6 0 0 0 0 db file sequential read 4 0 0 0 0 db file sequential read 680 0 0 0 0 db file single write 4 0 0 0 0 file identify 6 0 0 0 0 file identify 6 0 0 0 0 file identify 5 0 0 0 0 file open 10 0 0 0 0 file open 1 0 0 0 0 file open 8 0 0 0 0 file open 6 0 0 0 0 file open 11 0 0 0 0 latch free 2 2 0 0 0 log file parallel write 24 0 0 0 0 log file sequential read 5 0 0 0 0 log file single write 5 0 0 0 0 pmon timer 274432 274431 0 0 0 rdbms ipc message 965 959 0 0 0 rdbms ipc message 985 959 0 0 0 rdbms ipc message 5 4 0 0 0 rdbms ipc message 983 959 0 0 0 smon timer 15 10 0 0 0
How to Tune I/O
There are a number of approaches to tuning I/O. These involve striping of files across multiple disks and distribution of I/O, prevention of dynamic datafile space allocation, sort tuning, efficient checkpointing, database buffer writer and log buffer writer tuning.
Striping and I/O Distribution
In general all CPU activity waits for the completion of I/O operations. Striping of files across multiple disks either at the operating system or Oracle level can sometimes help to improve performance. Striping implies placing files onto different disks where files separated are those which contain I/O activity at exactly the time; there can be a lot of overhead generated by disk head movement whilst operating on different files at the same time. As with any type of tuning the types of operations determine largely how a system can best be tuned. With respect to Oracle transaction size, transaction size between specific periods of time or the spread of transaction sizes over time can determine the correct course for tuning. Obviously small and large transactions use resources at differing rates, thus different approaches to tuning are required.
Typically it is sensible to separate datafiles from redo log files since both are active at the same time for any database change operation. Also when multiplexing redo logs it is advisable to place each redo log file group onto a separate disk. Note that redo log file group multiplexing is very efficient since the different groups are written in parallel. Placing of archive logs on a disk separated from all redo logs is another possiblity thus preventing I/I retention between redo logs and archives during copying of redo logs to archive logs. It can also be advisable to place rollback segment datafiles on another separate disk aswell since rollback is generated during database change operations aswell, even though rollback will not see as much activity as redo logs. In the past it was also recommended to separate datafiles containing tables from their indexes but generally indexes are hit just before the table is. However, complex joins may negate this fact and make it advantageous to place datafiles containing tables and datafiles containing indexes on separate disks. Note that with the advent of RAID arrays and efficient operating system level striping of files separation of Oracle datafiles and log files on separate disks becomes less important.
It is important to place all files unrelated to Oracle on separate disks to that of Oracle and its datafiles. It is probably also a good idea to place the Oracle installation software itself on a separate disk, probably the primary disk or partition.
Another possible form of striping is at the Oracle level rather than the operating system level. This can involve the striping of large tables in multiple datafiles by placing the different datafiles onto separate disks. One form of this type of striping would simple be creation of multiple datafiles in a single tablespace where the different datafiles reside on different disks, this is beneficial to large amounts of random access to retrieve small amounts of data. A second form of this type of striping would be partitioning of tables across multiple tablespaces where the tablespace datafiles can be placed onto multiple disks. This type of striping can be achieved by creation of multiple files of the same size and then setting the initial parameter in the table storage clause to the size of the datafile plus the minextents parameter to the size of the initial parametes multiplied by the number of datafiles. This will probably only work well with static data because once data is added probably only the first datafile declared for the tablespace will extend automatically, or whichever datafile is found first which can be automatically extended. In short this method is a little ridiculous when a RAID array can do the same thing, with mirroring, and probably a lot more efficiently. If you can afford Oracle you can afford a RAID array.
Dynamic Space Allocation
When a datafile or rollback segment runs out of space Oracle will dynamically add more space to that segment. This is called dynamic extension by the process of Oracle adding new extents to a segment. This can adversely affect resursive calls are used to not only execute the SQL statements required but also to create the new extents. There is a statistic in the V$SYSSTAT view calledrecursive calls. Other things also cause recursive calls, ie. data cache misses (search the disk), trigger firings (go off and do something else before proceeding), DDL, SQL statements in any type of block and primary to foreign key referential integrity constraint checking. None of these types of five things can be avoided, for instance blocked SQL transactions have serious performance advantages.
select name,value from v$sysstat where name='recursive calls'; NAME VALUE ---------------------------------------------------------------- --------- recursive calls 8313
The obvious solution to reducing the automatic allocation of dynamic extents is to increase the size of the extents for an object or for all tablespaces. High database insertion activity is more likely to cause high automatic extent allocation. Larger extents can store more blocks and thus many blocks read sequentially will be in fewer extents and less I/O is required. The down-side to larger extents is that they may not be physically stored contiguously. Rollback segments tend to do a lot of dynamic extension since it is best to decrease their size after use to the optimal size. It is simply dangerous to automatically rollback segments, run-away transactions can use up disk space completely and crash the database. One solution for rollback segment dynamic extension is to use multiple different sized rollback segments and allocate each transaction a specific rollback segment using the SET TRANSACTION command. However, any unused, specialised rollback segments should either be created privately for specific transactions or held offline unless in use. Obviously SET TRANSACTION can cause contention because multiple users executing the same transaction will be using the same rollback segment (CRUNCH !!!). Create rollback segments generally for the average transaction size and perhaps some specialised offline rollback segments for specific, infrequent, single-user uses.
Tuning Sorts
It is faster for sorting to be performed in memory. When memory is insufficient then sorting will be performed on disk either in a temporary tablespace or whatever tablespace is designated for the user as that user's temporary tablespace. Temporary tablespaces are specifically designed for disk sorts, they should be used.
select name,value from v$sysstat where name in ('sorts (memory)','sorts (disk)'); NAME VALUE ---------------------------------------------------------------- --------- sorts (memory) 124 sorts (disk) 0
SORT_AREA_SIZE determines the amount of memory allocated to memory sorting. Most Texts recommend setting SORT_AREA_RETAINED_SIZE equal to SORT_AREA_SIZE. This means that all of the memory will be retained for sorting after query completion. Doing this is really only appropriate for large batch operations and not for OLTP databases. If small transactions are frequent and large transaction infrequent it is better to sort to disk for the large transactions. If the two parameters are kept the same small transactions will have far too much memory to use and large transactions will simply push all small transactions to disk when they occur. Generally set SORT_AREA_RETAINED_SIZE to about 10% of SORT_AREA_SIZE. For large sorts set the SORT_MULTIBLOCK_READ_COUNT parameter higher and thus more sorting transfer is achieved between memory and disk, ie. sorting is done in a few larger chunks rather many smaller chunks, ie. less I/O. An index can be created with the NOSORT option, this allows index creation on table rows correctly physically ordered to be a copy of the order in the table.
Checkpoints
A checkpoint forces the writing of dirty buffers (changed data) from the data buffer cache to the disk. The more frequently checkpoints are performed the less recovery would be required at instance failure. However, frequent checkpointing could possibly affect performance by requiring more I/O. Minimum checkpointing can be implemented by setting LOG_CHECKPOINT_INTERVAL to be larger than a single redo logs (in blocks) and set LOG_CHECKPOINT_TIMEOUT = 0. Do not have differing redo log sizes. Unless you are manually controlling when logs are used by forcing switches make all redo logs the same size. The larger a redo log is the longer it takes to write to the archives, however small redo logs are not effective for a very active system since they will be filled, archived and switched too rapidly. FAST_START_IO_TARGET will limit number of operations required for recovery by writing dirty data buffer cache to disk when the FAST_START_IO_TARGET limit is exceeded. FAST_START_IO_TARGET will also aid recoverability and not performance.
The DBWn and LGWR Processes
The LGWR process will being writing the log buffer to disk when the log buffer is 1/3 full or a commit command is issued. A small log buffer causes too many I/Os and a large log buffer can delay writing to disk. Set the CHECKPOINT_PROCESS to TRUE to enable the CKPT process therebye freeing the LGWR process of the checkpointing responsibilities.
Create multiple database writer processes (writing dirty database buffer cache entries to disk) by setting the DB_WRITER_PROCESSES parameter from between 1 and 10 producing processes DBW0 through to DBW9. Note that DBWR_IO_SLAVES cannot be executed concurrently with multiple DBWn processes. With a single buffer pool database buffer cache is divided up among multiple DBWn processes by LRU latches, one latch per LRU list (list of latches). Load should be spread amongst all DBWn processes by setting DB_BLOCK_LRU_LATCHES equal to or a multiple of CPUs. Thus a 4 CPU system with 2 DBWn processes you could have 4 LRU latches. In the case of multiple buffer pools (DEFAULT, KEEP and RECYCLE buffers) set LRU latches equal to or a multiple of DBWn processes. Spread the load, place an equal number of latches in each pool out of the total which is a multiple of the DBWn processes.