欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

Direct-PathINSERT(直接路径插入)

程序员文章站 2022-04-16 11:25:32
...

下面我们来看一下到底什么是直接路径以及使用直接路径的优势和限制。文章截自“ Oracle? Database Administrator's Guide11 g Release 2 (11.2) ”(当然中文注解不是官方文档中的) Improving INSERT Performance with Direct-Path INSERT When loading large

下面我们来看一下到底什么是直接路径以及使用直接路径的优势和限制。文章截自“Oracle? Database Administrator's Guide11g Release 2 (11.2)”(当然中文注解不是官方文档中的)

Improving INSERT Performance with Direct-Path INSERT

When loading large amounts of data, you can improve load performance by using direct-pathINSERT.

##当我们加载大量数据的时候,可以使用direct-path INSERT来提高处理性能

This section contains:

About Direct-Path INSERT

How Direct-Path INSERT Works

Loading Data with Direct-Path INSERT

Specifying the Logging Mode for Direct-Path INSERT

Additional Considerations for Direct-Path INSERT

About Direct-Path INSERT

Oracle Database inserts data into a table in one of two ways:

##Oracle数据库向表中插入数据有如下两种方式(传统路径和直接路径):

During conventional INSERT operations, the database reuses free space in the table, interleaving newly inserted data with existing data. During such operations, the database also maintains referential integrity constraints.##使用传统路径方式插入数据,数据库会利用表中已有的空闲空间,新老数据是交叉在一起的,同时在插入的过程中会维护引用完整性约束

During direct-path INSERT operations, the database appends the inserted data after existing data in the table. Data is written directly into data files, bypassing the buffer cache. Free space in the table is not reused, and referential integrity constraints are ignored. Direct-path INSERT can perform significantly better than conventional insert.##使用直接路径方式插入数据,数据库在表中已有数据之后追加数据(即直接使用高水位线以上的新块,不会像传统路径一样去扫描高水位线以下的空闲块使用)。数据绕过buffer cache直接写进数据文件。高水位线以下的空闲空间不会被使用,表的完整性约束会被忽略。相对于传统路径插入,直接路径插入效率提高很显著。

The database can insert data either in serial mode, where one process executes the statement, or in parallel mode, where multiple processes work together simultaneously to run a single SQL statement. The latter is referred to as parallel execution.

##数据库能够以串行模式插入数据,也能够以并行模式插入数据,并行模式也就是并行执行

The following are benefits of direct-path INSERT:

##下面列出了direct-path INSERT的优势:

During direct-path INSERT, you can disable the logging of redo and undo entries to reduce load time. Conventional insert operations, in contrast, must always log such entries, because those operations reuse free space and maintain referential integrity.##使用直接路径加载数据时你可以禁止产生redo和undo的日志,以此来缩短加载时间。相比之下传统路径加载总是会产生这些日志条目。

Direct-path INSERT operations ensure atomicity of the transaction, even when run in parallel mode. Atomicity cannot be guaranteed during parallel direct-path loads (using SQL*Loader).##Direct-pathINSERT操作能够确保事物的原子性,即使使用并行模式。但是direct-path loads(using SQL*Loader)不能保证事物的原子性。

When performing parallel direct-path loads, one notable difference between SQL*Loader andINSERT statements is the following: If errors occur during parallel direct-path loads with SQL*Loader, the load completes, but some indexes could be markedUNUSABLE at the end of the load. Parallel direct-path INSERT, in contrast, rolls back the statement if errors occur during index update.##当使用parallel direct-path loads时,需要注意的是,和direct-path INSERT不一样,如果在使用SQL*Loader进行parallel direct-path loads数据导入时出错,那么数据导入完成,但是索引会被标记为失效。相比之下Parallel direct-path INSERT如果在更新索引的时候出错,那么事物会回滚。

Note:

A conventional INSERT operation checks for violations of NOTNULL constraints during the insert. Therefore, if a NOTNULL constraint is violated for a conventional INSERT operation, then the error is returned during the insert. A direct-pathINSERT operation checks for violations of NOT NULL constraints before the insert. Therefore, if aNOT NULL constraint is violated for a direct-path INSERT operation, then the error is returned before the insert. ##传统路径插入是在插入过程中检查是否违反非空约束,因此,如果插入的数据违反了非空约束,那么会在插入过程中报错。直接路径插入在插入之前检查数据是否违反非空约束,因此,如果违反非空约束,那么会在插入之前报错。

How Direct-Path INSERT Works

You can use direct-path INSERT on both partitioned and nonpartitioned tables.

##可以对分区表和非分区表使用direct-path INSERT

Serial Direct-Path INSERT into Partitioned or Nonpartitioned Tables

The single process inserts data beyond the current high water mark of the table segment or of each partition segment. (Thehigh-water mark is the level at which blocks have never been formatted to receive data.) When aCOMMIT runs, the high-water mark is updated to the new value, making the data visible to users.

##使用高水位线之上的数据块进行插入,当执行commit提交以后,高水位线即被更新为新的值,使新插入的数据对用户变为可见(direct-path INSERT没有提交之前在同一个事物中,被插入的表是不能被dml,也不能被query的,否则会报ORA-12838错误。其他的会话可以查询该表,但是只能查到插入之前的数据)

Parallel Direct-Path INSERT into Partitioned Tables

This situation is analogous to serial direct-path INSERT. Each parallel execution server is assigned one or more partitions, with no more than one process working on a single partition. Each parallel execution server inserts data beyond the current high-water mark of its assigned partition segment(s). When a COMMIT runs, the high-water mark of each partition segment is updated to its new value, making the data visible to users.

##并行模式类似于串行模式。每一个并行进程被分配给一个或多个分区,但是不会出现多个程序处理一个分区的情况。

Parallel Direct-Path INSERT into Nonpartitioned Tables

Each parallel execution server allocates a new temporary segment and inserts data into that temporary segment. When aCOMMIT runs, the parallel execution coordinator merges the new temporary segments into the primary table segment, where it is visible to users.

##每一个并行进程被分配一个新的临时段并向临时段中插入数据。但执行commit时,并行执行的调度进程把这些临时段并入表所在的段中,这时数据对用户就是可见的了。

Loading Data with Direct-Path INSERT

You can load data with direct-path INSERT by using direct-path INSERT SQL statements, inserting data in parallel mode, or by using the Oracle SQL*Loader utility in direct-path mode. A direct-pathINSERT can be done in either serial or parallel mode.

##

Serial Mode Inserts with SQL Statements

You can activate direct-path INSERT in serial mode with SQL in the following ways:

##你可以使用如下方法激活直接路径加载

If you are performing an INSERT with a subquery, specify the APPEND hint in each INSERT statement, either immediately after theINSERT keyword, or immediately after the SELECT keyword in the subquery of theINSERT statement.##如果你是使用子查询的方式插入,那么在insert后加append提示

If you are performing an INSERT with the VALUES clause, specify theAPPEND_VALUES hint in each INSERT statement immediately after theINSERT keyword. Direct-path INSERT with the VALUES clause is best used when there are hundreds of thousands or millions of rows to load. The typical usage scenario is for array inserts using OCI. Another usage scenario might be inserts in a FORALL statement in PL/SQL.##如果你是使用带values从句的方式插入,那么在insert后加APPEND_VALUES提示

If you specify the APPEND hint (as opposed to the APPEND_VALUES hint) in anINSERT statement with a VALUES clause, the APPEND hint is ignored and a conventional insert is performed.

##如果你在带values从句的插入中使用的是append提示,而不是APPEND_VALUES提示,那么append提示会被忽略,执行的会是传统路径插入。

The following is an example of using the APPEND hint to perform a direct-pathINSERT:

INSERT /*+ APPEND */ INTO sales_hist SELECT * FROM sales WHERE cust_id=8890;

The following PL/SQL code fragment is an example of using the APPEND_VALUES hint:

FORALL i IN 1..numrecords
  INSERT /*+ APPEND_VALUES */ INTO orderdata 
  VALUES(ordernum(i), custid(i), orderdate(i),shipmode(i), paymentid(i));
COMMIT;
Parallel Mode Inserts with SQL Statements

When you are inserting in parallel mode, direct-path INSERT is the default. However, you can insert in parallel mode using conventionalINSERT by using the NOAPPEND PARALLEL hint.

##如果你使用并行模式插入,那么默认就是直接路径。然而,你也可以通过使用NOAPPEND PARALLEL提示来强制使用传统路径进行数据插入。

To run in parallel DML mode, the following requirements must be met:

##为了运行并行模式的DML,必须满足下面的条件:

You must have Oracle Enterprise Edition installed.##你必须安装的是oracle企业版

You must enable parallel DML in your session. To do this, submit the following statement:##你必须启用会话级并行DML

ALTER SESSION { ENABLE | FORCE } PARALLEL DML;

You must meet at least one of the following requirements:##同时你必须满足下面条件中的一个

Specify the parallel attribute for the target table, either at create time or subsequently##为目标表指定并行属性(建表时指定或建表后指定)

Specify the PARALLEL hint for each insert operation##在insert插入时加aprallel提示

Set the database initialization parameter PARALLEL_DEGREE_POLICY toAUTO##把PARALLEL_DEGREE_POLICY参数设为AUTO

To disable direct-path INSERT, specify the NOAPPEND hint in eachINSERT statement. Doing so overrides parallel DML mode.

##我们可以通过使用NOAPPEND提示来禁用直接路径插入。

Note:

You cannot query or modify data inserted using direct-path INSERT immediately after the insert is complete. If you attempt to do so, an ORA-12838 error is generated. You must first issue aCOMMIT statement before attempting to read or modify the newly-inserted data. ##在direct-path INSERT完成之后,你应该首先执行commit,然后再去查询和修改表中数据。如果你在还没有提交时就去查询或者修改数据,那么会报ORA-12838错误(注意,这里说的情况是在同一个会话下,如果你在另一个会话中还是可以对表进行查询的,当前你查到数据时插入之前的数据。但是就算是另一个会话你也不能完成dml操作,因为直接路径加载会给表加排他锁)

See Also:

"Using Conventional Inserts to Load Tables"

Oracle Database Performance Tuning Guide for more information on using hints

Oracle Database SQL Language Reference for more information on the subquery syntax ofINSERT statements and for additional restrictions on using direct-pathINSERT

Specifying the Logging Mode for Direct-Path INSERT

Direct-path INSERT lets you choose whether to log redo and undo information during the insert operation.

##Direct-path INSERT允许你选择是否记录redo和undo的日志信息(这里的意思应该是指是否写redo和undo)

You can specify logging mode for a table, partition, index, or LOB storage at create time (in aCREATE statement) or subsequently (in an ALTER statement).

If you do not specify either LOGGING or NOLOGGING at these times:

The logging attribute of a partition defaults to the logging attribute of its table.##如果明确给分区指定日志属性,那么分区会继承表的日志属性

The logging attribute of a table or index defaults to the logging attribute of the tablespace in which it resides.##如果表或索引没有指定日志属性,那么会继承所在表空间的日志属性

The logging attribute of LOB storage defaults to LOGGING if you specifyCACHE for LOB storage. If you do not specify CACHE, then the logging attributes defaults to that of the tablespace in which theLOB values resides.##LOB存储要看CACHE模式,如果是CACHE模式,则默认为LOGGING属性,如果为NOCACHE,则要看其所属表空间的LOGGING属性

You set the logging attribute of a tablespace in a CREATE TABLESPACE orALTER TABLESPACE statements.

Note:

If the database or tablespace is in FORCE LOGGING mode, then direct pathINSERT always logs, regardless of the logging setting.##如果数据库或者表空间处于FORCELOGGING模式,那么无论其他的日志属性怎么设置,直接路径加载都会记录日志。
Direct-Path INSERT with Logging

In this mode, Oracle Database performs full redo logging for instance and media recovery. If the database is inARCHIVELOG mode, then you can archive redo logs to tape. If the database is inNOARCHIVELOG mode, then you can recover instance crashes but not disk failures.

##在这种模式下,数据库记录用于实例和介质恢复所需的所有日志。如果数据库处于归档模式下,你可以把日志归档到磁带上。如果数据库处于非归档模式,那么你能够进行实例恢复,但是不能够进行介质恢复。

Direct-Path INSERT without Logging

In this mode, Oracle Database inserts data without redo or undo logging. Instead, the database logs a small number of block range invalidation redo records and periodically updates the control file with information about the most recent direct write.

##在这种模式下,数据库不记录数据插入时的redo和undo日志,但在新区标记invalid状态和修改数据字典的时候会产生少量日志,并且定期的在控制文件中更新当前的direct write信息

Direct-path INSERT without logging improves performance. However, if you subsequently must perform media recovery, the invalidation redo records mark a range of blocks as logically corrupt, because no redo data was logged for them. Therefore, it is important that you back up the data after such an insert operation.

##Direct-path INSERT without logging能够提高性能。然后如果你随后就进行介质恢复,那些被标记为invalid状态的块会报逻辑错误,因为没有为这些块记录日志。因此在直接路径之后进行数据备份是很有必要的。

Beginning with release 11.2.0.2 of Oracle Database, you can significantly improve the performance of unrecoverable direct path inserts by disabling the periodic update of the control files. You do so by setting the initialization parameterDB_UNRECOVERABLE_SCN_TRACKING to FALSE. However, if you perform an unrecoverable direct path insert with these control file updates disabled, you will no longer be able to accurately query the database to determine if any data files are currently unrecoverable.

##oracle11.2.0.2开始,使用Direct-path INSERT时,如果你通过把DB_UNRECOVERABLE_SCN_TRACKING 设为FALSE来禁用定期更新控制文件机制,那么数据插入性能会得到极大的提升。但是如果这这样做了,那么你就无法通过数据库查询那些数据文件时无法恢复的了。

See Also:

Oracle Database Backup and Recovery User's Guide for more information about unrecoverable data files

The section "Determining If a Backup Is Required After Unrecoverable Operations" inOracle Data Guard Concepts and Administration

Additional Considerations for Direct-Path INSERT

The following are some additional considerations when using direct-path INSERT.

##下面列出了使用direct-path INSERT的其他注意事项:

Compressed Tables

If a table is created with the basic compression, then you must use direct-pathINSERT to compress table data as it is loaded. If a table is created with OLTP, warehouse, or archive compression, then best compression ratios are achieved with direct-pathINSERT.

See "Consider Using Table Compression" for more information.

##如果你的表是basic压缩表,那么想要插入的数据会被压缩,必须使用direct-path INSERT(此时传统路径插入的数据是不会被压缩的)。如果你的表是oltp,warehouse,archive模式的压缩表,那么direct-pathINSERT方式插入的数据压缩效率是最高的(oltp等模式的压缩表,对普通方式插入的数据也会进行压缩,但不是在插入时即进行压缩)

Index Maintenance with Direct-Path INSERT

Oracle Database performs index maintenance at the end of direct-path INSERT operations on tables (partitioned or nonpartitioned) that have indexes. This index maintenance is performed by the parallel execution servers for parallel direct-pathINSERT or by the single process for serial direct-path INSERT. You can avoid the performance impact of index maintenance by making the index unusable before theINSERT operation and then rebuilding it afterward.

See Also:

"Making an Index Unusable" ##使用直接路径方式加载数据,索引的维护会被放在操作的最后执行。如果你使用的并行直接路径加载,那么维护索引时也会使用并行模式,如果你使用的是串行直接路径加载,维护索引时也是串行。为了避免数据加载过程中维护索引对性能产生影响,我们可以在加载数据之前使所有失效,加载数据之后再手工重建索引。
Space Considerations with Direct-Path INSERT

Direct-path INSERT requires more space than conventional-path INSERT.

##与传统路径插入相比直接路径插入需要更多的磁盘空间

All serial direct-path INSERT operations, as well as parallel direct-pathINSERT into partitioned tables, insert data above the high-water mark of the affected segment. This requires some additional space.

##不论串行或并行直接路径插入都是直接利用高水位线之上的数据块,这样就需要更多的空间(与传统路径插入相比)

Parallel direct-path INSERT into nonpartitioned tables requires even more space, because it creates a temporary segment for each degree of parallelism. If the nonpartitioned table is not in a locally managed tablespace in automatic segment-space management mode, you can modify the values of the NEXT and PCTINCREASE storage parameter andMINIMUM EXTENT tablespace parameter to provide sufficient (but not excess) storage for the temporary segments. Choose values for these parameters so that:

##

The size of each extent is not too small (no less than 1 MB). This setting affects the total number of extents in the object.

The size of each extent is not so large that the parallel INSERT results in wasted space on segments that are larger than necessary.

After the direct-path INSERT operation is complete, you can reset these parameters to settings more appropriate for serial operations.

Locking Considerations with Direct-Path INSERT

During direct-path INSERT, the database obtains exclusive locks on the table (or on all partitions of a partitioned table). As a result, users cannot perform any concurrent insert, update, or delete operations on the table, and concurrent index creation and build operations are not permitted. Concurrent queries, however, are supported, but the query will return only the information before the insert operation.

##direct-path INSERT操作会对表或者分区表的所有分区加上排他锁。因此在direct-path INSERT过程中不能对目标表进行dml操作,查询是可以的(非direct-pathINSERT当前会话),但查询返回的是direct-path INSERT之前的数据。