欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

rac的parallel_instance_group和instance_groups参数和并行的关

程序员文章站 2024-01-12 15:10:10
...

INSTANCE_GROUPS PropertyDescription Parameter typeString SyntaxINSTANCE_GROUPS = group_name [, group_name ] ... Default valueThere is no default value. ModifiableNo Range of valuesOne or more instance group names, separated by commas Basic

INSTANCE_GROUPS
Property Description
Parameter type String
Syntax INSTANCE_GROUPS = group_name [, group_name ] ...
Default value There is no default value.
Modifiable No
Range of values One or more instance group names, separated by commas
Basic No
Real Application Clusters Multiple instances can have different values.

INSTANCE_GROUPS is a Real Application Clusters parameter that you can specify only in parallel mode. Used in conjunction with the PARALLEL_INSTANCE_GROUPparameter, it lets you restrict parallel query operations to a limited number of instances.

This parameter specifies one or more instance groups and assigns the current instance to those groups. If one of the specified groups is also specified in thePARALLEL_INSTANCE_GROUP parameter, then Oracle allocates query processes for a parallel operation from this instance.

官档描述instance_groups参数是指定的并行模式实例组,联合parallel_instance_group使用,可以指定一个乃至多个实例组。

PARALLEL_INSTANCE_GROUP
Property Description
Parameter type String
Syntax PARALLEL_INSTANCE_GROUP = group_name
Default value A group consisting of all instances currently active
Modifiable ALTER SESSION, ALTER SYSTEM
Range of values Any group name specified in the INSTANCE_GROUPS parameter of any active instance
Real Application Clusters Multiple instances can have different values.

PARALLEL_INSTANCE_GROUP is a Real Application Clusters parameter that you can specify in parallel mode only. Used in conjunction with the INSTANCE_GROUPSparameter, it lets you restrict parallel query operations to a limited number of instances.

This parameter identifies the parallel instance group Oracle will use for spawning parallel execution processes. Parallel operations will spawn parallel execution processes only on instances that specify a matching group in their INSTANCE_GROUPS parameter.

If the value assigned to PARALLEL_INSTANCE_GROUP is the name of a group that does not exist, then the operation runs serially. No parallelism is used.

同样parallel_instance_group参数也是并行模式参数,联合instance_groups使用,oracle只分配给指定实例并行执行,这个实例必须和instance_groups参数里面记录的实例一致。
而如果parallel_instance_group指定的参数在实例组instance_groups中不存在,oracle不能启动并行。
小鱼的英文水平不高,翻译确实存在一些疑义,简单概要下就是instance_groups参数记录的是所有的可供选择并行执行的实例组,而parallel_instance_group则是实际指定的并行执行的实例,下面我们手动来测试下:

默认instance_groups和parallel_instance_group参数都没有设置
SQL> show parameter instance_group;
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_groups string
parallel_instance_group string

此时加上parallel hint对一个数据量较大的视图进行查询
SQL> select /*+parallel(a 2)*/ count(*) from t01_view a;
[root@rac01 ~]# ps -ef|grep p0|grep -v psp0
oracle 10898 1 0 02:10 ? 00:00:00 ora_p000_xiaoyu1
oracle 10900 1 0 02:10 ? 00:00:00 ora_p001_xiaoyu1
root 12081 7995 0 02:11 pts/2 00:00:00 grep p0
[root@rac02 ~]# ps -ef|grep p0|grep -v psp0
oracle 3445 1 0 02:10 ? 00:00:00 ora_p000_xiaoyu2
oracle 3447 1 0 02:10 ? 00:00:00 ora_p001_xiaoyu2
root 4700 9157 0 02:11 pts/2 00:00:00 grep p0
此时这个并行进程多个实例上都拥有,这里说明在默认不设置instance_group和parallel_instance_group参数情况下,并行查询是可以跨越实例的。

而如果我们要控制并行查询在指定的实际执行,也就是不要跨越实例
实例xiaoyu1的参数:
SQL> show parameter instance_group;
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_groups string xiaoyu, xiaoyu1
parallel_instance_group string xiaoyu1
实例xiaoyu2的参数:
SQL> show parameter instance_group;
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_groups string xiaoyu, xiaoyu2
parallel_instance_group string
实例1发起查询
SQL> select /*+parallel(a 2)*/ count(*) from t01_view a;
[root@rac01 ~]# ps -ef|grep p0|grep -v psp0
oracle 22975 1 1 03:00 ? 00:00:00 ora_p000_xiaoyu1
oracle 22977 1 1 03:00 ? 00:00:00 ora_p001_xiaoyu1
oracle 22979 1 0 03:00 ? 00:00:00 ora_p002_xiaoyu1
oracle 22981 1 0 03:00 ? 00:00:00 ora_p003_xiaoyu1
root 23414 7995 0 03:01 pts/2 00:00:00 grep p0
[root@rac02 ~]# ps -ef|grep p0|grep -v psp0
root 14032 9157 0 03:01 pts/2 00:00:00 grep p0
这里看出并行查询只能在instance_groups的实例组中,由于实例xiaoyu2的instance_groups没有设置xiaoyu1,此时并行进程只能在instance_group=’xiaoyu’,’xiaoyu1‘的实例xiaoyu1上运行。

实例1的参数:
SQL> show parameter instance_group;
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_groups string xiaoyu, xiaoyu1
parallel_instance_group string xiaoyu
实例2的参数:
SQL> show parameter instance_group;
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_groups string xiaoyu, xiaoyu2
parallel_instance_group string
实例1发起查询:
SQL> select /*+parallel(a 2)*/ count(*) from t01_view a;
[root@rac01 ~]# ps -ef|grep p0|grep -v psp0
oracle 5431 1 2 03:10 ? 00:00:00 ora_p000_xiaoyu1
oracle 5437 1 5 03:10 ? 00:00:00 ora_p001_xiaoyu1
root 5510 7995 0 03:10 pts/2 00:00:00 grep p0
[root@rac02 ~]# ps -ef|grep p0|grep -v psp0
oracle 28047 1 4 03:10 ? 00:00:00 ora_p000_xiaoyu2
oracle 28049 1 4 03:10 ? 00:00:00 ora_p001_xiaoyu2
root 28177 9157 0 03:10 pts/2 00:00:00 grep p0

实例1发起查询,而实例1和实例2的instance_group中都有xiaoyu这个组,所以这里并行查询可以在实例间交叉运行。

实例1的参数:
SQL> show parameter instance_group;
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_groups string xiaoyu, xiaoyu1
parallel_instance_group string xiaoyu2
实例2的参数:
SQL> show parameter instance_group;
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_groups string xiaoyu, xiaoyu2
parallel_instance_group string
[root@rac01 ~]# ps -ef|grep p0|grep -v psp0
root 22437 7995 0 03:21 pts/2 00:00:00 grep p0
[root@rac02 ~]# ps -ef|grep p0|grep -v psp0
oracle 12003 1 1 03:21 ? 00:00:00 ora_p000_xiaoyu2
oracle 12005 1 1 03:21 ? 00:00:00 ora_p001_xiaoyu2
oracle 12007 1 2 03:21 ? 00:00:00 ora_p002_xiaoyu2
oracle 12009 1 2 03:21 ? 00:00:00 ora_p003_xiaoyu2
root 12183 9157 0 03:21 pts/2 00:00:00 grep p0
此时我们发起查询的会话在实例xiaoyu1,但是并行进程确在实例xiaoyu2。

小鱼在测试过程中还发现了一个需要注意的地方,即使我们设置的parallel_instance_group和实际所有的实例的instance_groups不一致
实例1的参数:
SQL> show parameter instance_group;
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_groups string xiaoyu, xiaoyu1
parallel_instance_group string test
实例2的参数:
SQL> show parameter instance_group;
NAME TYPE VALUE
------------------------------------ ---------------------- ------------------------------
instance_groups string xiaoyu, xiaoyu2
parallel_instance_group string
实例1发起查询
SQL> select /*+parallel(a 2)*/ count(*) from t01_view a;
[root@rac01 ~]# ps -ef|grep p0|grep -v psp0
root 25375 7995 0 03:45 pts/2 00:00:00 grep p0
[root@rac02 ~]# ps -ef|grep p0|grep -v psp0
root 15135 9157 0 03:45 pts/2 00:00:00 grep p0

这里虽然看不到并行进程,但是在10046 trace文件中却看见了并行的执行计划:
select /*+parallel(a 2)*/ count(*)
from
t01_view a

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 5.23 12.76 48196 87280 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 5.24 12.77 48196 87280 0 1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 55

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=87280 pr=48196 pw=0 time=12766605 us)
1 PX COORDINATOR (cr=87280 pr=48196 pw=0 time=12766580 us)
1 PX SEND QC (RANDOM) :TQ10001 (cr=87280 pr=48196 pw=0 time=12765838 us)
1 SORT AGGREGATE (cr=87280 pr=48196 pw=0 time=12765825 us)
14487552 HASH JOIN (cr=87280 pr=48196 pw=0 time=396515809 us)
234304 PX RECEIVE (cr=43640 pr=24382 pw=0 time=13355332 us)
234304 PX SEND BROADCAST :TQ10000 (cr=43640 pr=24382 pw=0 time=13121027 us)
234304 PX BLOCK ITERATOR (cr=43640 pr=24382 pw=0 time=12652420 us)
234304 TABLE ACCESS FULL T01 (cr=43640 pr=24382 pw=0 time=12418113 us)
3186304 PX BLOCK ITERATOR (cr=43640 pr=23814 pw=0 time=12745290 us)
3186304 TABLE ACCESS FULL T01 (cr=43640 pr=23814 pw=0 time=6372668 us)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
db file scattered read 3116 0.33 7.99
db file sequential read 5 0.00 0.00
SQL*Net message from client 2 0.00 0.00
关于这个现象比较难以解释,官档提示中确实是提到了instance_groups如果没有parallel_instance_group,那么并行执行不会生效,但是这里10046 trace却看见了并行执行,小鱼又跟踪了并行执行:
SQL> alter session set "_px_trace"="all";
Session altered.
SQL> select /*+parallel(a 2)*/ count(*) from t01_view a;

COUNT(*)
----------
14487552
SQL> alter session set "_px_trace"="none";
Session altered.

我们查看下trace文件看看上面这个sql是否真的有过并行执行:
kkfdapdml
pgadep:2 pdml mode:0 PQ allowed DML allowed not autonomous => not allowe
d
kkfdapdml
pgadep:0 pdml mode:0 PQ allowed DML allowed not autonomous => not allowe
d
kxfplist
Getting instance info for open group
kxfralo
serial - instance group has no open members
qerpxStart
rpa:0x0x6d923dd0 rpm:0x(nil)
Current SQL statement for this session:
select /*+parallel(a 2)*/ count(*) from t01_view a
START no parallel resources
qbas:4971
pgakid:2 pgadep:0
qertqiStart
dfo: 0 passthru
qertqiFetch
tqhdl=0x(nil) dfo=0 rows=32767 passthru
kkfdapdml
pgadep:0 pdml mode:0 PQ allowed DML allowed not autonomous => not allowe
d

这里看出Current SQL statement for this session:
select /*+parallel(a 2)*/ count(*) from t01_view a
START no parallel resources
表示并行查询并没有生效,看来10046 trace得出的执行计划和跟踪并行的trace文件是不符合的,而这个问题确实比较疑惑,这里小鱼的crs和database版本都是10.2.0.1的,暂时没有找到合适的资料来进行验证是否是个bug存在,关于并行执行的trace文件分析可以参考圈内的一位大师的一篇文章http://www.hellodba.com/reader.php?ID=74&lang=CN
一般而言在rac环境还是尽量避免实例间交叉并行执行,主要是为了减少cache fusion带来的问题,特别如果sql的性能不好,而寄希望于实例间的parallel,这样可能会适得其反,如果单台服务器的cpu能力还算凑合,小鱼个人是一般喜欢把并行执行控制在本地实例即可。