[20190401]那个更快的疑问.txt
[20190401]那个更快的疑问.txt
--//前一阵子,做了11g于10g下,单表单条记录唯一索引扫描的测试,摘要如下:
--//参考链接:
http://blog.itpub.net/267265/viewspace-2636321/
http://blog.itpub.net/267265/viewspace-2636342/
1.环境:
--//当时的测试,在11g下测试结果如下:
scott@book> select method,count(*),round(avg(time_ela),0),sum(time_ela) from job_times group by method order by 3 ;
method count(*) round(avg(time_ela),0) sum(time_ela)
-------------------- ---------- ---------------------- -------------
result_cache 50 8611 430536
id=1_unique_index 50 9494 474714
null 50 10664 533197
id=1_index 50 28160 1407987
notnull 50 29279 1463928
--//在10g下测试结果如下:
scott@test> select method,count(*),round(avg(time_ela),0),sum(time_ela) from job_times group by method order by 3 ;
method count(*) round(avg(time_ela),0) sum(time_ela)
-------------------- ---------- ---------------------- -------------
id=1_unique_index 50 4864 243192
notnull 50 34134 1706713
id=1_index 50 34703 1735173
null 50 37234 1861717
--//我的测试环境服务器硬件相同,os版本一样,对比可以发现id=1_unique_index的情况下,10g比11g快了1倍(指id=1_unique_index的情况).
--//而其他方式下11g都明显快于10g,而10g下除了id=1_unique_index下其他执行方式都可以看到大量cbc latch等待事件.
--//而11g仅仅在id=1_index,notnull下看到大量cbc latch等待时间,null方式下(全表扫描)的情况下反而看不到cbc
--//latch等待事件.
--//我一直再想,我是不是测试方法存在什么问题,或者11g做了什么改进?重复测试唯一索引的情况看看:
1.环境:
scott@book> @ ver1
port_string version banner
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/linux 2.4.xx 11.2.0.4.0 oracle database 11g enterprise edition release 11.2.0.4.0 - 64bit production
create table t as select rownum id from dual ;
create unique index pk_t on t(id);
create table job_times (sid number, time_ela number,method varchar2(20));
--//分析表略.
$ cat m1.txt
set verify off
host sleep $(echo &&3/50 | bc -l )
insert into job_times values ( sys_context ('userenv', 'sid') ,dbms_utility.get_time ,'&&2') ;
commit ;
declare
v_id number;
v_d date;
begin
for i in 1 .. &&1 loop
select /*+ &&3 */ count (*) into v_id from t where id=1;
--//select /*+ &&3 */ count (*) into v_id from t ;
--//select /*+ &&3 */ 1 into v_id from dual ;
--//select /*+ &&3 */ sysdate into v_d from dual ;
end loop;
end ;
/
update job_times set time_ela = dbms_utility.get_time - time_ela where sid=sys_context ('userenv', 'sid') and method='&&2';
commit;
quit
$ sqlplus -s -l scott/book @m1.txt 1e6 id=1_unique_index 0 >/dev/null
scott@book> select method,count(*),round(avg(time_ela),0),sum(time_ela) from job_times group by method order by 3 ;
method count(*) round(avg(time_ela),0) sum(time_ela)
-------------------- ---------- ---------------------- -------------
id=1_unique_index 1 2615 2615
--//在10g环境下重复上面的步骤略.环境:
scott@test> @ &r/ver1
port_string version banner
------------------------------ -------------- ----------------------------------------------------------------
x86_64/linux 2.4.xx 10.2.0.4.0 oracle database 10g enterprise edition release 10.2.0.4.0 - 64bi
$ sqlplus -s -l scott/btbtms @m1.txt 1e6 id=1_unique_index 0 >/dev/null
scott@test> select method,count(*),round(avg(time_ela),0),sum(time_ela) from job_times group by method order by 3 ;
method count(*) round(avg(time_ela),0) sum(time_ela)
-------------------- ---------- ---------------------- -------------
id=1_unique_index 1 1535 1535
--//可以看出即使我单个用户执行相似的sql语句情况下,唯一索引查询10g下明显快于11g.
2.使用strace跟踪看看:
--//执行1e6次有点慢,改成1e5看看.
--//11g的测试:
$ strace -f -c sqlplus -s -l scott/book @m1.txt 1e5 id=1_unique_index 0 >/dev/null
...
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
72.78 0.043919 0 700708 getrusage
19.80 0.011947 0 200179 times
6.67 0.004024 671 6 3 wait4
0.24 0.000147 29 5 clone
0.18 0.000106 0 375 2 read
0.16 0.000096 0 264 106 open
...
------ ----------- ----------- --------- --------- ----------------
100.00 0.060341 902967 206 total
--//10g的测试:
$ strace -f -c sqlplus -s -l scott/btbtms @m1.txt 1e5 id=1_unique_index 0 >/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
72.00 0.056425 0 700486 getrusage
24.98 0.019573 2796 7 2 wait4
0.91 0.000714 3 236 read
0.74 0.000582 97 6 clone
0.73 0.000572 286 2 shmdt
0.23 0.000177 1 302 189 open
0.16 0.000122 1 145 108 stat
0.05 0.000042 1 65 write
..
0.03 0.000025 0 116 times
------ ----------- ----------- --------- --------- ----------------
100.00 0.097352 702291 347 total
--//差异在于10g很少做times的系统调用上.10g下仅仅116次.而11g高达200179.
--//如果再次执行跟踪如下,11g:
$ strace -f -p 57003
times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
getrusage(rusage_self, {ru_utime={19, 483038}, ru_stime={3, 285500}, ...}) = 0
times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
times({tms_utime=1948, tms_stime=328, tms_cutime=0, tms_cstime=0}) = 10865304314
--//调用7次getrusage,调用2次times.
--//对比前面的调用比例也可以看出getrusage调用700708,times调用times200179.非常接近7:2
--//而10g下仅仅看到:
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
getrusage(rusage_self, {ru_utime={13, 203992}, ru_stime={1, 700741}, ...}) = 0
--//正是这样的差异导致10g下明显快于11g.
--//顺便看看times输出表示什么?
$ man 2 times
name
times - get process times
synopsis
#include <sys/times.h>
clock_t times(struct tms *buf);
description
times() stores the current process times in the struct tms that buf points to. the struct tms is as defined in <sys/times.h>:
struct tms {
clock_t tms_utime; /* user time */
clock_t tms_stime; /* system time */
clock_t tms_cutime; /* user time of children */
clock_t tms_cstime; /* system time of children */
};
the tms_utime field contains the cpu time spent executing instructions of the calling process. the tms_stime
field contains the cpu time spent in the system while executing tasks on behalf of the calling process. the
tms_cutime field contains the sum of the tms_utime and tms_cutime values for all waited-for terminated children.
the tms_cstime field contains the sum of the tms_stime and tms_cstime values for all waited-for terminated
children.
times for terminated children (and their descendants) is added in at the moment wait(2) or waitpid(2) returns
their process id. in particular, times of grandchildren that the children did not wait for are never seen.
all times reported are in clock ticks.
return value
times() returns the number of clock ticks that have elapsed since an arbitrary point in the past. for linux 2.4
and earlier this point is the moment the system was booted. since linux 2.6, this point is (2^32/hz) - 300
(i.e., about 429 million) seconds before system boot time. the return value may overflow the possible range of
type clock_t. on error, (clock_t) -1 is returned, and errno is set appropriately.
$ cat /proc/uptime ;uptime
104375436.65 104342102.07
10:12:33 up 1208 days, 1:10, 2 users, load average: 0.01, 0.01, 0.07
--//我的计算:
2^32 = 4294967296
4294967296-300 = 4294966996 ,文档提到 429 million seconds,不对明显相差10被.难道hz的单位1/10秒吗?不懂.
4294967296/60-300 = 429496429.6
10437543665+429496429.6 = 10867040094.6
--//这样与上面的结果比较接近了,再次验证看看.
$ strace -f -p 57447 ; cat /proc/uptime
...
getrusage(rusage_self, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
times({tms_utime=2276, tms_stime=382, tms_cutime=0, tms_cstime=0}) = 10865578855
times({tms_utime=2276, tms_stime=382, tms_cutime=0, tms_cstime=0}) = 10865578855
getrusage(rusage_self, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
getrusage(rusage_self, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
getrusage(rusage_self, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
getrusage(rusage_self, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
getrusage(rusage_self, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
getrusage(rusage_self, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
getrusage(rusage_self, {ru_utime={22, 765539}, ru_stime={3, 829417}, ...}) = 0
^cprocess 57447 detached
104377639.45 2484149083.69
4294967296/2593.795/1^6 -300 = 1655562.27747374021462760164
104377639.45+1655562.27747374021462760164
10437763945+(4294967296/10-300) = 10867260374.6
--//存在很大的误差放弃!!