关于expdp任务异常的处理案例详析
前言
本文主要介绍了关于expdp任务异常处理的相关内容,分享出来供大家参考学习,下面话不多说了,来一起看看详细的介绍吧
环境:aix 6.1 + oracle 10.2.0.4
现象:在xtts迁移测试阶段,遇到执行几个expdp的导出任务,迟迟没有返回任何信息,对应日志无任何输出,查看任务状态:
sql> set lines 300 col owner_name for a10 col operation for a15 col job_mode for a20 col state for a15 select * from dba_datapump_jobs; owner_name job_name operation job_mode state degree attached_sessions datapump_sessions ---------- ------------------------------ --------------- -------------------- --------------- ---------- ----------------- ----------------- sys sys_export_transportable_01 export transportable defining 1 0 1 sys sys_export_transportable_02 export transportable defining 1 1 2 sys sys_export_transportable_03 export transportable defining 1 1 2 sys sys_export_schema_01 export schema defining 1 1 2 sys sys_export_transportable_04 export transportable defining 1 1 2 sys sys_export_schema_02 export schema defining 1 1 2 6 rows selected.
可以看到所有的expdp导出任务的state都停留在defining状态。
1.牛刀小试清异常
先强制杀掉后台执行的所有expdp任务:
ps -ef|grep expdp|grep -v grep|awk '{print $2}'|xargs kill -9
然后尝试删除这些表(其实应该在not running状态下删除)
select 'drop table '||owner_name||'.'||job_name||' purge;' from dba_datapump_jobs where state='not running'; drop table sys.sys_export_transportable_01 purge; ..
可这样是没有作用的,查询结果不变。
甚至尝试正常shutdown immediate停止数据库,也无法成功,告警日志看到有活动调用:
thu nov 1 15:14:24 2018 active call for process 4522064 user 'oracle' program 'oracle@localhost (dm00)' active call for process 4456536 user 'oracle' program 'oracle@localhost (dm01)' active call for process 10027180 user 'oracle' program 'oracle@localhost (dm02)' active call for process 7340140 user 'oracle' program 'oracle@localhost (dm03)' active call for process 6291888 user 'oracle' program 'oracle@localhost (dm04)' active call for process 8126596 user 'oracle' program 'oracle@localhost (dm05)' shutdown: waiting for active calls to complete.
发现这些进程的id都对应了ora_dm的进程:
$ ps -ef|grep ora_dm oracle 4456536 1 0 17:00:09 - 0:00 ora_dm01_xxxxdb oracle 4522064 1 0 16:50:57 - 0:00 ora_dm00_xxxxdb oracle 7340140 1 0 14:06:07 - 0:00 ora_dm03_xxxxdb oracle 8126596 1 0 14:35:03 - 0:00 ora_dm05_xxxxdb oracle 10027180 1 0 13:55:08 - 0:00 ora_dm02_xxxxdb oracle 6291888 1 0 14:31:17 - 0:00 ora_dm04_xxxxdb oracle 7340432 8388786 0 15:22:59 pts/4 0:00 grep ora_dm
实际上,这就是expdp任务的相关进程,强制杀掉这些进程:
ps -ef|grep ora_dm|grep -v grep|awk '{print $2}'|xargs kill -9
之后数据库关闭成功:
thu nov 1 15:24:37 2018 all dispatchers and shared servers shutdown thu nov 1 15:24:37 2018 alter database close normal
启动数据库后,再次查询发现已经成功清理:
sql> set lines 300 col owner_name for a10 col operation for a15 col job_mode for a20 col state for a15 select * from dba_datapump_jobs; no rows selected
小结:数据泵任务与ora_dm进程相关;如果数据泵任务发生异常,但任务并没有退出的情况,需要同时杀掉这类进程(杀掉后状态就会变为not running)。关库不是必须的,只是演示此时正常关闭被阻塞的场景。这也能说明为什么要保证在not running状态下才可以清理。
2.追本溯源查mos
上面的步骤只是清理了异常的数据泵任务,但没有解决问题,再次后台执行备份任务依然会重现故障:
nohup sh expdp_xtts.sh &
$ ps -ef|grep expdp oracle 6684914 8061208 0 15:30:07 pts/2 0:00 grep expdp oracle 7143482 8061208 0 15:30:03 pts/2 0:00 sh expdp_xtts.sh oracle 6685096 7143482 0 15:30:03 pts/2 0:00 expdp '/ as sysdba' parfile=expdp_xtts.par $ ps -ef|grep ora_dm oracle 7602308 8061208 0 15:30:10 pts/2 0:00 grep ora_dm oracle 3997964 1 1 15:30:05 - 0:00 ora_dm00_xxxxdb $
此时查询dba_datapump_jobs,state依然一直是defining状态:
owner_name job_name operation job_mode state degree attached_sessions datapump_sessions ---------- ------------------------------ --------------- ------------------------------ ------------------------------ ---------- ----------------- ----------------- sys sys_export_transportable_01 export transportable defining 1 1 2
其他的导出任务都一样,不再赘述。
为了方便测试,写一个简单的单表expdp导出,现象也一样。
expdp \'/ as sysdba\' directory=xtts tables=query.test dumpfile=query_test.dmp logfile=query_test.log
根据故障现象,用如下关键字在mos中搜索: expdp state defining,匹配到文档:
datapump export/import hangs with "defining" status when using a directory on nfs filesystem (文档 id 2262196.1)
正好这次测试是在nfs文件系统上,mos建议移动到本地文件系统导出。
这次再将expdp进程全部杀掉:
ps -ef|grep ora_dm|grep -v grep|awk '{print $2}'|xargs kill -9 ps -ef|grep expdp|grep -v grep|awk '{print $2}'|xargs kill -9
此时查询dba_datapump_jobs:
owner_name job_name operation job_mode state degree attached_sessions datapump_sessions ---------- ------------------------------ --------------- ------------------------------ --------------- ---------- ----------------- ----------------- sys sys_export_table_04 export table not running 0 0 0 sys sys_export_schema_01 export schema not running 0 0 0 sys sys_export_table_02 export table not running 0 0 0 sys sys_export_table_05 export table not running 0 0 0 sys sys_export_table_03 export table not running 0 0 0 sys sys_export_table_01 export table not running 0 0 0 sys sys_export_transportable_01 export transportable not running 0 0 0 7 rows selected.
清理not running的master表:
select 'drop table '||owner_name||'.'||job_name||' purge;' from dba_datapump_jobs where state='not running'; --执行结果用来执行,再次查看结果为空: sql> select * from dba_datapump_jobs; no rows selected
按mos建议,将导出任务移动到本地文件系统:
aix源端导出xtts源数据至源端/hxbak/xtts_exp目录中,而后copy至nfs共享存储/xtts/dmp中:
mkdir /hxbak/xtts_exp chown oracle:dba /hxbak/xtts_exp ls -ld /hxbak/xtts_exp select * from dba_directories; create or replace directory xtts as '/hxbak/xtts_exp';
此时测试expdp任务可正常运行:
$ expdp \'/ as sysdba\' directory=xtts tables=query.test dumpfile=query_test.dmp logfile=query_test.log export: release 10.2.0.4.0 - 64bit production on thursday, 01 november, 2018 16:03:21 copyright (c) 2003, 2007, oracle. all rights reserved. connected to: oracle database 10g enterprise edition release 10.2.0.4.0 - 64bit production with the partitioning, olap, data mining and real application testing options starting "sys"."sys_export_table_01": '/******** as sysdba' directory=xtts tables=query.test dumpfile=query_test.dmp logfile=query_test.log estimate in progress using blocks method... processing object type table_export/table/table_data total estimation using blocks method: 8 mb processing object type table_export/table/table processing object type table_export/table/statistics/table_statistics . . exported "query"."test" 6.743 mb 72593 rows master table "sys"."sys_export_table_01" successfully loaded/unloaded ****************************************************************************** dump file set for sys.sys_export_table_01 is: /hxbak/xtts_exp/query_test.dmp job "sys"."sys_export_table_01" successfully completed at 16:03:57 sql> select * from dba_datapump_jobs; owner_name job_name operation job_mode state degree attached_sessions datapump_sessions ---------- ------------------------------ --------------- ------------------------------ --------------- ---------- ----------------- ----------------- sys sys_export_table_01 export table executing 1 1 3
再次导出其他元数据:
#expdp_xtts.sh (about 5min) nohup sh expdp_xtts.sh & #expdp_xtts_other.sh(about 5min) nohup sh expdp_xtts_other.sh & #expdp_tmp_table nohup sh expdp_tmp_table01.sh & nohup sh expdp_tmp_table02.sh & nohup sh expdp_tmp_table03.sh & nohup sh expdp_tmp_table04.sh &
最后将这些导出文件再移动到/xtts/dmp/下,供后续xtts测试目标端导入使用:
$ pwd /hxbak/xtts_exp $ cp -rp * /xtts/dmp/
目标端导入时只需要有读这些文件的权限,即可,实际测试恢复ok。
小结:
在自己的linux环境测试过是可以直接expdp到nfs文件系统的,aix看来有区别,mos的建议也只是一个workaround,但也能满足需求,毕竟元数据导出文件没多大。
总结
以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作具有一定的参考学习价值,如果有疑问大家可以留言交流,谢谢大家对的支持。
推荐阅读