SQL Server一个特殊的阻塞案例分析2
最近发现一个非常奇怪的阻塞问题,如下截图所示(来自监控工具dpa),会话583被会话1036阻塞,而且阻塞发生在tempdb,被阻塞的sql如下截图所示,会话等待类型为lck_m_s
|
|
因为dpa工具不好截取全部信息,使用下面sql语句获取了阻塞会话的详细信息如下,来自microsoft sql server management studio - transact-sql intellisense的的sql被阻塞了
select s.session_id ,
r.status ,
r.blocking_session_id 'blk by' ,
r.wait_type ,
wait_resource ,
r.wait_time / ( 1000 * 60 ) 'wait m' ,
r.cpu_time ,
r.logical_reads ,
r.reads ,
r.writes ,
r.total_elapsed_time / ( 1000 * 60 ) 'elaps m' ,
substring(st.text, ( r.statement_start_offset / 2 ) + 1,
( ( case r.statement_end_offset
when -1 then datalength(st.text)
else r.statement_end_offset
end - r.statement_start_offset ) / 2 ) + 1) as statement_text ,
coalesce(quotename(db_name(st.dbid)) + n'.'
+ quotename(object_schema_name(st.objectid, st.dbid)) + n'.'
+ quotename(object_name(st.objectid, st.dbid)), '') as command_text ,
r.command ,
s.login_name ,
s.host_name ,
s.program_name ,
s.last_request_end_time ,
s.login_time ,
r.open_transaction_count
from sys.dm_exec_sessions as s
join sys.dm_exec_requests as r on r.session_id = s.session_id
cross apply sys.dm_exec_sql_text(r.sql_handle) as st
where r.session_id =583
order by r.cpu_time desc;
而会话1036处于睡眠(sleeping)状态,有一个open的事务。通过wait_resource的值“key: 2:1125899909070848 (79c68d10aa42)”,我们定位到是一个系统表sysschobj.它是sys.triggers这个系统视图的基表。如下所示:
set quoted_identifier on
set ansi_nulls on
go
create view sys.triggers as
select o.name,
object_id = o.id,
parent_class = o.pclass,
parent_class_desc = pc.name,
parent_id = o.pid,
type = o.type,
type_desc = n.name,
create_date = o.created,
modify_date = o.modified,
is_ms_shipped = sysconv(bit, o.status & 1), -- objall_msshipped
is_disabled = sysconv(bit, o.status & 256), -- objtrg_disabled
is_not_for_replication = sysconv(bit, o.status & 512), -- objtrg_notforrepl
is_instead_of_trigger = sysconv(bit, o.status & 1024) -- objtrg_insteadof
from sys.sysschobjs o
left join sys.syspalnames n on n.class = 'obty' and n.value = o.type
left join sys.syspalvalues pc on pc.class = 'uncl' and pc.value = o.pclass
where o.type in ('ta','tr') and o.pclass <> 100
and has_access('tr', o.id, o.pid, o.nsclass) = 1
go
使用下面脚本可以查到对应的锁信息(其实,正确的做法应该是加上条件过滤spid),从而可以判断,会话1036可能因为事务中,间接涉及对基表sysschobj的dml操作(有可能是衍生的系统视图),而由于事务没有提交也没有释放,导致在基表上的锁没有释放,从而导致出现这样一个阻塞。个人也很好奇是什么样的sql会导致这样一个阻塞出现,不过这种阻塞源头是sleeping状态的会话,已经无法定位阻塞源头sql,只能通过profile去跟踪捕获。但是这种阻塞出现时机和场景不清楚,很难一下子捕获到。
use tempdb
go
select request_session_id as spid ,
db_name(resource_database_id) as dbname ,
case when resource_type = 'object'
then object_name(resource_associated_entity_id)
when resource_associated_entity_id = 0 then 'n/a'
else object_name(p.object_id)
end as entity_name ,
index_id ,
resource_type as resource ,
resource_description as description ,
request_mode as mode ,
request_status as status
from sys.dm_tran_locks t
left join sys.partitions p
on p.partition_id = t.resource_associated_entity_id
where resource_database_id = db_id()
今天早上有发现同样的阻塞,猜测是sql里面有创建临时表或对临时表做ddl或dml操作的迹象。不过依然没有找到源头sql,只能静候下次机会。找出“罪魁祸首”。不过让我头痛的是,对于这种特殊阻塞情况,即使是使用profile跟踪也是相当麻烦,因为不知道它什么时间点出现、出现在那个会话。
select tl.resource_type as [resource type] ,
db_name(tl.resource_database_id) as [db name] ,
case tl.resource_type
when 'object'
then object_name(tl.resource_associated_entity_id,
tl.resource_database_id)
when 'database' then 'db'
else case when tl.resource_database_id = db_id()
then ( select object_name(object_id,
tl.resource_database_id)
from sys.partitions
where hobt_id = tl.resource_associated_entity_id
)
else '(run under db context)'
end
end as [object] ,
tl.resource_description as [resource] ,
tl.request_session_id as [session] ,
tl.request_mode as [mode] ,
tl.request_status as [status] ,
wt.wait_duration_ms as [wait (ms)] ,
qi.sql ,
qi.query_plan
from sys.dm_tran_locks tl with ( nolock )
left outer join sys.dm_os_waiting_tasks wt with ( nolock )
on tl.lock_owner_address = wt.resource_address
and tl.request_status = 'wait'
outer apply ( select substring(s.text,
( er.statement_start_offset / 2 ) + 1,
( ( case er.statement_end_offset
when -1 then datalength(s.text)
else er.statement_end_offset
end - er.statement_start_offset ) / 2 )
+ 1) as sql ,
qp.query_plan
from sys.dm_exec_requests er with ( nolock )
cross apply sys.dm_exec_sql_text(er.sql_handle) s
outer apply sys.dm_exec_query_plan(er.plan_handle) qp
where tl.request_session_id = er.session_id
) qi
where tl.request_session_id =134
链接资料