close
文章出處

記一次公司倉庫數據庫服務器死鎖過程

 

倉庫揀貨卡死,排查了數據庫的很多地方,都沒有頭緒,最后到SQL Server 錯誤日志里查看,終于發現了蛛絲馬跡

EXEC xp_readerrorlog 0,1,NULL,NULL,'2015-09-21','2015-10-10','DESC'
     waiter id=process5c30e08 mode=U requestType=wait
    waiter-list
     owner id=process5c26988 mode=X
    owner-list
   keylock hobtid=72057597785604096 dbid=33 objectname=stoxxx.dbo.Orderxxx indexname=IX_PricingExpressProductCode_State id=lock17fa96980 mode=X associatedObjectId=72057597785604096
     waiter id=process5c26988 mode=U requestType=wait
    waiter-list
     owner id=process5c30e08 mode=X
    owner-list
   keylock hobtid=72057597785604096 dbid=33 objectname=stoxxx.dbo.Orderxxx indexname=IX_PricingExpressProductCode_State id=lock87d69e780 mode=X associatedObjectId=72057597785604096
  resource-list
(@OperateState money,@HandledByNewWms bit,@State int,@OrderOut int)
UPDATE [Orderxx] SET [OperateState] = @OperateState,[HandledByNewWms] = @HandledByNewWms WHERE (([Orderxxx].[State] = @State) And ([Orderxxx].[OrderOut] = @OrderOut) And ([Orderxxx].[PricingExpressProductCode] IN ('UKNIR')))    
    inputbuf
unknown     
     frame procname=unknown line=1 sqlhandle=0x000000000000000000000000000000000000000000000000
UPDATE [Orderxxx] SET [OperateState] = @OperateState,[HandledByNewWms] = @HandledByNewWms WHERE (([Orderxxx].[State] = @State) And ([Orderxxx].[OrderOut] = @OrderOut) And ([Orderxxx].[PricingExpressProductCode] IN ('UKNIR')))     
     frame procname=adhoc line=1 stmtstart=134 sqlhandle=0x020000009d376d18a17e7ea51289d8caa2fb4de65c976389
    executionStack
   process id=process5c30e08 taskpriority=0 logused=10320 waitresource=KEY: 33:72057597785604096 (112399c2054a) waittime=4813 ownerId=31578743038 transactionname=user_transaction lasttranstarted=2015-09-24T10:22:58.410 XDES=0x372e95950 lockMode=U schedulerid=17 kpid=8496 status=suspended spid=153 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2015-09-24T10:22:58.540 lastbatchcompleted=2015-09-24T10:22:58.540 clientapp=.Net SqlClient Data Provider hostname=CK1-WIN-WEB02 hostpid=37992 loginname=ck1.biz isolationlevel=read committed (2) xactid=31578743038 currentdb=33 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128056
(@OperateState money,@HandledByNewWms bit,@State int,@OrderOut int)UPDATE [Orderxxx] SET [OperateState] = @OperateState,[HandledByNewWms] = @HandledByNewWms WHERE (([Orderxxx].[State] = @State) And ([Orderxxx].[OrderOut] = @OrderOut) And ([Orderxxx].[PricingExpressProductCode] IN ('UKNIR')))    
    inputbuf
unknown     
     frame procname=unknown line=1 sqlhandle=0x000000000000000000000000000000000000000000000000
UPDATE [Orderxxx] SET [OperateState] = @OperateState,[HandledByNewWms] = @HandledByNewWms WHERE (([Orderxxx].[State] = @State) And ([Orderxxx].[OrderOut] = @OrderOut) And ([Orderxxx].[PricingExpressProductCode] IN ('UKNIR')))     
     frame procname=adhoc line=1 stmtstart=134 sqlhandle=0x020000009d376d18a17e7ea51289d8caa2fb4de65c976389
    executionStack
   process id=process5c26988 taskpriority=0 logused=9892 waitresource=KEY: 33:72057597785604096 (70f5b089bb2b) waittime=4813 ownerId=31579268946 transactionname=user_transaction lasttranstarted=2015-09-24T10:27:01.357 XDES=0x98312f950 lockMode=U schedulerid=16 kpid=9184 status=suspended spid=454 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2015-09-24T10:27:01.490 lastbatchcompleted=2015-09-24T10:27:01.487 clientapp=.Net SqlClient Data Provider hostname=CK1-WIN-WEB02 hostpid=37992 loginname=ck1.biz isolationlevel=read committed (2) xactid=31579268946 currentdb=33 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128056
  process-list
 deadlock victim=process5c26988
deadlock-list

 

咋一看上面的錯誤信息,可以發現兩條相同的語句造成的死鎖,但是這么短的語句不可能持有排他鎖太久

 

 

再仔細分析一下錯誤日志,發現都死鎖在同一個非聚集索引上,再問了一下開發,開發那邊說,這條語句是在一個大事務里面,這個事務會做7、8件事

 

索引屬性

 

 

還有索引里面的數據,發現很多重復值

 

 

SQL語句是這樣的

(@OperateState money,@HandledByNewWms bit,@State int,@OrderOut int)
@HandledByNewWms=(1)  @OperateState=($1.0000)  @OrderOut=(4055484)  @State=(3) 

UPDATE [Orderxxx] SET [OperateState] = $1.0000,[HandledByNewWms] = 1
WHERE (([Orderxxx].[State] = 3) And ([Orderxxx].[OrderOut] = 4055484) And ([Orderxxx].[PricingExpressProductCode] IN ('UKRRM','UKRLE')))

 

 

下圖為語句生成的執行計劃

 

 

當時的情況是大量SQL語句被阻塞,而阻塞的語句正是下面這條語句

UPDATE [Orderxxx] SET [OperateState] = $1.0000,[HandledByNewWms] = 1
WHERE (([Orderxxx].[State] = 3) And ([Orderxxx].[OrderOut] = 4055484) And ([Orderxxx].[PricingExpressProductCode] IN ('UKRRM','UKRLE')))

 

 

解決方法

上面得出幾個癥狀

1、update語句是在一個大事務里面,事務太大導致其他session等待排他鎖的時間變長

2、大家都在使用同一個非聚集索引,并掃描PricingExpressProductCode字段

3、索引里的重復值很多

 

從上面的癥狀基本可以判斷,這個非聚集索引無啥用,可以禁用之

ALTER INDEX [IX_PricingExpressProductCode_State] ON [dbo].[Orderxxx] DISABLE

禁用之后,死鎖消失,問題解決,倉庫的怨氣也隨之消失

 

這一次排查過程時間有點長,但是很好定位,SQL Server錯誤日志給出了足夠的信息定位死鎖問題,所以遇到問題的時候一定要分析清楚日志

 

 

如有不對的地方,歡迎大家拍磚o(∩_∩)o 


文章列表


不含病毒。www.avast.com
arrow
arrow
    全站熱搜
    創作者介紹
    創作者 AutoPoster 的頭像
    AutoPoster

    互聯網 - 大數據

    AutoPoster 發表在 痞客邦 留言(0) 人氣()