A database may go into suspect mode for various reason most of
them are related to corruption of disk or data but SQL server may also put a
database in a suspect mode if it’s not able to recover the database properly on
restart
We had one such scenario recently, On a restart of planned
outage one of the database went in to Suspect mode and we saw the following
error in the SQL error log,
2014-08-01 10:37:31.760
spid5s Attempting to recover
in-doubt distributed transactions involving Microsoft Distributed Transaction
Coordinator (MS DTC). This is an informational message only. No user action is required.
2014-08-01 10:39:13.480
spid5s SQL Server detected a DTC/KTM
in-doubt transaction with UOW
{14EBFD4C-0955-4C25-91B1-C418EE182E64}.Please resolve it following the
guideline for Troubleshooting DTC Transactions.
2014-08-01 10:39:13.480
spid5s Error: 3437, Severity: 21,
State: 3.
2014-08-01 10:39:13.480 spid5s An error occurred while recovering
database 'DBName'. Unable to connect to Microsoft Distributed Transaction
Coordinator (MS DTC) to check the completion status of transaction
(0:75299367). Fix MS DTC, and run recovery again.
2014-08-01 10:39:13.480
spid5s Error: 3414, Severity: 21,
State: 2.
2014-08-01 10:39:13.480
spid5s An error occurred during
recovery, preventing the database 'DBNAME' (6:0) from restarting. Diagnose the
recovery errors and fix them, or restore from a known good backup. If errors
are not corrected or expected, contact Technical Support.
We already know the application using the database uses
MSDTC for various sources and any unexpected restart put an In-doubt
transaction on DTC. Basically In-Doubt transactions are transaction which the
DTC services cannot decide to abort or resolve by itself but since the UOW for
the transaction exists
in SQL, SQLOS does not differentiate it aa transaction in
In-doubt status and kill it and the transaction will be marked as active till
is resolved or aborted from DTC. This is similar to the topic which I wrote
earlier in the blog regarding “-2 SPID Blocking”.
when the database tries to recover a DB with UOW of in-doubt
transaction on restart it could not rollback or roll forward the transaction so
SQL puts database eventually in suspect mode failing to recover completely. You
go Component Services->Computers->My Computer->Distributed transaction
Coordinator->Local DTC->Transaction List for standalone servers or Component Services->Computers->My
Computer->Distributed transaction Coordinator->Clustered
DTC->Transaction List for a cluster instance, Right Click on all transaction
marked In-Doubt either commit or abort the transaction depending on your
requirements. And restart the SQL Services which should recover the database
clean.
But the clean way of doing a restart of a SQL with DTC
transactions is to switch of the application before the SQL restart which will avoid
this issue in the first place. Hope this article helps!
Thank you a LOT! You just saved me from full backup recovering a 4TB+ DB! NICE work!
ReplyDeletevery informative! thanks for sharing. It is very impotent to look in to error log before anything can be done. it is very tough to get a suspect db back. The SQL Recovery tool is much helpful to recover SQL database from suspect mode.
ReplyDeleteindeed very good...
ReplyDeleteThanks... I am facing the issue. It is still relevant even after 4 years.
ReplyDeleteYeah, looks like fixing such issues is not so fast.
DeleteNice review this is helpful. I am looking for a script that will always start DTC Service first on start up then once running start SQL Server. can this be forced in a script or at the registry level?
ReplyDelete