SQL Server 2005 x64: Intermittent "Error: 18456, Severity: 14, State: 10."
Hi,
We are experiencing intermittent authentication errors "Error: 18456, Severity: 14, State: 10" on a customer's production server. This is a new server that has just been rolled out in the past several months. Rebooting the server appears to make the problem go away. We are using SQL authentication from a separate server that is running IIS. The application always uses the same username and password to connect.
Server info:
select@@version
Microsoft SQL Server 2005 - 9.00.2047.00 (X64)
Apr 14 2006 01:11:53
Copyright (c) 1988-2005 Microsoft Corporation
Standard Edition (64-bit) on Windows NT 5.2 (Build 3790: Service Pack 1)
select@@version
SP1
Sample error from SQL Server errorlog:
Date 7/7/2006 10:53:24 AM
Log SQL Server (Archive #2 - 7/7/2006 12:20:00 PM)
Source Logon
Message
Error: 18456, Severity: 14, State: 10.
MSDN (http://msdn2.microsoft.com/en-us/library/ms366351.aspx) lists various states for this error message but 10 is not included, and it says "Other error states exist and signify an unexpected internal processing error."
Note: This was not a case of a transient error that occurs only when SQL Server is starting up - these errors occurred at a variety of times, two months after the server was last rebooted / SQL Server last restarted.
Thanks for any help you can provide.
Regards,
-Frank.
[1800 byte] By [
fdc2005] at [2007-12-24]
Hi Sung,
I had considered that possibility, but it seems unlikely to be the problem. The server, SQL Server, and the database referenced in the connection had all been online for several weeks at least...
Thanks,
Frank.
Hi Frank,
Can you also try what Laurentiu suggests from the other thread? Quoted below:
NOTE: Please use a FAKE password when creating a dump.
"You can start SQL Server with "-y 18456", to produce a dump when the error occurs. As the dump will happen during authentication, the login password might appear in it, so you should use a bogus test password for the purpose of the creating this dump, just in case. Once you have the dump, please open a report at https://connect.microsoft.com/feedback/default.aspx?SiteID=68, attach the dump, and mention my name in the report, so it will get assigned to me.
If you can repro this issue with a smaller application that attempts the same connection on startup, you could attach that application as well, so we can attempt to repro the issue in our labs."
Thanks,
Sung
Hi Sung,
Yes, I will ask the client if they are willing to do this. As this is a production server, there are some questions they are sure to ask me, so I will ask you first:
1) Is there any risk of corruption to databases from using the "-y" startup parameter?
2) Is there any performance downside to using the "-y" startup parameter? E.g. when the error occurs and a dump is written, is there a long delay when this occurs?
3) We will not be able to use a fake password when creating the dump, because we have only seen this error on the production server. Can we edit the dump to remove sensitive information before sending it to Microsoft?
4) I am assuming both you and Laurentiu are Microsoft employees, is that correct?
5) The "-y" startup parameter is undocumented (at least I do not see it listed in the "Using the SQL Server Service Startup Options." books online topic). Is that correct?
6) Is this the proper procedure for setting the "-y" startup parameter?
6a) Programs | SQL Server 2005 | Configuration Tools | SQL Server Configuration Manager
6b) SQL Server 2005 services
6c) SQL Server (MSSQLSERVER) | Properties
6d) Advanced
6e) Startup Parameters
6f) Append ";-y 18456" (should there be a space between "-y" and "18456"? For the default parameters there does not seem to be a space...
7) This server is running SQL Server 2005 x64... has the "-y" startup parameter been tested and approved for use on customer site production servers for the 64-bit version?
8) Has anyone else reported this problem (where the error occurs well after all databases have been started up)?
Thanks,
Frank.
Hi Frank,
Yes, both Laurentiu and I are on the SQL Server Security team. We recommend you attempt this repro on a test server instead of a production server but you should always edit out all sensitive info before passing on any data.
I have asked Laurentiu to jump in on this thread so that he can provide you with more details.
Thanks,
Sung
Hi Frank,
Let me answer your questions on using -y.
There is no risk of corruption from using -y. Using this option only causes the function that reports the error to also generate a dump file.
There will be a performance impact, although it should be very small. The server will be paused while the dump is taken, in your case this should be a mini-dump less than 4MB in size, so the impact should be quite reasonable providing you aren't hitting this error multiple times in succession.
You could edit the mdmp file, but that might make it unusable. Opening a case with product support would be the best path to take to safeguard your mini-dump.
-y is not included in books online at this time, we will correct that.
The procedure you outlined for using -y is sufficient.
-y works on all editions of sql server.
Hi Sung, Hi Jerome,
Thanks for the quick replies. I will ask our customer to stop and restart SQL Server configured with the "-y" option.
It took about a week for the problem to recur after the last reboot. Assuming we can get a dump then I'll ask them to open the case.
Some followup questions:
1) Searching the web, I have not seen anyone else with this particular problem, do you know if it has been reported by anyone else? (Other than the thread Sung referred to, where it happened most likely due to the database not being online yet).
2) Is this the proper location to open the case?
https://connect.microsoft.com/feedback/default.aspx?SiteID=68
3) From the code, is it possible to tell what unexpected situation State=10 implies? Perhaps we can develop a workaround for the short run if we know what State=10 means. Every time this happens it is one or more unhappy users, so I'd like to alleviate it if possible while we figure out the ultimate solution.
Thanks,
Frank.
Hi Frank,
This 18456 error with state 10 might be caused by a password policy API failure. We did not encounter this error before.
The link you posted is the right place to open a case.
You can attempt to prevent this issue from occurring by turning off the policy checks for the login for which you get the error (set CHECK_POLICY to off).
The "-y" option is listed if you execute sqlservr.exe with the "-?" parameter. I have opened a request to also describe this option in BOL. The dump should not affect the state of the server, but it will affect its performance for the time it writes the dump (it took a couple of seconds on my machine). The password should not appear in the dump, but to double check, you can search the contents of the dump. If the password appears in the dump file, you might be able to overwrite it with a hex editor without corrupting the dump file, but I've never tried this before.
You would definitely not want to leave the "-y" option enabled on a production server because a dump for each login failure may lead to running out of space if your server experiences lots of connection failures. Unfortunately, the dump would occur for ANY 18456 error; it cannot be restricted by error state.
Thanks
Laurentiu
Hi Laurentiu,
Thanks for the quick response. I'll recommend to the client that we first try "-y" but not touch CHECK_POLICY, to see if we can get a dump to help with troubleshooting, then we'll turn off CHECK_POLICY for the affected logins, and see if that helps.
We don't have anything but code-generated connections to this server, so we shouldn't get any other dumps... we will monitor the disk space closely when using this option just in case.
Thanks for requesting the books online update, having it appear in the documentation helps as a comfort factor when dealing with production servers.
Thanks,
Frank.
Hi Laurentiu,
We received this error several times today, 2 days after restarting the SQL Server service with the "-y" flag. I have opened a case (actually, two copies of it show up, I do not know why - IDs 172339 and 172338... they appear to be identical).
I took all SQLDump* files from the log folder and zipped them up and attached them to the case. If there is a problem with the attachment or if I did not include enough files please let me know.
After grabbing the dump files, I disabled CHECK_POLICY for the logins that are failing. I will let you know if the problem returns with CHECK_POLICY disabled.
Thanks for any help you can provide!
Frank.
Thank you for opening the cases. I will look into them and I will let you know if we find out the cause of the failure.
Thanks
Laurentiu
Frank, I've looked at both bugs opened for these reports - the bug numbers are 446508 and 446509 - and I closed the latter as a duplicate of the first. However, none of these had a dump attached to it. Can you double check to see if the dumps were indeed attached to the feedback report.
Thanks
Laurentiu
Hi Laurentiu,
I've uploaded the file again (SQLDump.zip, 1,488,351 bytes). Now it is listed twice for this case (172338)... not sure why you wouldn't be seeing it there...... here is the listing from the bottom of the case:
Primary Feedback Item None.
Duplicates 172339
File Attachments SQLDump.zip
SQLDump.zip
Thanks,
Frank.
I got the attachment now.
Have you encountered the issue after turning of the password policy checking for that login?
I'll investigate the dumps.
Thank you
Laurentiu