Let me set the stage, we have two SQL Servers, SQLA and SQLB. SQLA is in DomainA and SQLB is in DomainB. I would like to access resources on DomainB.SQLB using login DomainA\UserA via SQL Server Management Studio.
There are many articles online that talk about doing this kind of work using impersonation and such. However all those cases are for untrusted domain or non-domain. However in my case I am working on Two-Way Restricted Forest-Level trust. Even with this configured the SQL Servers could not communicate with each other.
I am not Active Directory guys (probably figured that out with my trouble over the years); anyhow, so I will not discuss how to set up two-way trust. I leave that to you or your Active Directory Gurus.
Given you have two-way trust configured, everything work smoothly right? No!
Far from it, this has been one of the most frustrating issues to troubleshoot. As there is almost no documentation I could find on the issue. So what happened?
- I can log on to DomainA using DomainB\UserB, no problems.
- I can log on to DomainB using DomainA\UserA, no problems.
- I can access File Resources on DomainB.SQLB\C$ from DomainA using DomainA\UserA no issues.
- I can access File Resources on DomainA.SQLA\C$ from DomainB using DomainB\UserB no issues.
- I can log in to SQL Server on DomainB.SQLB using DomainA\UserA from DomainB.
- I can log in to SQL Server on DomainA.SQLA using DomainB\UserB from DomainA.
So all that works, what gives? Well, when I tried logging into DomainB.SQLB using DomainA\UserA from DomainA. Things broke down.
I started getting “Cannot Generate SSPI Context” error messages. And if you have been as fortunate as me with Kerberos this is a fairly difficult message to troubleshoot. But, don’t lose hope! We will continue troubleshooting.
So naturally when the issue happened, i.e. failed login, I decided to check all the logs I can get my hands on.
- SQL Server Error Logs
- Event Viewer Security Logs on SQLA, SQLB
However none of the logs had anything. So this meant two things 1) since I did not see anything in SQL Server Errorlog, the login was not being decided from the SQL engine. 2) Since I did not see any thing in Event viewer logs, the login was not being denied from Windows layer.
Okay! I am officially confused? Right? Who else is involved in authentication process? Aha! Domain Controller I though, but at last there is nothing on its logs. I was running out of options fast, but with all my experience of troubleshooting SSPI issues. I knew it had do something with SPNs. I verified all the SPN, all the security, everything correct. But it failed!
Then, what is going on? So this is new one for me, I … umm … decided to look at the .NET Stack Trace data that is provided with error log. No I am not .NET Developer, so I might be showing my limited knowledge in that field. However … Lets look …
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParser.SSPIError(String error, String procedure)
at System.Data.SqlClient.TdsParser.SNISSPIData(Byte receivedBuff, UInt32 receivedLength, Byte sendBuff, UInt32& sendLength)
at System.Data.SqlClient.TdsParser.TdsLogin(SqlLogin rec)
at System.Data.SqlClient.SqlInternalConnectionTds.Login(Int64 timerExpire, String newPassword)
at System.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, String newPassword, Boolean ignoreSniOpenTimeout, Int64 timerExpire, SqlConnection owningObject)
at System.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(String host, String newPassword, Boolean redirectedUserInstance, SqlConnection owningObject, SqlConnectionString connectionOptions, Int64 timerStart)
at System.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(SqlConnection owningObject, SqlConnectionString connectionOptions, String newPassword, Boolean redirectedUserInstance)
at System.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, Object providerInfo, String newPassword, SqlConnection owningObject, Boolean redirectedUserInstance)
at System.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection)
at System.Data.ProviderBase.DbConnectionFactory.CreateNonPooledConnection(DbConnection owningConnection, DbConnectionPoolGroup poolGroup)
at System.Data.ProviderBase.DbConnectionFactory.GetConnection(DbConnection owningConnection)
at System.Data.ProviderBase.DbConnectionClosed.OpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory)
at Microsoft.SqlServer.Management.UI.VSIntegration.ObjectExplorer.ObjectExplorer.ValidateConnection(UIConnectionInfo ci, IServerType server)
So reading it from bottom up, all seems pretty foreign langauge to me. Until I get to SSPIError, aha! Found what fires the error. So one step below that must be what caused the error? It was TdsPraser.SNISSPIData. What the hell is SNISSPIData? What data could it be reading?
Searching about (Bing’ing It!) seems it is a fairly common function name relating to “SSPI Context”. So again it seems I am missing some kind of SPN maybe?
While working on the problem with a colleague, he decided to try using Force SPN with in ODBC. Surprisingly it worked! What the? So I am missing an SPN? BUT, all the required SPN are there, at least … all the required ones within single domain.
Since Forced SPN worked, we decided to create some new SPNs in each domain.
- MSSQLSvc/SQLA.FQDN.DomainA DomainB\SQLServiceAccount
- MSSQLSvc/SQLA.FQDN.DomainA:Port DomainB\SQLServiceAccount
- MSSQLSvc/SQLB.FQDN.DomainB DomainA\SQLServiceAccount
- MSSQLSvc/SQLB.FQDN.DomainB:Port DomainA\SQLServiceAccount
Trying connection after this everything works! So I am calling these Cross-Domain Service Principal Names (CDSPNs).
Very interesting problem, very simple solution, just painful hard to figure out. I am not Kerberos expert, nor do I know what all SNISSPIData does, however it seems that it was trying to verify the SPN and failing. I wish the error message had just told me that, but ohh well.
And you know it never ends, this solution absolutely works. However in troubleshooting this issue learned some new things about Kerberos. Kerberos relies on DNS to be configured correctly. I know it might be obvious to AD/Network/etc. guru out there, it was new to me. Because 2-days after I wrote most of the post above I found cross-domain authentication using windows authentication working fine without the CDSPNs. It turns out I scenario about because I could not resolve DomainB.SQLB from DomainA or DomainA.SQLA from DomainB. The authentication kept failing, because Kerberos couldn’t find the FQDN for the server. When I created the SPN, it no longer had to do the look (it seems!).
So short of it (and why I am leaving the above post still), Kerberos is not easy to configure. Rather the issues are not easy to troubleshoot. The original solution I found worked, but it was not the root cause of the problem. Root cause was the DNS not configured correctly. So if you are having issues here a rough checklist before you start pulling your hairs out.
1. Check you can ping each server by its name (i.e. Ping ServerName)
It should return ServerName.FQDN (IP), if it doesn’t we got any issue.
2. Check you can do reverse look up by its IP (i.e. PING -a xxx.xxx.xxx.xxx)
It should return ServerName.FQDN(IP), note the “-a” HAS to come become the IP to do reverse look up.
3. Have correct Service Principal Names for SQL Server Engine (two-three SPNS, 1) ServerName.FQDN Domain\ServiceAcct 2) ServerName.FQDN:Port Domain\ServiceAcct, and 3) if it’s named instance ServerName.FQDN:InstanceName Domain\ServiceAcct)
Check SPN using SETSPN -L Domain\ServerAccount, if missing add using SETSPN -A MSSQLSvc/ServerName.FQDN Domain\ServiceAcct, etc.
4. Make sure the Service Account for SQL Server is still active in active directory (i.e. password is correct, its not locked, it hasn’t expired, it still exists, etc.)
5. Make sure the Computer Account is still in AD, it has not been tombstones or deleted by AD Explorers (a.k.a. Me!).
Good Luck… if I find another items to add I will later on :).