Advertisements

Archive

Posts Tagged ‘Kerberos’

Service Principal Names for SQL Server

June 16, 2014 1 comment

Service Principal Names (SPN) get used for Kerberos authentication.  SQL Server require that all instances have SPNs configured, if not Kerberos authentication fails.  By default starting with Windows 2008 all communication between clients and SQL Server is first attempted on Kerberos.  If it fails, it falls back to NTLM.  Often if you have Kerberos issues you might also have SSPI error messages in your SQL Server errorlog.

So question can be asked, if it falls back to NTLM and I can get my work done.  Why do I care?  Simple answer is delegation.  With incorrect configured Kerberos, delegation cannot be completed and logins fail.  For example on delegation, read my post on Login failed for user ‘NT AUTHORITY\ANONYMOUS LOGON’ with SQL Server Reporting Services.

So what SPNs we need? Simple answer two SPN per SQL Server instance.

  • MSSQLSvc/SQLServerName[:Port] Domain\SQLServerServiceAccount
  • MSSQLSvc/SQLServerName.FQDN[:Port] Domain\SQLServerServiceAccount

Depending on your configuration, SPNs can look different.  We can manage SPNs two ways; 1) using SETSPN utility 2) using Activity Tool call EDITADIS.  Most people rely on SETSPN, as other tool is for Active Directory administrators.  And they (AD Admins) don’t like it when we mess around with their stuff; like we don’t like it when they mess around with our stuff ;-).  So lets talk about various forms of SPNs you can create.

Example #1: Default Instance, Default Port (1433), Using Domain Service Account

  • MSSQLSvc/SQLServerName Domain\SQLServerServiceAccount
  • MSSQLSvc/SQLServerName.FQDN Domain\SQLServerServiceAccount

Example #2: Default Instance, Non-Default Port, Using Domain Service Account
Example #3: Named Instance, Static Non-Default Port, Using Domain Service Account

  • MSSQLSvc/SQLServerName:Port Domain\SQLServerServiceAccount
  • MSSQLSvc/SQLServerName.FQDN:Port Domain\SQLServerServiceAccount

Example #4: Default Instance, Default Port (1433), Using Computer Account *

  • MSSQLSvc/SQLServerName Domain\ServerName$
  • MSSQLSvc/SQLServerName.FQDN Domain\ServerName$

Example #5: Default Instance, Non-Default Port, Using Computer Account *
Example #6: Named Instance, Static Non-Default Port, Using Computer Account *

  • MSSQLSvc/SQLServerName:Port Domain\ServerName$
  • MSSQLSvc/SQLServerName.FQDN:Port Domain\ServerName$

Example #7: Named Instance, Dynamic Non-Default Port, Using Domain Service Account **

  • MSSQLSvc/SQLServerName:InstanceName Domain\SQLServerServiceAccount
  • MSSQLSvc/SQLServerName.FQDN:InstanceName Domain\SQLServerServiceAccount

Example #8: Named Instance, Dynamic Non-Default Port, Using Computer Account *

  • MSSQLSvc/SQLServerName:InstanceName Domain\ServerName$
  • MSSQLSvc/SQLServerName.FQDN:InstanceName Domain\ServerName$

* An instance is consider using computer account when it is running under built in accounts, such as Network Services.
** Generally we do not recommend dynamic port as managing SPN manually can be difficult.  Also for us to use InstanceName, the SQL Server Browser service must be running.

So after reading this you are going that’s TOO MUCH TO REMEMBER!

I agree, that is why SQL Server can create and destroy this SPNs by itself without user interaction. However in order to do this, we must grant SQL Server Service Account or Computer Account permissions to Read/Write SPN Property on itself (Reference).

Now there is one scenario where we do not want to grant this permission and mange the SPN manually.  Please reference KB2443457. Copied the issue from KB for completeness.

  1. The Sqlcluster instance is active on Node A and registered the SQL SPN in domain controller A during start up..
  2. The Sqlcluster instance fails over to Node B when Node A is shutdown normally.
  3. The Sqlcluster instance deregistered its SPN from domain controller A during the shutdown process on Node A.
  4. The SPN is removed from domain controller A but the change has not yet been replicated to domain controller B.
  5. When starting up on Node B, the Sqlcluster instance tries to register the SQL SPN with domain controller B. Since, the SPN still exists Node B does not register the SPN.
  6. After some time, domain controller A replicates the deletion of the SPN (from step 3) to domain controller B as part of Active Directory replication. The end result is that no valid SPN exists for the SQL instance in the domain and hence you see connection issues to the Sqlcluster instance.

Blog post cross posted on …

  • SQLCAN (WordPress), Link.
  • SQLCAN (MSDN), Link.
Advertisements

Cross-Domain Authentication for SQL Server

July 29, 2013 Leave a comment

Let me set the stage, we have two SQL Servers, SQLA and SQLB.  SQLA is in DomainA and SQLB is in DomainB.  I would like to access resources on DomainB.SQLB using login DomainA\UserA via SQL Server Management Studio.

There are many articles online that talk about doing this kind of work using impersonation and such.  However all those cases are for untrusted domain or non-domain.  However in my case I am working on Two-Way Restricted Forest-Level trust.  Even with this configured the SQL Servers could not communicate with each other.

I am not Active Directory guys (probably figured that out with my trouble over the years); anyhow, so I will not discuss how to set up two-way trust.  I leave that to you or your Active Directory Gurus.

Given you have two-way trust configured, everything work smoothly right? No!

Far from it, this has been one of the most frustrating issues to troubleshoot.  As there is almost no documentation I could find on the issue.  So what happened?

  • I can log on to DomainA using DomainB\UserB, no problems.
  • I can log on to DomainB using DomainA\UserA, no problems.
  • I can access File Resources on DomainB.SQLB\C$ from DomainA using DomainA\UserA no issues.
  • I can access File Resources on DomainA.SQLA\C$ from DomainB using DomainB\UserB no issues.
  • I can log in to SQL Server on DomainB.SQLB using DomainA\UserA from DomainB.
  • I can log in to SQL Server on DomainA.SQLA using DomainB\UserB from DomainA.

So all that works, what gives?  Well, when I tried logging into DomainB.SQLB using DomainA\UserA from DomainA.  Things broke down.

I started getting “Cannot Generate SSPI Context” error messages.  And if you have been as fortunate as me with Kerberos this is a fairly difficult message to troubleshoot.  But, don’t lose hope!  We will continue troubleshooting.

So naturally when the issue happened, i.e. failed login, I decided to check all the logs I can get my hands on.

  • SQL Server Error Logs
  • Event Viewer Security Logs on SQLA, SQLB

However none of the logs had anything.  So this meant two things 1) since I did not see anything in SQL Server Errorlog, the login was not being decided from the SQL engine. 2) Since I did not see any thing in Event viewer logs, the login was not being denied from Windows layer.

Okay! I am officially confused?  Right? Who else is involved in authentication process?  Aha! Domain Controller I though, but at last there is nothing on its logs.  I was running out of options fast, but with all my experience of troubleshooting SSPI issues.  I knew it had do something with SPNs.  I verified all the SPN, all the security, everything correct.  But it failed!

Then, what is going on?  So this is new one for me, I … umm … decided to look at the .NET Stack Trace data that is provided with error log.  No I am not .NET Developer, so I might be showing my limited knowledge in that field.  However … Lets look …

Procedure: GenClientContext
Program Location:
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParser.SSPIError(String error, String procedure)
   at System.Data.SqlClient.TdsParser.SNISSPIData(Byte[] receivedBuff, UInt32 receivedLength, Byte[] sendBuff, UInt32& sendLength)
at System.Data.SqlClient.TdsParser.TdsLogin(SqlLogin rec)
at System.Data.SqlClient.SqlInternalConnectionTds.Login(Int64 timerExpire, String newPassword)
at System.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, String newPassword, Boolean ignoreSniOpenTimeout, Int64 timerExpire, SqlConnection owningObject)
at System.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(String host, String newPassword, Boolean redirectedUserInstance, SqlConnection owningObject, SqlConnectionString connectionOptions, Int64 timerStart)
at System.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(SqlConnection owningObject, SqlConnectionString connectionOptions, String newPassword, Boolean redirectedUserInstance)
at System.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, Object providerInfo, String newPassword, SqlConnection owningObject, Boolean redirectedUserInstance)
at System.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection)
at System.Data.ProviderBase.DbConnectionFactory.CreateNonPooledConnection(DbConnection owningConnection, DbConnectionPoolGroup poolGroup)
at System.Data.ProviderBase.DbConnectionFactory.GetConnection(DbConnection owningConnection)
at System.Data.ProviderBase.DbConnectionClosed.OpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory)
at System.Data.SqlClient.SqlConnection.Open()
at Microsoft.SqlServer.Management.UI.VSIntegration.ObjectExplorer.ObjectExplorer.ValidateConnection(UIConnectionInfo ci, IServerType server)
at Microsoft.SqlServer.Management.UI.ConnectionDlg.Connector.ConnectionThreadUser()

So reading it from bottom up, all seems pretty foreign langauge to me.  Until I get to SSPIError, aha! Found what fires the error.  So one step below that must be what caused the error?  It was TdsPraser.SNISSPIData.  What the hell is SNISSPIData?  What data could it be reading?

Searching about (Bing’ing It!) seems it is a fairly common function name relating to “SSPI Context”.  So again it seems I am missing some kind of SPN maybe?

While working on the problem with a colleague, he decided to try using Force SPN with in ODBC.  Surprisingly it worked!  What the?  So I am missing an SPN?  BUT, all the required SPN are there, at least … all the required ones within single domain.

Since Forced SPN worked, we decided to create some new SPNs in each domain.

DomainA

  • MSSQLSvc/SQLA.FQDN.DomainA DomainB\SQLServiceAccount
  • MSSQLSvc/SQLA.FQDN.DomainA:Port DomainB\SQLServiceAccount

DomainB

  • MSSQLSvc/SQLB.FQDN.DomainB DomainA\SQLServiceAccount
  • MSSQLSvc/SQLB.FQDN.DomainB:Port DomainA\SQLServiceAccount

Trying connection after this everything works!  So I am calling these Cross-Domain Service Principal Names (CDSPNs).

Very interesting problem, very simple solution, just painful hard to figure out.  I am not Kerberos expert, nor do I know what all SNISSPIData does, however it seems that it was trying to verify the SPN and failing.  I wish the error message had just told me that, but ohh well.

Learning Experience!

And you know it never ends, this solution absolutely works.  However in troubleshooting this issue learned some new things about Kerberos.  Kerberos relies on DNS to be configured correctly.  I know it might be obvious to AD/Network/etc. guru out there, it was new to me.  Because 2-days after I wrote most of the post above I found cross-domain authentication using windows authentication working fine without the CDSPNs.  It turns out I scenario about because I could not resolve DomainB.SQLB from DomainA or DomainA.SQLA from DomainB.  The authentication kept failing, because Kerberos couldn’t find the FQDN for the server.  When I created the SPN, it no longer had to do the look (it seems!).

So short of it (and why I am leaving the above post still), Kerberos is not easy to configure.  Rather the issues are not easy to troubleshoot.  The original solution I found worked, but it was not the root cause of the problem.  Root cause was the DNS not configured correctly.  So if you are having issues here a rough checklist before you start pulling your hairs out.

1. Check you can ping each server by its name (i.e. Ping ServerName)

It should return ServerName.FQDN (IP), if it doesn’t we got any issue.

2. Check you can do reverse look up by its IP (i.e. PING -a xxx.xxx.xxx.xxx)

It should return ServerName.FQDN(IP), note the “-a” HAS to come become the IP to do reverse look up.

3. Have correct Service Principal Names for SQL Server Engine (two-three SPNS, 1) ServerName.FQDN Domain\ServiceAcct 2) ServerName.FQDN:Port Domain\ServiceAcct, and 3) if it’s named instance ServerName.FQDN:InstanceName Domain\ServiceAcct)

Check SPN using SETSPN -L Domain\ServerAccount, if missing add using SETSPN -A MSSQLSvc/ServerName.FQDN Domain\ServiceAcct, etc.

4. Make sure the Service Account for SQL Server is still active in active directory (i.e. password is correct, its not locked, it hasn’t expired, it still exists, etc.)

5. Make sure the Computer Account is still in AD, it has not been tombstones or deleted by AD Explorers (a.k.a. Me!).

Good Luck… if I find another items to add I will later on :).

%d bloggers like this: