ad

Active Directory and DFS-R Auto-Recovery

I appreciate this is an old subject but it is one that I’ve come across a couple of times recently so wanted to share and highlight the importance of it. This will be one of a few posts I have upcoming on slightly older topics but none the less important ones that need to be addressed.

How Does DFS-R Effect Active Directory

In Windows Server 2008, Microsoft made a big change to Active Directory Domain Services (AD DS) by allowing us to use DFS-R for the underlying replication technology for the Active Directory SYSVOL, replacing File Replication Service (FRS) that has been with us since the birth of Active Directory. DFS-R is a massive improvement on FRS and you can read about the changes that DFS-R brings to understand the benefits at http://technet.microsoft.com/en-us/library/cc794837(v=WS.10).aspx. If you have upgraded your domains from Windows Server 2003 to Windows Server 2008 or Windows Server 2008 R2 and you haven’t completed the FRS to DFS-R migration (and it’s easily overlooked as you have to manually complete this part of a migration in addition to upgrading or replacing your domain controllers with Windows Server 2008 servers and there are no prompts or reminders when replacing your domain controllers to do it), I’d really recommend you look at it. There is a guide available on TechNet at http://technet.microsoft.com/en-us/library/dd640019(v=WS.10).aspx to help you through the process.

Back in January 2012, Microsoft released KB2663685 which changes the default behaviour of DFS-R replication and it effects Active Directory. Prior to the hotfix, when a DSF-R replication group member performs a dirty shutdown, the member would perform an automatic recovery when it came back online however after the hotfix, this is no longer the case. This behaviour change results in a DFS-R replication group member halting replication after a dirty shutdown awaiting manual intervention. Your intervention choices range from manually activating the recovery task to decommissioning the server and replacing it, all depending on the nature of the dirty shutdown. What we need to understand however is that a dirty shutdown can happen more often than you think so it’s important to be aware of this.

Identifying Dirty DFS-R Shutdown Events

Dirty shutdown events are logged to the DFS Replication event log with the event ID of 2213 as shown below in the screenshot and it advises you that replication has been halted. If you have virtual domain controllers and you shutdown your domain controller using the Shutdown Guest Operating System options in vSphere or in Hyper-V, this will actually trigger a dirty shutdown state. Similarly, if you have a HA cluster of hypervisors and you have a host failure causing the VM to restart on another host, yep, you guessed it, that’s another dirty shutdown. The lesson here first and foremost is always shutdown domain controllers from within the guest operating system to ensure that it is done cleanly and not forcefully via a machine agent. The event ID 2213 is quite helpful in that it actually gives us the exact command to recover the replication so a simply copy and paste into an elevated command prompt will recover the server. No need to edit to taste. Once you’ve entered the command, another event is logged with the event ID 2214 to indicate that replication has recovered shown in the second screenshot.

AD DS DFS-R Dirty Shutdown 2213  AD DS DFS-R Dirty Shutdown 2214

Changing DFS-R Auto-Recovery Behaviour

So now that we understand the behaviour change, the event ID’s that lets us track this issue, how can we get back to the previous behaviour so that DFS-R can automatically recover itself? Before you do this, you need to realise that there is a risk to this change and the risk is that if you allow automatic recovery of DFS-R replication groups and the server that is coming back online is indeed dirty, it could have an impact on the sanctity of your Active Directory Domain Services SYSVOL directory.

Unless you have a very large organisation or unless you are making continuous change to your Group Policy Objects or files which are stored in SYSVOL, this shouldn’t really be a problem and I believe that the risk is outweighed by the advantages. If a domain controller restarts and you don’t pick up on the event ID 2213, you have a domain controller which is out of sync with the rest of the domain controllers. The risk to this happening is that domain members and domain users will be getting out of date versions of Group Policy Objects if they use this domain controller as the domain controller will still be active servicing clients whilst this DFS-R replication group is in an unhealthy state.

Effects Beyond Active Directory

DFS-R is a technology originally designed for replicating file server data. This change to DFS-R Auto-Recovery impacts not only Active Directory, the scope of this post but also file services. If you are using DFS-R to replicate your file servers then you may want to consider this change for those servers too. Whilst having an out of date SYSVOL can be an inconvenience, having an out of date file server can be a major problem as users will be working with out of date copies of documents or users may not even be able to find documents if the document they are looking for is new and hasn’t been replicated to their target server.

My take on this though would be to carefully consider the change for a file server. Whilst having a corrupt Group Policy can fairly easily be fixed or recovered from a GPO backup or re-created if the policy wasn’t too complex, asking a user to re-create their work because you allowed a corrupt copy of it to be brought into the environment might not go down quite so well.

Active Directory and the Case of the Failed BitLocker Recovery Key Archive

This is an issue I came across this evening at home (yes, just to reiterate, home), however the issue applies equally to my workplace as we encounter the same issue there.

One of the laptops in my house incorporates a TPM Module which I take advantage of to BitLocker encrypt the hard disk and using the TPM and a PIN. This gives me peace of mind as it’s the laptop used by my wife who although doesn’t currently will likely start to take her device out on the road when studying at university.

Historically, I have used the Save to File method of storing the recovery key, storing the key both on our home server and on my SkyDrive account for protection, but as of our new Windows Server 2012 Essentials environment, I wanted to take advantage of Active Directory and configure the clients to automatically archive the keys to there.

The key to beginning this process is to download an .exe file from Microsoft (http://www.microsoft.com/en-us/download/details.aspx?id=13432). I’m not going to explain here how to extend the AD Schema or modify the domain ACL for this all to work as that is all explained in the Microsoft document.

Following the instructions, I created a GPO which applied both the Trusted Platform Module Services Computer Configuration Setting for Turn on TPM Backup to Active Directory Domain Services and also the setting for BitLocker Drive Encryption Store Computer Configuration Setting for Store BitLocker Recovery Information in Active Directory Domain Services.

After allowing the machine to pickup the GPO and a restart to be sure, I enabled BitLocker and I realised that after verification in AD, nothing was being backed up. Strange I thought, as this matches a problem in the office at work however we had attributed this problem at work to a potential issue with our AD security ACEs, but at home, this is a brand new Windows Server 2012 with previously untouched ACEs out of the OOBE.

After scratching my head a little and a bit more poking around in Group Policy, I clocked it. The settings defined in the documentation are for Windows Vista. Windows 7 and Windows 8 clients rely on a different set of Group Policy Computer Configuration settings.

These new settings give you far more granular control of BitLocker than the Windows Vista settings did, so much so, that Microsoft elected that the Windows Vista settings would simply not apply to Windows 7 or 8 and that the new settings needed to be used.

You can find the new settings in Computer Configuration > Administrative Tools > Windows Components > BitLocker Drive Encryption. The settings in the root of this GPO hive are the existing Vista settings. The new Windows 7 and Windows 8 settings live in the three child portions: Fixed, Operating System and Removable Drives.

Each area gives you specific, granular control over how BitLocker affects these volumes, including whether to store the key in AD DS, whether to allow a user to configure a PIN or just to use the TPM and probably the best option second to enabling AD DS archive in my opinion is whether to allow the user to select or whether to mandate that the entire drive or only the used space is encrypted. The Operating System Drives portion gives you the most options and will likely be the one people want to configure most as this is ultimately what determines the behaviour when booting your computer.

I’m sure you’ll agree that there’s a lot of new settings here over Vista and that this gives you much greater flexibility and control over the settings, but with great power comes great responsibility. Make sure you read the effects and impact of each setting clearly and that you test your configuration and if possible, backup any data on any machines which you are testing BitLocker GPOs against in the event that the key isn’t archived to AD DS and that you enter a situation where you need, but don’t have that recovery key available.