dedupe

Deduplication in Windows Server 2012 Essentials

Yesterday, I posted with a quasi-rant about Windows Server 2012 Essentials Storage Pools and the inability to remove a disk in a sensible non-destructive manner. At the end of that post, I eluded to the lack of the Primary Data Deduplication feature in Windows Server 2012 Essentials which got me thinking about it more, so I went of on an internet duck hunt to find the solution.

Firstly, I found this thread (http://social.technet.microsoft.com/Forums/en-US/winserveressentials/thread/4288f259-cf87-4bd6-bf9f-babfe26b5a69) on the TechNet forums in which an MVP highlights a bug which was filed on Microsoft Connect during the beta stages over the lack of deduplication. The bug was closed by Microsoft with a status of ‘Postponed’ and a message that it was a business decision to remove the feature.

Sad, but true when the people being targeted with Essentials are the people potentially wanting and needing it most, but I guess the reason probably lies in the realms of supportability and a degree of knowledge gap in the home and small business sectors to understand the feature.

Luckily for me, in another search, I found this article (http://forums.mydigitallife.info/archive/index.php/t-34417.html) at My Digital Life where some nefarious user has managed to extract the .cab files from a Windows Server 2012 Standard installation required to allow DISM to install the feature. While the post is targeted at Windows 8 64-bit users to use dedup on their desktop machines, the process works equally well for Windows Server 2012 Essentials, if not better as you can also use the GUI to drive the configuration.

I don’t want to be the one in breach of copyright infringement or breach of terms of service with Microsoft, so I’m not going to link to the .7z file provided on My Digital Life, so download it from them, sorry.

Download the file and extract it to a location on the server. Once extracted, open an elevated command prompt, change the directory context of the prompt to your extracted .7z folder and enter the following command:
dism /Online /Add-Package /PackagePath:Microsoft-Windows-VdsInterop-Package~31bf3856ad364e35~amd64~~6.2.9200.16384.cab /PackagePath:Microsoft-Windows-VdsInterop-Package~31bf3856ad364e35~amd64~en-US~6.2.9200.16384.cab /packagepath:Microsoft-Windows-FileServer-Package~31bf3856ad364e35~amd64~~6.2.9200.16384.cab /PackagePath:Microsoft-Windows-FileServer-Package~31bf3856ad364e35~amd64~en-US~6.2.9200.16384.cab /packagepath:Microsoft-Windows-Dedup-Package~31bf3856ad364e35~amd64~~6.2.9200.16384.cab /PackagePath:Microsoft-Windows-Dedup-Package~31bf3856ad364e35~amd64~en-US~6.2.9200.16384.cab

If DISM fails or gives you any errors, then the most likely cause is that you didn’t use an elevated command prompt. The next likely cause is that you aren’t in the correct working directory so check that too.

Once all of the packages are imported okay, enter the second command:
dism /Online /Enable-Feature /FeatureName:Dedup-Core /All

No restart is required for the import of the packages or the enabling of the feature, so everything can be done online.

Once the feature is enabled, head over to Server Manager to get things started. Server Manager isn’t pinned to the server Start Screen by default, so from the Start Screen type Server Manager and it will appear in the in-line search results.

From Server Manager, select File and Storage Services from the left pane, and then select Volumes from the sub-options.

As you will see in the screenshot, I’ve already enabled dedup on the volume on this test Windows Server 2012 Essentials VM of mine and I’ve saved space  by virtue of the fact that I’ve created two data folders with identical data in each folder.

For you to configure your volumes, right click the volume you want to setup and select the Configure Data Deduplication option. On the options screen, first, tick the box to enable the feature. Once selected, you have options for age of files to include in Deduplication and types of file to exclude. For my usage at home, I am setting the age to 0 days which includes all files regardless of age, and I am choosing to not exclude any file types as I want maximum savings.

The final step is at the bottom of the dialog, Set Deduplication Schedule. This allows you to configure when optimization tasks occur and whether to use background optimization during idle periods of disk access. I chose to enable both of these and I have left the default time of 0145hrs in place.

Once you click OK and then OK again on the initial dialog, you have just enabled dedup on that volume. Repeat the process for any volumes you are interested in and job done for you. After this, the server has the hard task of calculating all the savings and the process of actually creating the metadata links to physical blocks on the disk and marking the space occupied by duplicate blocks on the disk as free space. This process is very CPU and memory heavy and depending on the size of your dataset can and will take a long time to run.

I am just about to kick off a manual task on my live Essentials server at home, so once the results are in, I will be posting here to report my savings and also the time taken, but I’m not expecting this to come in anytime within the next day or so.