Decommissioning Specific SAN Datastores En Masse

One of my customers recently purchased a new SAN, with the goal of decommissioning the old one.  They used Storage vMotion to migrate all of their VMs over to the new SAN and adjusted all of the ESXi hosts to put their scratch space on a new LUN, and were ready to proceed.

Many people, at this point, would just turn off the old SAN... and they might be ok.  Maybe.  At that point, the ESXi hosts are going to seriously freak out, because they just encountered an unexpected SAN failure... and we've all seen that sometimes, ESXi doesn't respond well to losing datastores unexpectedly.

So, the more cautious people would right click on each datastore and unmount it, then turn off the old SAN.  While not as bad as just turning off the SAN, the ESXi hosts still expect those LUNs to be there (even if they're no longer mounted as datastores) and can still run into issues.

Miss Manners insists that people follow the procedure detailed in KB 2004605.  That article includes a lot of important preparatory steps (did you take the datastores out of their datastore clusters?) as well as details for how to cleanly remove the LUNs from the ESXi hosts.  As the KB article describes, after you unmount the datastores, you need to detach them from each ESXi host.  Yes, each ESXi host.  If you need to remove 10 datastores from 16 ESXi hosts, you get to run that detach operation 160 times.

Fortunately, there's a script!  Back in 2012, Alan Renouf (yes, that Alan Renouf) put together a PowerCLI module called DatastoreFunctions, which dives into some serious PowerCLI arcana to automate these operations!  That module contains two important functions: unmount-datastore and detach-datastore.  So, all we had to do was take our Decommissioned SAN Datastore list and pipe it into those two functions!

Well, yes and no.  That absolutely worked, but it was slow.  This customer's environment was significantly larger than 10 LUNs and 16 ESXi hosts.  It's so large, that after doing a trial run on a single Datastore, we calculated that the entire decommission process would involve about 33 hours of nonstop script execution.

Time to go back to the drawing board.  Since the scripts were so old, I spent some time reviewing them and testing bits to ensure that they'd work in today's PowerCLI environment.  While I was reading through them, I noticed a few places where I could possibly improve their execution speed.  Since none of us were interested in babysitting a script for 30+ hours, I spent a bit of time trying to optimize the scripts to run faster in large environments.

In the end, I came up with the scripts below.  For speed testing, I used the get-datastoreMountInfo function.  The original version took 67 minutes to run in this environment.  My modified version takes 1.5 minutes to run.  The other functions don't have such a dramatic improvement in execution speed (after all, it still has to do the same number of Sets as before, and Sets take a lot longer than Gets), but even they are somewhat improved (going from about 22 seconds per datastore per host to 14 seconds per datastore per host).

The script lives in the PowerCLI Example Scripts repository at GitHub, but you can see my modified version in the gist below!  As always, this script is provided as is with no guarantees.  Just because it worked for me in my situation does not mean that it'll work for you in yours, so test thoroughly before executing it.  Especially this one, as it is literally removing datastores from your environment - if something goes wrong or you point it at the wrong datastores, it could wreak some serious havoc!


Popular posts from this blog

Deleting Orphaned (AKA Zombie) VMDK Files

Clone a Standard vSwitch from one ESXi Host to Another

Orphaned VMDK Files