Failed Azure Web App Auto Restart Runbook

Let me start by painting a picture. You are using Azure. You have an App Service configured with a Web App that is hosting a website; this website for example. The website could be single-instanced or it could be multi-instanced using Azure Load Balancer, Azure Traffic Manager, Azure Application Gateway, or any other number of load balancing and traffic distribution technologies. One day, your web application fails to respond and you get a dreaded HTTP 500 or another error code. As a dedicated Azure consumer, you use Azure Application Insights to monitor your website. Application Insights not only gives you user metrics akin to Google Analytics but also gives you performance and availability metrics.

The picture I painted just then explains my scenario. I use Azure App Service with an Azure Web App to host this blog. I use Azure Application Insights to provide me with all of the metrics and data I need to understand the site. The availability monitoring feature is quite excellent. It allows me to monitor the website availability from up to five locations around the world with performance data for each region so I can see how the site performs for each geography. If the site goes down for any reason, I get an email notification to warn me.

Read the Full Post

The Case of the Failed Azure Automation Runbook

In a post I will be publishing shortly after this one, I wrote an Azure Automation Runbook to automatically restart an Azure Web App when Azure Application Insights reports the site as being offline. The solution is not foolproof, but it offers a good first line of defence against issues that bring the site down. I originally wrote the runbook some time ago, however, with pressure elsewhere, it has been a while since I have been able to re-visit it and complete it.

Whilst testing the workflow this morning, I found that it was generating an error at the Login-AzureRmAccount stage; the stage where the workflow should be logging into Azure using a service principal to obtain permissions on the relevant Resource Groups. A screenshot of the error log from the automation job is shown below.

The error had been puzzled as I know this had previously worked and I have not made changes to the Azure credential nor to the runbook since. A quick Google of the error message brought me to the answer at https://social.msdn.microsoft.com/Forums/en-US/c38e01df-dac8-4095-9658-7b1d981fe8e6/azure-automation-error-run-loginazurermaccount-to-login?forum=azureautomation. The problem lay in the fact that my Azure Automation account was referencing old versions of the Azure PowerShell Module. The old version of the module generated a failure to use the Login-AzureRmAccount command.

Updating the Azure PowerShell Module in Azure Automation is painless and can be performed from the Modules blade in the Azure Automation account.

After a short wait, the modules are updated to the latest version. Re-running my workflow in Azure Automation completed successfully proving the issue as being an out-of-date module version.

An interesting point is that there is currently a banner message in Azure Automation warning that Azure PowerShell modules will be automatically updated in Azure after the 17th July 2017. The screenshot below illustrates the message in Azure Automation. I think this is a very good move by Microsoft. As an author of automation, my workflow and runbook should not be beholden to the version of the module. If a new module is required to allow my code to continue to function, do the update automatically. If features are being deprecated in the Azure PowerShell modules, I hope that Microsoft will notify us in advance. This will give us all time to revise our code to work on any deprecated commands.