Skip to main content

Large Cloud Systems like Azure are No Guarantee of Safety

We've just emerged from a week of hell in which Azure and Microsoft have completely lost my trust. It's raised a lot of questions about Azure and Marketplace and came very close to making front-page news. 

There are obviously certain details that I can't talk about but I'll say this. The upper echelons at Microsoft were made fully aware of the damage they were causing and the impact that our five day outage was having on several very large players and also on hundreds of individuals. They were completely ineffectual and did nothing to resolve the solution. 

Image by dexmac from Pixabay

A Word About Billing

I don't think I've talked about how bad Microsoft's billing systems are, so it's worth spending a little time here. I've dealt with billing from hundreds of companies over the years but nothing has ever approached the complete obscurity of their billing. 

It's not just the big things that are obscure either. Even when you obtain a small pay-as-you-go service, such as a cloud PC, they refer to it as a 3 year reservation, even though you've not signed up for three years. When you get a product from the Azure Marketplace, it's just billed as "Azure Marketplace" with no mention of which product it is. 

Microsoft billing makes it more or less impossible to be sure what you're paying. 

How it all Started

Some years back, we had a manager who liked to trial services without documenting them. He'd often forget to cancel these services and they'd eventually start charging us. There are lots of managers like this and while it's not the way I like to do things, it's not necessarily a major character flaw. If Microsoft's billing was clearer, it wouldn't be a problem at all. 

When he left the company, we were left with Microsoft bills coming from all directions with no useful descriptions on them simply because Microsoft doesn't label their invoices properly. 

We began a multi-year project to discover and close down the unused services. We engaged a few contractors and several people at Microsoft, most of whom were unable to make sense of their own bills. Little by little we got those services removed (we still have at least two that Microsoft can't identify). 

One of the more effective clean-ups occurred a little over a year ago when we spent months on teams calls with Microsoft and involved several of their people. We got a lot of services removed then, though as it turned out, they removed and cancelled are two very different things. 

It's not unusual for us to have services with delayed payments at Microsoft because we're always having to get them to explain their charges. Something that could be avoided with clearly written invoices.

The Cause

We detected the shutdown of an external service, "SendGrid" which was enabled by Azure Marketplace last Friday. I can't say more on this other than the absence of this service had the potential to affect payrolls around the country. 

We raised calls with both Microsoft and Twilio SendGrid. Both blamed each-other but both also highlighted billing as a cause. We spent Friday and Saturday going through all of our unpaid Microsoft invoices and paying everything by credit card. Microsoft's payment systems are a little problematic and we ended up paying for a few things twice. 

Nevertheless, we quickly ended up with a payments panel full of green-tick icons. 

You can imagine our surprise when despite our efforts and Microsoft's assurances, the service remained suspended on Monday. 

We spent the remainder of Monday and half of Tuesday in a blame game with Microsoft. They kept blaming SendGrid and SendGrid kept blaming them. They asked us to provide all kinds of files to see if we had a technical glitch on our side (despite us telling them that it was clearly a billing issue and that they were wasting valuable time). 

We were also able to send them very clear screen shots and files which made it clear that the problem was on Microsoft's end. We had a billing person assigned to our call and they were reluctant to involve any technical people, so we ended up going in circles even though we escalated to their bosses.

By late Monday, we had a Priority 1 case but they still told us that their technicians were too busy to be engaged. We sent them several screenshots but it wasn't until late Tuesday that they agreed to a teams session to actually look at the problem. 

The Problem but not the Fix

It turned out that when Microsoft had done a review of our bills many  months back, they'd somehow turned off visibility of a bill. That bill for a paltry $18, was outstanding but not visible to us. We couldn't pay what we couldn't see.  Microsoft spent some time trying to make the bill visible but ultimately couldn't.  In fact, they couldn't even produce the invoice for us and I have doubts that the bill actually existed. 

We were obviously willing to pay but Microsoft waived that fee. After all, the combined 'lost wages' of everyone engaged on the problem, plus the productivity and reputational losses we were suffering were considerably higher. 

All good right? Wrong. The service remained stubbornly suspended. 

Not Quite a Fix

By Wednesday, with the organisation in panic mode, and the problem having been pushed higher with Microsoft via Twitter and LinkedIn, we were able to talk to higher people but they were disinterested in our plight.

It had been determined that Microsoft's billing issue had resulted in a service being disconnected and that the service in question (SendGrid) could not be reconnected. The option was there but it was greyed out. We could see a simple file change that would resolve the problem but Microsoft was still unwilling to engage the right technical people - despite us having paid support. 

We'd also found out from SendGrid, that while they couldn't reactivate our existing service and we could create another, they had no facilities to migrate the 100+ templates from one system to another. There's essentially no backup and transfer for that service. (Shame on you SendGrid).

We have some pretty capable people in our team and they spent Wednesday rebuilding the service on SendGrid (without any connection to Marketplace) and manually moving templates by copying and pasting HTML. There was no further contact with Microsoft - it was obviously too hard for them. 

We got up and running again by the end of Wednesday but our faith in Microsoft is gone and I can't see us ever using their Marketplace again.

Even today, or service remains suspended.
(on the plus side, I pointed out to Microsoft that their error message
spelled "subscription" wrong and they at least fixed that).

The Past is the Future

We moved this particular service from Domino to Azure in 2018. In the 17 years it ran on Domino, it racked up a total outage of 5 hours. In one week, on Azure it got 40 hours (and there have been several other Azure outages prior to this). 

It doesn't make sense to move the application back to Domino as the life expectancy of the application is drawing to a close but what we can do is move the external portions, such as SendGrid back to Domino because it provides a reliable service with proven DR capabilities. 

It also makes it harder to recommend Azure as a platform in the future. 

My advice: Think hard before you trust bigger, cloud-based services. Don't trust their ability to engage in a crisis, don't trust their customer care or their billing and certainly don't trust their DR capabilities. 

Comments

Popular posts from this blog

How to Change Your Notification Options for New Lotus Notes Mail in version 8.x

Don't worry, I'm not patronizing you (my readers), I just decided to re-document this for one of our internal users and thought you might want to be able to use it in your own user documentation. WHAT IS THIS DOCUMENT ABOUT? Some people who don't get a lot of mail, like to be notified when such an event occurs. Notification can be; via a sound via a pop-up box via the system tray (where the computer clock is) The pop up box looks like this; Other people, who like myself, get too much mail would rather not be notified. The aim of this document is to tell you how (and where) to turn these options on and off. CHANGING YOUR SETTINGS To change your settings from the Notes 8.x client; On the Menu, click File , then Preferences... On the left hand side , click on the little plus sign to the left of Mail to expand the options. Click on the option marked Sending and Receiving . In the middle section, under receiving, you can control your notifications. If you untick the box mark...

How to Create a Bootable DVD Using Nero Burning ROM 9

I often need to create bootable CDs and DVDs but it's weird because I frequently end up buring myself a new coaster instead. It's not that the process is difficult, just that nero has a few too many options and I forget which ones to choose and end up picking the wrong one. I figured that the best way to avoid this mistake in future would be to write the steps down. Procedure Insert CD or DVD into your DVD Burner. Start Nero Burning ROM 9 Choose DVD-ROM (Boot) or CD-ROM (Boot) depending on what you're creating You'll be prompted for a disk image source. Choose a Nero Source - you'll usually find them somewhere like this... C:\Program Files\Nero\Nero9\Nero Burning Rom\DOSBootImage.ima Leave the Boot Locale as English - unless you really need a different keyboard layout Tick the box marked [X] Enable Expert Settings Choose Hard Drive Emulation and leave any other settings as they are. Click the button marked New Add any files you want but don't try to add operati...

How to Create an Auto-Response Mail Message in Lotus Notes 8.5.3+

Why would you do this? Suppose that you have an externally accessible generic email address for your company; support@mycompany.com or info@mycompany.com. You might expose this to the web and allow people to send messages to you. Setting up an auto-response email will tell the senders that their message reached its destination and that it will be dealt with accordingly.  It's also good practice to include links to FAQs or other useful information. Why 8.5.3 The techniques we'll be using here work in older versions of Notes but some of the options seem to have moved around in 8.5.3.  I figured it was a good time to show you where they've moved to. The Procedure Start Domino Designer and open the Mail file to be modified.  A really quick way to do this is to right-click on the application tab and choose "Open in Designer". In the Left hand panel of designer, expand Code and then double-click Agents.  A new window should appear. Click the action ...