Skip to main content

Managing Microsoft Azure accounts

Microsoft Azure's popularity is growing especially in the small and mid sized markets. Pay-as-you-go and the 0 investment makes these companies consider moving to the cloud attractive. Microsoft engineers working in Azure guarantee a SLA of 99.9 % or more (depending on the type of business). 

The easiness of building infrastructures in Azure introduces a problem that has not been there before when IT was incorporated in the organisation: budget management. Before you had these endless and boring meetings about budget and spendings with the internal accountants to get a go and every dime was turned twice of not three times before a go was given. 

In Azure you can command a server that is capable to perform at 2000 DTU and it will cost you € 31.370,76 per month. Oeps! No kidding!


"Sorry, I was just looking for something powerful" the developer smiled nervously in the meeting after the reception of the Azure invoice.

But everybody will starts really screaming if the Azure budget has a monthly spending limit: suddenly services will die and are no longer accessible. Azure SQL Server databases are no longer accessible too:

So companies web-sites and services will go down, data is no longer processed if it will be stored at all because storing data costs also money and the data transferred to the Azure outbound VPN connections will no longer work. 

Oh yes, you get an automated e-mail that everything is stopped:


It is a "little problem" for Azure but it might be a big problem for your company.

The Azure team goes sometimes so far that it will delete big spender services for you but when when there is some credits again you get this notice by mail:
Everything is all right again but you need to redeploy some services.

From a Microsoft point of view quiet reasonable: the client does not want to spend more than X and that limit has been reached so we prevent the client's services ordered to exceed that limit. 

These things are bound to happen and are particularity painful when the demanded power was not used during the period. Most of the time it is not really the fault of the developer or the person that created the resource that has exhausted the companies budget. 

Lots of expensive resources in Microsoft Azure have no buttons to PAUSE or STOP the resource. Some resources you have to delete (ex. HDInsight cluster) to stop it. That of cause scares users that do not want to lose their data so they let it run over night. 

I did make that error myself: I used the Cortana Intelligence Gallery to create the Songs example:
Really nice stuff in there: SQL Data Warehouse and a Spark HDInsight cluster.  Cool! The whole environment was created in 15 minutes but when i checked it it did not want to run because it has an error in there. Great, but not really. I wanted to debug the Azure Data Factory Editor in Firefox but Firefox is not yet supported and gives an empty screen. The solution did not work, I could not find any STOP or PAUSE button (Microsoft did indicate that as a positive thing of the Azure Data Warehouse Service) and seems not to spend too much money so I decided to look at the problem on Monday. 

How foolish I was! In the weekend the solution seemed to come alive and eat up my Azure budget for the month. I had the notice by e-mail on Monday morning that my account was exhausted and all services were stopped. 

The Data Warehouse used almost 80 €: 

And the HDInsight two data nodes Spark Cluster 92 €:

Both doing nothing. Azure pricing forecast scared me even more:
If I continued to use the application without the spending limit I would spend 342 € by the end of the week and the costs will rise exponential as you can see.

For a large company this is of cause peanuts and will be brushed under the carpet. Ordering a 2000 DTU/€ 31.370,76 per month Azure Stretch Database Service and forget it is a bit more difficult to put under the carpet even for a large company. 

In my opinion Microsoft Azure does not provide sufficient tools yet for accountants to manage the companies budgets. But I am sure they will be coming.

Sure as an Azure account owner you can see what is going on and you can take action. But if you are an accountant that owns and manage the Azure account you don't want to delete a Spark cluster just like that because you have no idea if it important. If it is and you have deleted it from a spending point of view and the whole production environment crashes, it is the guy behind the ledger and the glasses who did it and the IT guys will be coming for you. 

What the Azure team needs to implement is a way to STOP or PAUSE on all big spending services, like Azure Data Warehouse or a HDInsight cluster. 

For large companies it is essential to separate accounts and account rights in a clearly defined way for example a global accountant can see and propose to stop or pause spending services with an IT manager's consent. 

I advice to use separate Azure accounts per project to clearly identify budgets and spendings. Each project needs to have separate Azure accounts for  Production, if required Pre-Production/Test and Development. 

Having a spending limit on Production is dangerous in a way that it can stop all services when the account is exhausted. Resources need to be managed continuously to prevent overspending. Creating separate accounts per project and per environment means that if a project has a spending problem in Development it means that only these resources for that project are stopped and not for all projects or even Production, which could be disastrous for a company.

PS: What is very annoying about managing an Azure account is that it is so incredibly slow. The portal needs certainly a refresh to speed up its performance.

Comments

Popular posts from this blog

Privacy and the liberty to express yourself on LinkedIn

Unaware that LinkedIn has such a strong filtering policy that it does not allow me posting a completely innocent post on a Chinese extreme photography website I tried to post the following: "As an Mpx lover I was suprised to find out that the M from Million is now replaced by the B from Billion. This picture is 24 Bpx! Yes you read this well, 24 billion pixels.  Searching on the picture I stumbled on a fellow Nikon lover. If you want to search for him yourself you can find him here: http://www.bigpixel.cn/t/5834170785f26b37002af46a " In my eyes nothing is wrong with this post, but LinkedIn considers it as offending. I changed the lover words, but I could not post it.  Even taking a picture and post it will not let this pass:  Or my critical post on LinkedIn crazy posting policy: it will not pass and I cannot post it.  The technology LinkedIn shows here is an example what to expect in the near future.  Newspapers will have a unified re...

How to run SQL Server 2016 with In-Databasse R on Windows 2016 CTP5

For those who like me tried to run SQL Server 2016 with In-Database R might have run into the same problem as me: In-Database R or the LaunchPad service gives a timeout and won't start. I did several clean installations with different configuration options - for instance I like to put my data on another disk than the system disk - but in the end I tried to do the next, next, next, finish install to see if it something in the setup options is hard coded in there (yes, it happens developers!). For some reason this problem is related to Windows 2016 and not on Windows 2012R2 and I hope the SQL Server team will soon resolve these issues because they are in one word a bit sloppy.  There are 2 issues (maybe even 3 so I give this one also):  The R setup does not create the ExtensibilityLog directory in the "C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Log" directory The R setup sets the number of users in the SQL Server Launchpad service to 0 it is pos...

Truncate a table in SQL Server and nobody will ever see it....

Today I ran into a very strange problem. I was doing an audit because a client lost 32 billion records in a table and he wanted to know "Who did it!". They did not use auditing because it had a too large impact on performance. The server and instance was not restarted so the DMV's were still available with data.  I presumed there were two options how the data could have been lost and so quickly: run a TRUNCATE TABLE. Very fast and and in less than a second your table is unrecoverable empty  Switch a partition out into a temporary table and just truncate the original table. Yes it happens and the effect is the same: nothing!  So I ran this query to collect the last queries: SELECT t.TEXT QueryName, s.execution_count AS ExecutionCount, s.max_elapsed_time AS MaxElapsedTime, ISNULL(s.total_elapsed_time / s.execution_count, 0) AS AvgElapsedTime, s.creation_time AS LogCreatedOn FROM sys.dm_exec_query_stats s CROSS APPLY sys.dm_exec_sql_text( s.sql_handle ) t WHERE t.TE...