Introducing the Azure SQL Database DTU Calculator

The following is the first in a series of posts about the Azure SQL Database DTU Calculator.  This post describes the need for the calculator and the process of creating it.

As part of my job at Microsoft, I get the opportunity to work closely with companies across the US.  My particular area of focus is Azure so I spend most of my days talking about a company’s business model, architecture, IT costs, and other technical issues to help them identify areas where Azure can help as well as provide them with a roadmap for migration.  During one such meeting about 6 months ago, I was talking to a company about Azure SQL Database as a potential option for migrating their RackSpace SQL Server to Azure.  If you’re familiar with Azure SQL Database, then you’re probably familiar with the term database throughput units (DTUs).  If not, a DTU is something the Azure SQL team created to describe the relative capacity of a performance level across a blended measure of CPU, memory, reads, and writes.  The article, Azure SQL Database Service Tiers and Performance Levels, contains a more thorough explanation.  Anyway, during that discussion one of the company’s architects asked, “how do I know how many DTUs my database needs.” 

Seems like a reasonable question and at the time I didn’t have a great answer.  I’d read the guidance, Azure SQL Database introduces new near real-time performance metrics, so I was familiar with the process of starting with feature capability, picking the lowest performance level, migrating the database, and then adjusting based on utilization.  However, I could see the concern this caused for the engineers.  Migrating a database from location A to location B isn’t a trivial task and doing so with any level of ambiguity isn’t something any engineer would recommend.  It was at that time I reached out to a friend on the Azure SQL Database team for help.  The conversation went something like this (names have been changed to protect the innocent):

Justin: 

Hey John.  Quick question… Do you have anything to do with or know anything about SQL Database DTUs?

John: 

You bet.

Justin: 

Cool.  In your opinion, would it be possible to write a DTU Calculator which a developer could run against their on-premise database to see which tier of SQL Database they would map to?

John: 

We’ve discussed doing such a thing, and I think it’s a good idea, just haven’t had the time to do it yet.  Feel free to send me email, and I’ll get the right conversation going.

A calculator is born…

One of my main objectives when developing the calculator was to keep it simple – something engineers at any level of expertise could easily use.  With that in mind, I spent the last several months working with the Azure SQL team to understand all the complexities of converting SQL Server performance to database throughput units and developing a website to wrap those complexities in a simple, intuitive interface which can analyze your production database workload and accurately calculate the service tier and performance level needed in Azure SQL Database – all before migration.  I’m happy to announce the Azure SQL Database DTU Calculator does just that!  The calculator is available to use but I’d love to hear your feedback and results.  As I mentioned earlier, this post is just an introduction.  Look for my next post to go into more detail about using the calculator and interpreting the results.

Advertisements

144 thoughts on “Introducing the Azure SQL Database DTU Calculator

  1. Hi Justin,

    Thanks for this tool! I’ve been trying to upload my .csv file, but when I do I am greeted with the ‘Error! An error occurred while processing your request.’ message after clicking ‘Calculate’

    I’ve tried using both the standard and elastic calculators, and get the same error from both section.

    Can you provide any guidance for this issue?

  2. hey Justin, seem to be having the same issue currently have several DB’s sitting on a standalone SQL box running w2012 server and looking to move up to the cloud

  3. Hi Justin,

    Thanks for this tool… Would love your thoughts on my scenario. Currently I’ve about 40 SQL Service instances on a single server with 1 core, which I’m considering moving to an elastic pool.

    For a single “average” DB from those 40, I ran your tool and got
    CPU : 22% basic; 78% standard
    IOPS: 84% basic, 12% standard, 4% premium (recommends 439 total)
    Log: 99% basic

    What I’m trying to work out, is how those numbers multiple up for an elastic pool. Would you be able to advise?

    Thanks,
    Chris

    1. CPU and IOPS are captured at server level – not DB – so they should map directly to an Elastic Pool. Log is the only metric that’s DB specific.

      Did you run the calculator using the Elastic Pools calculation or the Single Database? They both perform the same calculation but the Elastic Pools calculation will recommend the number of eDTUs needed. Based on your comment, “recommends 439 total” it looks like you ran the Elastic calculation. The result of which indicates you need a Standard Elastic Pool with somewhere between 400 and 800 (439) eDTUs.

  4. I’m having problems running the PowerShell script on a clustered environment – i.e. running on the active node of a failover cluster. I’m getting an error that the \SqlServer:Databases(_Total)\:Log Bytes Flushed/sec counter isn’t available.

    I’m running the script as Administrator, and also have tried changing SqlServer to my instance name – however no joy. Is there a way I can get this counter in this environment?

      1. Hi Justin,
        Unfortunately the command line utility won’t run due to .NET framework version problems.

        The error message from the PowerShell script is pretty good – the counter isn’t available. If I run Get-Counter from the command line, this counter isn’t displayed. I think I just need to reference the SQL Server in the correct way to get at the counter – but that’s what is eluding me!

      2. Hi Justin,
        Digging around to see which counters there are available, running:
        Get-Counter -ListSet “*Database*”
        Shows that we have
        \Cluster Database\Flushes
        and
        \Cluster Database\Flushes Delta

        Both of which are close, but not – I think – what we want.

  5. Hello Justin

    I am having issues running the calculator. It is the last counter that is causing an issue for me, the one that interrogates the Databases. When I run the PS1 script with the counter name I get from perform I get the below.

    Collecting counters…
    Press Ctrl+C to exit.
    The specified object was not found on the computer.
    At C:\Users\username\Desktop\sql-perfmon.ps1:43 char:12
    + Get-Counter <<<< -Counter $counters -SampleInterval 1 -MaxSamples 3600 |
    + CategoryInfo : InvalidResult: (:) [Get-Counter], Exception
    + FullyQualifiedErrorId : CounterApiError,Microsoft.PowerShell.Commands.GetCounterCommand

    Within the script I have replaced SQLServer with our counter name MSSQL$Instance1, it does show in capitals in perform but I have used the case as shown.

    I have tried variations of the SQL server replacement. I have tried the actual server name\Instance 1, I have tried just the server name, I have tried it without changing anything but nothing works. I have been in touch with Microsoft Azure support and they have been helpful but only to advise that the script works.

    I ran the CLI copy and that outputs info for the other counters but for the database I get this message – SQLServer:Databases doesn't exist. Try running perfmon.exe to identify the correct SQLServer:Databases category.

    Which I would expect but I don't know how to edit the exe and the code within the change it to out instances/SQL server name.

    can you help? It would be greatly appreciated

      1. Hello Justin

        Just to let you know I’ve figured out what was going wrong.

        There were two issues.

        The first issue was that the counter that looked at log bytes flushed needed my specific counter details in. So what you get in the script file is \SQLServer:Databases(_Total)\Log Bytes Flushed/sec but what I had to change it to was MSSQL$Instance1:Databases(_Total)\Log Bytes Flushed/sec. I found the exact counter name in the perfmon tool.

        The second issue was to do with the quotation marks used on the script. Rather than the ” between each of the counters I had to change it to ‘.

        Original counter part of script
        $counters = @(“\Processor(_Total)\% Processor Time”,
        “\LogicalDisk(_Total)\Disk Reads/sec”,
        “\LogicalDisk(_Total)\Disk Writes/sec”,
        “\SQLServer:Databases(_Total)\Log Bytes Flushed/sec”)

        Changed counter part of script

        $counters = @(‘\Processor(_Total)\% Processor Time’,
        ‘\LogicalDisk(_Total)\Disk Reads/sec’,
        ‘\LogicalDisk(_Total)\Disk Writes/sec’,
        ‘\MSSQL$Instance1:Databases(_Total)\Log Bytes Flushed/sec’)

        I am now collecting the counter information fine now.

        Thanks for the help and the speedy response.

  6. Justin,

    I am looking at this from the perspective that i run a SQL Server currently with about 30 DB’s for various sites, but am looking at moving just one DB for one site on to azure.

    Currently from a server point of view its recommending an S3/4 depending on when i collect the data using the powershell script.

    If i look at the performance stats on SQL server, i can see this is not the busiest database on that given box its about 7th out of 30 in terms of the resources that particular SQL server.

    Is there a tool that would gather stats based on a single database on a given server?

  7. justin – what’s the issue with calculator giving the “Error! An error occurred while processing your request.” whenever i try to load the csv file created by the tool?

    1. Is it possible that you are using Regional Settings other than en-US (datetime format differences) or that you have made several runs with the collector (the collector does not delete the old file, it only keeps appending and then you get a header row in the middle of the file)?

      To solve the Regional Settings problem, I have made a pull request for hardcoding the csv output to en-US on

      https://github.com/jhenriks79/Dtu-Calculator-Perf-Collector/pull/1

      To avoid getting multiple header rows, you need to manually delete the old csv file before making another run with the collector.

  8. The website crashes in all manner of ways if i try to upload my files. To be fair, I expected that, as they’re quite big (for all servers – 1.24 gb). To be able to calculate correctly I had to monitor an entire month, as we have slow periods and very active periods.

    Do you have any suggestion on how to actually be able to get my results? Maybe provide a command line tool to also process the results and upload the processed results for the estimation? Or a tool to simplify the data in a correct way?

  9. Hi, Justin. I’m trying to use the tools, however if I try to load the file, generated by the utility (I’ve fixed headers there) – I get a “Error! An error occurred while processing your request” message. Could you please advise – what am I doing wrong?

    1. P.S. I’ve verified the file structure – there’re no duplicates. It has US locale, I just keep receiving this error

  10. Hi Justin,

    I’m trying to run the script available on http://dtucalculator.azurewebsites.net/ but everytime I get this execution error:

    Collecting counters…
    Press Ctrl+C to exit.
    Get-Counter : Não é possível localizar um parâmetro posicional que aceite o argumento ‘1’.
    No C:\sql-perfmon.ps1:43 caractere:1
    + Get-Counter $counters 1 -MaxSamples 3600 |
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : InvalidArgument: (:) [Get-Counter], ParentContainsErrorRecordException
    + FullyQualifiedErrorId : PositionalParameterNotFound,Microsoft.PowerShell.Commands.GetCounterCommand

    My server language is pt_br I don’t know if that can be causing this fail.

    Can you help me?

    Thanks a lot.
    Luiz

    1. It looks like your OS language is Portuguese. Without modifying the PowerShell script or the command-line utility, it won’t find the counters. Can you change the counters in the script to be localized?

  11. Hi Justin,

    Thanks for providing this tool. It’s fantastic!! I’ve to use this tool against 500+ environments over a longish sample time. I could use a couple of pointer please if you have time.

    I’m sampling the counter values every 10 second for 7 weeks to get a good reflection of instance load to account for weekly jobs/reports etc. These will be stored locally so I can do mean, median and max rollups of disk reads, disk writes, log bytes flushed and processor time. These will form 3 seperate JSON requests to the API. Is the rollup idea a good approach or is there an easier/better way to interact with the API with all sample values collected?

    I’m also running into a funny issue with the JSON response I get from the following powershell code :

    $MaxDiskReads = 6
    $MaxDiskWrites = 11
    $MaxLogBytesFlushed = 207
    $MaxProcessorTime = 4

    $DTURequest = @{}

    $DTURequest.Add(‘DiskReads’, $MaxDiskReads)
    $DTURequest.Add(‘DiskWrites’, $MaxDiskWrites)
    $DTURequest.Add(‘LogBytesFlushed’, $MaxLogBytesFlushed)
    $DTURequest.Add(‘ProcessorTime’, $MaxProcessorTime)

    $Body = $DTURequest | ConvertTo-Json

    $Response = Invoke-WebRequest -uri ‘http://dtucalculator.azurewebsites.net/api/calculate?cores=2’ -Method POST -Body $Body

    write-host $Response

    The response is as follows, essentially empty/zeroed out ::

    “{“Recommendations”:[{“Metric”:”CPU”,”ServiceTier”:””,”ServiceTierCoveragePct”:0.0},{“Metric”:”Iops”,”ServiceTier”:””,”ServiceTierCoveragePct”:0.0}”

    I’ve tried generating the JSON request in a couple of different ways, from the hash table and as a long string with the values put in by hand, and I always get the same result even though the request is the same when printed out to the host as what I input into the tester app. Do you have any suggestions please?

    Apologies for the long post 🙂

    Chris

    1. Hey Chris,
      Regarding the approach, using max for aggregation would be the best approach. With the DTU Calculator, you’re essentially looking for what’s the maximum DTU (Service Tier/Performance Level) needed.

      Regarding PowerShell, you need to convert to Json like this:
      $Body = ConvertTo-Json @($DTURequest)

      And add the -ContentType parameter to the request:
      $Response = Invoke-WebRequest -uri ‘http://dtucalculator.azurewebsites.net/api/calculate?cores=2’ -Method POST -Body $Body -ContentType ‘application/json’

      1. Hey Justin,

        Thanks for the suggestions here. I’ve implemented the powershell sugesstion and it works perfectly.

        I’ve gathered the sample data and taken the max values. From 34k counter samples taken over 7 days the max values seem to be complete outliers. For example, in the case of DiskWrites, the second largest counter value is 40% of the largest…for LogBytesFlushed the third largest value is 50% of the largest. I don’t think this would represent a realistic higher workload. I think I would have the Premium P6 suggested for every instance. There’s 2 ways I can think of…

        1. Gathering the mean and going 2 standard deviations from the mean for each counter and using that value. From the current set this would only exclude the top 1 percentile (approx 300 values).

        2. Simply ignore the first 10 values from a set that size. This would exclude 100 seconds over the course of 1 week and most likely get rid of the more “crazy” values.

        Which is a better approach?

        Thanks again 🙂

        Chris

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s