Due to fine-grained resource provisioning, which gives customers ultimate pricing flexibility, identifying a ‘flavor’ of cloud can be tough. Much like a coffee shop or rental car agency, the choices are mind-boggling. Let’s consider renting a vehicle versus buying one. When you rent a car you don’t concern yourself with warranty terms and conditions, maintenance, replacing worn parts, or cleaning. All of these things are the rental car company’s responsibility. You are simply paying to use the car for the intended purpose and the price you pay covers the management of these issues. When we buy a car we not only take on all of these issues, but we have a tendency to over estimate our needs because we’ll probably be driving this car for several years so we need to guesstimate what our future needs might be. We also tend to drive the car well within its limits, because we don’t want to be constantly having it serviced. Unfortunately, if our needs change, having a child for example, with a rental car where we can simply swap it out for a different model, but when we own the car it’s a far more time consuming and expensive undertaking.
The same concept applies to cloud computing. We are renting computing capacity. Our goal should be to drive CPU and memory utilization as high as we can, because it’s the cloud providers responsibility to manage the underlying physical resources. If a cloud providers servers are running hot, then that’s their problem to manage, not yours. If storage is failing or memory needs replacing, once again, that’s their problem not yours.
When we look at cloud computing through a car rental lens, we start to understand why performance means everything. If you’re renting a sports car you want to go fast, if you’re renting a minivan you want to cram as many people into it as possible, and if you’re renting a convertible, then obviously you want the top down. Renting gives us this flexibility, not to mention putting the right resource in the right location. (I live in Seattle, what am I going to do with a convertible?)
Infrastructure as a Service (IaaS) is the renting of resources (CPU, memory, network, storage) to service computing needs. While this sounds straight forward, complexities begin to appear almost immediately, for example, how many resources does a given instance require? While it might appear that a rudimentary assessment of an existing configuration would suffice, e.g. 2 vCPUs & 4 GB of RAM = A2 (a.k.a the economy car), but this approach doesn’t factor in requirements like throughput and iOPS (Premium storage), which really means F2 (the sports car), or a need for higher temporary disk performance, which means D2 (the minivan).
Unfortunately, without Actual Resource Consumption data, all we can do is guesstimate which vehicle we need, resulting in an over estimation of resource needs, unnecessarily inflates cost and potentially leaves us with the wrong set of features.
Why does this typically result in an over estimation of needs?
Everything in life is a trade-off, higher performance means higher expense, lower risk means lower returns. As CPU utilization increases, so too does the risk of latency, forcing a trade-off between lower resource utilization driving cost up versus higher risk of latency driving performance down. When building on-prem, host, network and storage costs are often viewed as a sunk cost. If these resources aren’t assigned, then they cannot be used; but if they are assigned, they can be accessed if the need ever arises. This over provisioning then remains in place and instead of recovering unused resources as additional VMs are created, newer hardware is acquired and the cycle of waste begins all over again.
What’s driving this complexity?
Cloud providers already have a laser-like focus on optimizing infrastructure utilization. The more VMs they can run on a physical host – or cars they can jam into the parking lot – the greater their Return on Investment (ROI), but this must be balanced against their obligation to provide an agreed upon minimum level of computing performance i.e. having the right type of car in the right location at the right time. These performance levels are offered as profiles A, D, F, G etc. which are then further split into resource reservation levels A1, A2, A3 etc. By leveraging fine-grained resource provisioning a.k.a. resource throttling, a cloud provider can run multiple profiles and resource reservation levels on a single host to maximize both the use of their physical resources and their ROI, while continuing to satisfy their computing resource obligations. Basically a formula driven way of figuring out where the resource utilization and latency curves intersect, e.g. how can I park as many different types of cars as efficiently as possible and still be able to drive them.
What is often overlooked however, is that this functionality is often already being applied to on-prem infrastructures, but under a different name and for completely different reasons. The name many administrators use is ‘resource prioritization’. For example, using resource pool shares, administrators can specify the relative importance of a VM. A production VM may be given a high priority while a development or testing VM may be given a low priority. As you can see, the goals or objectives may differ, but the impact is the same, the resources granted to a VM do not directly correspond to the performance levels it can produce. This throttling/prioritization effectively impacts us on both ends. Does the on-prem environment prioritize some workloads over others, and if so, how does this correlate to the profiles and reservation levels offered in the cloud.
How do we know over provisioning is an issue?
We have 36 billion Actual Resource Consumption data points telling us so. If customers were implementing these types of features on-prem, to drive down costs without impacting performance, then at a minimum we would expect to see average CPU and memory consumption to be in the 40-50 percentile range at a minimum. At present, the average CPU consumption across all Movere customers is 4.62% while memory is tracking at 34.19%, sort of like using a minivan for 1 person. Planning a cloud migration strategy based upon nothing more than an assessment of the existing on-prem configuration simply won’t work from a cost perspective as this lack of performance tuning equates to millions of dollars of spending on underutilized hardware and software, and now we run the risk of repeating this in the cloud.
I’ll believe it when I see it…
Using a real-world example: Movere scanned a Domain Controller (DC) and sized it as F2 profile (the sports car). Why is that? Movere was using the single threaded compute capacity of the chip it was running on, then it factored in the resources that had been granted to it. Without Actual Resource Consumption data, Movere could only assume the VM had 100% access to the underlying resources, i.e. Movere had no awareness of any fine-grained resource provisioning. This same concept applies to using basic configuration data from an existing on-prem environment. With no awareness of how each instance has been throttled or prioritized, we cannot accurately assess how much performance is truly required. Movere then captured Actual Resource Consumption data from the DC and it correctly calculated it as an A1 profile (the economy car). Movere measured the VMs actual performance and no longer needed to assume the VM has unrestricted access to the resources being granted to it from the host.
CPU, memory, network and storage utilization data collected by Movere is vitally important if the goal is to help customers move to the cloud in the most cost effective way possible while still being able to satisfy performance obligations, because this is what cloud providers are already doing.
I’m too risk adverse, I’d rather over spend than under deliver…
Movere presents cloud sizings based on 2 and 3 standard deviations from the mean; giving customers the confidence that the sizings Movere recommends will satisfy 95% and 99% of their peak workloads, because they are based on Actual Resource Consumption. So, no problem, we can help you identify which bells and whistles you need and you can add red paint, just because you can.
Movere powered by Unified Logic is a SaaS platform that provides a customizable, guided exploration of your IT world. We love talking to you, so please reach out firstname.lastname@example.org !