6 Takeaways from Thought Leaders at Amazon, CNCF and Dell

The cloud gives us a plethora of advantages compared to the bare metal age. But when listening closely to what Adrian Cockcroft (Amazon), Cheryl Hung (CNCF) and Eric Mulartrick (Dell Boomi) shared during this year’s Cloud Waste Panel, it appears organizations across industries are not leveraging the flexibility of the cloud to its full potential.

Fortunately, our panelists shared some helpful pointers, stories and unique insights for all of us to reduce cloud costs and optimize resource utilization. In short, how to cut cloud waste and achieve a more sustainable approach to run cloud native applications.

In the following I will try to distill the main takeaways, but please consider watching the full Cloud Waste Panel 2021 recording.

1. Leverage the “on demand” in on demand

The first takeaway might sound very obvious and kind of a no-brainer for many, but as Adrian Cockcroft pointed out, the flexibility of modern cloud providers like AWS and others is not utilized to its full extent. Why is this? For decades, companies had to plan and estimate computing resources needed for an application years ahead and some of this thinking might still be around.

Today computing resources are charged by the minute or even milliseconds in the case of AWS Lambda functions. Yet, microservices and whole applications are still over-provisioned or infrastructure, which is not needed in a certain time period, just continues running. Do you really need your development or testing environment when people have already left the (home) office?

 

2. See. Analyze. Act.

The “old” saying (in pop-cultural terms) from HBO’s Newsroom goes “The first step in solving a problem is recognizing there is one”. All panelists agreed that the very first step to reduce cloud waste is to get a clear picture of the resource provisioning and actual utilization. This step is by no means meant in a charge back way, but more of an explorative approach to identify any inefficiencies. Eric Mulartrick, FinOps Lead at Dell Boomi and member of the FinOps Foundation, mentioned that the actual utilization is often at ~ 30% and increasing this number to 60-70% can be an interesting, cross-functional challenge.

It becomes much easier to optimize something if you have a metric to optimize against. Every application is different, for example an ecommerce application might optimize against the average resource usage per purchase, while AirBnB might optimize against the average resource usage per tenant.

 

3. Let’s not waste a Tesla

When leaving your house do you turn off the lights, the heater, and TV? It is a good way to think about your cloud resources the same way and visualize the waste. Do you really need the same resources for your application overnight and over the weekends?

The cloud native way opens up so many possibilities to automate and optimize resource utilization of your applications. Once you aligned provisioning, utilization and cost incurred, it might be helpful to put this number into perspective. Could you save a Tesla per month or even two? Looking at your potential from this perspective would definitely motivate me more to optimize than looking at a cold $ number.

 

4. Your intuition is probably wrong

You might have guessed it but the headline above is a direct quote from Adrian Cockroft (VP Sustainability Architecture at Amazon). He was pointing out what he recognized for himself during his career and numerous discussions with customers – The huge gap between cloud bill and attributing these costs directly to an application and its features. What is your attribution rate of your cloud bill? Do you know what part of your application produces which percentage of your cloud bill?

The first step a customer of AWS did to reduce their waste was to increase their attribution rate from 30% to 90%. Again, it is not about chargeback, it is about making things visible. You might see then, that you can shut off your QA system ¾ of the time or that somebody forgot to turn off a 100 node Hadoop cluster. Oops.

 


5. Be ahead of regulatory pressure

This takeaway is more for the managers among us. Regulatory pressure to improve on Environmental, Social, and corporate Governance (ESG) is increasing. The European Union has already implemented and plans to extend many regulations to hold organizations accountable and enforce transparency across industries. Similar standards are planned for the United States. Therefore, institutional investors make ESG standards an integral part of their decision process before investing and evaluating existing investments.

Seen from a corporation standpoint, sustainability thinking is not necessarily brought into the boardroom from employees, but rather flowing out of it. Of course, cloud efficiency is not the only factor improving an organization’s ESG numbers, but the more your company is relying on the cloud, the more impact your efficiency efforts will have.

Adrian Cockcroft made an excellent point during the panel discussion and I will just quote him here:

“The real benefit, I think, of cloud is it helps you transform your product line more quickly to a more sustainable product line. In the same way that we use cloud to speed up digital transformation, we’re using cloud to speed up sustainability transformation.”

Adrian Cockcroft
VP of Sustainability Architecture
Amazon

6. The cool thing about FinOps

At least for me, the term FinOps was a new one. In case you are also not familiar with it, here is a quick definition taken from the FinOps Foundation:

“FinOps is shorthand for Cloud Financial Management. It is the practice of bringing financial accountability to the variable spend model of cloud, enabling distributed teams to make business trade-offs between speed, cost, and quality. “

Bringing visibility and accountability to cloud spend is a crucial part of the journey to cloud efficiency and reducing cloud waste.

Achieving a high attribution rate from cloud spend to individual services or features is the job of a FinOps Engineer allowing organizations to organize the cross-functional efforts to run applications at peak efficiency in different scenarios.

There are several tools already out there specialized for cloud native ecosystems and helping teams get the insights about e.g. cloud cost per service. CNCF’s sandbox project Cloud Custodian is just one example backed by 3.5k stargazers and ~300 contributors.

Seen from a developer or DevOps perspective, it might be a motivating thought to first realize that the infrastructure cost of providing a certain feature may limit its profitability, but then using a tool like StormForge Optimize for turning this ratio around – changing a net loss feature into a value adding, sustainable one.

Adrian, Cheryl, and Eric shared many more inspiring thoughts for your journey to cloud efficiency and reducing waste. Feel free to watch the full panel discussion.

I hope this brief summary has been helpful to navigate the main thoughts discussed during this years panel.