When it comes to Kubernetes and cloud-native applications, day two operational pain is real. It’s what happens when the idea of developing applications for the cloud meets the reality of production. It’s where the rubber meets the road, so to speak, although for today’s developers it’s more like where the unpredictable meets the potentially unwelcome surprise.
That sure doesn’t sound like the best way to build and deliver capabilities that meet high-performance goals, does it? So why are so many engineering and development teams still forced to operate in wait-and-see mode even as performance demands have become vital to business?
Unfortunately, it’s because these teams haven’t had a way to accurately, easily, and cost-effectively predict in a non-prod environment how an application will perform when it’s in the hands of actual users. Eventually, something will happen in production that you didn’t or couldn’t account for, and that sends teams chasing ways to manually tune applications and the environment for optimal performance.
Too often, however, that optimization process means different things to different engineers. And that’s why each team member charged with finding the right answer will attempt to optimize based upon what they think will make a difference. Typically, it also means trying what they are most comfortable changing, which means that some engineers will first increase CPU or memory while others will try another tweak that makes sense to them. They may take educated guesses based upon years of experience or build upon recent learnings, but they are best guesses nonetheless.
While some tweaks may have a positive impact, there’s also the likelihood that they will work against one another – and with thousands of potential changes to experiment with, there’s no way engineers can possibly try all of them to discover what truly optimizes performance and cost. Why? Because manual tuning is resource-intensive, inefficient, and… once again, because you don’t know what you don’t know.
Here’s where an optimization platform driven by machine learning can help. By automating experimentation-based optimization in pre-production and observation-based optimization in production, organizations can run experiments and tweak applications to get as close to their target as possible prior to launch. They can then observe and validate applications in production for continuous verification and improvement.
When you can monitor, observe and experiment by leveraging machine learning, you can automate and optimize at scale intelligently – because you know what you know. And, when it comes to optimizing apps and environments, that’s a much better place to be.
Get Started with StormForge
Try StormForge for FREE, and start optimizing your Kubernetes environment now.
Download our latest eBook
About the Author
Patrick Bergstrom
Patrick Bergstrom is Chief Technology Officer at StormForge, where he is responsible for product strategy development, and delivering innovation to StormForge customers.
Bergstrom was most recently Vice President, Site Reliability Engineering & Software Engineering, Enterprise Operations at UnitedHealth Group, where he created a globally distributed organization responsible for processes and tools to support distributed applications using modern DevOps techniques and best practices. He also led the creation of the Site Reliability Engineering category in BestBuy.com’s Web Operations group and introduced modern strategies around Data Collection, Application Monitoring, Alerting, Incident Management and Response. Bergstrom started his career in the Army National Guard as an Avionic Systems Technician, where he gained his first insight into system reliability at scale while working on U.S Army airframes during his 12 years of service.
Based in Ames, Iowa where he lives with his wife and enjoys woodworking, and also serves as a Board Chair of the Economic Vitality Committee as part of Downtown Ames & Ames Chamber of Commerce. He holds a Bachelor’s Degree from Iowa State University.