Hope Is Not A Strategy

Automating Efficient Resource Utilization for SREs

Air Date: June 3, 2021

Rich: Alright, welcome everyone! We’re going to go ahead and get started. Thank you all for attending our webinar today. We’re here to talk about automating efficient resource utilization and really specifically for those who are in an SRE role. We titled the webinar, “Hope is Not a Strategy,” because it’s really about becoming more systematic and really building efficiency and optimization into your processes and automating that as much as possible. Really getting out of the more reactive kind of approach. 

So a couple of housekeeping things before we dive into the content today. If you have questions, please enter them on the Q&A button within the Zoom interface. We are doing a giveaway at the end of the session, well actually following the session. We’re going to be doing a giveaway and just by attending today’s webinar, you have an entry into the giveaway, which is for an Amazon gift card. But also for every question that you ask, we’ll give you another entry into the drawing because we want to encourage you to ask questions and keep it as engaging as we can even given the virtual format of the webinar today. So please, anytime during the webinar today, enter questions in the interface. So just by way of introduction, my name is Rich Bentley and I’m the Senior Director of Product Marketing at StormForge. I’ve been around the industry for quite a while starting out as a developer, but more recently working for the vendor side of things in a product marketing role. I’m based in Ann Arbor, Michigan. With me today is Erwin Daria, who is a Principal Sales Engineer for StormForge. Erwin’s been also around the industry for quite a while in a number of different roles both from the side of the enterprise IT organization, but also as working for vendors as well. He is based in the Bay Area of California. 

So we’re really happy to be with you today to talk about this, so we’re going to start out with a simple question here. What does SRE stand for? I’ve heard a number of different answers to this. I wanted to share some of the more common ones I’ve heard. So is it Simply Restart Everything, or on a related note I’ve also heard Senior Rebooting Engineer, Sleep Rarely Ever is a common one I’ve heard as well. Software Ruins Evening, or probably the worst one of all Seriously Regretting Everything. 

Hopefully that doesn’t describe any of your roles as an SRE, but we all know working as an SRE your job is really a difficult job. You have a lot of pressure going on and things have changed in terms of the role of making sure that systems and applications are performing reliably. It used to be very much just keeping the lights on and reacting to issues as they occurred and trying to fix them, which that alone was pressure enough. But now the role that you’re in you still have to do that. You still have to keep the lights on, you still have to make sure that things are working well, but at the same time you’ve also got to work on engineering a better light bulb that doesn’t go out, right? So you kind of have both of those parts of your role. 

So it’s a really difficult, challenging role. There’s a lot at stake just given digital transformation and the importance of your organization. So hopefully what we’ll share today will help give you some guidance around doing that more efficiently. So when we talk about efficiency, what do we actually mean by that? There’s 3 aspects of it that we think are really important and it’s really about the trade-offs between those things. So when people think about efficiency, a lot of times they think just about cost and reducing costs and utilizing resources better. That certainly is an important aspect of it, but along with that, there’s also application performance, right? You can’t reduce costs, but impact application performance negatively. And then the third leg of that triangle is really time and effort. So how much time do you have? How much effort do you need to put into making sure systems perform well and that they’re running at the lowest cost possible? Any one of those things, any change can affect anything else. If you want better performance, you’re either going to have to put in more time and effort or you’re going to have to spend more to do that, and really any of those things can affect the other two legs of the triangle. So when we talk about achieving peak efficiency, what we’re really talking about is improving the performance of your applications, so that the SLIs that you’re looking at, which could be throughput, availability, latency, error rates, and anything like that you’re improving those SLIs, which will help you then consistently or more consistently meet or exceed your service level objectives. But you want to do that at the lowest possible cost and lowest possible resource utilization. You also want to do it with the least amount of effort. So that’s really kind of the holy grail of what we’re trying to achieve with peak efficiency. So if you think about those three things, right, application performance, cost, and then the time and effort, there’s a significant impact by not having that level of efficiency that you’d like to be at. 

So the first is when you think about application performance, what is the impact on you and your business? On the business side of things, depending on what type of application you’re talking about, you’re potentially talking about lost revenues, lower conversion rates, higher abandonment, customer satisfaction, brand reputation, all of those things. But then more on the personal side for those of you working in an SRE role, you’re talking about war rooms where everybody is trying to point the finger at everybody else, you’re talking about slowing down your transformation, you’re losing productivity. All of those things can happen if you’re not running at peak efficiency. Then when we think about the cost side of things, there’s a huge impact there as well. So we did a study, and there’s been other studies done on this as well, but we recently asked folks how much of their cloud spend did they think was wasted on either idle or unused resources? The average estimate that came back was about almost half, right? 48% of cloud spend is wasted, or at least that’s the perception of people that amount of their cloud spend is wasted, which is obviously a huge number depending on how big your cloud bill is. If half of that is wasted, that’s significant to your organization’s bottom line and then last, but not least is kind of the time and effort that you put into trying to make sure things are efficient. A couple of stats here from D2IQ, a study that they did, which I thought were really compelling, the first is that 38% of developers and architects feel like the work that they’re doing makes them feel really burnt out. Maybe even more shocking than that, over half of developers and architects say that moving to cloud native and building cloud native applications makes them want to find a new job. So that’s really a huge number and this is developers and architects, but I think you could extend that to SREs and other people who are in roles that are under that much pressure and have that much responsibility within their organization. So those are kind of the 3 challenges and the impact, but let’s get into why it is so difficult to achieve peak efficiency. It sounds easy, right? Reducing cost, making performance better, but when it comes to cloud native applications that are running on Kubernetes, you talk a lot about the complexity, you hear a lot about the complexity of these applications. The reality is that when you deploy an app on Kubernetes, there’s a lot of things that you need to configure and you may be looking at defaults that come from software vendors, but at the end of the day, it’s really your responsibility to configure these applications when you deploy them and the things that you have to configure are both on the Kubernetes side of things. So CPU and memory, requests and limits, replicas, and multiplying that by the number of containers that make up your application. But then there’s also application specific settings. Depending on what type of application you’re running, there are things like JVM heap size, garbage collection settings, all those things that you have to choose and every one of those choices you make has an impact on the cost of running the application, the way that application performs, and the availability of that application as well. So if you think about the fact that maybe there’s 15 parameters, or maybe there’s 20 parameters, that you need to set, the number of different combinations is almost infinite. It becomes a really complicated problem to solve and trying to do it manually just based on your own knowledge and experience, it’s really hard to do that. Really I would say it’s impossible to do that in a way that you’re going to achieve the most efficient outcome possible.

You’re probably using a number of different tools and techniques and processes today to try to kind of manage this challenge. Each of these have their place, each of these have value, but they also have drawbacks. So trial and error is really common, right? When you’re deploying an application, you try what you think is the best configuration. Maybe you start with the defaults and then you see what happens, right? you deploy it into production and then you find out things aren’t working quite as well as you thought, and so you go back and you tweak it and you change it. But again, based on the number of different parameters, the complexity of these applications, it’s not a great approach other than for wasting a lot of time and effort. 

Performance and load testing tools are really important. they’re great tools for putting load onto a system in pre-production and seeing how the system will respond. it’s a great way to get a feel for how things will work under load, but it doesn’t really give you, tell you, what to do about that, right? it doesn’t tell you how to fix it, how to address any issues that come up. So it’s a good start, but it’s really only a step in the right direction.

Then there are capabilities provided within Kubernetes, like horizontal pod auto scaler that help you scale up and down the number of pods, which are again really helpful for kind of managing the scale, but even these tools require optimization. They require tuning. Otherwise it’s still not going to run to the level of efficiency that you expect.

Last but not least, you’ve got monitoring and observability tools, which are really important for monitoring your production environment. The challenge there is that by the time something goes wrong it’s too late, right? your users have already experienced the impact of that. So it puts you into a reactive mode of addressing challenges. So again all great techniques, but I think there’s a better way that we can that we can address that. That’s what we’re going to get into next as I hand it off to Erwin to start to talk about what we can do about these challenges.

Erwin: Thanks, Rich. So yeah. So how do we address these challenges, right? So what Rich has already described is kind of this shangri-la, this promise of the cloud, all the promise of all the tools that we have in the tool chain, and the kind of desired state of the applications and the people that run the applications, but we’ve already discussed how there are all these challenges and how those challenges provide or exert all kinds of pressure up and down the stack. The technology itself, the people that manage everything, etcetera.

We call this the Kubernetes efficiency gap and this gap has to be crossed in some way. What we’re really trying to do is many people that are registered on this call and the title of this webinar is really about the SRE role, right, but many of the kind of processes that we still have in place are still from the old guard, right? but we do know that we want to go from the traditional sysadmin approach where we’re watching alerts and we’re getting escalations to something that’s a little bit more proactive, a little bit more kind of programmatic, a lot more automatable. We want to go from this manual to, again, more automation. We want to get to proactive, we want to be able to kind of see the problems out in front of us before something tips over or gets impacted in production, and impacts our customers. Then we want to go from dev teams kind of feeling like they’re on island, or devops teams feeling like they’re on an island, to really empowering them with with not only the collaborative processes that allow us to kind of not only share information more readily and more effectively, but again communicate in ways and provide services back and forth that allow us to integrate better. Then we want to be able to know what those risks are before we push these changes out. These implementations and these applications back out to prod.

So just recently, I think this is pretty universal with regard to like the SRE mindset, but we’ve been told many many times by many many organizations that the role of the SRE, not only is it really meant to kind of spend more than half of your time in automating, getting out of the toil of what we assigned to the old sysadmin type job roles and responsibilities, but we really want to be able to empower these folks such as yourselves to automate yourself into your next role, right? We want to provide tools that allow you to do that. 

But there’s a weird kind of mix here, right? If you look at what you see on the screen here is a traditional representation of a CI/CD pipeline. What we’ve been talking about in terms of all this conflict, all this friction, really operates on the right hand side of this. We push applications, we’ve got like a release management mechanism, we deploy them using Kubernetes, we operate them, we install and all the telemetry and observability tools on this side, and then we rely on a very archaic kind of cycle of escalations, alerts, many of you might recognize Pagerduty, not to call out specific brands, but this I would imagine many of you just like in my history, there’s a visceral reaction to kind of getting those alerts, right? you get the alert, maybe you’re the domain expert for a given technology that escalation comes to you, it’s after hours. Again here we kind of playfully refer to SRE as software ruins evening, but we’ve all been here, right? We’ve been there at home, trying to spend time with our families, getting the escalation, and then having to drop everything to figure out what’s going on because we’ve got SLIs and SLOs to kind of meet for our customers.

We log into our observability tools. Here we’ve got an APM looking at the Kubernetes infrastructure, and again, maybe there’s a run book, maybe there’s not a run book, but then this is just indicative of the complexity that we have to deal with every day. We think that there’s a better way of handling not only the complexity of the applications to Rich’s point earlier, but to test them and really what we call shift optimization to the left, right? So integrated within the CI/CD pipeline, so that optimization that testing and the optimization happens before we release new code to prod. That’s really what we’re talking about. 

If we don’t do this, right? Again in my leadership roles in the past, right, it really boils down to risk mitigation, right? We want to be able to service our end user customers, service the business, run all the applications that make up our business, but we want to do so while still mitigating risk. A lot of those risk factors change over time. It used to be just service level up time. It used to be and now it’s application performance, consistency, etc. and now we’re finding that on top of all that, like to Rich’s point earlier, we’re adding cost and operational overhead to the mix, right? We believe that by shifting a lot of these optimization tasks to the left and doing them before we release code, not only do we result in a better product hitting production, we also mitigate risk not only for our customers and ourselves as organizations and as businesses, but we also mitigate risk for our staff and for the people that we rely on to be on call and be those subject matter experts, right? We want to mitigate the breadth and scope of those escalations.

So let’s get to brass tacks, right? We’re going to talk about the StormForge platform. So there’s two main components within the platform. The first, which we’ll touch on lightly here, is our performance testing as a service. So many organizations already have some kind of QA functions where there is performance testing, but some organizations either don’t have a robust enough QA environment. Maybe they have problems with performance testing, and we can provide that as a service. 

So that’s also part of the StormForge platform. So with that, performance testing as a service is built on Amazon. It allows you to very quickly create testing profiles and then send them to any ingress interface from any of Amazon’s regions. Again it’s as a service. we spin up the environments. You don’t have to carve out infrastructure to run your performance tests internally. We can provide that to you. We also can integrate with maybe a testing suite that you already have. So if you have a well-developed testing practice within your organization, we can still use those tests and we can trigger them from the application optimization framework, which I’ll spend most of my time talking about today. 

So again, on the right hand side here, this is where we’re going to spend most of our time talking about the value that it brings. It’s the application optimization by StormForge. So this is a machine learning power rapid experimentation engine. We do it in a downstream environment. So this is super important. We basically load test an application in a downstream environment. We watch the behavior of that application under load. We record the metrics, the behavior, and the results of that load, and then we feed that to a machine learning engine so that the machine learning engine can make recommendations about what settings to change, what parameters to tune, and then we do that iteratively over and over again in kind of a closed loop fashion. I’ll have diagrams on the show coming up right after this.

The intent is to autonomously figure out what are the optimal configurations for your Kubernetes applications. And then not only mitigate the high risk configs, so know things that you shouldn’t implement because we know that under load they’re going to fall over, but then more importantly find those ones that not only surface the trade-offs that we had talked about earlier, but find the optimal kind of configuration for the intent that your business needs. 

Alright, so super high level, this is a quick diagram. I’m going to work kind of like top to bottom left to right and describe the StormForge process. So on the left-hand side, I mentioned we start with load testing, right? Performance load testing. Now again, this box on the left-hand side that says “test case” could be StormForge performance testing, or it could be a performance test that you provide for us. Then the next box is your app deployed using your manifest, whatever method it is, helm charts, config maps, whatever that might be in your downstream environment, right? So we’re going to basically impose load on your application running in your environment. The StormForge Optimization is kind of all the stuff on the right-hand side in the gray box. So we deploy a controller in your Kubernetes cluster and that controller will query your APM to understand what the response is of the application under that load that we’ve created. Again, we create this feedback loop that continuously measures the performance of the application. We can impose specific metrics. So we can design an experiment where we want to prioritize multiple dimensions, so we can say hey we care about cost and performance, or we care about something like latency and throughput, and the Rapid Experimentation Engine will continually make recommendations about what changes to make to what parameters and we’ll test that and we’ll measure those results and we’ll do that over and over again and then present the results of that experiment to you in the gui, which I’ll go through when we get to the demo.

Human beings get to choose what the optimal configuration is because I think it’s really important to understand all the nuances of the trade-offs that we have to make when we do this type of testing and optimization. We’ll go through that. Now that optimal configuration can then be used, and I think relevant to this particular webinar, you can take that optimal configuration and you can use it as not only a gate for a secondary test on the next push that you might have for a given application, but you can use it as kind of a baseline, or a threshold. So we meet a particular optimal configuration that hit specific, we’re going to use cost and performance thresholds, we can make sure that those are things that happen continuously again in that CI/CO in terms of continuous optimization continuous delivery model. 

Alright, with that I’m going to switch over to our demo.

Okay, so I’m going to borrow a quote from my esteemed colleague, Brad Ascar, here at StormForge. I’m going to show you the cake after it’s been made. So what you see on the screen here is our web UI. So when you deploy StormForge, there’s a number of components. Very simply there’s just a controller that gets deployed in your Kubernetes cluster that controller will have outbound access to what you see here, basically our machine learning API that’s in the cloud. Primarily you’re going to interface with both the controller in the Kubernetes cluster, as well as this gui here.

You’ll see top to bottom, we’ve got a number of different experiments that we’ve run on various environments in our lab. I’m going to focus on maybe the most common one that we see in terms of microservices architectures. We’re going to use the Voting Web App. This is a pretty popular Docker example of what a microservices architecture might look like. We have a version of it on our GitHub, so those of you that have been working with Docker might recognize this architecture. We’ve got a number of different containers. They’re all running different software applications and they represent a Voting Web App. So basically there’d be a web UI, you get to vote with cats versus dogs, that information goes to an in-memory database, there’s a worker that writes those results to a Postgres database, and then there’s a results page that’s written in node.js to show you the results of the of the voting app. 

Now when we run our optimization against this app, the machine learning will run a number of trials given the metrics and objectives that we care about, right? So what you see here in this kind of gray grid, we call that the parameter space. It’s really just a 2D representation of the results of the tests or the trials that we run our application through, along these two axes. So in this particular optimization, we’ve prioritized throughput and cost as the two axes. The box in blue is our baseline. So when you design an experiment with StormForge Optimize, you get to define all of these various things. Again, the parameters that want to be tuned. Those essentially equate to the knobs and buttons that the machine learning will tweak in order to submit to the controller to run another series of tests, as well as the metrics, the things that we care about that we want the machine learning to achieve. If we look at the parameters that are part of this particular application manifest, you can see that we’re addressing not just a single specific layer, but the actual parameters from an array of app of the application tunables that are part of this, right? So we’ve got DB CPU, we’ve got DB Memory, Redis CPU, Redis Memory, and so on and so forth. 

We’ve got about 10 parameters here. Now the system has gone through and it’s made a bunch of changes and you can see what the results of each one of those iterations or trials has been plotted out here. These are all the dots that are in the graph. Now I can then filter all those dots to the optimal configuration. So the optimal configuration is essentially the signal from the noise. So we’ve used machine learning to figure out what are the various effects of all the tuning on the parameters that we’ve done, and these are the ones that most tightly align with both throughput and cost. Now if I click on one, or before I click on one I should note that the box in orange is the exact middle point of the optimization. This is as far as you can go in one direction without giving up the other dimension. So in this particular case, you can see what we’ve been able to do is tune the parameters, you can see what the net result of the tuning here is down below, in order to achieve a throughput that’s about 12 percent less than we were getting from baseline, but at a 72 percent savings in cost. 

Now something to note, I think this is really important, especially as we begin to use machine learning in a lot of these operations, I do believe it’s important to have human in the loop here because a lot of times if you’re just using simple regression to the most simplistic terms of just saving costs, or just saving performance, you don’t understand what the landscape is of trade-offs that you could potentially be making. So for instance, one of the things that we really want to communicate here at StormForge is that this visualization, this graph, allows you to not only see what we believe, or what the machine learning believes, is the optimal configuration. You can click on other graphs or other configurations and figure out where those trade-offs actually come from. So for instance, maybe I do want more throughput. So there are several configurations here where we can increase throughput without meeting or without hitting that baseline cost, right? so again if I’m looking at this plane here, I can pick trial number 63. I was able to increase throughput by two percent and still maintain a 60 percent cost savings. 

Now again, the metrics that we care about can be anything that can be exposed via an APM, and so if there are specific targets for your organization, your application, your business unit, we can certainly use those as targets for the machine learning to optimize.

Alright, so having this, understanding that this is the ideal configuration, maybe I don’t want to go with the simple one that the machine has decided for us, I want to choose trial number 63, I can do a number of things from here, right? I can export this config. I get a very simple copy and paste. This is if you wanted to use our gui to do that. You can certainly do it through the CLI, which would be on-prem. Then you would export this configuration with all these parameter settings to some other version control system that you might have. Maybe you’re implementing some form of GitOps, and so you would export all these with the appropriate settings so that they can go through another round of maybe gating functions before they get applied to production, or we get pushed out to production.

So in short, I mean this is really the power of using machine learning at this particular stage. I think it’s really important that we we understand that doing this again in pre-prod allows us to identify not only configurations where things can be impacted without impacting our customers, but again really understanding what those trade-offs are, and then really finding those configurations that once we push them to prod, the likelihood of escalation, the likelihood of impact on those subject matter experts after hours when they’re hoping and praying to spend time with their families, are not being engaged with the technology itself I think is super important. Again, the title of this webinar is, “Hope is Not a Strategy,” and we believe that this is one of the key ways that we can help mitigate those risks. 

So with that, I’m going to go back to our deck here. 

All of this would be for not if we couldn’t give you a real world example of the benefits of our machine learning and this shift left for the optimization in that model, right? 

So unfortunately I can’t give you the actual company name, but this is a very large travel website. You can imagine that for a company like that, their main app, that’s their customer facing app. This is where they generate all of their revenue. That particular application, before StormForge, had a team of nine focused on observability, understanding kind of what all the various impacts there were when they’re testing and tuning. I mean this is a full-time job for nine people. They didn’t really have, they had very minimal dev and test environment capacity. They were running a lot of this stuff on-prem, in a private cloud, and so you can imagine that if you’re not optimized even in pre-prod, I mean, the applications are very inefficient in pre-prod, you can imagine how inefficient they are in production. 

We were able to… they had been very good at finding all of the various bottlenecks from a performance standpoint, but using our machine learning they were able to find an additional 50% resource utilization improvement, without impacting their performance. So again this is very impactful for organizations not only because it means that the application lands essentially smaller in prod, but it also in this particular case freed up those dev and test resources, so that they could do more QA, they could do more kind of pre-prod testing and optimization on efficiency for a wider array of applications and use cases. 

So with that, it’s been a pleasure presenting.

Rich: Thanks, Erwin. Great great demo. Good overview of the solution here.

So before we dive into a few questions, just wanted to talk about some additional things that you could do to follow up if you want more information. One thing is I mentioned one of the statistics from our Cloud Waste Survey. You can download the entire results of that survey. I think it’s really interesting, and especially in your role as an SRE, to kind of see what the market is saying about their efficiency and how they’re looking at the problem of cloud waste, so definitely a good report there to check out. Then also scheduling a demo, or actually signing up to use the product, both of those things are great next steps. So we’re happy to get on a call with you and give you a more of a personalized demo to talk through specifically related to your needs and what your applications are all about. So happy to do that, but then also if you’d like to sign up, you can use the optimization solution for free and you can find that at that link there. Also the QR code here will take you to a page with all of these links and a few more as well for things that you might want to take a look at. 

Alright, so we’ve got a few good questions here. We’ll get to as many as we can. First one, Erwin, the question is really around how much effort and knowledge is required to use StormForge machine learning for optimizing your apps? Let me talk to that one a little bit here and then hopefully we’ll get Erwin’s audio back. 

But yeah I think when it comes to the effort and knowledge required to use StormForge. So Erwin kind of showed you as he mentioned the cake after it was baked, and I think what’s required to actually set up an experiment is actually not not too difficult. We’re trying to make it as easy as possible, and what we do is we actually will automatically scan your cluster for what resources can be tuned and then basically you can go step by step with our command line interface through the process of actually setting up and running an experiment. If you want to get deep into the details and go and edit the yaml files and customize your experiment, you can do that to your heart’s delight, but we want to make it as easy as possible to get started and actually create and execute these experiments. So definitely not too much Kubernetes expertise required to go there. 

Alright, let’s try another one and see if we got the audio back again. Next question is from Erwan, and this is about the metrics. So StormForge is able to get application metrics on its own, or is it processing logs or some kind of external metrics?

Erwin: Okay, so the question is is StormForge able to get application metrics on its own or is it processing logs or any kind of external metrics. So the answer is both, right? So if you don’t have an APM that we can query for those metrics, we can deploy Prometheus specifically within our implementation, or a very kind of simple implementation to kind of get those metrics. But a lot of our customers already have an APM and out of the box we’ve got some query examples for Prometheus, and I believe for Datadog, but again any APN that we can query for those metrics we can use.

Rich: Alright, we’re passing the mic back and forth here. So you might be wondering how we’re able to do that since Erwin is based in California and I’m based in Michigan, but we happen to be together here in Atlanta today.

Erwin: I was gonna go with wormhole.

Rich: Alright, so okay. Let’s do a couple more questions here if we can make it work. So this is a really good question. This is about what is the advantage of shifting left using machine learning versus using machine learning in a production environment, like AI ops tools and things like that. Let’s try one more time with your mic and if it doesn’t work I’ll pass it over to you, all right? 

Erwin: Okay, so the question was what is the advantage of shifting left, putting that optimization kind of before pushing out the prod, versus kind of that right hand model, right, where we put stuff out in prod and then we use AI ops to kind of infer certain conditions and then automate some kind of resolution. So I think there’s a number of different ways or a number of different advantages to kind of shifting left. 

One is, I think the obvious one, is do you trust an AI to make changes to your prod environment in real time, right? I mean as real time as you can get in terms of the number of data points that you need to collect, to build an inference model, all those kinds of things. What are the security implications of that? What are the operational risks to having that? I think shifting left just in that particular dimension makes a lot of sense, right? we want to be able to test without impacting our customers and without causing escalations to those people that are SREs, right? 

The other one is what is the breadth and scope of what we can tune, right? I think having our platform do these things in pre-prod environments allows us to experiment in ways that not only gives us visibility in how the applications respond under load, but we can also use the StormForge web UI to understand what the impact of all those parameters are. Maybe we have the wrong parameters, maybe… I didn’t get a chance to show this during the demo, but maybe we see an optimization where something like the replica configuration is not optimal. Like it doesn’t matter whether you have one replica or three replicas and maybe we don’t need that as a tunable, maybe we don’t need to add that as a parameter, and we need to focus our time on something else. These are all really important things that allow us not only this feedback loop of doing it shifting left and doing it in pre-prod, it buys us a lot more risk mitigation. It gives us a lot more kind of air to breathe so that we’re not worried about breaking things in production. 

I’ll stop there. There’s a lot of others, I’ll get on the soapbox. I don’t necessarily want to do that, but those are just maybe a couple of the places that I would stick to.

Rich: Thanks, Erwin. Let’s do one more question here before we run out of time. So last question here is about our StormForge integration with other cloud vendors. Just how does StormForge integrate with cloud vendors?

Erwin: That’s a great question. So we integrate with Kubernetes and we integrate in all the places that Kubernetes lives. So there’s no dependency on specific clouds. If you’ve got a Kubernetes master, wherever it is, in all the managed ones, if you’re self-managing your Kubernetes environments, we can install our controller. So I’ll leave that there, but we’re very tightly integrated with Kubernetes. Again, no dependencies on any specific flavor of Kubernetes. 

Rich: Great. Thank you, Erwin and thanks everybody for attending today’s webinar. Apologies for the audio issues at the end there, but hopefully we worked around that okay. If you have questions that you still have after the webinar, reach out! Feel free to contact us and we’d be happy to jump on a call. So thank you everyone and that ends today’s webinar.