Version 1.8 of the Red Sky Ops controller is now generally available. This release includes new capabilities that improve the efficiency and accuracy of experiments, give you more visibility and control into metrics, and make it easier to see improvements versus your baseline configuration.

This blog describes highlights of version 1.8. For complete release notes, see our release page.

Metric Constraints

Currently, when defining your experiment file you can set limits on the min and max parameter values you want the machine learning to explore. With version 1.8, we’ve expanded constraints to apply to your metrics as well.

Let’s say you’re optimizing for latency and there is a threshold you don’t want to exceed. You can set this threshold in your experiment file and when a trial has a latency above this threshold it will get marked as failed, and the machine learning will learn to avoid the parameter space to provide optimal results that meet your performance or other metric criteria. For example we might edit our Horizontal Pod Autoscaler (HPA) example ‘recipe’, which tunes the HPA for the voting web app, to have a latency threshold of 1000 ms. We can do so by adding the max field as we did below.

By extending constraints to include both parameters and metrics, the Red Sky Ops machine learning algorithm will be more efficient and accurate, spending less time testing irrelevant configurations and more quickly home in on the optimal configuration based on your goals.

metrics:
- name: latency
  minimize: true
  max: 1000   
  type: jsonpath
  query: '{.stats[0].avg_response_time}'
  path: '/stats/requests'
  port: 8089
  selector:
    matchLabels:
      component: locust

Non-optimized metrics

You may want to optimize for a set of metrics but have additional metric thresholds that are “deal breakers” for using in production. You can now track “other” metrics that you are not optimizing, using the optimize field in the experiment file. You can even set constraints on these metrics to fail the trial if it’s above a certain threshold, ensuring you get the most efficiency out of your experiment by only returning results that meet all of your standards of performance. Staying with our HPA example, we could simply add in an optimize field and set it to false (the default behavior is to optimize).

metrics:
- name: latency
  minimize: true   
  optimize: false   
  max: 1000   
  type: jsonpath
  query: '{.stats[0].avg_response_time}'
  path: '/stats/requests'
  port: 8089
  selector:
    matchLabels:
      component: locust

Setting your baseline in your experiment file

Finally, we made it easier to compare our RSO optimized configurations to your current configuration by simply adding to each parameter the values you are currently using as the baseline. When all of your parameters have a baseline set, the first trial of your experiment will be run with this configuration, collecting the metrics. This will show you the baseline in your results, allowing you to quickly visually compare the baseline to the rest of the trials.

parameters:
  - name: voting_cpu
    min: 200
    max: 2000    
    baseline: 1000
  - name: min_replicas
    min: 1
    max: 8    
    baseline: 1
  - name: max_replicas
    min: 1
    max: 8    
    baseline: 8  
  - name: avg_utilization
    min: 10
    max: 80    
    baseline: 80

Try it now

For current customers, these new capabilities and more are available in version 1.8 of the controller, which you can download here. If you are not currently using Red Sky Ops, you can optimize your first application for free, sign up here.