Update README for "PercentageRunnersBusy" HRA metric type (#237 )

* adding readme for new hpa scheme * callum's comments Co-authored-by: Zachary Benamram <zacharybenamram@blend.com>
Autoscaling: Percentage runners busy - remove magic number used for round up (#235 )
2025-12-10 11:41:27 +00:00 · 2020-12-17 10:21:27 +09:00 · 2020-12-15 14:38:01 +09:00 · 2020-12-13 08:48:19 +09:00 · 2020-12-13 08:33:04 +09:00 · 2020-12-13 08:31:24 +09:00
11 changed files with 216 additions and 109 deletions
--- a/README.md
+++ b/README.md
@@ -267,6 +267,28 @@ spec:
    - summerwind/actions-runner-controller
 ```

+If you do not want to manage an explicit list of repositories to scale, an alternate autoscaling scheme that can be applied is the PercentageRunnersBusy scheme. The number of desired pods are evaulated by checking how many runners are currently busy and applying a scaleup or scale down factor if certain thresholds are met. By setting the metric type to PercentageRunnersBusy, the HorizontalRunnerAutoscaler will query github for the number of busy runners which live in the RunnerDeployment namespace. Scaleup and scaledown thresholds are the percentage of busy runners at which the number of desired runners are re-evaluated. Scaleup and scaledown factors are the multiplicative factor applied to the current number of runners used to calculate the number of desired runners. This scheme is also especially useful if you want multiple controllers in various clusters, each responsible for scaling their own runner pods per namespace.
+
+```yaml
+---
+apiVersion: actions.summerwind.dev/v1alpha1
+kind: HorizontalRunnerAutoscaler
+metadata:
+  name: example-runner-deployment-autoscaler
+spec:
+  scaleTargetRef:
+    name: example-runner-deployment
+  minReplicas: 1
+  maxReplicas: 3
+  scaleDownDelaySecondsAfterScaleOut: 60
+  metrics:
+  - type: PercentageRunnersBusy
+    scaleUpThreshold: '0.75'
+    scaleDownThreshold: '0.3'
+    scaleUpFactor: '1.4'
+    scaleDownFactor: '0.7'
+```
+
 ## Runner with DinD

 When using default runner, runner pod starts up 2 containers: runner and DinD (Docker-in-Docker). This might create issues if there's `LimitRange` set to namespace.
@@ -321,7 +343,7 @@ spec:
        requests:
          cpu: "2.0"
          memory: "4Gi"
-      # If set to false, there are no privileged container and you cannot use docker. 
+      # If set to false, there are no privileged container and you cannot use docker.
      dockerEnabled: false
      # If set to true, runner pod container only 1 container that's expected to be able to run docker, too.
      # image summerwind/actions-runner-dind or custom one should be used with true -value
--- a/api/v1alpha1/horizontalrunnerautoscaler_types.go
+++ b/api/v1alpha1/horizontalrunnerautoscaler_types.go
@@ -56,6 +56,26 @@ type MetricSpec struct {
 	// For example, a repository name is the REPO part of `github.com/USER/REPO`.
 	// +optional
 	RepositoryNames []string `json:"repositoryNames,omitempty"`
+
+	// ScaleUpThreshold is the percentage of busy runners greater than which will
+	// trigger the hpa to scale runners up.
+	// +optional
+	ScaleUpThreshold string `json:"scaleUpThreshold,omitempty"`
+
+	// ScaleDownThreshold is the percentage of busy runners less than which will
+	// trigger the hpa to scale the runners down.
+	// +optional
+	ScaleDownThreshold string `json:"scaleDownThreshold,omitempty"`
+
+	// ScaleUpFactor is the multiplicative factor applied to the current number of runners used
+	// to determine how many pods should be added.
+	// +optional
+	ScaleUpFactor string `json:"scaleUpFactor,omitempty"`
+
+	// ScaleDownFactor is the multiplicative factor applied to the current number of runners used
+	// to determine how many pods should be removed.
+	// +optional
+	ScaleDownFactor string `json:"scaleDownFactor,omitempty"`
 }

 type HorizontalRunnerAutoscalerStatus struct {
--- a/api/v1alpha1/runnerdeployment_types.go
+++ b/api/v1alpha1/runnerdeployment_types.go
@@ -22,6 +22,7 @@ import (

 const (
 	AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns = "TotalNumberOfQueuedAndInProgressWorkflowRuns"
+	AutoscalingMetricTypePercentageRunnersBusy                        = "PercentageRunnersBusy"
 )

 // RunnerReplicaSetSpec defines the desired state of RunnerDeployment
--- a/charts/actions-runner-controller/crds/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
+++ b/charts/actions-runner-controller/crds/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
@@ -64,6 +64,24 @@ spec:
                    items:
                      type: string
                    type: array
+                  scaleDownFactor:
+                    description: ScaleDownFactor is the multiplicative factor applied
+                      to the current number of runners used to determine how many
+                      pods should be removed.
+                    type: string
+                  scaleDownThreshold:
+                    description: ScaleDownThreshold is the percentage of busy runners
+                      less than which will trigger the hpa to scale the runners down.
+                    type: string
+                  scaleUpFactor:
+                    description: ScaleUpFactor is the multiplicative factor applied
+                      to the current number of runners used to determine how many
+                      pods should be added.
+                    type: string
+                  scaleUpThreshold:
+                    description: ScaleUpThreshold is the percentage of busy runners
+                      greater than which will trigger the hpa to scale runners up.
+                    type: string
                  type:
                    description: Type is the type of metric to be used for autoscaling.
                      The only supported Type is TotalNumberOfQueuedAndInProgressWorkflowRuns
--- a/charts/actions-runner-controller/templates/_helpers.tpl
+++ b/charts/actions-runner-controller/templates/_helpers.tpl
@@ -40,6 +40,9 @@ helm.sh/chart: {{ include "actions-runner-controller.chart" . }}
 app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
 {{- end }}
 app.kubernetes.io/managed-by: {{ .Release.Service }}
+{{- range $k, $v := .Values.labels }}
+{{ $k }}: {{ $v }}
+{{- end }}
 {{- end }}

 {{/*
--- a/charts/actions-runner-controller/templates/deployment.yaml
+++ b/charts/actions-runner-controller/templates/deployment.yaml
@@ -33,6 +33,7 @@ spec:
        - "--metrics-addr=127.0.0.1:8080"
        - "--enable-leader-election"
        - "--sync-period={{ .Values.syncPeriod }}"
+        - "--docker-image={{ .Values.image.dindSidecarRepositoryAndTag }}"
        command:
        - "/manager"
        env:
--- a/charts/actions-runner-controller/values.yaml
+++ b/charts/actions-runner-controller/values.yaml
@@ -2,15 +2,18 @@
 # This is a YAML-formatted file.
 # Declare variables to be passed into your templates.

+labels: {}
+
 replicaCount: 1

 syncPeriod: 10m

 image:
  repository: summerwind/actions-runner-controller
+  # Overrides the manager image tag whose default is the chart appVersion if the tag key is commented out
+  tag: "latest"
+  dindSidecarRepositoryAndTag: "docker:dind"
  pullPolicy: IfNotPresent
-  # Overrides the image tag whose default is the chart appVersion.
-  tag: ""

 imagePullSecrets: []
 nameOverride: ""
--- a/config/crd/bases/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
+++ b/config/crd/bases/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
@@ -64,6 +64,24 @@ spec:
                    items:
                      type: string
                    type: array
+                  scaleDownFactor:
+                    description: ScaleDownFactor is the multiplicative factor applied
+                      to the current number of runners used to determine how many
+                      pods should be removed.
+                    type: string
+                  scaleDownThreshold:
+                    description: ScaleDownThreshold is the percentage of busy runners
+                      less than which will trigger the hpa to scale the runners down.
+                    type: string
+                  scaleUpFactor:
+                    description: ScaleUpFactor is the multiplicative factor applied
+                      to the current number of runners used to determine how many
+                      pods should be added.
+                    type: string
+                  scaleUpThreshold:
+                    description: ScaleUpThreshold is the percentage of busy runners
+                      greater than which will trigger the hpa to scale runners up.
+                    type: string
                  type:
                    description: Type is the type of metric to be used for autoscaling.
                      The only supported Type is TotalNumberOfQueuedAndInProgressWorkflowRuns
--- a/controllers/autoscaling.go
+++ b/controllers/autoscaling.go
@@ -4,9 +4,19 @@ import (
 	"context"
 	"errors"
 	"fmt"
+	"math"
+	"strconv"
 	"strings"

 	"github.com/summerwind/actions-runner-controller/api/v1alpha1"
+	"sigs.k8s.io/controller-runtime/pkg/client"
+)
+
+const (
+	defaultScaleUpThreshold   = 0.8
+	defaultScaleDownThreshold = 0.3
+	defaultScaleUpFactor      = 1.3
+	defaultScaleDownFactor    = 0.7
 )

 func (r *HorizontalRunnerAutoscalerReconciler) determineDesiredReplicas(rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler) (*int, error) {
@@ -16,8 +26,20 @@ func (r *HorizontalRunnerAutoscalerReconciler) determineDesiredReplicas(rd v1alp
 		return nil, fmt.Errorf("horizontalrunnerautoscaler %s/%s is missing maxReplicas", hra.Namespace, hra.Name)
 	}

-	var repos [][]string
+	metrics := hra.Spec.Metrics
+	if len(metrics) == 0 || metrics[0].Type == v1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns {
+		return r.calculateReplicasByQueuedAndInProgressWorkflowRuns(rd, hra)
+	} else if metrics[0].Type == v1alpha1.AutoscalingMetricTypePercentageRunnersBusy {
+		return r.calculateReplicasByPercentageRunnersBusy(rd, hra)
+	} else {
+		return nil, fmt.Errorf("validting autoscaling metrics: unsupported metric type %q", metrics[0].Type)
+	}
+}

+func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByQueuedAndInProgressWorkflowRuns(rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler) (*int, error) {
+
+	var repos [][]string
+	metrics := hra.Spec.Metrics
 	repoID := rd.Spec.Template.Spec.Repository
 	if repoID == "" {
 		orgName := rd.Spec.Template.Spec.Organization
@@ -25,13 +47,7 @@ func (r *HorizontalRunnerAutoscalerReconciler) determineDesiredReplicas(rd v1alp
 			return nil, fmt.Errorf("asserting runner deployment spec to detect bug: spec.template.organization should not be empty on this code path")
 		}

-		metrics := hra.Spec.Metrics
-
-		if len(metrics) == 0 {
-			return nil, fmt.Errorf("validating autoscaling metrics: one or more metrics is required")
-		} else if tpe := metrics[0].Type; tpe != v1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns {
-			return nil, fmt.Errorf("validting autoscaling metrics: unsupported metric type %q: only supported value is %s", tpe, v1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns)
-		} else if len(metrics[0].RepositoryNames) == 0 {
+		if len(metrics[0].RepositoryNames) == 0 {
 			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].repositoryNames is required and must have one more more entries for organizational runner deployment")
 		}

@@ -135,3 +151,103 @@ func (r *HorizontalRunnerAutoscalerReconciler) determineDesiredReplicas(rd v1alp

 	return &replicas, nil
 }
+
+func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunnersBusy(rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler) (*int, error) {
+	ctx := context.Background()
+	orgName := rd.Spec.Template.Spec.Organization
+	minReplicas := *hra.Spec.MinReplicas
+	maxReplicas := *hra.Spec.MaxReplicas
+	metrics := hra.Spec.Metrics[0]
+	scaleUpThreshold := defaultScaleUpThreshold
+	scaleDownThreshold := defaultScaleDownThreshold
+	scaleUpFactor := defaultScaleUpFactor
+	scaleDownFactor := defaultScaleDownFactor
+
+	if metrics.ScaleUpThreshold != "" {
+		sut, err := strconv.ParseFloat(metrics.ScaleUpThreshold, 64)
+		if err != nil {
+			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleUpThreshold cannot be parsed into a float64")
+		}
+		scaleUpThreshold = sut
+	}
+	if metrics.ScaleDownThreshold != "" {
+		sdt, err := strconv.ParseFloat(metrics.ScaleDownThreshold, 64)
+		if err != nil {
+			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleDownThreshold cannot be parsed into a float64")
+		}
+
+		scaleDownThreshold = sdt
+	}
+	if metrics.ScaleUpFactor != "" {
+		suf, err := strconv.ParseFloat(metrics.ScaleUpFactor, 64)
+		if err != nil {
+			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleUpFactor cannot be parsed into a float64")
+		}
+		scaleUpFactor = suf
+	}
+	if metrics.ScaleDownFactor != "" {
+		sdf, err := strconv.ParseFloat(metrics.ScaleDownFactor, 64)
+		if err != nil {
+			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleDownFactor cannot be parsed into a float64")
+		}
+		scaleDownFactor = sdf
+	}
+
+	// return the list of runners in namespace. Horizontal Runner Autoscaler should only be responsible for scaling resources in its own ns.
+	var runnerList v1alpha1.RunnerList
+	if err := r.List(ctx, &runnerList, client.InNamespace(rd.Namespace)); err != nil {
+		return nil, err
+	}
+	runnerMap := make(map[string]struct{})
+	for _, items := range runnerList.Items {
+		runnerMap[items.Name] = struct{}{}
+	}
+
+	// ListRunners will return all runners managed by GitHub - not restricted to ns
+	runners, err := r.GitHubClient.ListRunners(ctx, orgName, "")
+	if err != nil {
+		return nil, err
+	}
+	numRunners := len(runnerList.Items)
+	numRunnersBusy := 0
+	for _, runner := range runners {
+		if _, ok := runnerMap[*runner.Name]; ok && runner.GetBusy() {
+			numRunnersBusy++
+		}
+	}
+
+	var desiredReplicas int
+	fractionBusy := float64(numRunnersBusy) / float64(numRunners)
+	if fractionBusy >= scaleUpThreshold {
+		scaleUpReplicas := int(math.Ceil(float64(numRunners) * scaleUpFactor))
+		if scaleUpReplicas > maxReplicas {
+			desiredReplicas = maxReplicas
+		} else {
+			desiredReplicas = scaleUpReplicas
+		}
+	} else if fractionBusy < scaleDownThreshold {
+		scaleDownReplicas := int(float64(numRunners) * scaleDownFactor)
+		if scaleDownReplicas < minReplicas {
+			desiredReplicas = minReplicas
+		} else {
+			desiredReplicas = scaleDownReplicas
+		}
+	} else {
+		desiredReplicas = *rd.Spec.Replicas
+	}
+
+	r.Log.V(1).Info(
+		"Calculated desired replicas",
+		"computed_replicas_desired", desiredReplicas,
+		"spec_replicas_min", minReplicas,
+		"spec_replicas_max", maxReplicas,
+		"current_replicas", rd.Spec.Replicas,
+		"num_runners", numRunners,
+		"num_runners_busy", numRunnersBusy,
+	)
+
+	rd.Status.Replicas = &desiredReplicas
+	replicas := desiredReplicas
+
+	return &replicas, nil
+}
--- a/github/github.go
+++ b/github/github.go
@@ -182,7 +182,7 @@ func (c *Client) createRegistrationToken(ctx context.Context, owner, repo string
 		return c.Client.Actions.CreateRegistrationToken(ctx, owner, repo)
 	}

-	return CreateOrganizationRegistrationToken(ctx, c, owner)
+	return c.Client.Actions.CreateOrganizationRegistrationToken(ctx, owner)
 }

 func (c *Client) removeRunner(ctx context.Context, owner, repo string, runnerID int64) (*github.Response, error) {
@@ -190,7 +190,7 @@ func (c *Client) removeRunner(ctx context.Context, owner, repo string, runnerID
 		return c.Client.Actions.RemoveRunner(ctx, owner, repo, runnerID)
 	}

-	return RemoveOrganizationRunner(ctx, c, owner, runnerID)
+	return c.Client.Actions.RemoveOrganizationRunner(ctx, owner, runnerID)
 }

 func (c *Client) listRunners(ctx context.Context, owner, repo string, opts *github.ListOptions) (*github.Runners, *github.Response, error) {
@@ -198,7 +198,7 @@ func (c *Client) listRunners(ctx context.Context, owner, repo string, opts *gith
 		return c.Client.Actions.ListRunners(ctx, owner, repo, opts)
 	}

-	return ListOrganizationRunners(ctx, c, owner, opts)
+	return c.Client.Actions.ListOrganizationRunners(ctx, owner, opts)
 }

 // Validates owner and repo arguments. Both are optional, but at least one should be specified
--- a/github/github_beta.go
+++ b/github/github_beta.go
@@ -1,95 +0,0 @@
-package github
-
-// this contains BETA API clients, that are currently not (yet) in go-github
-// once these functions have been added there, they can be removed from here
-// code was reused from https://github.com/google/go-github
-
-import (
-	"context"
-	"fmt"
-	"net/url"
-	"reflect"
-
-	"github.com/google/go-github/v33/github"
-	"github.com/google/go-querystring/query"
-)
-
-// CreateOrganizationRegistrationToken creates a token that can be used to add a self-hosted runner on an organization.
-//
-// GitHub API docs: https://developer.github.com/v3/actions/self-hosted-runners/#create-a-registration-token-for-an-organization
-func CreateOrganizationRegistrationToken(ctx context.Context, client *Client, owner string) (*github.RegistrationToken, *github.Response, error) {
-	u := fmt.Sprintf("orgs/%v/actions/runners/registration-token", owner)
-
-	req, err := client.NewRequest("POST", u, nil)
-	if err != nil {
-		return nil, nil, err
-	}
-
-	registrationToken := new(github.RegistrationToken)
-	resp, err := client.Do(ctx, req, registrationToken)
-	if err != nil {
-		return nil, resp, err
-	}
-
-	return registrationToken, resp, nil
-}
-
-// ListOrganizationRunners lists all the self-hosted runners for an organization.
-//
-// GitHub API docs: https://developer.github.com/v3/actions/self-hosted-runners/#list-self-hosted-runners-for-an-organization
-func ListOrganizationRunners(ctx context.Context, client *Client, owner string, opts *github.ListOptions) (*github.Runners, *github.Response, error) {
-	u := fmt.Sprintf("orgs/%v/actions/runners", owner)
-	u, err := addOptions(u, opts)
-	if err != nil {
-		return nil, nil, err
-	}
-
-	req, err := client.NewRequest("GET", u, nil)
-	if err != nil {
-		return nil, nil, err
-	}
-
-	runners := &github.Runners{}
-	resp, err := client.Do(ctx, req, &runners)
-	if err != nil {
-		return nil, resp, err
-	}
-
-	return runners, resp, nil
-}
-
-// RemoveOrganizationRunner forces the removal of a self-hosted runner in a repository using the runner id.
-//
-// GitHub API docs: https://developer.github.com/v3/actions/self_hosted_runners/#remove-a-self-hosted-runner
-func RemoveOrganizationRunner(ctx context.Context, client *Client, owner string, runnerID int64) (*github.Response, error) {
-	u := fmt.Sprintf("orgs/%v/actions/runners/%v", owner, runnerID)
-
-	req, err := client.NewRequest("DELETE", u, nil)
-	if err != nil {
-		return nil, err
-	}
-
-	return client.Do(ctx, req, nil)
-}
-
-// addOptions adds the parameters in opt as URL query parameters to s. opt
-// must be a struct whose fields may contain "url" tags.
-func addOptions(s string, opts interface{}) (string, error) {
-	v := reflect.ValueOf(opts)
-	if v.Kind() == reflect.Ptr && v.IsNil() {
-		return s, nil
-	}
-
-	u, err := url.Parse(s)
-	if err != nil {
-		return s, err
-	}
-
-	qs, err := query.Values(opts)
-	if err != nil {
-		return s, err
-	}
-
-	u.RawQuery = qs.Encode()
-	return u.String(), nil
-}
Author	SHA1	Message	Date
ZacharyBenamram	0dadddfc7d	Update README for "PercentageRunnersBusy" HRA metric type (#237 ) * adding readme for new hpa scheme * callum's comments Co-authored-by: Zachary Benamram <zacharybenamram@blend.com>	2020-12-17 10:21:27 +09:00
ZacharyBenamram	48923fec56	Autoscaling: Percentage runners busy - remove magic number used for round up (#235 ) * remove magic number for autoscaling Co-authored-by: Zachary Benamram <zacharybenamram@blend.com>	2020-12-15 14:38:01 +09:00
ZacharyBenamram	466b30728d	Add "PercentageRunnersBusy" horizontal runner autoscaler metric type (#223 ) * hpa scheme based off busy runners * running make manifests Co-authored-by: Zachary Benamram <zacharybenamram@blend.com>	2020-12-13 08:48:19 +09:00
callum-tait-pbx	c13704d7e2	feat: custom labels (#231 ) Co-authored-by: Callum Tait <callum.tait@PBXUK-HH-05772.photobox.priv>	2020-12-13 08:33:04 +09:00
callum-tait-pbx	fb49bbda75	feat: adding helm config for dind sidecar (#232 ) Co-authored-by: Callum Tait <callum.tait@PBXUK-HH-05772.photobox.priv>	2020-12-13 08:31:24 +09:00
Reinier Timmer	8d6f77e07c	Remove beta GitHub client implementations (#228 )	2020-12-10 09:08:51 +09:00