scale{Up,Down}Adjustment to add/remove constant number of replicas on scaling (#315 )

* `scale{Up,Down}Adjustment` to add/remove constant number of replicas on scaling Ref #305 * Bump chart version
Fix enterprise runners misusing cached token (#314 )
2025-12-10 11:41:27 +00:00 · 2021-02-16 17:16:26 +09:00 · 2021-02-16 12:56:52 +09:00 · 2021-02-16 12:44:51 +09:00 · 2021-02-16 09:58:09 +09:00 · 2021-02-16 09:55:55 +09:00
13 changed files with 198 additions and 27 deletions
--- a/api/v1alpha1/horizontalrunnerautoscaler_types.go
+++ b/api/v1alpha1/horizontalrunnerautoscaler_types.go
@@ -126,6 +126,16 @@ type MetricSpec struct {
 	// to determine how many pods should be removed.
 	// +optional
 	ScaleDownFactor string `json:"scaleDownFactor,omitempty"`
+
+	// ScaleUpAdjustment is the number of runners added on scale-up.
+	// You can only specify either ScaleUpFactor or ScaleUpAdjustment.
+	// +optional
+	ScaleUpAdjustment int `json:"scaleUpAdjustment,omitempty"`
+
+	// ScaleDownAdjustment is the number of runners removed on scale-down.
+	// You can only specify either ScaleDownFactor or ScaleDownAdjustment.
+	// +optional
+	ScaleDownAdjustment int `json:"scaleDownAdjustment,omitempty"`
 }

 type HorizontalRunnerAutoscalerStatus struct {
--- a/charts/actions-runner-controller/Chart.yaml
+++ b/charts/actions-runner-controller/Chart.yaml
@@ -15,7 +15,7 @@ type: application
 # This is the chart version. This version number should be incremented each time you make changes
 # to the chart and its templates, including the app version.
 # Versions are expected to follow Semantic Versioning (https://semver.org/)
-version: 0.4.0
+version: 0.5.0

 home: https://github.com/summerwind/actions-runner-controller

--- a/charts/actions-runner-controller/crds/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
+++ b/charts/actions-runner-controller/crds/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
@@ -78,6 +78,11 @@ spec:
                    items:
                      type: string
                    type: array
+                  scaleDownAdjustment:
+                    description: ScaleDownAdjustment is the number of runners removed
+                      on scale-down. You can only specify either ScaleDownFactor or
+                      ScaleDownAdjustment.
+                    type: integer
                  scaleDownFactor:
                    description: ScaleDownFactor is the multiplicative factor applied
                      to the current number of runners used to determine how many
@@ -87,6 +92,10 @@ spec:
                    description: ScaleDownThreshold is the percentage of busy runners
                      less than which will trigger the hpa to scale the runners down.
                    type: string
+                  scaleUpAdjustment:
+                    description: ScaleUpAdjustment is the number of runners added
+                      on scale-up. You can only specify either ScaleUpFactor or ScaleUpAdjustment.
+                    type: integer
                  scaleUpFactor:
                    description: ScaleUpFactor is the multiplicative factor applied
                      to the current number of runners used to determine how many
--- a/charts/actions-runner-controller/values.yaml
+++ b/charts/actions-runner-controller/values.yaml
@@ -28,8 +28,8 @@ image:

 kube_rbac_proxy:
  image:
-    repository: gcr.io/kubebuilder/kube-rbac-proxy
-    tag: v0.4.1
+    repository: quay.io/brancz/kube-rbac-proxy
+    tag: v0.8.0

 imagePullSecrets: []
 nameOverride: ""
--- a/config/crd/bases/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
+++ b/config/crd/bases/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
@@ -78,6 +78,11 @@ spec:
                    items:
                      type: string
                    type: array
+                  scaleDownAdjustment:
+                    description: ScaleDownAdjustment is the number of runners removed
+                      on scale-down. You can only specify either ScaleDownFactor or
+                      ScaleDownAdjustment.
+                    type: integer
                  scaleDownFactor:
                    description: ScaleDownFactor is the multiplicative factor applied
                      to the current number of runners used to determine how many
@@ -87,6 +92,10 @@ spec:
                    description: ScaleDownThreshold is the percentage of busy runners
                      less than which will trigger the hpa to scale the runners down.
                    type: string
+                  scaleUpAdjustment:
+                    description: ScaleUpAdjustment is the number of runners added
+                      on scale-up. You can only specify either ScaleUpFactor or ScaleUpAdjustment.
+                    type: integer
                  scaleUpFactor:
                    description: ScaleUpFactor is the multiplicative factor applied
                      to the current number of runners used to determine how many
--- a/config/default/manager_auth_proxy_patch.yaml
+++ b/config/default/manager_auth_proxy_patch.yaml
@@ -10,7 +10,7 @@ spec:
    spec:
      containers:
      - name: kube-rbac-proxy
-        image: gcr.io/kubebuilder/kube-rbac-proxy:v0.4.1
+        image: quay.io/brancz/kube-rbac-proxy:v0.8.0
        args:
        - "--secure-listen-address=0.0.0.0:8443"
        - "--upstream=http://127.0.0.1:8080/"
--- a/controllers/autoscaling.go
+++ b/controllers/autoscaling.go
@@ -189,6 +189,9 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByQueuedAndInPro
 		"workflow_runs_in_progress", inProgress,
 		"workflow_runs_queued", queued,
 		"workflow_runs_unknown", unknown,
+		"namespace", hra.Namespace,
+		"runner_deployment", rd.Name,
+		"horizontal_runner_autoscaler", hra.Name,
 	)

 	return &replicas, nil
@@ -196,7 +199,6 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByQueuedAndInPro

 func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunnersBusy(rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler) (*int, error) {
 	ctx := context.Background()
-	orgName := rd.Spec.Template.Spec.Organization
 	minReplicas := *hra.Spec.MinReplicas
 	maxReplicas := *hra.Spec.MaxReplicas
 	metrics := hra.Spec.Metrics[0]
@@ -220,14 +222,34 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunn

 		scaleDownThreshold = sdt
 	}
-	if metrics.ScaleUpFactor != "" {
+
+	scaleUpAdjustment := metrics.ScaleUpAdjustment
+	if scaleUpAdjustment != 0 {
+		if metrics.ScaleUpAdjustment < 0 {
+			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleUpAdjustment cannot be lower than 0")
+		}
+
+		if metrics.ScaleUpFactor != "" {
+			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[]: scaleUpAdjustment and scaleUpFactor cannot be specified together")
+		}
+	} else if metrics.ScaleUpFactor != "" {
 		suf, err := strconv.ParseFloat(metrics.ScaleUpFactor, 64)
 		if err != nil {
 			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleUpFactor cannot be parsed into a float64")
 		}
 		scaleUpFactor = suf
 	}
-	if metrics.ScaleDownFactor != "" {
+
+	scaleDownAdjustment := metrics.ScaleDownAdjustment
+	if scaleDownAdjustment != 0 {
+		if metrics.ScaleDownAdjustment < 0 {
+			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleDownAdjustment cannot be lower than 0")
+		}
+
+		if metrics.ScaleDownFactor != "" {
+			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[]: scaleDownAdjustment and scaleDownFactor cannot be specified together")
+		}
+	} else if metrics.ScaleDownFactor != "" {
 		sdf, err := strconv.ParseFloat(metrics.ScaleDownFactor, 64)
 		if err != nil {
 			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleDownFactor cannot be parsed into a float64")
@@ -245,8 +267,18 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunn
 		runnerMap[items.Name] = struct{}{}
 	}

+	var (
+		enterprise   = rd.Spec.Template.Spec.Enterprise
+		organization = rd.Spec.Template.Spec.Organization
+		repository   = rd.Spec.Template.Spec.Repository
+	)
+
 	// ListRunners will return all runners managed by GitHub - not restricted to ns
-	runners, err := r.GitHubClient.ListRunners(ctx, "", orgName, "")
+	runners, err := r.GitHubClient.ListRunners(
+		ctx,
+		enterprise,
+		organization,
+		repository)
 	if err != nil {
 		return nil, err
 	}
@@ -261,9 +293,17 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunn
 	var desiredReplicas int
 	fractionBusy := float64(numRunnersBusy) / float64(numRunners)
 	if fractionBusy >= scaleUpThreshold {
-		desiredReplicas = int(math.Ceil(float64(numRunners) * scaleUpFactor))
+		if scaleUpAdjustment > 0 {
+			desiredReplicas = numRunners + scaleUpAdjustment
+		} else {
+			desiredReplicas = int(math.Ceil(float64(numRunners) * scaleUpFactor))
+		}
 	} else if fractionBusy < scaleDownThreshold {
-		desiredReplicas = int(float64(numRunners) * scaleDownFactor)
+		if scaleDownAdjustment > 0 {
+			desiredReplicas = numRunners - scaleDownAdjustment
+		} else {
+			desiredReplicas = int(float64(numRunners) * scaleDownFactor)
+		}
 	} else {
 		desiredReplicas = *rd.Spec.Replicas
 	}
@@ -282,6 +322,12 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunn
 		"current_replicas", rd.Spec.Replicas,
 		"num_runners", numRunners,
 		"num_runners_busy", numRunnersBusy,
+		"namespace", hra.Namespace,
+		"runner_deployment", rd.Name,
+		"horizontal_runner_autoscaler", hra.Name,
+		"enterprise", enterprise,
+		"organization", organization,
+		"repository", repository,
 	)

 	rd.Status.Replicas = &desiredReplicas
--- a/controllers/runner_controller.go
+++ b/controllers/runner_controller.go
@@ -250,7 +250,7 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
 		if err != nil {
 			var e *github.RunnerNotFound
 			if errors.As(err, &e) {
-				log.Error(err, "Failed to check if runner is busy. Probably this runner has never been successfully registered to GitHub.")
+				log.V(1).Info("Failed to check if runner is busy. Either this runner has never been successfully registered to GitHub or it still needs more time.", "runnerName", runner.Name)

 				notRegistered = true
 			} else {
--- a/controllers/runnerreplicaset_controller.go
+++ b/controllers/runnerreplicaset_controller.go
@@ -111,7 +111,7 @@ func (r *RunnerReplicaSetReconciler) Reconcile(req ctrl.Request) (ctrl.Result, e

 				var e *github.RunnerNotFound
 				if errors.As(err, &e) {
-					log.Error(err, "Failed to check if runner is busy. Probably this runner has never been successfully registered to GitHub, and therefore we prioritize it for deletion", "runnerName", runner.Name)
+					log.V(1).Info("Failed to check if runner is busy. Either this runner has never been successfully registered to GitHub or has not managed yet to, and therefore we prioritize it for deletion", "runnerName", runner.Name)
 					notRegistered = true
 				} else {
 					var e *gogithub.RateLimitError
--- a/github/fake/fake.go
+++ b/github/fake/fake.go
@@ -102,6 +102,18 @@ func NewServer(opts ...Option) *httptest.Server {
 			Status: http.StatusBadRequest,
 			Body:   "",
 		},
+		"/enterprises/test/actions/runners/registration-token": &Handler{
+			Status: http.StatusCreated,
+			Body:   fmt.Sprintf("{\"token\": \"%s\", \"expires_at\": \"%s\"}", RegistrationToken, time.Now().Add(time.Hour*1).Format(time.RFC3339)),
+		},
+		"/enterprises/invalid/actions/runners/registration-token": &Handler{
+			Status: http.StatusOK,
+			Body:   fmt.Sprintf("{\"token\": \"%s\", \"expires_at\": \"%s\"}", RegistrationToken, time.Now().Add(time.Hour*1).Format(time.RFC3339)),
+		},
+		"/enterprises/error/actions/runners/registration-token": &Handler{
+			Status: http.StatusBadRequest,
+			Body:   "",
+		},

 		// For ListRunners
 		"/repos/test/valid/actions/runners": config.FixedResponses.ListRunners,
@@ -125,6 +137,18 @@ func NewServer(opts ...Option) *httptest.Server {
 			Status: http.StatusBadRequest,
 			Body:   "",
 		},
+		"/enterprises/test/actions/runners": &Handler{
+			Status: http.StatusOK,
+			Body:   RunnersListBody,
+		},
+		"/enterprises/invalid/actions/runners": &Handler{
+			Status: http.StatusNoContent,
+			Body:   "",
+		},
+		"/enterprises/error/actions/runners": &Handler{
+			Status: http.StatusBadRequest,
+			Body:   "",
+		},

 		// For RemoveRunner
 		"/repos/test/valid/actions/runners/1": &Handler{
@@ -151,6 +175,18 @@ func NewServer(opts ...Option) *httptest.Server {
 			Status: http.StatusBadRequest,
 			Body:   "",
 		},
+		"/enterprises/test/actions/runners/1": &Handler{
+			Status: http.StatusNoContent,
+			Body:   "",
+		},
+		"/enterprises/invalid/actions/runners/1": &Handler{
+			Status: http.StatusOK,
+			Body:   "",
+		},
+		"/enterprises/error/actions/runners/1": &Handler{
+			Status: http.StatusBadRequest,
+			Body:   "",
+		},

 		// For auto-scaling based on the number of queued(pending) workflow runs
 		"/repos/test/valid/actions/runs": config.FixedResponses.ListRepositoryWorkflowRuns,
--- a/github/github.go
+++ b/github/github.go
@@ -11,6 +11,7 @@ import (

 	"github.com/bradleyfalzon/ghinstallation"
 	"github.com/google/go-github/v33/github"
+	"github.com/summerwind/actions-runner-controller/github/metrics"
 	"golang.org/x/oauth2"
 )

@@ -34,15 +35,9 @@ type Client struct {

 // NewClient creates a Github Client
 func (c *Config) NewClient() (*Client, error) {
-	var (
-		httpClient *http.Client
-		client     *github.Client
-	)
-	githubBaseURL := "https://github.com/"
+	var transport http.RoundTripper
 	if len(c.Token) > 0 {
-		httpClient = oauth2.NewClient(context.Background(), oauth2.StaticTokenSource(
-			&oauth2.Token{AccessToken: c.Token},
-		))
+		transport = oauth2.NewClient(context.Background(), oauth2.StaticTokenSource(&oauth2.Token{AccessToken: c.Token})).Transport
 	} else {
 		tr, err := ghinstallation.NewKeyFromFile(http.DefaultTransport, c.AppID, c.AppInstallationID, c.AppPrivateKey)
 		if err != nil {
@@ -55,9 +50,13 @@ func (c *Config) NewClient() (*Client, error) {
 			}
 			tr.BaseURL = githubAPIURL
 		}
-		httpClient = &http.Client{Transport: tr}
+		transport = tr
 	}
+	transport = metrics.Transport{Transport: transport}
+	httpClient := &http.Client{Transport: transport}

+	var client *github.Client
+	var githubBaseURL string
 	if len(c.EnterpriseURL) > 0 {
 		var err error
 		client, err = github.NewEnterpriseClient(c.EnterpriseURL, c.EnterpriseURL, httpClient)
@@ -67,6 +66,7 @@ func (c *Config) NewClient() (*Client, error) {
 		githubBaseURL = fmt.Sprintf("%s://%s%s", client.BaseURL.Scheme, client.BaseURL.Host, strings.TrimSuffix(client.BaseURL.Path, "api/v3/"))
 	} else {
 		client = github.NewClient(httpClient)
+		githubBaseURL = "https://github.com/"
 	}

 	return &Client{
@@ -82,7 +82,7 @@ func (c *Client) GetRegistrationToken(ctx context.Context, enterprise, org, repo
 	c.mu.Lock()
 	defer c.mu.Unlock()

-	key := getRegistrationKey(org, repo)
+	key := getRegistrationKey(org, repo, enterprise)
 	rt, ok := c.regTokens[key]

 	if ok && rt.GetExpiresAt().After(time.Now()) {
@@ -250,11 +250,8 @@ func getEnterpriseOrganisationAndRepo(enterprise, org, repo string) (string, str
 	return "", "", "", fmt.Errorf("enterprise, organization and repository are all empty")
 }

-func getRegistrationKey(org, repo string) string {
-	if len(org) > 0 {
-		return org
-	}
-	return repo
+func getRegistrationKey(org, repo, enterprise string) string {
+	return fmt.Sprintf("org=%s,repo=%s,enterprise=%s", org, repo, enterprise)
 }

 func splitOwnerAndRepo(repo string) (string, string, error) {
--- a/github/metrics/transport.go
+++ b/github/metrics/transport.go
@@ -0,0 +1,63 @@
+// Package metrics provides monitoring of the GitHub related metrics.
+//
+// This depends on the metrics exporter of kubebuilder.
+// See https://book.kubebuilder.io/reference/metrics.html for details.
+package metrics
+
+import (
+	"net/http"
+	"strconv"
+
+	"github.com/prometheus/client_golang/prometheus"
+	"sigs.k8s.io/controller-runtime/pkg/metrics"
+)
+
+func init() {
+	metrics.Registry.MustRegister(metricRateLimit, metricRateLimitRemaining)
+}
+
+var (
+	// https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-limiting
+	metricRateLimit = prometheus.NewGauge(
+		prometheus.GaugeOpts{
+			Name: "github_rate_limit",
+			Help: "The maximum number of requests you're permitted to make per hour",
+		},
+	)
+	metricRateLimitRemaining = prometheus.NewGauge(
+		prometheus.GaugeOpts{
+			Name: "github_rate_limit_remaining",
+			Help: "The number of requests remaining in the current rate limit window",
+		},
+	)
+)
+
+const (
+	// https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-limiting
+	headerRateLimit          = "X-RateLimit-Limit"
+	headerRateLimitRemaining = "X-RateLimit-Remaining"
+)
+
+// Transport wraps a transport with metrics monitoring
+type Transport struct {
+	Transport http.RoundTripper
+}
+
+func (t Transport) RoundTrip(req *http.Request) (*http.Response, error) {
+	resp, err := t.Transport.RoundTrip(req)
+	if resp != nil {
+		parseResponse(resp)
+	}
+	return resp, err
+}
+
+func parseResponse(resp *http.Response) {
+	rateLimit, err := strconv.Atoi(resp.Header.Get(headerRateLimit))
+	if err == nil {
+		metricRateLimit.Set(float64(rateLimit))
+	}
+	rateLimitRemaining, err := strconv.Atoi(resp.Header.Get(headerRateLimitRemaining))
+	if err == nil {
+		metricRateLimitRemaining.Set(float64(rateLimitRemaining))
+	}
+}
--- a/go.mod
+++ b/go.mod
@@ -11,6 +11,7 @@ require (
 	github.com/kelseyhightower/envconfig v1.4.0
 	github.com/onsi/ginkgo v1.8.0
 	github.com/onsi/gomega v1.5.0
+	github.com/prometheus/client_golang v0.9.2
 	github.com/stretchr/testify v1.4.0 // indirect
 	golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45
 	k8s.io/api v0.0.0-20190918155943-95b840bb6a1f
Author	SHA1	Message	Date
Yusuke Kuoka	434823bcb3	`scale{Up,Down}Adjustment` to add/remove constant number of replicas on scaling (#315 ) * `scale{Up,Down}Adjustment` to add/remove constant number of replicas on scaling Ref #305 * Bump chart version	2021-02-16 17:16:26 +09:00
Yusuke Kuoka	35d047db01	Fix enterprise runners misusing cached token (#314 ) Follow-up for #290	2021-02-16 12:56:52 +09:00
Yusuke Kuoka	f1db6af1c5	Add repository runners support for PercentageRunnersBusy-based autoscaling (#313 ) Resolves #258	2021-02-16 12:44:51 +09:00
Hidetake Iwata	4f3f2fb60d	Add metrics for GitHub API rate limit (#312 )	2021-02-16 09:58:09 +09:00
Johannes Nicolai	2623140c9a	Make log message less scary (#311 ) * the reconciliation loop is often much faster than the runner startup, so changing runner not found related messages to debug and also add the possibility that the runner just needs more time	2021-02-16 09:55:55 +09:00
Johannes Nicolai	1db9d9d574	Use ARM64 compatible kube-rbac-proxy from upstream (#310 ) * as pointed out in #281 the currently used image for the kube-rbac-proxy - gcr.io/kubebuilder/kube-rbac-proxy:v0.4.1" - does not have an ARM64 image * hence, trying to use the standard deployment manifest / helm char will fail on ARM64 systems * replaced image with quay.io/brancz/kube-rbac-proxy:v0.8.0 which is the latest version from the upstream maintainer (https://github.com/brancz/kube-rbac-proxy/blob/master/Makefile#L13) * successfully tested on both AMD64 and ARM64 clusters * fixes #281	2021-02-16 09:55:03 +09:00