Merge pull request #398 from summerwind/fix-status-last-reg-check-time-type-err

Fix `status.lastRegistrationCheckTime in body must be of type string: \"null\"` error
Do patch runner status instead of update to reduce conflicts and avoid future bugs
2025-12-10 11:41:27 +00:00 · 2021-03-18 10:36:44 +09:00 · 2021-03-18 10:31:17 +09:00 · 2021-03-18 10:26:21 +09:00 · 2021-03-18 10:20:49 +09:00 · 2021-03-18 07:36:22 +09:00
25 changed files with 1036 additions and 98 deletions
--- a/README.md
+++ b/README.md
@@ -292,7 +292,7 @@ A `RunnerDeployment` can scale the number of runners between `minReplicas` and `

 **TotalNumberOfQueuedAndInProgressWorkflowRuns**

-In the below example, `actions-runner` will pole GitHub for all pending workflows with the pole period defined by the sync period configuration. It will then scale to e.g. 3 if there're 3 pending jobs at sync time.
+In the below example, `actions-runner` will poll GitHub for all pending workflows with the poll period defined by the sync period configuration. It will then scale to e.g. 3 if there're 3 pending jobs at sync time.
 With this scaling metric we are required to define a list of repositories within our metric.

 The scale out performance is controlled via the manager containers startup `--sync-period` argument. The default value is set to 10 minutes to prevent default deployments rate limiting themselves from the GitHub API.
@@ -302,12 +302,12 @@ The scale out performance is controlled via the manager containers startup `--sy

 **Benefits of this metric**
 1. Supports named repositories allowing you to restrict the runner to a specified set of repositories server side.
-2. Scales quickly (within the bounds of the syncPeriod) as it will spin up the number of runners based on the depth of the workflow queue
+2. Scales the runner count based on the actual queue depth of the jobs meaning a more 1:1 scaling of runners to queued jobs.
 3. Like all scaling metrics, you can manage workflow allocation to the RunnerDeployment through the use of [Github labels](#runner-labels).

 **Drawbacks of this metric**
 1. Repositories must be named within the scaling metric, maintaining a list of repositories may not be viable in larger environments or self-serve environments.
-2. May not scale quick enough for some users needs
+2. May not scale quick enough for some users needs. This metric is pull based and so the queue depth is polled as configured by the sync period, as a result scaling performance is bound by this sync period meaning there is a lag to scaling activity.
 3. Relatively large amounts of API requests required to maintain this metric, you may run in API rate limiting issues depending on the size of your environment and how aggressive your sync period configuration is


@@ -349,20 +349,20 @@ spec:

 **PercentageRunnersBusy**

-The `HorizontalRunnerAutoscaler` will pole GitHub based on the configuration sync period for the number of busy runners which live in the RunnerDeployment's namespace and scale based on the settings
+The `HorizontalRunnerAutoscaler` will poll GitHub based on the configuration sync period for the number of busy runners which live in the RunnerDeployment's namespace and scale based on the settings

 **Kustomize Config :** The period can be customised in the `config/default/manager_auth_proxy_patch.yaml` patch<br />
 **Helm Config :** `syncPeriod`

 **Benefits of this metric**
-1. Allows for multiple controllers to be deployed as each controller deployed is responsible for scaling their own runner pods on a per namespace basis.
-2. Supports named repositories server side the same as the `TotalNumberOfQueuedAndInProgressWorkflowRuns` metric [#313](https://github.com/summerwind/actions-runner-controller/pull/313)
-3. Supports github organisation wide scaling without maintaining an explicit list of repositories, this is especially useful for those that are working at a larger scale. [#223](https://github.com/summerwind/actions-runner-controller/pull/223)
-4. Like all scaling metrics, you can manage workflow allocation to the RunnerDeployment through the use of [Github labels](#runner-labels)
-5. Supports scaling runner count on both a percentage increase / descrease basis as well as on a fixed runner count basis [#223](https://github.com/summerwind/actions-runner-controller/pull/223) [#315](https://github.com/summerwind/actions-runner-controller/pull/315)
+1. Supports named repositories server side the same as the `TotalNumberOfQueuedAndInProgressWorkflowRuns` metric [#313](https://github.com/summerwind/actions-runner-controller/pull/313)
+2. Supports GitHub organisation wide scaling without maintaining an explicit list of repositories, this is especially useful for those that are working at a larger scale. [#223](https://github.com/summerwind/actions-runner-controller/pull/223)
+3. Like all scaling metrics, you can manage workflow allocation to the RunnerDeployment through the use of [Github labels](#runner-labels)
+4. Supports scaling desired runner count on both a percentage increase / decrease basis as well as on a fixed increase / decrease count basis [#223](https://github.com/summerwind/actions-runner-controller/pull/223) [#315](https://github.com/summerwind/actions-runner-controller/pull/315)

 **Drawbacks of this metric**
-1. May not scale quick enough for some users needs as we are scaling up and down based on indicative information rather than a direct count of the workflow queue depth
+1. May not scale quick enough for some users needs. This metric is pull based and so the number of busy runners are polled as configured by the sync period, as a result scaling performance is bound by this sync period meaning there is a lag to scaling activity.
+2. We are scaling up and down based on indicative information rather than a count of the actual number of queued jobs and so the desired runner count is likely to under provision new runners or overprovision them relative to actual job queue depth, this may or may not be a problem for you.


 Examples of each scaling type implemented with a `RunnerDeployment` backed by a `HorizontalRunnerAutoscaler`:
--- a/api/v1alpha1/horizontalrunnerautoscaler_types.go
+++ b/api/v1alpha1/horizontalrunnerautoscaler_types.go
@@ -72,6 +72,12 @@ type GitHubEventScaleUpTriggerSpec struct {
 type CheckRunSpec struct {
 	Types  []string `json:"types,omitempty"`
 	Status string   `json:"status,omitempty"`
+
+	// Names is a list of GitHub Actions glob patterns.
+	// Any check_run event whose name matches one of patterns in the list can trigger autoscaling.
+	// Note that check_run name seem to equal to the job name you've defined in your actions workflow yaml file.
+	// So it is very likely that you can utilize this to trigger depending on the job.
+	Names []string `json:"names,omitempty"`
 }

 // https://docs.github.com/en/actions/reference/events-that-trigger-workflows#pull_request
@@ -150,6 +156,7 @@ type HorizontalRunnerAutoscalerStatus struct {
 	DesiredReplicas *int `json:"desiredReplicas,omitempty"`

 	// +optional
+	// +nullable
 	LastSuccessfulScaleOutTime *metav1.Time `json:"lastSuccessfulScaleOutTime,omitempty"`

 	// +optional
--- a/api/v1alpha1/runner_types.go
+++ b/api/v1alpha1/runner_types.go
@@ -92,6 +92,8 @@ type RunnerSpec struct {
 	DockerdWithinRunnerContainer *bool `json:"dockerdWithinRunnerContainer,omitempty"`
 	// +optional
 	DockerEnabled *bool `json:"dockerEnabled,omitempty"`
+	// +optional
+	DockerMTU *int64 `json:"dockerMTU,omitempty"`
 }

 // ValidateRepository validates repository field.
@@ -123,6 +125,9 @@ type RunnerStatus struct {
 	Phase        string                   `json:"phase"`
 	Reason       string                   `json:"reason"`
 	Message      string                   `json:"message"`
+
+	//+optional
+	LastRegistrationCheckTime *metav1.Time `json:"lastRegistrationCheckTime"`
 }

 // RunnerStatusRegistration contains runner registration status
--- a/api/v1alpha1/zz_generated.deepcopy.go
+++ b/api/v1alpha1/zz_generated.deepcopy.go
@@ -66,6 +66,11 @@ func (in *CheckRunSpec) DeepCopyInto(out *CheckRunSpec) {
 		*out = make([]string, len(*in))
 		copy(*out, *in)
 	}
+	if in.Names != nil {
+		in, out := &in.Names, &out.Names
+		*out = make([]string, len(*in))
+		copy(*out, *in)
+	}
 }

 // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new CheckRunSpec.
@@ -689,6 +694,11 @@ func (in *RunnerSpec) DeepCopyInto(out *RunnerSpec) {
 		*out = new(bool)
 		**out = **in
 	}
+	if in.DockerMTU != nil {
+		in, out := &in.DockerMTU, &out.DockerMTU
+		*out = new(int64)
+		**out = **in
+	}
 }

 // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RunnerSpec.
@@ -705,6 +715,10 @@ func (in *RunnerSpec) DeepCopy() *RunnerSpec {
 func (in *RunnerStatus) DeepCopyInto(out *RunnerStatus) {
 	*out = *in
 	in.Registration.DeepCopyInto(&out.Registration)
+	if in.LastRegistrationCheckTime != nil {
+		in, out := &in.LastRegistrationCheckTime, &out.LastRegistrationCheckTime
+		*out = (*in).DeepCopy()
+	}
 }

 // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RunnerStatus.
--- a/charts/actions-runner-controller/Chart.yaml
+++ b/charts/actions-runner-controller/Chart.yaml
@@ -15,7 +15,7 @@ type: application
 # This is the chart version. This version number should be incremented each time you make changes
 # to the chart and its templates, including the app version.
 # Versions are expected to follow Semantic Versioning (https://semver.org/)
-version: 0.8.0
+version: 0.10.2

 home: https://github.com/summerwind/actions-runner-controller

--- a/charts/actions-runner-controller/crds/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
+++ b/charts/actions-runner-controller/crds/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
@@ -148,6 +148,17 @@ spec:
                      checkRun:
                        description: https://docs.github.com/en/actions/reference/events-that-trigger-workflows#check_run
                        properties:
+                          names:
+                            description: Names is a list of GitHub Actions glob patterns.
+                              Any check_run event whose name matches one of patterns
+                              in the list can trigger autoscaling. Note that check_run
+                              name seem to equal to the job name you've defined in
+                              your actions workflow yaml file. So it is very likely
+                              that you can utilize this to trigger depending on the
+                              job.
+                            items:
+                              type: string
+                            type: array
                          status:
                            type: string
                          types:
@@ -196,6 +207,7 @@ spec:
              type: integer
            lastSuccessfulScaleOutTime:
              format: date-time
+              nullable: true
              type: string
            observedGeneration:
              description: ObservedGeneration is the most recent generation observed
--- a/charts/actions-runner-controller/crds/actions.summerwind.dev_runnerdeployments.yaml
+++ b/charts/actions-runner-controller/crds/actions.summerwind.dev_runnerdeployments.yaml
@@ -433,6 +433,9 @@ spec:
                      type: array
                    dockerEnabled:
                      type: boolean
+                    dockerMTU:
+                      format: int64
+                      type: integer
                    dockerdContainerResources:
                      description: ResourceRequirements describes the compute resource requirements.
                      properties:
--- a/charts/actions-runner-controller/crds/actions.summerwind.dev_runnerreplicasets.yaml
+++ b/charts/actions-runner-controller/crds/actions.summerwind.dev_runnerreplicasets.yaml
@@ -433,6 +433,9 @@ spec:
                      type: array
                    dockerEnabled:
                      type: boolean
+                    dockerMTU:
+                      format: int64
+                      type: integer
                    dockerdContainerResources:
                      description: ResourceRequirements describes the compute resource requirements.
                      properties:
--- a/charts/actions-runner-controller/crds/actions.summerwind.dev_runners.yaml
+++ b/charts/actions-runner-controller/crds/actions.summerwind.dev_runners.yaml
@@ -398,6 +398,9 @@ spec:
              type: array
            dockerEnabled:
              type: boolean
+            dockerMTU:
+              format: int64
+              type: integer
            dockerdContainerResources:
              description: ResourceRequirements describes the compute resource requirements.
              properties:
@@ -1538,6 +1541,9 @@ spec:
        status:
          description: RunnerStatus defines the observed state of Runner
          properties:
+            lastRegistrationCheckTime:
+              format: date-time
+              type: string
            message:
              type: string
            phase:
--- a/charts/actions-runner-controller/templates/certificate.yaml
+++ b/charts/actions-runner-controller/templates/certificate.yaml
@@ -5,7 +5,7 @@ apiVersion: cert-manager.io/v1
 kind: Issuer
 metadata:
  name: {{ include "actions-runner-controller.selfsignedIssuerName" . }}
-  namespace: {{ .Namespace }}
+  namespace: {{ .Release.Namespace }}
 spec:
  selfSigned: {}
 ---
@@ -13,7 +13,7 @@ apiVersion: cert-manager.io/v1
 kind: Certificate
 metadata:
  name: {{ include "actions-runner-controller.servingCertName" . }}
-  namespace: {{ .Namespace }}
+  namespace: {{ .Release.Namespace }}
 spec:
  dnsNames:
  - {{ include "actions-runner-controller.webhookServiceName" . }}.{{ .Release.Namespace }}.svc
--- a/config/crd/bases/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
+++ b/config/crd/bases/actions.summerwind.dev_horizontalrunnerautoscalers.yaml
@@ -148,6 +148,17 @@ spec:
                      checkRun:
                        description: https://docs.github.com/en/actions/reference/events-that-trigger-workflows#check_run
                        properties:
+                          names:
+                            description: Names is a list of GitHub Actions glob patterns.
+                              Any check_run event whose name matches one of patterns
+                              in the list can trigger autoscaling. Note that check_run
+                              name seem to equal to the job name you've defined in
+                              your actions workflow yaml file. So it is very likely
+                              that you can utilize this to trigger depending on the
+                              job.
+                            items:
+                              type: string
+                            type: array
                          status:
                            type: string
                          types:
@@ -196,6 +207,7 @@ spec:
              type: integer
            lastSuccessfulScaleOutTime:
              format: date-time
+              nullable: true
              type: string
            observedGeneration:
              description: ObservedGeneration is the most recent generation observed
--- a/config/crd/bases/actions.summerwind.dev_runnerdeployments.yaml
+++ b/config/crd/bases/actions.summerwind.dev_runnerdeployments.yaml
@@ -433,6 +433,9 @@ spec:
                      type: array
                    dockerEnabled:
                      type: boolean
+                    dockerMTU:
+                      format: int64
+                      type: integer
                    dockerdContainerResources:
                      description: ResourceRequirements describes the compute resource requirements.
                      properties:
--- a/config/crd/bases/actions.summerwind.dev_runnerreplicasets.yaml
+++ b/config/crd/bases/actions.summerwind.dev_runnerreplicasets.yaml
@@ -433,6 +433,9 @@ spec:
                      type: array
                    dockerEnabled:
                      type: boolean
+                    dockerMTU:
+                      format: int64
+                      type: integer
                    dockerdContainerResources:
                      description: ResourceRequirements describes the compute resource requirements.
                      properties:
--- a/config/crd/bases/actions.summerwind.dev_runners.yaml
+++ b/config/crd/bases/actions.summerwind.dev_runners.yaml
@@ -398,6 +398,9 @@ spec:
              type: array
            dockerEnabled:
              type: boolean
+            dockerMTU:
+              format: int64
+              type: integer
            dockerdContainerResources:
              description: ResourceRequirements describes the compute resource requirements.
              properties:
@@ -1538,6 +1541,9 @@ spec:
        status:
          description: RunnerStatus defines the observed state of Runner
          properties:
+            lastRegistrationCheckTime:
+              format: date-time
+              type: string
            message:
              type: string
            phase:
--- a/controllers/autoscaling.go
+++ b/controllers/autoscaling.go
@@ -71,7 +71,13 @@ func (r *HorizontalRunnerAutoscalerReconciler) determineDesiredReplicas(rd v1alp
 	}

 	metrics := hra.Spec.Metrics
-	if len(metrics) == 0 || metrics[0].Type == v1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns {
+	if len(metrics) == 0 {
+		if len(hra.Spec.ScaleUpTriggers) == 0 {
+			return r.calculateReplicasByQueuedAndInProgressWorkflowRuns(rd, hra)
+		}
+
+		return hra.Spec.MinReplicas, nil
+	} else if metrics[0].Type == v1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns {
 		return r.calculateReplicasByQueuedAndInProgressWorkflowRuns(rd, hra)
 	} else if metrics[0].Type == v1alpha1.AutoscalingMetricTypePercentageRunnersBusy {
 		return r.calculateReplicasByPercentageRunnersBusy(rd, hra)
@@ -91,6 +97,13 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByQueuedAndInPro
 			return nil, fmt.Errorf("asserting runner deployment spec to detect bug: spec.template.organization should not be empty on this code path")
 		}

+		// In case it's an organizational runners deployment without any scaling metrics defined,
+		// we assume that the desired replicas should always be `minReplicas + capacityReservedThroughWebhook`.
+		// See https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-793372693
+		if len(metrics) == 0 {
+			return hra.Spec.MinReplicas, nil
+		}
+
 		if len(metrics[0].RepositoryNames) == 0 {
 			return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].repositoryNames is required and must have one more more entries for organizational runner deployment")
 		}
@@ -259,17 +272,26 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunn
 		scaleDownFactor = sdf
 	}

-	selector, err := metav1.LabelSelectorAsSelector(rd.Spec.Selector)
+	// return the list of runners in namespace. Horizontal Runner Autoscaler should only be responsible for scaling resources in its own ns.
+	var runnerList v1alpha1.RunnerList
+
+	var opts []client.ListOption
+
+	opts = append(opts, client.InNamespace(rd.Namespace))
+
+	selector, err := metav1.LabelSelectorAsSelector(getSelector(&rd))
 	if err != nil {
 		return nil, err
 	}
-	// return the list of runners in namespace. Horizontal Runner Autoscaler should only be responsible for scaling resources in its own ns.
-	var runnerList v1alpha1.RunnerList
+
+	opts = append(opts, client.MatchingLabelsSelector{Selector: selector})
+
+	r.Log.V(2).Info("Finding runners with selector", "ns", rd.Namespace)
+
 	if err := r.List(
 		ctx,
 		&runnerList,
-		client.InNamespace(rd.Namespace),
-		client.MatchingLabelsSelector{Selector: selector},
+		opts...,
 	); err != nil {
 		if !kerrors.IsNotFound(err) {
 			return nil, err
--- a/controllers/horizontal_runner_autoscaler_webhook.go
+++ b/controllers/horizontal_runner_autoscaler_webhook.go
@@ -159,6 +159,13 @@ func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) Handle(w http.Respons
 			e.Repo.Owner.GetType(),
 			autoscaler.MatchPullRequestEvent(e),
 		)
+
+		if pullRequest := e.PullRequest; pullRequest != nil {
+			log = log.WithValues(
+				"pullRequest.base.ref", e.PullRequest.Base.GetRef(),
+				"action", e.GetAction(),
+			)
+		}
 	case *gogithub.CheckRunEvent:
 		target, err = autoscaler.getScaleUpTarget(
 			context.TODO(),
@@ -168,6 +175,13 @@ func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) Handle(w http.Respons
 			e.Repo.Owner.GetType(),
 			autoscaler.MatchCheckRunEvent(e),
 		)
+
+		if checkRun := e.GetCheckRun(); checkRun != nil {
+			log = log.WithValues(
+				"checkRun.status", checkRun.GetStatus(),
+				"action", e.GetAction(),
+			)
+		}
 	case *gogithub.PingEvent:
 		ok = true

@@ -195,9 +209,11 @@ func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) Handle(w http.Respons
 	}

 	if target == nil {
-		msg := "no horizontalrunnerautoscaler to scale for this github event"
+		log.Info(
+			"Scale target not found. If this is unexpected, ensure that there is exactly one repository-wide or organizational runner deployment that matches this webhook event",
+		)

-		log.Info(msg, "eventType", webhookType)
+		msg := "no horizontalrunnerautoscaler to scale for this github event"

 		ok = true

@@ -365,10 +381,6 @@ func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) getScaleUpTarget(ctx
 		return target, nil
 	}

-	log.Info(
-		"Scale target not found. If this is unexpected, ensure that there is exactly one repository-wide or organizational runner deployment that matches this webhook event",
-	)
-
 	return nil, nil
 }

--- a/controllers/horizontal_runner_autoscaler_webhook_on_check_run.go
+++ b/controllers/horizontal_runner_autoscaler_webhook_on_check_run.go
@@ -3,6 +3,7 @@ package controllers
 import (
 	"github.com/google/go-github/v33/github"
 	"github.com/summerwind/actions-runner-controller/api/v1alpha1"
+	"github.com/summerwind/actions-runner-controller/pkg/actionsglob"
 )

 func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) MatchCheckRunEvent(event *github.CheckRunEvent) func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
@@ -27,6 +28,16 @@ func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) MatchCheckRunEvent(ev
 			return false
 		}

+		if checkRun := event.CheckRun; checkRun != nil && len(cr.Names) > 0 {
+			for _, pat := range cr.Names {
+				if r := actionsglob.Match(pat, checkRun.GetName()); r {
+					return true
+				}
+			}
+
+			return false
+		}
+
 		return true
 	}
 }
--- a/controllers/integration_test.go
+++ b/controllers/integration_test.go
@@ -174,6 +174,93 @@ var _ = Context("INTEGRATION: Inside of a new namespace", func() {

 	Describe("when no existing resources exist", func() {

+		It("should create and scale organizational runners without any scaling metrics on pull_request event", func() {
+			name := "example-runnerdeploy"
+
+			{
+				rd := &actionsv1alpha1.RunnerDeployment{
+					ObjectMeta: metav1.ObjectMeta{
+						Name:      name,
+						Namespace: ns.Name,
+					},
+					Spec: actionsv1alpha1.RunnerDeploymentSpec{
+						Replicas: intPtr(1),
+						Selector: &metav1.LabelSelector{
+							MatchLabels: map[string]string{
+								"foo": "bar",
+							},
+						},
+						Template: actionsv1alpha1.RunnerTemplate{
+							ObjectMeta: metav1.ObjectMeta{
+								Labels: map[string]string{
+									"foo": "bar",
+								},
+							},
+							Spec: actionsv1alpha1.RunnerSpec{
+								Organization: "test",
+								Image:        "bar",
+								Group:        "baz",
+								Env: []corev1.EnvVar{
+									{Name: "FOO", Value: "FOOVALUE"},
+								},
+							},
+						},
+					},
+				}
+
+				ExpectCreate(ctx, rd, "test RunnerDeployment")
+				ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1)
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 1)
+			}
+
+			// Scale-up to 2 replicas
+			{
+				hra := &actionsv1alpha1.HorizontalRunnerAutoscaler{
+					ObjectMeta: metav1.ObjectMeta{
+						Name:      name,
+						Namespace: ns.Name,
+					},
+					Spec: actionsv1alpha1.HorizontalRunnerAutoscalerSpec{
+						ScaleTargetRef: actionsv1alpha1.ScaleTargetRef{
+							Name: name,
+						},
+						MinReplicas:                       intPtr(2),
+						MaxReplicas:                       intPtr(5),
+						ScaleDownDelaySecondsAfterScaleUp: intPtr(1),
+						Metrics:                           nil,
+						ScaleUpTriggers: []actionsv1alpha1.ScaleUpTrigger{
+							{
+								GitHubEvent: &actionsv1alpha1.GitHubEventScaleUpTriggerSpec{
+									PullRequest: &actionsv1alpha1.PullRequestSpec{
+										Types:    []string{"created"},
+										Branches: []string{"main"},
+									},
+								},
+								Amount:   1,
+								Duration: metav1.Duration{Duration: time.Minute},
+							},
+						},
+					},
+				}
+
+				ExpectCreate(ctx, hra, "test HorizontalRunnerAutoscaler")
+
+				ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1)
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 2)
+				ExpectHRAStatusCacheEntryLengthEventuallyEquals(ctx, ns.Name, name, 1)
+			}
+
+			{
+				env.ExpectRegisteredNumberCountEventuallyEquals(2, "count of fake runners after HRA creation")
+			}
+
+			// Scale-up to 3 replicas on second pull_request create webhook event
+			{
+				env.SendOrgPullRequestEvent("test", "valid", "main", "created")
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 3, "runners after second webhook event")
+			}
+		})
+
 		It("should create and scale organization's repository runners on pull_request event", func() {
 			name := "example-runnerdeploy"

@@ -235,7 +322,11 @@ var _ = Context("INTEGRATION: Inside of a new namespace", func() {
 						MinReplicas:                       intPtr(1),
 						MaxReplicas:                       intPtr(3),
 						ScaleDownDelaySecondsAfterScaleUp: intPtr(1),
-						Metrics:                           nil,
+						Metrics: []actionsv1alpha1.MetricSpec{
+							{
+								Type: actionsv1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns,
+							},
+						},
 						ScaleUpTriggers: []actionsv1alpha1.ScaleUpTrigger{
 							{
 								GitHubEvent: &actionsv1alpha1.GitHubEventScaleUpTriggerSpec{
@@ -360,7 +451,11 @@ var _ = Context("INTEGRATION: Inside of a new namespace", func() {
 						MinReplicas:                       intPtr(1),
 						MaxReplicas:                       intPtr(5),
 						ScaleDownDelaySecondsAfterScaleUp: intPtr(1),
-						Metrics:                           nil,
+						Metrics: []actionsv1alpha1.MetricSpec{
+							{
+								Type: actionsv1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns,
+							},
+						},
 						ScaleUpTriggers: []actionsv1alpha1.ScaleUpTrigger{
 							{
 								GitHubEvent: &actionsv1alpha1.GitHubEventScaleUpTriggerSpec{
@@ -406,6 +501,110 @@ var _ = Context("INTEGRATION: Inside of a new namespace", func() {
 			env.ExpectRegisteredNumberCountEventuallyEquals(5, "count of fake list runners")
 		})

+		It("should create and scale organization's repository runners only on check_run event", func() {
+			name := "example-runnerdeploy"
+
+			{
+				rd := &actionsv1alpha1.RunnerDeployment{
+					ObjectMeta: metav1.ObjectMeta{
+						Name:      name,
+						Namespace: ns.Name,
+					},
+					Spec: actionsv1alpha1.RunnerDeploymentSpec{
+						Replicas: intPtr(1),
+						Selector: &metav1.LabelSelector{
+							MatchLabels: map[string]string{
+								"foo": "bar",
+							},
+						},
+						Template: actionsv1alpha1.RunnerTemplate{
+							ObjectMeta: metav1.ObjectMeta{
+								Labels: map[string]string{
+									"foo": "bar",
+								},
+							},
+							Spec: actionsv1alpha1.RunnerSpec{
+								Repository: "test/valid",
+								Image:      "bar",
+								Group:      "baz",
+								Env: []corev1.EnvVar{
+									{Name: "FOO", Value: "FOOVALUE"},
+								},
+							},
+						},
+					},
+				}
+
+				ExpectCreate(ctx, rd, "test RunnerDeployment")
+				ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1)
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 1)
+			}
+
+			{
+				env.ExpectRegisteredNumberCountEventuallyEquals(1, "count of fake list runners")
+			}
+
+			// Scale-up to 3 replicas by the default TotalNumberOfQueuedAndInProgressWorkflowRuns-based scaling
+			// See workflowRunsFor3Replicas_queued and workflowRunsFor3Replicas_in_progress for GitHub List-Runners API responses
+			// used while testing.
+			{
+				hra := &actionsv1alpha1.HorizontalRunnerAutoscaler{
+					ObjectMeta: metav1.ObjectMeta{
+						Name:      name,
+						Namespace: ns.Name,
+					},
+					Spec: actionsv1alpha1.HorizontalRunnerAutoscalerSpec{
+						ScaleTargetRef: actionsv1alpha1.ScaleTargetRef{
+							Name: name,
+						},
+						MinReplicas:                       intPtr(1),
+						MaxReplicas:                       intPtr(5),
+						ScaleDownDelaySecondsAfterScaleUp: intPtr(1),
+						ScaleUpTriggers: []actionsv1alpha1.ScaleUpTrigger{
+							{
+								GitHubEvent: &actionsv1alpha1.GitHubEventScaleUpTriggerSpec{
+									CheckRun: &actionsv1alpha1.CheckRunSpec{
+										Types:  []string{"created"},
+										Status: "pending",
+									},
+								},
+								Amount:   1,
+								Duration: metav1.Duration{Duration: time.Minute},
+							},
+						},
+					},
+				}
+
+				ExpectCreate(ctx, hra, "test HorizontalRunnerAutoscaler")
+
+				ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1)
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 1)
+			}
+
+			{
+				env.ExpectRegisteredNumberCountEventuallyEquals(1, "count of fake list runners")
+			}
+
+			// Scale-up to 2 replicas on first check_run create webhook event
+			{
+				env.SendOrgCheckRunEvent("test", "valid", "pending", "created")
+				ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1, "runner sets after webhook")
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 2, "runners after first webhook event")
+			}
+
+			{
+				env.ExpectRegisteredNumberCountEventuallyEquals(2, "count of fake list runners")
+			}
+
+			// Scale-up to 3 replicas on second check_run create webhook event
+			{
+				env.SendOrgCheckRunEvent("test", "valid", "pending", "created")
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 3, "runners after second webhook event")
+			}
+
+			env.ExpectRegisteredNumberCountEventuallyEquals(3, "count of fake list runners")
+		})
+
 		It("should create and scale user's repository runners on pull_request event", func() {
 			name := "example-runnerdeploy"

@@ -467,7 +666,11 @@ var _ = Context("INTEGRATION: Inside of a new namespace", func() {
 						MinReplicas:                       intPtr(1),
 						MaxReplicas:                       intPtr(3),
 						ScaleDownDelaySecondsAfterScaleUp: intPtr(1),
-						Metrics:                           nil,
+						Metrics: []actionsv1alpha1.MetricSpec{
+							{
+								Type: actionsv1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns,
+							},
+						},
 						ScaleUpTriggers: []actionsv1alpha1.ScaleUpTrigger{
 							{
 								GitHubEvent: &actionsv1alpha1.GitHubEventScaleUpTriggerSpec{
@@ -535,6 +738,99 @@ var _ = Context("INTEGRATION: Inside of a new namespace", func() {
 			}
 		})

+		It("should create and scale user's repository runners only on pull_request event", func() {
+			name := "example-runnerdeploy"
+
+			{
+				rd := &actionsv1alpha1.RunnerDeployment{
+					ObjectMeta: metav1.ObjectMeta{
+						Name:      name,
+						Namespace: ns.Name,
+					},
+					Spec: actionsv1alpha1.RunnerDeploymentSpec{
+						Replicas: intPtr(1),
+						Selector: &metav1.LabelSelector{
+							MatchLabels: map[string]string{
+								"foo": "bar",
+							},
+						},
+						Template: actionsv1alpha1.RunnerTemplate{
+							ObjectMeta: metav1.ObjectMeta{
+								Labels: map[string]string{
+									"foo": "bar",
+								},
+							},
+							Spec: actionsv1alpha1.RunnerSpec{
+								Repository: "test/valid",
+								Image:      "bar",
+								Group:      "baz",
+								Env: []corev1.EnvVar{
+									{Name: "FOO", Value: "FOOVALUE"},
+								},
+							},
+						},
+					},
+				}
+
+				ExpectCreate(ctx, rd, "test RunnerDeployment")
+				ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1)
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 1)
+			}
+
+			{
+				hra := &actionsv1alpha1.HorizontalRunnerAutoscaler{
+					ObjectMeta: metav1.ObjectMeta{
+						Name:      name,
+						Namespace: ns.Name,
+					},
+					Spec: actionsv1alpha1.HorizontalRunnerAutoscalerSpec{
+						ScaleTargetRef: actionsv1alpha1.ScaleTargetRef{
+							Name: name,
+						},
+						MinReplicas:                       intPtr(1),
+						MaxReplicas:                       intPtr(3),
+						ScaleDownDelaySecondsAfterScaleUp: intPtr(1),
+						ScaleUpTriggers: []actionsv1alpha1.ScaleUpTrigger{
+							{
+								GitHubEvent: &actionsv1alpha1.GitHubEventScaleUpTriggerSpec{
+									PullRequest: &actionsv1alpha1.PullRequestSpec{
+										Types:    []string{"created"},
+										Branches: []string{"main"},
+									},
+								},
+								Amount:   1,
+								Duration: metav1.Duration{Duration: time.Minute},
+							},
+						},
+					},
+				}
+
+				ExpectCreate(ctx, hra, "test HorizontalRunnerAutoscaler")
+
+				ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1)
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 1)
+			}
+
+			{
+				env.ExpectRegisteredNumberCountEventuallyEquals(1, "count of fake runners after HRA creation")
+			}
+
+			// Scale-up to 2 replicas on first pull_request create webhook event
+			{
+				env.SendUserPullRequestEvent("test", "valid", "main", "created")
+				ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1, "runner sets after webhook")
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 2, "runners after first webhook event")
+				ExpectHRADesiredReplicasEquals(ctx, ns.Name, name, 2, "runner deployment desired replicas")
+			}
+
+			// Scale-up to 3 replicas on second pull_request create webhook event
+			{
+				env.SendUserPullRequestEvent("test", "valid", "main", "created")
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 3, "runners after second webhook event")
+				ExpectHRADesiredReplicasEquals(ctx, ns.Name, name, 3, "runner deployment desired replicas")
+			}
+		})
+
 		It("should create and scale user's repository runners on check_run event", func() {
 			name := "example-runnerdeploy"

@@ -594,7 +890,11 @@ var _ = Context("INTEGRATION: Inside of a new namespace", func() {
 						MinReplicas:                       intPtr(1),
 						MaxReplicas:                       intPtr(5),
 						ScaleDownDelaySecondsAfterScaleUp: intPtr(1),
-						Metrics:                           nil,
+						Metrics: []actionsv1alpha1.MetricSpec{
+							{
+								Type: actionsv1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns,
+							},
+						},
 						ScaleUpTriggers: []actionsv1alpha1.ScaleUpTrigger{
 							{
 								GitHubEvent: &actionsv1alpha1.GitHubEventScaleUpTriggerSpec{
@@ -640,6 +940,110 @@ var _ = Context("INTEGRATION: Inside of a new namespace", func() {
 			env.ExpectRegisteredNumberCountEventuallyEquals(5, "count of fake list runners")
 		})

+		It("should create and scale user's repository runners only on check_run event", func() {
+			name := "example-runnerdeploy"
+
+			{
+				rd := &actionsv1alpha1.RunnerDeployment{
+					ObjectMeta: metav1.ObjectMeta{
+						Name:      name,
+						Namespace: ns.Name,
+					},
+					Spec: actionsv1alpha1.RunnerDeploymentSpec{
+						Replicas: intPtr(1),
+						Selector: &metav1.LabelSelector{
+							MatchLabels: map[string]string{
+								"foo": "bar",
+							},
+						},
+						Template: actionsv1alpha1.RunnerTemplate{
+							ObjectMeta: metav1.ObjectMeta{
+								Labels: map[string]string{
+									"foo": "bar",
+								},
+							},
+							Spec: actionsv1alpha1.RunnerSpec{
+								Repository: "test/valid",
+								Image:      "bar",
+								Group:      "baz",
+								Env: []corev1.EnvVar{
+									{Name: "FOO", Value: "FOOVALUE"},
+								},
+							},
+						},
+					},
+				}
+
+				ExpectCreate(ctx, rd, "test RunnerDeployment")
+				ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1)
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 1)
+			}
+
+			{
+				env.ExpectRegisteredNumberCountEventuallyEquals(1, "count of fake list runners")
+			}
+
+			// Scale-up to 3 replicas by the default TotalNumberOfQueuedAndInProgressWorkflowRuns-based scaling
+			// See workflowRunsFor3Replicas_queued and workflowRunsFor3Replicas_in_progress for GitHub List-Runners API responses
+			// used while testing.
+			{
+				hra := &actionsv1alpha1.HorizontalRunnerAutoscaler{
+					ObjectMeta: metav1.ObjectMeta{
+						Name:      name,
+						Namespace: ns.Name,
+					},
+					Spec: actionsv1alpha1.HorizontalRunnerAutoscalerSpec{
+						ScaleTargetRef: actionsv1alpha1.ScaleTargetRef{
+							Name: name,
+						},
+						MinReplicas:                       intPtr(1),
+						MaxReplicas:                       intPtr(5),
+						ScaleDownDelaySecondsAfterScaleUp: intPtr(1),
+						ScaleUpTriggers: []actionsv1alpha1.ScaleUpTrigger{
+							{
+								GitHubEvent: &actionsv1alpha1.GitHubEventScaleUpTriggerSpec{
+									CheckRun: &actionsv1alpha1.CheckRunSpec{
+										Types:  []string{"created"},
+										Status: "pending",
+									},
+								},
+								Amount:   1,
+								Duration: metav1.Duration{Duration: time.Minute},
+							},
+						},
+					},
+				}
+
+				ExpectCreate(ctx, hra, "test HorizontalRunnerAutoscaler")
+
+				ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1)
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 1)
+			}
+
+			{
+				env.ExpectRegisteredNumberCountEventuallyEquals(1, "count of fake list runners")
+			}
+
+			// Scale-up to 2 replicas on first check_run create webhook event
+			{
+				env.SendUserCheckRunEvent("test", "valid", "pending", "created")
+				ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1, "runner sets after webhook")
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 2, "runners after first webhook event")
+			}
+
+			{
+				env.ExpectRegisteredNumberCountEventuallyEquals(2, "count of fake list runners")
+			}
+
+			// Scale-up to 3 replicas on second check_run create webhook event
+			{
+				env.SendUserCheckRunEvent("test", "valid", "pending", "created")
+				ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 3, "runners after second webhook event")
+			}
+
+			env.ExpectRegisteredNumberCountEventuallyEquals(3, "count of fake list runners")
+		})
+
 	})
 })

--- a/controllers/runner_controller.go
+++ b/controllers/runner_controller.go
@@ -22,6 +22,7 @@ import (
 	"fmt"
 	gogithub "github.com/google/go-github/v33/github"
 	"github.com/summerwind/actions-runner-controller/hash"
+	"k8s.io/apimachinery/pkg/util/wait"
 	"strings"
 	"time"

@@ -129,8 +130,8 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
 			newRunner := runner.DeepCopy()
 			newRunner.ObjectMeta.Finalizers = finalizers

-			if err := r.Update(ctx, newRunner); err != nil {
-				log.Error(err, "Failed to update runner")
+			if err := r.Patch(ctx, newRunner, client.MergeFrom(&runner)); err != nil {
+				log.Error(err, "Failed to update runner for finalizer removal")
 				return ctrl.Result{}, err
 			}

@@ -159,31 +160,23 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
 		}

 		if err := r.Create(ctx, &newPod); err != nil {
+			if kerrors.IsAlreadyExists(err) {
+				// Gracefully handle pod-already-exists errors due to informer cache delay.
+				// Without this we got a few errors like the below on new runner pod:
+				// 2021-03-16T00:23:10.116Z        ERROR   controller-runtime.controller   Reconciler error      {"controller": "runner-controller", "request": "default/example-runnerdeploy-b2g2g-j4mcp", "error": "pods \"example-runnerdeploy-b2g2g-j4mcp\" already exists"}
+				log.Info("Runner pod already exists. Probably this pod has been already created in previous reconcilation but the new pod is not yet cached.")
+
+				return ctrl.Result{RequeueAfter: 10 * time.Second}, nil
+			}
+
 			log.Error(err, "Failed to create pod resource")
+
 			return ctrl.Result{}, err
 		}

 		r.Recorder.Event(&runner, corev1.EventTypeNormal, "PodCreated", fmt.Sprintf("Created pod '%s'", newPod.Name))
 		log.Info("Created runner pod", "repository", runner.Spec.Repository)
 	} else {
-		// If pod has ended up succeeded we need to restart it
-		// Happens e.g. when dind is in runner and run completes
-		restart := pod.Status.Phase == corev1.PodSucceeded
-
-		if !restart && runner.Status.Phase != string(pod.Status.Phase) {
-			updated := runner.DeepCopy()
-			updated.Status.Phase = string(pod.Status.Phase)
-			updated.Status.Reason = pod.Status.Reason
-			updated.Status.Message = pod.Status.Message
-
-			if err := r.Status().Update(ctx, updated); err != nil {
-				log.Error(err, "Failed to update runner status")
-				return ctrl.Result{}, err
-			}
-
-			return ctrl.Result{}, nil
-		}
-
 		if !pod.ObjectMeta.DeletionTimestamp.IsZero() {
 			deletionTimeout := 1 * time.Minute
 			currentTime := time.Now()
@@ -220,6 +213,10 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
 			}
 		}

+		// If pod has ended up succeeded we need to restart it
+		// Happens e.g. when dind is in runner and run completes
+		restart := pod.Status.Phase == corev1.PodSucceeded
+
 		if pod.Status.Phase == corev1.PodRunning {
 			for _, status := range pod.Status.ContainerStatuses {
 				if status.Name != containerName {
@@ -244,24 +241,45 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
 			return ctrl.Result{}, err
 		}

+		var registrationRecheckDelay time.Duration
+
 		// all checks done below only decide whether a restart is needed
 		// if a restart was already decided before, there is no need for the checks
-		// saving API calls and scary log messages
+		// saving API calls and scary{ log messages
 		if !restart {
+			registrationCheckInterval := time.Minute

-			notRegistered := false
+			// We want to call ListRunners GitHub Actions API only once per runner per minute.
+			// This if block, in conjunction with:
+			//   return ctrl.Result{RequeueAfter: registrationRecheckDelay}, nil
+			// achieves that.
+			if lastCheckTime := runner.Status.LastRegistrationCheckTime; lastCheckTime != nil {
+				nextCheckTime := lastCheckTime.Add(registrationCheckInterval)
+				if nextCheckTime.After(time.Now()) {
+					log.Info(
+						fmt.Sprintf("Skipping registration check because it's deferred until %s", nextCheckTime),
+					)
+
+					// Note that we don't need to explicitly requeue on this reconcilation because
+					// the requeue should have been already scheduled previsouly
+					// (with `return ctrl.Result{RequeueAfter: registrationRecheckDelay}, nil` as noted above and coded below)
+					return ctrl.Result{}, nil
+				}
+			}
+
+			notFound := false
 			offline := false

 			runnerBusy, err := r.GitHubClient.IsRunnerBusy(ctx, runner.Spec.Enterprise, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
+
+			currentTime := time.Now()
+
 			if err != nil {
 				var notFoundException *github.RunnerNotFound
 				var offlineException *github.RunnerOffline
 				if errors.As(err, &notFoundException) {
-					log.V(1).Info("Failed to check if runner is busy. Either this runner has never been successfully registered to GitHub or it still needs more time.", "runnerName", runner.Name)
-
-					notRegistered = true
+					notFound = true
 				} else if errors.As(err, &offlineException) {
-					log.V(1).Info("GitHub runner appears to be offline, waiting for runner to get online ...", "runnerName", runner.Name)
 					offline = true
 				} else {
 					var e *gogithub.RateLimitError
@@ -293,40 +311,91 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
 			}

 			registrationTimeout := 10 * time.Minute
-			currentTime := time.Now()
-			registrationDidTimeout := currentTime.Sub(pod.CreationTimestamp.Add(registrationTimeout)) > 0
+			durationAfterRegistrationTimeout := currentTime.Sub(pod.CreationTimestamp.Add(registrationTimeout))
+			registrationDidTimeout := durationAfterRegistrationTimeout > 0

-			if notRegistered && registrationDidTimeout {
-				log.Info(
-					"Runner failed to register itself to GitHub in timely manner. "+
-						"Recreating the pod to see if it resolves the issue. "+
-						"CAUTION: If you see this a lot, you should investigate the root cause. "+
-						"See https://github.com/summerwind/actions-runner-controller/issues/288",
-					"podCreationTimestamp", pod.CreationTimestamp,
-					"currentTime", currentTime,
-					"configuredRegistrationTimeout", registrationTimeout,
-				)
+			if notFound {
+				if registrationDidTimeout {
+					log.Info(
+						"Runner failed to register itself to GitHub in timely manner. "+
+							"Recreating the pod to see if it resolves the issue. "+
+							"CAUTION: If you see this a lot, you should investigate the root cause. "+
+							"See https://github.com/summerwind/actions-runner-controller/issues/288",
+						"podCreationTimestamp", pod.CreationTimestamp,
+						"currentTime", currentTime,
+						"configuredRegistrationTimeout", registrationTimeout,
+					)

-				restart = true
+					restart = true
+				} else {
+					log.V(1).Info(
+						"Runner pod exists but we failed to check if runner is busy. Apparently it still needs more time.",
+						"runnerName", runner.Name,
+					)
+				}
+			} else if offline {
+				if registrationDidTimeout {
+					log.Info(
+						"Already existing GitHub runner still appears offline . "+
+							"Recreating the pod to see if it resolves the issue. "+
+							"CAUTION: If you see this a lot, you should investigate the root cause. ",
+						"podCreationTimestamp", pod.CreationTimestamp,
+						"currentTime", currentTime,
+						"configuredRegistrationTimeout", registrationTimeout,
+					)
+
+					restart = true
+				} else {
+					log.V(1).Info(
+						"Runner pod exists but the GitHub runner appears to be still offline. Waiting for runner to get online ...",
+						"runnerName", runner.Name,
+					)
+				}
 			}

-			if offline && registrationDidTimeout {
-				log.Info(
-					"Already existing GitHub runner still appears offline . "+
-						"Recreating the pod to see if it resolves the issue. "+
-						"CAUTION: If you see this a lot, you should investigate the root cause. ",
-					"podCreationTimestamp", pod.CreationTimestamp,
-					"currentTime", currentTime,
-					"configuredRegistrationTimeout", registrationTimeout,
-				)
-
-				restart = true
+			if (notFound || offline) && !registrationDidTimeout {
+				registrationRecheckDelay = registrationCheckInterval + wait.Jitter(10*time.Second, 0.1)
 			}
-
 		}

 		// Don't do anything if there's no need to restart the runner
 		if !restart {
+			// This guard enables us to update runner.Status.Phase to `Running` only after
+			// the runner is registered to GitHub.
+			if registrationRecheckDelay > 0 {
+				log.V(1).Info(fmt.Sprintf("Rechecking the runner registration in %s", registrationRecheckDelay))
+
+				updated := runner.DeepCopy()
+				updated.Status.LastRegistrationCheckTime = &metav1.Time{Time: time.Now()}
+
+				if err := r.Status().Patch(ctx, updated, client.MergeFrom(&runner)); err != nil {
+					log.Error(err, "Failed to update runner status for LastRegistrationCheckTime")
+					return ctrl.Result{}, err
+				}
+
+				return ctrl.Result{RequeueAfter: registrationRecheckDelay}, nil
+			}
+
+			if runner.Status.Phase != string(pod.Status.Phase) {
+				if pod.Status.Phase == corev1.PodRunning {
+					// Seeing this message, you can expect the runner to become `Running` soon.
+					log.Info(
+						"Runner appears to have registered and running.",
+						"podCreationTimestamp", pod.CreationTimestamp,
+					)
+				}
+
+				updated := runner.DeepCopy()
+				updated.Status.Phase = string(pod.Status.Phase)
+				updated.Status.Reason = pod.Status.Reason
+				updated.Status.Message = pod.Status.Message
+
+				if err := r.Status().Patch(ctx, updated, client.MergeFrom(&runner)); err != nil {
+					log.Error(err, "Failed to update runner status for Phase/Reason/Message")
+					return ctrl.Result{}, err
+				}
+			}
+
 			return ctrl.Result{}, nil
 		}

@@ -394,8 +463,8 @@ func (r *RunnerReconciler) updateRegistrationToken(ctx context.Context, runner v
 		ExpiresAt:    metav1.NewTime(rt.GetExpiresAt().Time),
 	}

-	if err := r.Status().Update(ctx, updated); err != nil {
-		log.Error(err, "Failed to update runner status")
+	if err := r.Status().Patch(ctx, updated, client.MergeFrom(&runner)); err != nil {
+		log.Error(err, "Failed to update runner status for Registration")
 		return false, err
 	}

@@ -530,6 +599,15 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
 		},
 	}

+	if mtu := runner.Spec.DockerMTU; mtu != nil && dockerdInRunner {
+		pod.Spec.Containers[0].Env = append(pod.Spec.Containers[0].Env, []corev1.EnvVar{
+			{
+				Name:  "MTU",
+				Value: fmt.Sprintf("%d", *runner.Spec.DockerMTU),
+			},
+		}...)
+	}
+
 	if !dockerdInRunner && dockerEnabled {
 		runnerVolumeName := "runner"
 		runnerVolumeMountPath := "/runner"
@@ -612,6 +690,15 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
 			Resources: runner.Spec.DockerdContainerResources,
 		})

+		if mtu := runner.Spec.DockerMTU; mtu != nil {
+			pod.Spec.Containers[1].Env = append(pod.Spec.Containers[1].Env, []corev1.EnvVar{
+				{
+					Name:  "DOCKERD_ROOTLESS_ROOTLESSKIT_MTU",
+					Value: fmt.Sprintf("%d", *runner.Spec.DockerMTU),
+				},
+			}...)
+		}
+
 	}

 	if len(runner.Spec.Containers) != 0 {
--- a/controllers/runnerdeployment_controller.go
+++ b/controllers/runnerdeployment_controller.go
@@ -41,7 +41,8 @@ import (
 )

 const (
-	LabelKeyRunnerTemplateHash = "runner-template-hash"
+	LabelKeyRunnerTemplateHash   = "runner-template-hash"
+	LabelKeyRunnerDeploymentName = "runner-deployment-name"

 	runnerSetOwnerKey = ".metadata.controller"
 )
@@ -193,7 +194,10 @@ func (r *RunnerDeploymentReconciler) Reconcile(req ctrl.Request) (ctrl.Result, e
 				Namespace: newestSet.Namespace,
 				Name:      newestSet.Name,
 			}).
-				Info("Waiting until the newest runner replica set to be 100% available")
+				Info("Waiting until the newest runner replica set to be 100% available",
+					"ready", readyReplicas,
+					"desired", currentDesiredReplicas,
+				)

 			return ctrl.Result{RequeueAfter: 10 * time.Second}, nil
 		}
@@ -326,23 +330,32 @@ func (r *RunnerDeploymentReconciler) newRunnerReplicaSet(rd v1alpha1.RunnerDeplo
 	return newRunnerReplicaSet(&rd, r.CommonRunnerLabels, r.Scheme)
 }

+func getSelector(rd *v1alpha1.RunnerDeployment) *metav1.LabelSelector {
+	selector := rd.Spec.Selector
+	if selector == nil {
+		selector = &metav1.LabelSelector{MatchLabels: map[string]string{LabelKeyRunnerDeploymentName: rd.Name}}
+	}
+
+	return selector
+}
+
 func newRunnerReplicaSet(rd *v1alpha1.RunnerDeployment, commonRunnerLabels []string, scheme *runtime.Scheme) (*v1alpha1.RunnerReplicaSet, error) {
 	newRSTemplate := *rd.Spec.Template.DeepCopy()

-	templateHash := ComputeHash(&newRSTemplate)
-	// Add template hash label to selector.
-	labels := CloneAndAddLabel(rd.Spec.Template.Labels, LabelKeyRunnerTemplateHash, templateHash)
-
 	for _, l := range commonRunnerLabels {
 		newRSTemplate.Spec.Labels = append(newRSTemplate.Spec.Labels, l)
 	}

-	newRSTemplate.Labels = labels
+	templateHash := ComputeHash(&newRSTemplate)
+
+	// Add template hash label to selector.
+	newRSTemplate.ObjectMeta.Labels = CloneAndAddLabel(newRSTemplate.ObjectMeta.Labels, LabelKeyRunnerTemplateHash, templateHash)
+
+	// This label selector is used by default when rd.Spec.Selector is empty.
+	newRSTemplate.ObjectMeta.Labels = CloneAndAddLabel(newRSTemplate.ObjectMeta.Labels, LabelKeyRunnerDeploymentName, rd.Name)
+
+	selector := getSelector(rd)

-	selector := rd.Spec.Selector
-	if selector == nil {
-		selector = &metav1.LabelSelector{MatchLabels: labels}
-	}
 	newRSSelector := CloneSelectorAndAddLabel(selector, LabelKeyRunnerTemplateHash, templateHash)

 	rs := v1alpha1.RunnerReplicaSet{
@@ -350,7 +363,7 @@ func newRunnerReplicaSet(rd *v1alpha1.RunnerDeployment, commonRunnerLabels []str
 		ObjectMeta: metav1.ObjectMeta{
 			GenerateName: rd.ObjectMeta.Name + "-",
 			Namespace:    rd.ObjectMeta.Namespace,
-			Labels:       labels,
+			Labels:       newRSTemplate.ObjectMeta.Labels,
 		},
 		Spec: v1alpha1.RunnerReplicaSetSpec{
 			Replicas: rd.Spec.Replicas,
--- a/github/github.go
+++ b/github/github.go
@@ -5,6 +5,7 @@ import (
 	"fmt"
 	"net/http"
 	"net/url"
+	"os"
 	"strings"
 	"sync"
 	"time"
@@ -39,10 +40,20 @@ func (c *Config) NewClient() (*Client, error) {
 	if len(c.Token) > 0 {
 		transport = oauth2.NewClient(context.Background(), oauth2.StaticTokenSource(&oauth2.Token{AccessToken: c.Token})).Transport
 	} else {
-		tr, err := ghinstallation.NewKeyFromFile(http.DefaultTransport, c.AppID, c.AppInstallationID, c.AppPrivateKey)
-		if err != nil {
-			return nil, fmt.Errorf("authentication failed: %v", err)
+		var tr *ghinstallation.Transport
+
+		if _, err := os.Stat(c.AppPrivateKey); err == nil {
+			tr, err = ghinstallation.NewKeyFromFile(http.DefaultTransport, c.AppID, c.AppInstallationID, c.AppPrivateKey)
+			if err != nil {
+				return nil, fmt.Errorf("authentication failed: using private key at %s: %v", c.AppPrivateKey, err)
+			}
+		} else {
+			tr, err = ghinstallation.New(http.DefaultTransport, c.AppID, c.AppInstallationID, []byte(c.AppPrivateKey))
+			if err != nil {
+				return nil, fmt.Errorf("authentication failed: using private key of size %d (%s...): %v", len(c.AppPrivateKey), strings.Split(c.AppPrivateKey, "\n")[0], err)
+			}
 		}
+
 		if len(c.EnterpriseURL) > 0 {
 			githubAPIURL, err := getEnterpriseApiUrl(c.EnterpriseURL)
 			if err != nil {
@@ -147,7 +158,7 @@ func (c *Client) ListRunners(ctx context.Context, enterprise, org, repo string)

 	var runners []*github.Runner

-	opts := github.ListOptions{PerPage: 10}
+	opts := github.ListOptions{PerPage: 100}
 	for {
 		list, res, err := c.listRunners(ctx, enterprise, owner, repo, &opts)

--- a/pkg/actionsglob/README.md
+++ b/pkg/actionsglob/README.md
@@ -0,0 +1,8 @@
+This package is an implementation of glob that is intended to simulate the behaviour of 
+https://github.com/actions/toolkit/tree/master/packages/glob in many cases.
+
+This isn't a complete reimplementation of the referenced nodejs package.
+
+Differences:
+
+- This package doesn't implement `**`
--- a/pkg/actionsglob/actionsglob.go
+++ b/pkg/actionsglob/actionsglob.go
@@ -0,0 +1,78 @@
+package actionsglob
+
+import (
+	"fmt"
+	"strings"
+)
+
+func Match(pat string, s string) bool {
+	if len(pat) == 0 {
+		panic(fmt.Sprintf("unexpected length of pattern: %d", len(pat)))
+	}
+
+	var inverse bool
+
+	if pat[0] == '!' {
+		pat = pat[1:]
+		inverse = true
+	}
+
+	tokens := strings.SplitAfter(pat, "*")
+
+	var wildcardInHead bool
+
+	for i := 0; i < len(tokens); i++ {
+		p := tokens[i]
+
+		if p == "" {
+			s = ""
+			break
+		}
+
+		if p == "*" {
+			if i == len(tokens)-1 {
+				s = ""
+				break
+			}
+
+			wildcardInHead = true
+
+			continue
+		}
+
+		wildcardInTail := p[len(p)-1] == '*'
+		if wildcardInTail {
+			p = p[:len(p)-1]
+		}
+
+		subs := strings.SplitN(s, p, 2)
+
+		if len(subs) == 0 {
+			break
+		}
+
+		if subs[0] != "" {
+			if !wildcardInHead {
+				break
+			}
+		}
+
+		if subs[1] != "" {
+			if !wildcardInTail {
+				break
+			}
+		}
+
+		s = subs[1]
+
+		wildcardInHead = wildcardInTail
+	}
+
+	r := s == ""
+
+	if inverse {
+		r = !r
+	}
+
+	return r
+}
--- a/pkg/actionsglob/match_test.go
+++ b/pkg/actionsglob/match_test.go
@@ -0,0 +1,214 @@
+package actionsglob
+
+import (
+	"testing"
+)
+
+func TestMatch(t *testing.T) {
+	type testcase struct {
+		Pattern, Target string
+		Want            bool
+	}
+
+	run := func(t *testing.T, tc testcase) {
+		t.Helper()
+
+		got := Match(tc.Pattern, tc.Target)
+
+		if got != tc.Want {
+			t.Errorf("%s against %s: want %v, got %v", tc.Pattern, tc.Target, tc.Want, got)
+		}
+	}
+
+	t.Run("foo == foo", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "foo",
+			Target:  "foo",
+			Want:    true,
+		})
+	})
+
+	t.Run("!foo == foo", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!foo",
+			Target:  "foo",
+			Want:    false,
+		})
+	})
+
+	t.Run("foo == foo1", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "foo",
+			Target:  "foo1",
+			Want:    false,
+		})
+	})
+
+	t.Run("!foo == foo1", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!foo",
+			Target:  "foo1",
+			Want:    true,
+		})
+	})
+
+	t.Run("*foo == foo", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "*foo",
+			Target:  "foo",
+			Want:    true,
+		})
+	})
+
+	t.Run("!*foo == foo", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!*foo",
+			Target:  "foo",
+			Want:    false,
+		})
+	})
+
+	t.Run("*foo == 1foo", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "*foo",
+			Target:  "1foo",
+			Want:    true,
+		})
+	})
+
+	t.Run("!*foo == 1foo", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!*foo",
+			Target:  "1foo",
+			Want:    false,
+		})
+	})
+
+	t.Run("*foo == foo1", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "*foo",
+			Target:  "foo1",
+			Want:    false,
+		})
+	})
+
+	t.Run("!*foo == foo1", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!*foo",
+			Target:  "foo1",
+			Want:    true,
+		})
+	})
+
+	t.Run("*foo* == foo1", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "*foo*",
+			Target:  "foo1",
+			Want:    true,
+		})
+	})
+
+	t.Run("!*foo* == foo1", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!*foo*",
+			Target:  "foo1",
+			Want:    false,
+		})
+	})
+
+	t.Run("*foo == foobar", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "*foo",
+			Target:  "foobar",
+			Want:    false,
+		})
+	})
+
+	t.Run("!*foo == foobar", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!*foo",
+			Target:  "foobar",
+			Want:    true,
+		})
+	})
+
+	t.Run("*foo* == foobar", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "*foo*",
+			Target:  "foobar",
+			Want:    true,
+		})
+	})
+
+	t.Run("!*foo* == foobar", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!*foo*",
+			Target:  "foobar",
+			Want:    false,
+		})
+	})
+
+	t.Run("foo* == foo", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "foo*",
+			Target:  "foo",
+			Want:    true,
+		})
+	})
+
+	t.Run("!foo* == foo", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!foo*",
+			Target:  "foo",
+			Want:    false,
+		})
+	})
+
+	t.Run("foo* == foobar", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "foo*",
+			Target:  "foobar",
+			Want:    true,
+		})
+	})
+
+	t.Run("!foo* == foobar", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!foo*",
+			Target:  "foobar",
+			Want:    false,
+		})
+	})
+
+	t.Run("foo (* == foo ( 1 / 2 )", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "foo (*",
+			Target:  "foo ( 1 / 2 )",
+			Want:    true,
+		})
+	})
+
+	t.Run("!foo (* == foo ( 1 / 2 )", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!foo (*",
+			Target:  "foo ( 1 / 2 )",
+			Want:    false,
+		})
+	})
+
+	t.Run("actions-*-metrics == actions-workflow-metrics", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "actions-*-metrics",
+			Target:  "actions-workflow-metrics",
+			Want:    true,
+		})
+	})
+
+	t.Run("!actions-*-metrics == actions-workflow-metrics", func(t *testing.T) {
+		run(t, testcase{
+			Pattern: "!actions-*-metrics",
+			Target:  "actions-workflow-metrics",
+			Want:    false,
+		})
+	})
+}
--- a/runner/startup.sh
+++ b/runner/startup.sh
@@ -33,5 +33,9 @@ for process in "${processes[@]}"; do
    fi
 done

+if [ -n "${MTU}" ]; then
+  ifconfig docker0 mtu ${MTU} up
+fi
+
 # Wait processes to be running
 entrypoint.sh
Author	SHA1	Message	Date
Yusuke Kuoka	2929a739e3	Merge pull request #398 from summerwind/fix-status-last-reg-check-time-type-err Fix `status.lastRegistrationCheckTime in body must be of type string: \"null\"` error	2021-03-18 10:36:44 +09:00
Yusuke Kuoka	3cccca8d09	Do patch runner status instead of update to reduce conflicts and avoid future bugs Ref https://github.com/summerwind/actions-runner-controller/pull/398#issuecomment-801548375	2021-03-18 10:31:17 +09:00
Yusuke Kuoka	7a7086e7aa	Make error logs more helpful	2021-03-18 10:26:21 +09:00
Yusuke Kuoka	565b14a148	Fix `status.lastRegistrationCheckTime in body must be of type string: \"null\"` error Follow-up for #392	2021-03-18 10:20:49 +09:00
Yusuke Kuoka	ecc441de3f	Bump chart version	2021-03-18 07:36:22 +09:00
Manabu Sakai	25335bb3c3	Fix typo in certificate.yaml (#396 )	2021-03-18 07:33:34 +09:00
Yusuke Kuoka	9b871567b1	Fix wildcard in middle of actionsglob/scaleUpTrigger.githubEvent.checkRun.names not working (#395 ) actionsglob patterns like `foo-*-bar` was not correctly working. Tests and the implementation was enhanced to correctly support it.	2021-03-17 06:46:48 +09:00
Balazs Gyurak	264cf494e3	Fix "pole" typo in README (#394 ) I think these should be "poll".	2021-03-17 06:34:01 +09:00
Yusuke Kuoka	3f23501b8e	Reduce "No runner matching the specified labels was found" errors while runner replacement (#392 ) We occasionally encountered those errors while the underlying RunnerReplicaSet is being recreated/replaced on RunnerDeployment.Spec.Template update. It turned out to be due to that the RunnerDeployment controller was waiting for the runner pod becomes `Running`, intead of the new replacement runner to have registered to GitHub. This fixes that, by trying to Runner.Status.Phase to `Running` only after the runner in the runner pod appears to be registered. A side-effect of this change is that runner controller would call more "ListRunners" GitHub Actions API. I've reviewed and improved the runner controller code and Runner CRD to make make the number of calls minimum. In most cases, ListRunners should be called only twice for each runner creation.	2021-03-16 10:52:30 +09:00
Yusuke Kuoka	5530030c67	Disable metrics-based autoscaling by default when scaleUpTriggers are enabled (#391 ) Relates to https://github.com/summerwind/actions-runner-controller/pull/379#discussion_r592813661 Relates to https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-793266609 When you defined HRA.Spec.ScaleUpTriggers[] but HRA.Spec.Metrics[], the HRA controller will now enable ScaleUpTriggers alone and insteaed of automatically enabling TotalNumberOfQueuedAndInProgressWorkflowRuns. This allows you to use ScaleUpTriggers alone, so that the autoscaling is done without calling GitHub API at all, which should grealy decrease the change of GitHub API calls get rate-limited.	2021-03-14 11:03:00 +09:00
Yusuke Kuoka	8d3a83b07a	Add CheckRun.Names scale-up trigger configuration (#390 ) This allows you to trigger autoscaling depending on check_run names(i.e. actions job names). If you are willing to differentiate scale amount only for a specific job, or want to scale only on a specific job, try this.	2021-03-14 10:21:42 +09:00
callum-tait-pbx	a6270b44d5	docs: fix typos and add PR link (#379 ) * docs: fix typos and add PR link * docs: changes based on feedback * docs: fixing numbers in list * docs: grammer * docs: better wording	2021-03-12 08:52:34 +09:00
Brandon Kimbrough	2273b198a1	Add ability to set the MTU size of the docker in docker container (#385 ) * adding abilitiy to set docker in docker MTU size * safeguards to only set MTU env var if it is set	2021-03-12 08:44:49 +09:00
Yusuke Kuoka	3d62e73f8c	Fix PercentageRunnersBusy scaling not working (#386 ) PercentageRunnerBusy seems to have regressed since #355 due to that RunnerDeployment.Spec.Selector is empty by default and the HRA controller was using that empty selector to query runners, which somehow returned 0 runners. This fixes that by using the newly added automatic `runner-deployment-name` label for the default runner label and the selector, which avoids querying with empty selector. Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-795200205	2021-03-11 20:16:36 +09:00
Yusuke Kuoka	f5c639ae28	Make webhook-based autoscaler github event logs more operator-friendly (#384 ) Adds fields like `pullRequest.base.ref` and `checkRun.status` that are useful for verifying the autoscaling behaviour without browsing GitHub. Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-794175312	2021-03-10 09:40:44 +09:00
Yusuke Kuoka	81016154c0	GITHUB_APP_PRIVATE_KEY can now be the content of the key (#383 ) Resolves #382	2021-03-10 09:37:15 +09:00
Yusuke Kuoka	728829be7b	Fix panic on scaling organizational runners (#381 ) Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-793287133	2021-03-09 15:03:47 +09:00