Compare commits

..

278 Commits

Author SHA1 Message Date
John Stewart
8566a4f453 Don't set default caBundle for webhooks (#617)
* Don't set default caBundle for webhooks

Fixes #614

* bump chart version
2021-06-10 08:30:37 +09:00
toast-gear
3366dc9a63 docs: adding in the caveat to upgrade docs 2021-06-09 10:15:09 +01:00
toast-gear
fa94799ec8 chore/bump-helm-chart (#615)
* chore: bumping chart version

* chore: updating chart details
2021-06-08 19:24:50 +01:00
toast-gear
c424d1afee ci: ignore .md file changes everywhere 2021-06-08 18:32:08 +01:00
toast-gear
99f83a9bf0 ci: ignore any .md file changes anywhere 2021-06-08 18:29:17 +01:00
toast-gear
aa7d4c5ecc docs: adding docs for the chart values (#608)
* docs: adding docs for the chart values

* docs: updating the main docs

* docs: grammar fixes

* docs: updating proxy default

Co-authored-by: Callum James Tait <callum.tait@photobox.com>
2021-06-08 18:17:49 +01:00
Carus Kyle
552ee28072 chore: bump kube-rbac-proxy version (#609) 2021-06-08 18:16:30 +01:00
toast-gear
fa77facacd ci: adding negative paths for publish 2021-06-07 09:34:44 +01:00
callum-tait-pbx
5b28f3d964 ci: correcting negative paths (#606) 2021-06-07 09:31:55 +01:00
Yusuke Kuoka
c36748b8bc chart: Enhance the upgrade process to not require uninstalling (#605) 2021-06-07 09:00:40 +01:00
toast-gear
f16f5b0aa4 ci: ignore doc changes (#604) 2021-06-07 08:59:28 +09:00
toast-gear
c889b92f45 docs: adding in link to HIP (#603)
* docs: adding in link to HIP

* docs: improving wording
2021-06-07 08:59:05 +09:00
Rob Bos
46be20976a Fixing typos in documentation (#602) 2021-06-04 18:52:10 +01:00
Jonah Back
8c42f99d0b feat: avoid setting privileged flag if seLinuxOptions is not null (#599)
Sets the privileged flag to false if SELinuxOptions are present/defined. This is needed because containerd treats SELinux and Privileged controls as mutually exclusive. Also see https://github.com/containerd/cri/blob/aa2d5a97c/pkg/server/container_create.go#L164.

This allows users who use SELinux for managing privileged processes to use GH Actions - otherwise, based on the SELinux policy, the Docker in Docker container might not be privileged enough. 

Signed-off-by: Jonah Back <jonah@jonahback.com>
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-06-04 08:59:11 +09:00
Tim Birkett
a93fd21f21 feat: add STARTUP_DELAY to entrypoint.sh (#592)
Ref #591 

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-06-04 08:57:59 +09:00
Ameer Ghani
7523ea44f1 feat: allow specifying runtime class in runner spec (#580)
This allows using the `runtimeClassName` directive in the runner's spec.

One of the use-cases for this is Kata Containers, which use `runtimeClassName` in a pod spec as an indicator that the pod should run inside a Kata container. This allows us a greater degree of pod isolation.
2021-06-04 08:56:43 +09:00
Vladyslav Miletskyi
30ab0c0b71 Fix actions-runner-dind not to fail setting up MTU (#589)
Fixes #588
2021-06-04 08:54:46 +09:00
Pierre DEMAGNY
a72f190ef6 docs: add an annotation example in Additional Tweaks (#600) 2021-06-04 08:38:56 +09:00
toast-gear
cb60c1ec3b docs: add explicit permission list (#593)
Fixes https://github.com/actions-runner-controller/actions-runner-controller/issues/543

Co-authored-by: Callum James Tait <callum.tait@photobox.com>
2021-06-02 08:52:14 +09:00
Christian Dobinsky
e108e04dda chart: add podLabels to helm chart (#583)
* Add pod labels to helm chart

* fix: make podLabels consistent to podAnnotations

* Update charts/actions-runner-controller/Chart.yaml

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-06-01 09:21:32 +09:00
toast-gear
2e083bca28 fix: fixing mising pip PATH (#585)
* fix: fixing mising pip PATH

* chore: removing User Site Directory

Co-authored-by: Callum James Tait <callum.tait@photobox.com>
2021-06-01 09:21:14 +09:00
toast-gear
198b13324d ci: only run latest tag job on push / release (#586)
* ci: only run latest tag job on merge

* ci: update job conditional
2021-06-01 09:18:50 +09:00
toast-gear
605dae3995 docs: add docs for upgrading the project when using Helm (#582)
* docs: adding upgrade notes for Helm

* chore: adding new ignore

* docs: add in cmd to check for stuck runners

* docs: better format

* docs: removing superfluous steps

* docs: moved location of docs

Co-authored-by: Callum James Tait <callum.tait@photobox.com>
2021-05-29 10:37:07 +09:00
toast-gear
d2b0920454 chore: removing dead chart parameters (#577)
* chore: removing autoscale parameters

* chore: removing dead parameter

* chore: removing dead parameters
2021-05-28 08:57:25 +09:00
Yair Fried
2cbeca0e7c chart: Add service monitor and remove kube_rbac_proxy leftovers (#527)
* remove all authProxy refs

* Add serviceMonitor

* fix metrics port

* fix newline

* fix newline

* bump chart version

* fix indentation typo

* Rename metrics.proxy

* Make metrics.portNumber configurable

* fix metrics port

* revert: chart version change

Co-authored-by: toast-gear <15716903+toast-gear@users.noreply.github.com>
2021-05-26 12:10:25 +01:00
Callum James Tait
859e04a680 chore: moving python to alphabetical order 2021-05-26 09:32:01 +09:00
Callum James Tait
c0821d4ede chore: correcting lists removal path 2021-05-26 09:32:01 +09:00
Callum James Tait
c3a6e45920 chore: aligning package order 2021-05-26 09:32:01 +09:00
Callum James Tait
818dfd6515 chore: whitespace alignment 2021-05-26 09:32:01 +09:00
Callum James Tait
726b39aedd feat: adding pip to base image 2021-05-26 09:32:01 +09:00
toast-gear
7638c21e92 docs: adding caveat to scaling metric (#570)
* docs: adding caveat to scaling metric

* docs: better wording

Fixes #338
2021-05-25 10:23:32 +09:00
Viktor Anderling
c09d6075c6 Add topologySpreadConstraints to helm chart (#569)
This commit adds the ability to use topologySpreadConstraints in the
helm chart by populating either one or both of topologySpreadConstraints
and githubWebhookServer.topologySpreadConstraints values.

See the official docs:
https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/

Resolves #567
2021-05-25 10:23:08 +09:00
callum-tait-pbx
39d37a7d28 docs: removing git version (#572)
The version of git bundled isn't pinned
2021-05-24 21:47:33 +01:00
toast-gear
de0315380d docs: better formating (#571) 2021-05-24 21:25:27 +01:00
toast-gear
906ddacbc6 chore: lowering daysUntilStale config (#568) 2021-05-24 09:41:24 +01:00
toast-gear
c388446668 docs: adding comment on permissions being included (#565)
* docs: adding comment on permissions being included

* docs: aligning text across readme
2021-05-22 20:05:19 +09:00
Yusuke Kuoka
d56971ca7c Fix typo (sucessfully -> successfully (#563)
Follow-up for #556
2021-05-22 08:36:18 +09:00
Yusuke Kuoka
cb14d7530b Add HRA printer column "SCHEDULE" (#561)
Adds a column to help the operator see if they configured HRA.Spec.ScheduledOverrides correctly, in a form of "next override schedule recognized by the controller":

```
$ k get horizontalrunnerautoscaler
NAME                            MIN   MAX   DESIRED   SCHEDULE
actions-runner-aos-autoscaler   0     5     0
org                             0     5     0         min=0 time=2021-05-21 15:00:00 +0000 UTC
```

Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/484
2021-05-22 08:29:53 +09:00
Yusuke Kuoka
fbb24c8c0a chore: update issue templates (#559)
* Update bug_report.md

* chore: removing default label for enhancement

Co-authored-by: toast-gear <15716903+toast-gear@users.noreply.github.com>
2021-05-21 16:51:07 +01:00
Yusuke Kuoka
0b88b246d3 Fix additionalPrinterColumns (#556)
This fixes human-readable output of `kubectl get` on `runnerdeployment`, `runnerreplicaset`, and `runner`.

Most notably, CURRENT and READY of runner replicasets are now computed and printed correctly. Runner deployments now have UP-TO-DATE and AVAILABLE instead of READY so that it is consistent with columns of K8s deployments.

A few fixes has been also made to runner deployment and runner replicaset controllers so that those numbers stored in Status objects are reliably updated and in-sync with actual values.

Finally, `AGE` columns are added to runnerdeployment, runnerreplicaset, runnner to make that more visible to users.

`kubectl get` outputs should now look like the below examples:

```
# Immediately after runnerdeployment updated/created
$ k get runnerdeployment
NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
example-runnerdeploy   0         0         0            0           8d
org-runnerdeploy       5         5         5            0           8d

# A few dozens of seconds after update/create all the runners are registered that "available" numbers increase
$ k get runnerdeployment
NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
example-runnerdeploy   0         0         0            0           8d
org-runnerdeploy       5         5         5            5           8d
```

```
$ k get runnerreplicaset
NAME                         DESIRED   CURRENT   READY   AGE
example-runnerdeploy-wnpf6   0         0         0       61m
org-runnerdeploy-fsnmr       2         2         0       8m41s
```

```
$ k get runner
NAME                                           ENTERPRISE   ORGANIZATION                REPOSITORY                                       LABELS                      STATUS    AGE
example-runnerdeploy-wnpf6-registration-only                                            actions-runner-controller/mumoshu-actions-test                               Running   61m
org-runnerdeploy-fsnmr-n8kkx                                actions-runner-controller                                                    ["mylabel 1","mylabel 2"]             21s
org-runnerdeploy-fsnmr-sq6m8                                actions-runner-controller                                                    ["mylabel 1","mylabel 2"]             21s
```

Fixes #490
2021-05-21 09:10:47 +09:00
Yusuke Kuoka
a4631f345b Update issue templates (#552) 2021-05-18 18:15:00 +09:00
Yusuke Kuoka
7be31ce3e5 kubectl-diff / dry-run support (#549)
Resolves #266
2021-05-17 09:36:13 +09:00
toast-gear
57a7b8076f docs: correcting shell command (#548)
Fixes #546
2021-05-16 09:08:41 +09:00
ToMe25
5309b1c02c Fix acceptance test not working due to missing SYNC_PERIOD (#542)
Fixes #533
2021-05-11 20:30:34 +09:00
Yusuke Kuoka
ae09e6ebb7 Make log level configurable (#541)
Resolves #425
2021-05-11 20:23:06 +09:00
Yusuke Kuoka
3cd124dce3 chore: Add debug logs for scheduledOverrides (#540)
Follow-up for #515
Ref #484
2021-05-11 17:30:22 +09:00
Yusuke Kuoka
25f5817a5e Improve debug log in webhook-based autoscaling
Adds some helpful debug log messages I have used while verifying #534
2021-05-11 15:49:03 +09:00
Yusuke Kuoka
0510f19607 chore: Enhance acceptance test to cover webhook-based autoscaling for repo and org runners
Adds what I used while verifying #534
2021-05-11 15:36:02 +09:00
Yusuke Kuoka
9d961c58ff Log used settings on startup 2021-05-11 11:46:35 +09:00
Yusuke Kuoka
ab25907050 chart: Add githubAPICacheDuration
Ref #502
2021-05-11 11:46:35 +09:00
Yusuke Kuoka
6cbba80df1 Add --github-api-cache-duration
Resolves #502
2021-05-11 11:46:35 +09:00
Liam Gibson
082245c5db Fix typos in README.md (#528) 2021-05-08 21:29:11 +09:00
Yusuke Kuoka
a82e020daa Add notes for unreleased features (#526) 2021-05-05 14:59:36 +09:00
Yusuke Kuoka
c8c2d44a5c Add documentation for ScheduledOverrides (#525)
Ref #484
2021-05-05 14:54:50 +09:00
Yusuke Kuoka
4e7b8b57c0 edge: Enable scaling from zero with PercentageRunnersBusy (#524)
`PercentageRunnersBusy`, in combination with a secondary `TotalInProgressAndQueuedWorkflowRuns` metric, enables scale-from-zero for PercentageRunnersBusy.

Please see the new `Autoscaling to/from 0` section in the updated documentation about how it works.

Resolves #522
2021-05-05 14:27:17 +09:00
Yusuke Kuoka
e7020c7c0f Fix scale-from-zero to retain the reg-only runner until other pods come up (#523)
Fixes #516
2021-05-05 12:13:51 +09:00
Yair Fried
cb54864387 chart: Allow to disabling kube-rbac-proxy and expose metrics (#511)
Fixes #454
2021-05-03 23:36:01 +09:00
Yusuke Kuoka
0e0f385f72 Experimental support for ScheduledOverrides (#515)
This adds the initial version of ScheduledOverrides to HorizontalRunnerAutoscaler.
`MinReplicas` overriding should just work.
When there are two or more ScheduledOverrides, the earliest one that matched is activated. Each ScheduledOverride can be recurring or one-time. If you have two or more ScheduledOverrides, only one of them should be one-time. And the one-time override should be the earliest item in the list to make sense.

Tests will be added in another commit. Logging improvements and additional observability in HRA.Status will also be added in yet another commits.

Ref #484
2021-05-03 23:31:17 +09:00
Yusuke Kuoka
b3cae25741 Enhance HorizontalRunnerAutoscaler API for ScheduledOverrides (#514)
This adds types and CRD changes related to HorizontalRunnerAutoscaler for the upcoming ScheduledOverrides feature.

Ref #484
2021-05-03 22:31:54 +09:00
Yusuke Kuoka
469b117a09 Foundation for ScheduledOverrides (#513)
Adds two types `RecurrenceRule` and `Period` and one function `MatchSchedule` as the foundation for building the upcoming ScheduledOverrides feature.

Ref #484
2021-05-03 22:03:49 +09:00
Yusuke Kuoka
5f59734078 Fix docker-login failing since move to GitHub organization (#510)
Fixes #509
2021-05-03 14:56:58 +09:00
Yusuke Kuoka
e00b3b9714 Make development cycle faster (#508)
Improves Makefile, acceptance/deploy.sh, acceptance/testdata/runnerdeploy.yaml, and the documentation to help developers and contributors.
2021-05-03 13:03:17 +09:00
Thejas N
588872a316 feat: allow ephemeral runner to be optional (#498)
- Adds `ephemeral` option to `runner.spec` 
    
    ```
      ....
      template:
         spec:
             ephemeral: false
             repository: mumoshu/actions-runner-controller-ci
      ....
    ```
- `ephemeral` defaults to `true`
- `entrypoint.sh` in runner/Dockerfile modified to read `RUNNER_EPHEMERAL` flag
- Runner images are backward-compatible. `--once` is omitted only when the new envvar `RUNNER_EPHEMERAL` is explicitly set to `false`.

Resolves #457
2021-05-02 19:04:14 +09:00
Yusuke Kuoka
a0feee257f Add .dockerignore for controller to accelerate image rebuild in local dev env (#504)
Previously any non-go changes resulted in `make docker-build` rerunning time-consufming `go build`. This fixes that by adding clearly unnecessary files .dockerignore
2021-05-02 16:47:07 +09:00
Christoph Brand
a18ac330bb feature(controller): allow autoscaler to scale down to 0 (#447) 2021-05-02 16:46:51 +09:00
Yusuke Kuoka
0901456320 Update README with more detailed test instructions (#503)
- You can now use `make acceptance/run` to run only a specific acceptance test case
- Add note about Ubuntu 20.04 users / snap-provided docker
- Add instruction to run Ginkgo tests
- Extract acceptance/load from acceptance/kind
- Make `acceptance/pull` not depend on `docker-build`, so that you can do `make docker-build acceptance/load` for faster image reload
2021-05-02 16:31:07 +09:00
Yusuke Kuoka
dbd7b486d2 feat: Support for scaling from/to zero (#465)
This is an attempt to support scaling from/to zero.

The basic idea is that we create a one-off "registration-only" runner pod on RunnerReplicaSet being scaled to zero, so that there is one "offline" runner, which enables GitHub Actions to queue jobs instead of discarding those.

GitHub Actions seems to immediately throw away the new job when there are no runners at all. Generally, having runners of any status, `busy`, `idle`, or `offline` would prevent GitHub actions from failing jobs. But retaining `busy` or `idle` runners means that we need to keep runner pods running, which conflicts with our desired to scale to/from zero, hence we retain `offline` runners.

In this change, I enhanced the runnerreplicaset controller to create a registration-only runner on very beginning of its reconciliation logic, only when a runnerreplicaset is scaled to zero. The runner controller creates the registration-only runner pod, waits for it to become "offline", and then removes the runner pod. The runner on GitHub stays `offline`, until the runner resource on K8s is deleted. As we remove the registration-only runner pod as soon as it registers, this doesn't block cluster-autoscaler.

Related to #447
2021-05-02 16:11:36 +09:00
callum-tait-pbx
7e766282aa ci: updating paths-ignore (#496)
* chore: updating paths-ignore

* chore: adding more path-ignores
2021-05-01 21:36:45 +09:00
ToMe25
ba175148c8 Locally build runner image instead of pulling it (#473)
* Fix acceptance helm test not using newly built controller image

* Locally build runner image instead of pulling it

* Revert runner controller image pull policy to always

and add a line to the test deployment to use IfNotPresent

* Change runner repository from summerwind/action-runner to the owner of actions-runner-controller.

Also fix some Makefile formatting.

* Undo renaming acceptance/pull to docker-pull

* Some env var cleanup

Rename USERNAME to DOCKER_USER(is still used for github too tho)
Add RUNNER_NAME var(defaults to $DOCKER_USER/actions-runner)
Add TEST_REPO(defaults to $DOCKER_USER/actions-runner-controller)
2021-05-01 15:10:57 +09:00
callum-tait-pbx
358146ee54 docs: adding note on cloud tooling (#492)
* docs: adding note on cloud tooling

* docs: better grammar
2021-04-30 10:20:01 +09:00
callum-tait-pbx
e9dd16b023 chore: adding stale config (#487)
* chore: adding stale config

* chore: adding more labels

* chore: adding more exempt labels
2021-04-30 10:14:13 +09:00
callum-tait-pbx
1ba4098648 docs: updating to reflect new ownership (#491) 2021-04-30 10:11:58 +09:00
callum-tait-pbx
05fb8569b3 docs: updating helm install command (#485) 2021-04-27 09:12:30 +09:00
callum-tait-pbx
db45a375d0 chore: bump runner (#486)
* chore: bump runner

* chore: bumper runner in ci
2021-04-27 08:38:40 +09:00
Rolf Ahrenberg
81dd47a893 Document dockerMTU and dockerRegistryMirror (#482) 2021-04-26 09:52:09 +09:00
Rolf Ahrenberg
6b77a2a5a8 feat: Docker registry mirror (#478)
Changes:

- Switched to use `jq` in startup.sh
- Enable docker registry mirror configuration which is useful when e.g. avoiding the Docker Hub rate-limiting

Check #478 for how this feature is tested and supposed to be used.
2021-04-25 14:04:01 +09:00
callum-tait-pbx
dc4cf3f57b docs: better enterprise runner documentation (#477)
* docs: better Enterprise runner documentation

* docs: adding more detail
2021-04-25 13:33:47 +09:00
Yusuke Kuoka
d810b579a5 Update RELEASE_NOTE_TEMPLATE.md 2021-04-25 13:02:15 +09:00
Yusuke Kuoka
47c8de9dc3 Rename RELEASE_NOTE_TEMPLATE to RELEASE_NOTE_TEMPLATE.md 2021-04-25 13:01:20 +09:00
Yusuke Kuoka
74a53bde5e Add release note template (#481)
So that everyone can contribute enhancements and fixes to the release notes structure :)
2021-04-25 13:00:25 +09:00
callum-tait-pbx
aad2615487 docs: improved details on authentication (#472) 2021-04-23 09:42:29 +09:00
callum-tait-pbx
03d9b6a09f docs: slightly better wording about support (#471) 2021-04-23 09:41:08 +09:00
callum-tait-pbx
5d280cc8c8 docs: adding scaling configuration detail (#469) 2021-04-23 09:40:23 +09:00
callum-tait-pbx
133c4fb21e docs: clean up Enterprise and fsGroup docs (#460)
* docs: cleaning up Enterprise docs

* docs: better wording in various areas

* docs: improving enterprise runner docs

* docs: using American English

* docs: removing superfluous paragraph

* docs: improving grammar

* docs: better grammar

* docs: better wording

* docs: updated to reflect comments

* docs: spelling correction
2021-04-20 10:26:10 +09:00
callum-tait-pbx
3b2d2c052e chore: adding Helm app version back (#412)
* chore: adding Helm app version back

* chore: removing redundant values entry

* chore: bumping to newer version

* chore: bumping app version to latest

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-04-18 13:58:54 +09:00
Manuel Jurado
37c2a62fa8 Allow to configure runner volume size limit (#436)
Enable the user to set a limit size on the volume of the runner to avoid some runner pod affecting other resources of the same cluster

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-04-18 13:56:59 +09:00
callum-tait-pbx
2eeb56d1c8 docs: removing superfluous title reference (#459) 2021-04-18 09:45:28 +09:00
ToMe25
a612b38f9b Cache docker images in acceptance test (#463)
* Cache docker images locally

Cache dind, runner, and kube-rbac-proxy docker image on the host and copy onto the kind node instead of downloading it to the node directly.

* Also cache certmanager docker images
2021-04-18 09:44:59 +09:00
callum-tait-pbx
1c67ea65d9 ci: fix latest tag push logic (#462)
* ci: fix latest tag push logic

* ci: use better job names
2021-04-18 09:41:22 +09:00
ToMe25
c26fb5ad5f Make acceptance use local docker image (#448)
load the local docker image to the kind cluster instead of pushing it to dockerhub and pulling it from there
2021-04-17 17:13:47 +09:00
callum-tait-pbx
325c2cc385 docs: correct and simplify example (#450)
* docs: correct and simplify example

* docs: removing alternatives

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-04-17 17:08:57 +09:00
Agoney Garcia-Deniz
2e551c9d0a Add hostAliases to the runner spec (#456) 2021-04-17 17:04:52 +09:00
asoldino
7b44454d01 Add documentation of dockerVolumeMount (#453) 2021-04-17 17:04:38 +09:00
callum-tait-pbx
f2680b2f2d Bumping runner to Ubuntu 20.04 (#438)
Images for `actions-runner:v${VERSION}` and `actions-runner:latest` tags are upgraded to Ubuntu 20.04.

If you would like not to upgrade Ubuntu in the runner image in the future, migrate to new tags suffixed with `-ubuntu-20.04` like`actions-runner:v${VERSION}-ubuntu-20.04`.

We also keep publishing the existing Ubuntu 18.04 images with new `actions-runner:v${VERSION}-ubuntu-18.04` tags. Please use it when it turned out that you had workflows dependent on Ubuntu 18.04.

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-04-17 17:02:03 +09:00
asoldino
b42b8406a2 Add dockerVolumeMounts (#439)
Resolves #435
2021-04-06 10:10:10 +09:00
Javi Polo
3c125e2191 Fix helm webhook ingress error: spec.rules[0].http.paths[0].backend: Required value: port name or number is required (#437) 2021-04-02 06:34:45 +09:00
Christoph Brand
9ed245c85e feature(controller): remove dockerd executable (#432) 2021-04-01 08:50:48 +09:00
Florian Braun
5b7807d54b Quote vars in entrypoint.sh to prevent unwanted argument split (#420)
Prevents arguments from being split when e.g. the RUNNER_GROUP variable contains spaces (which is legit. One can create such groups in GitHub).

I've seen that all workers with group names that contain no spaces can register successfully, while all workers with groups that contain spaces will not register.

Furthermore, I suppose also other chars can be used here to inject arbitrary commands in an unsupported way via e.g. pipe symbol.

Quoting the vars correctly should prevent that and allow for e.g. group names and runner labels with spaces and other bash reserved characters.
2021-03-31 10:09:08 +09:00
Yusuke Kuoka
156e2c1987 Fix MTU configuration for dockerd (#421)
Resolves #393
2021-03-31 09:29:21 +09:00
Yusuke Kuoka
da4dfb3fdf Add make target test-with-deps to ease setting up dependent binaries (#426) 2021-03-31 09:23:16 +09:00
Gabriel Dantas Gomes
0783ffe989 some readme typos (#423) 2021-03-29 10:08:21 +09:00
Yusuke Kuoka
374105c1f3 Fix dindWithinRunnerContainer not to crash-loop runner pods (#419)
Apparently #253 broke dindWithinRunnerContainer completely due to the difference in how /runner volume is set up.
2021-03-25 10:23:36 +09:00
Yusuke Kuoka
bc6e499e4f Make logging more concise (#410)
This makes logging more concise by changing logger names to something like `controllers.Runner` to `actions-runner-controller.runner` after the standard `controller-rutime.controller` and reducing redundant logs by removing unnecessary requeues. I have also tweaked log messages so that their style is more consistent, which will also help readability. Also, runnerreplicaset-controller lacked useful logs so I have enhanced it.
2021-03-20 07:34:25 +09:00
Yusuke Kuoka
07f822bb08 Do include Runner controller in integration test (#409)
So that we could catch bugs in runner controller like seen in #398, #404, and #407.

Ref #400
2021-03-19 16:14:15 +09:00
Hidetake Iwata
3a0332dfdc Add metrics of RunnerDeployment and HRA (#408)
* Add metrics of RunnerDeployment and HRA

* Use kube-state-metrics-style label names
2021-03-19 16:14:02 +09:00
Yusuke Kuoka
f6ab66c55b Do not delay min/maxReplicas propagation from HRA to RD due to caching (#406)
As part of #282, I have introduced some caching mechanism to avoid excessive GitHub API calls due to the autoscaling calculation involving GitHub API calls is executed on each Webhook event.

Apparently, it was saving the wrong value in the cache- The value was one after applying `HRA.Spec.{Max,Min}Replicas` so manual changes to {Max,Min}Replicas doesn't affect RunnerDeployment.Spec.Replicas until the cache expires. This isn't what I had wanted.

This patch fixes that, by changing the value being cached to one before applying {Min,Max}Replicas.

Additionally, I've also updated logging so that you observe which number was fetched from cache, and what number was suggested by either TotalNumberOfQueuedAndInProgressWorkflowRuns or PercentageRunnersBusy, and what was the final number used as the desired-replicas(after applying {Min,Max}Replicas).

Follow-up for #282
2021-03-19 12:58:02 +09:00
Yusuke Kuoka
d874a5cfda Fix status.lastRegistrationCheckTime in body must be of type string: \"null\" errors (#407)
Follow-up for #398 and #404
2021-03-19 11:15:35 +09:00
Yusuke Kuoka
c424215044 Do recheck runner registration timely (#405)
Since #392, the runner controller could have taken unexpectedly long time until it finally notices that the runner has been registered to GitHub. This patch fixes the issue, so that the controller will notice the successful registration in approximately 1 minute(hard-coded).

More concretely, let's say you had configured a long sync-period of like 10m, the runner controller could have taken approx 10m to notice the successful registration. The original expectation was 1m, because it was intended to recheck every 1m as implemented in #392. It wasn't working as such due to my misunderstanding in how requeueing work.
2021-03-19 11:02:47 +09:00
Yusuke Kuoka
c5fdfd63db Merge pull request #404 from summerwind/fix-reg-update-runner-status-err
Fix `Failed to update runner status for Registration` errors
2021-03-19 08:56:20 +09:00
Yusuke Kuoka
23a45eaf87 Bump chart version 2021-03-19 08:37:17 +09:00
Yusuke Kuoka
dee997b44e Fix Failed to update runner status for Registration errors
Fixes #400
2021-03-19 07:02:00 +09:00
Yusuke Kuoka
2929a739e3 Merge pull request #398 from summerwind/fix-status-last-reg-check-time-type-err
Fix `status.lastRegistrationCheckTime in body must be of type string: \"null\"` error
2021-03-18 10:36:44 +09:00
Yusuke Kuoka
3cccca8d09 Do patch runner status instead of update to reduce conflicts and avoid future bugs
Ref https://github.com/summerwind/actions-runner-controller/pull/398#issuecomment-801548375
2021-03-18 10:31:17 +09:00
Yusuke Kuoka
7a7086e7aa Make error logs more helpful 2021-03-18 10:26:21 +09:00
Yusuke Kuoka
565b14a148 Fix status.lastRegistrationCheckTime in body must be of type string: \"null\" error
Follow-up for #392
2021-03-18 10:20:49 +09:00
Yusuke Kuoka
ecc441de3f Bump chart version 2021-03-18 07:36:22 +09:00
Manabu Sakai
25335bb3c3 Fix typo in certificate.yaml (#396) 2021-03-18 07:33:34 +09:00
Yusuke Kuoka
9b871567b1 Fix wildcard in middle of actionsglob/scaleUpTrigger.githubEvent.checkRun.names not working (#395)
actionsglob patterns like `foo-*-bar` was not correctly working. Tests and the implementation was enhanced to correctly support it.
2021-03-17 06:46:48 +09:00
Balazs Gyurak
264cf494e3 Fix "pole" typo in README (#394)
I think these should be "poll".
2021-03-17 06:34:01 +09:00
Yusuke Kuoka
3f23501b8e Reduce "No runner matching the specified labels was found" errors while runner replacement (#392)
We occasionally encountered those errors while the underlying RunnerReplicaSet is being recreated/replaced on RunnerDeployment.Spec.Template update. It turned out to be due to that the RunnerDeployment controller was waiting for the runner pod becomes `Running`, intead of the new replacement runner to have registered to GitHub. This fixes that, by trying to Runner.Status.Phase to `Running` only after the runner in the runner pod appears to be registered.

A side-effect of this change is that runner controller would call more "ListRunners" GitHub Actions API. I've reviewed and improved the runner controller code and Runner CRD to make make the number of calls minimum. In most cases, ListRunners should be called only twice for each runner creation.
2021-03-16 10:52:30 +09:00
Yusuke Kuoka
5530030c67 Disable metrics-based autoscaling by default when scaleUpTriggers are enabled (#391)
Relates to https://github.com/summerwind/actions-runner-controller/pull/379#discussion_r592813661
Relates to https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-793266609

When you defined HRA.Spec.ScaleUpTriggers[] but HRA.Spec.Metrics[], the HRA controller will now enable ScaleUpTriggers alone and insteaed of automatically enabling TotalNumberOfQueuedAndInProgressWorkflowRuns. This allows you to use ScaleUpTriggers alone, so that the autoscaling is done without calling GitHub API at all, which should grealy decrease the change of GitHub API calls get rate-limited.
2021-03-14 11:03:00 +09:00
Yusuke Kuoka
8d3a83b07a Add CheckRun.Names scale-up trigger configuration (#390)
This allows you to trigger autoscaling depending on check_run names(i.e. actions job names). If you are willing to differentiate scale amount only for a specific job, or want to scale only on a specific job, try this.
2021-03-14 10:21:42 +09:00
callum-tait-pbx
a6270b44d5 docs: fix typos and add PR link (#379)
* docs: fix typos and add PR link

* docs: changes based on feedback

* docs: fixing numbers in list

* docs: grammer

* docs: better wording
2021-03-12 08:52:34 +09:00
Brandon Kimbrough
2273b198a1 Add ability to set the MTU size of the docker in docker container (#385)
* adding abilitiy to set docker in docker MTU size

* safeguards to only set MTU env var if it is set
2021-03-12 08:44:49 +09:00
Yusuke Kuoka
3d62e73f8c Fix PercentageRunnersBusy scaling not working (#386)
PercentageRunnerBusy seems to have regressed since #355 due to that RunnerDeployment.Spec.Selector is empty by default and the HRA controller was using that empty selector to query runners, which somehow returned 0 runners. This fixes that by using the newly added automatic `runner-deployment-name` label for the default runner label and the selector, which avoids querying with empty selector.

Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-795200205
2021-03-11 20:16:36 +09:00
Yusuke Kuoka
f5c639ae28 Make webhook-based autoscaler github event logs more operator-friendly (#384)
Adds fields like `pullRequest.base.ref` and `checkRun.status` that are useful for verifying the autoscaling behaviour without browsing GitHub.
Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-794175312
2021-03-10 09:40:44 +09:00
Yusuke Kuoka
81016154c0 GITHUB_APP_PRIVATE_KEY can now be the content of the key (#383)
Resolves #382
2021-03-10 09:37:15 +09:00
Yusuke Kuoka
728829be7b Fix panic on scaling organizational runners (#381)
Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-793287133
2021-03-09 15:03:47 +09:00
Yusuke Kuoka
c0b8f9d483 Merge pull request #380 from summerwind/ns-flag
Use --watch-namespace flag to restrict the namespace to watch
2021-03-09 15:03:32 +09:00
Yusuke Kuoka
ced1c2321a Fix chart-testing failing due to conflict between authSecret and dummySecret 2021-03-09 14:54:55 +09:00
Yusuke Kuoka
1b8a656051 Use --watch-namespace flag to restrict the namespace to watch
Ref https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-793172995
2021-03-09 09:46:21 +09:00
Rob Whitby
1753fa3530 handle GET requests in webhook hra (#378) 2021-03-09 08:46:27 +09:00
Johannes Nicolai
8c0f3dfc79 Set runner group for runners with enterprise scope (#376)
* so far, runner group parameter is only set for runners with org scope
* now set group for enterprise runners as well
* removed null check for org scope as either org or enterprise will be set
2021-03-08 09:18:23 +09:00
Rob Whitby
dbda292f54 fix typo in examples (#373) 2021-03-08 09:18:10 +09:00
callum-tait-pbx
550a864198 chore: bumping helm chart (#372)
PR 355 made changes to the CRDs but didn't bump the version
2021-03-05 20:27:52 +09:00
Yusuke Kuoka
4fa5315311 Fix possible flapping autoscale on runner update (#371)
Addresses https://github.com/summerwind/actions-runner-controller/pull/355#discussion_r587199428
2021-03-05 10:21:20 +09:00
Hiroshi Muraoka
11e58fcc41 Manage runner with label (#355)
* Update RunnerDeploymentSpec to have Selector field

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Update RunnerReplicaSetSpec to have Selector field

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Add CloneSelectorAndAddLabel to add Selector field

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Fix tests

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Use label to find RunnerReplicaSet/Runner

Signed-off-by: binoue <banji-inoue@cybozu.co.jp>

* Update controller-gen versions in CRD

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Update autoscaler to list Pods with labels

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Add debug log

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Modify RunnerDeployment tests

Signed-off-by: binoue <banji-inoue@cybozu.co.jp>

* Modify RunnerReplicaset test

Signed-off-by: binoue <banji-inoue@cybozu.co.jp>

* Modify integration test

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Use RunnerDeployment Template Labels as the default selector for backward compatibility

* Fix labeling

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Update func in Eventually to return (int, error)

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Update RunnerDeployment controller not to use label selector

Signed-off-by: Hiroshi Muraoka <h.muraoka714@gmail.com>

* Fix potential replicaset controller breakage on replicaset created before v0.17.0

* Fix errors on existing runner replica sets

* Ensure RunnerReplicaSet Spec Selector addition does not break controller

* Ensure RunnerDeployment Template.Spec.Labels change does result in template hash change

* Fix comment

Co-authored-by: binoue <banji-inoue@cybozu.co.jp>
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-03-05 10:15:39 +09:00
Mike Perry
f220fefe92 Update README.md (#370) 2021-03-05 09:17:32 +09:00
Hidetake Iwata
56b4598d1d Fix helm template error when webhook server is enabled (#365)
* Fix include block in githubwebhook.deployment.yaml

* Fix include block in githubwebhook.secrets.yaml
2021-03-03 09:21:58 +09:00
Taehyun Kim
8f977dbe48 Fix various bugs in helm chart (#364)
* Fix wrong trim

* add missing MutatingWeghookConfiguration.webhooks[*].sideEffects

* fix missing admissionReviewVersions

* admissionregistration.k8s.io/v1 for kustomization manifests

* revert webhook config
2021-03-03 09:21:20 +09:00
Yusuke Kuoka
9ae3551744 Remove unnecessary GitHub API calls (#363)
The controller had the 2 extra and redundant calls to List Workflow Runs API.

Ref #362
2021-03-02 10:55:30 +09:00
Rolf Ahrenberg
05ad3f5469 Set default python (#361) 2021-03-01 09:45:13 +09:00
callum-tait-pbx
9c7372a8e0 docs: styling fixes (#359)
* docs: styling fixes

* docs: grammer fixes
2021-03-01 09:44:35 +09:00
Yusuke Kuoka
584590e97c Use patch instead of update to alleviate HRA conflict on webhook (#358)
We sometimes see that integration test fails due to runner replicas not meeting the expected number in a timely manner. After investigating a bit, this turned out to be due to that HRA updates on webhook-based autoscaler and HRA controller are conflicting. This changes the controllers to use Patch instead of Update to make conflicts less likely to happen.

I have also updated the hra controller to use Patch when updating RunnerDeployment, too.

Overall, these changes should make the webhook-based autoscaling more reliable due to less conflicts.
2021-02-26 10:17:09 +09:00
Yusuke Kuoka
d18884a0b9 Fix HRA expired cache entries not cleaned up (#357)
Fixes #356
2021-02-26 09:54:24 +09:00
callum-tait-pbx
f987571b64 Improve docs (#303) 2021-02-26 09:32:18 +09:00
Taehyun Kim
450e384c4c Update helm chart (#343)
* add replicaCount

* Add authSecret.existingSecret

* set image.tag null by default

* implement ingress for githubwebhook server

* fix deprecated and secretName template

* backward compat .authSecret.enabled

* existingSecret for github webhook secret

* use secretName template

* set default secret names

* do not use app version based image tag

* create and name variable for secrets
2021-02-26 09:26:51 +09:00
Yusuke Kuoka
e9eef04993 Fix old HRA capacity reservations not cleaned up (#354)
Similar to #348 for #346, but for HRA.Spec.CapacityReservations usually modified by the webhook-based autoscaler controller.
This patch tries to fix that by improving the webhook-based autoscaler controller to omit expired reservations on updating HRA spec.
2021-02-25 11:08:00 +09:00
Yusuke Kuoka
598dd1d9fe Fix incorrect DESIRED on `kubectl get hra (#353)
`kubectl get horizontalrunnerautoscalers.actions.summerwind.dev` shows HRA.status.desiredReplicas as the DESIRED count. However the value had been not taking capacityReservations into account, which resulted in showing incorrect count when you used webhook-based autoscaler, or capacityReservations API directly. This fixes that.
2021-02-25 10:32:09 +09:00
Yusuke Kuoka
9890a90e69 Improve webhook-based autoscaler log (#352)
The controller had been writing confusing messages like the below on missing scale target:

```
Found too many scale targets: It must be exactly one to avoid ambiguity. Either set WatchNamespace for the webhook-based autoscaler to let it only find HRAs in the namespace, or update Repository or Organization fields in your RunnerDeployment resources to fix the ambiguity.{"scaleTargets": ""}
```

This fixes that, while improving many kinds of messages written while reconcilation, so that the error message is more actionable.
2021-02-25 10:07:41 +09:00
Yusuke Kuoka
9da123ae5e Fix integration test flakiness (#351)
Ref https://github.com/summerwind/actions-runner-controller/pull/345#issuecomment-785015406
2021-02-25 09:30:32 +09:00
Johannes Nicolai
4d4137aa28 Avoid zombie runners that missed token expiration by a bit (#345)
* if a new runner pod was just scheduled to start up right before a 
registration expired, it will not get a new registration token and go in 
an infinite update loop (until #341) kicks in
* if registzration tokens got updated a little bit before they actually 
expired, just starting up pods will way more likely get a working token
2021-02-25 09:07:49 +09:00
Yusuke Kuoka
022007078e Compact excessive error message on runnerreplicaset status update conflict (#350)
We occasionally see logs like the below:

```
2021-02-24T02:48:26.769ZERRORFailed to update runner status{"runnerreplicaset": "testns-244ol/example-runnerdeploy-j5wzf", "error": "Operation cannot be fulfilled on runnerreplicasets.actions.summerwind.dev \"example-runnerdeploy-j5wzf\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/go-logr/zapr.(*zapLogger).Error
/home/runner/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
github.com/summerwind/actions-runner-controller/controllers.(*RunnerReplicaSetReconciler).Reconcile
/home/runner/work/actions-runner-controller/actions-runner-controller/controllers/runnerreplicaset_controller.go:207
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:256
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:88
2021-02-24T02:48:26.769ZERRORcontroller-runtime.controllerReconciler error{"controller": "testns-244olrunnerreplicaset", "request": "testns-244ol/example-runnerdeploy-j5wzf", "error": "Operation cannot be fulfilled on runnerreplicasets.actions.summerwind.dev \"example-runnerdeploy-j5wzf\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/go-logr/zapr.(*zapLogger).Error
/home/runner/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:258
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190913080033-27d36303b655/pkg/util/wait/wait.go:88
```

which can be compacted into one-liner, without the useless stack trace, without double-logging the same error from the logger and the controller.
2021-02-25 09:01:02 +09:00
Johannes Nicolai
31e5e61155 Log correct runner that was deleted (#349) 2021-02-25 08:38:55 +09:00
Aditya Purandare
1d1453c5f2 Fix user used for dind runner group permissions (#337) 2021-02-24 19:06:52 +09:00
Yusuke Kuoka
e44e53b88e Fix failure while saving HRA status after running controller for a while (#348)
Fixes #346
2021-02-24 11:20:21 +09:00
Yusuke Kuoka
398791241e Fix runner release workflow to do docker-push (#347)
Apparently I have mistakenly removed `push` option from the workflow in #323 which resulted in new runner build #323 not being pushed. This fixes that.
2021-02-24 11:08:33 +09:00
Yusuke Kuoka
991535e567 Fix panic on webhook for user-owned repository (#344)
* Fix panic on webhook for user-owned repository

Follow-up for #282 and #334
2021-02-23 08:05:25 +09:00
Johannes Nicolai
2d7fbbfb68 Handle offline runners gracefully (#341)
* if a runner pod starts up with an invalid token, it will go in an 
infinite retry loop, appearing as RUNNING from the outside
* normally, this error situation is detected because no corresponding 
runner objects exists in GitHub and the pod will get removed after 
registration timeout
* if the GitHub runner object already existed before - e.g. because a 
finalizer was not properly run as part of a partial Kubernetes crash, 
the runner will always stay in a running mode, even updating the 
registration token will not kill the problematic pod
* introducing RunnerOffline exception that can be handled in runner 
controller and replicaset controller
* as runners are offline when a pod is completed and marked for restart, 
only do additional restart checks if no restart was already decided, 
making code a bit cleaner and saving GitHub API calls after each job 
completion
2021-02-22 10:08:04 +09:00
Yusuke Kuoka
dd0b9f3e95 Merge pull request #340 from int128/integration-test-check-run
Fix index key to find HRA in GitHub webhook handler
2021-02-22 09:49:54 +09:00
Yusuke Kuoka
7cb2bc84c8 Merge pull request #334 from summerwind/integration-test-check-run
Add integration test for autoscaling on check_run webhook event
2021-02-22 09:38:07 +09:00
Hidetake Iwata
b0e74bebab Fix index key to find HRA in GitHub webhook handler 2021-02-20 21:25:23 +09:00
Hidetake Iwata
dfbe53dcca Fix webhook payload in integration test 2021-02-20 21:08:23 +09:00
Yusuke Kuoka
ebc3970b84 Add integration test for autoscaling on check_run webhook event 2021-02-19 10:33:04 +09:00
Hidetake Iwata
1ddcf6946a Fix nil pointer error on received check_run event (#331)
* Reproduce nil pointer error on received check_run event

* Fix nil pointer error on received check_run event
2021-02-18 20:22:36 +09:00
Yusuke Kuoka
cfbaad38c8 Merge pull request #328 from int128/fix-port-name-length
Changes:

1. Fix length of github-webhook-server port name
2. Add a cluster role binding for github-webhook-server
3. Remove --enable-leader-election from github-webhook-server
2021-02-18 20:20:39 +09:00
Yusuke Kuoka
67f6de010b feat: Common runner labels configurable per controller (#327)
* feat: Common runner labels configurable per controller

Ref #321
2021-02-18 20:19:08 +09:00
Hidetake Iwata
2db608879a Remove --enable-leader-election from github-webhook-server 2021-02-18 16:51:47 +09:00
Hidetake Iwata
2c4a6ca90b Add cluster role binding for github-webhook-server 2021-02-18 16:49:24 +09:00
Hidetake Iwata
829bf20449 Fix length of github-webhook-server port name 2021-02-18 16:42:15 +09:00
Reinier Timmer
be13322816 Update runner to 2.277.1 (#322)
* Update runner to 2.277.1

* Update build-and-release-runners.yml

* integration test condition

Don't run integration tests when only updating the runner image

* fixup! integration test condition

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-02-18 09:29:53 +09:00
Johannes Nicolai
7f4a76a39b Also log into DockerHub for release event (#326)
* so far, only push events would trigger the DockerHub login step
* hence, attempts to release would fail because of a permission problem (tested locally)
* adding OR condition to also login in case a release got published
2021-02-18 08:54:44 +09:00
callum-tait-pbx
0fce761686 fix: add trunate to ensure service kinds have valid names (#325)
* fix: adding truncate for service kinds

* chore : bumping chart version
2021-02-18 08:43:48 +09:00
Yusuke Kuoka
c88ff44518 Fix wip.yml workflow for building controller canary tags (#323)
In #306 we seem to have accidentally updated a wrong workflow, which was for runner builds. This updates the one for the controller.

Resolves #302
2021-02-18 08:42:24 +09:00
Yusuke Kuoka
2fdf35ac9d Refactor integration test to use helpers (#320)
This should make the test code a bit more DRY and readable.
2021-02-17 10:23:35 +09:00
Johannes Nicolai
6cce3fefc5 Add project to awesome-runners list (#319) 2021-02-17 09:14:42 +09:00
Yusuke Kuoka
eb2eaf8130 Fix TotalNumberOfQueuedAndInProgressWorkflowRuns to work with a lot of remaining completed jobs (#316)
I have heard from some user that they have hundred thousands of `status=completed` workflow runs in their repository which effectively blocked TotalNumberOfQueuedAndInProgressWorkflowRuns from working because of GitHub API rate limit due to excessive paginated requests.

This fixes that by separating list-workflow-runs calls to two - one for `queued` and one for `in_progress`, which can make the minimum API call from 1 to 2, but allows it to work regardless of number of remaining `completed` workflow runs.
2021-02-16 18:55:55 +09:00
callum-tait-pbx
7bf712d0d4 fix: duplicate name attribute (#318) 2021-02-16 18:52:08 +09:00
Yusuke Kuoka
7d024a6c05 Fix "duplicate metrics collector registration attempted" errors at startup (#317)
I have seen this error a lot in our integration test. It turned out due to https://github.com/kubernetes-sigs/controller-runtime/issues/484 and is being fixed with this change.
2021-02-16 18:51:33 +09:00
Yusuke Kuoka
434823bcb3 scale{Up,Down}Adjustment to add/remove constant number of replicas on scaling (#315)
* `scale{Up,Down}Adjustment` to add/remove constant number of replicas on scaling

Ref #305

* Bump chart version
2021-02-16 17:16:26 +09:00
Yusuke Kuoka
35d047db01 Fix enterprise runners misusing cached token (#314)
Follow-up for #290
2021-02-16 12:56:52 +09:00
Yusuke Kuoka
f1db6af1c5 Add repository runners support for PercentageRunnersBusy-based autoscaling (#313)
Resolves #258
2021-02-16 12:44:51 +09:00
Hidetake Iwata
4f3f2fb60d Add metrics for GitHub API rate limit (#312) 2021-02-16 09:58:09 +09:00
Johannes Nicolai
2623140c9a Make log message less scary (#311)
* the reconciliation loop is often much faster than the runner startup, 
so changing runner not found related messages to debug and also add the 
possibility that the runner just needs more time
2021-02-16 09:55:55 +09:00
Johannes Nicolai
1db9d9d574 Use ARM64 compatible kube-rbac-proxy from upstream (#310)
* as pointed out in #281 the currently used image for the 
kube-rbac-proxy - gcr.io/kubebuilder/kube-rbac-proxy:v0.4.1" - does not 
have an ARM64 image
* hence, trying to use the standard deployment manifest / helm char will 
fail on ARM64 systems
* replaced image with quay.io/brancz/kube-rbac-proxy:v0.8.0 which is the 
latest version from the upstream maintainer 
(https://github.com/brancz/kube-rbac-proxy/blob/master/Makefile#L13)
* successfully tested on both AMD64 and ARM64 clusters
* fixes #281
2021-02-16 09:55:03 +09:00
callum-tait-pbx
d046350240 chore: bumping helm chart sematically (#296)
* chore: bumping helm chart sematically

* chore: removing the app version config
2021-02-16 09:45:56 +09:00
callum-tait-pbx
cca4d249e9 feat: create workflow for runner releases (#306) 2021-02-16 09:42:28 +09:00
Johannes Nicolai
bc8bc70f69 Fix rate limit and runner registration logic (#309)
* errors.Is compares all members of a struct to return true which never 
happened
* switched to type check instead of exact value check
* notRegistered was using double negation in if statement which lead to 
unregistering runners after the registration timeout
2021-02-15 09:36:49 +09:00
Johannes Nicolai
34c6c3d9cd Pod eviction policy examples (crashed nodes) (#308)
* ... otherwise it will take 40 seconds (until a node is detected as unreachable) + 5 minutes (until pods are evicted from unreachable/crashed nodes)
* pods stuck in "Terminating" status on unreachable nodes will only be freed once #307 gets merged
2021-02-15 09:33:01 +09:00
Johannes Nicolai
9c8d7305f1 Introduce pod deletion timeout and forcefully delete stuck pods (#307)
* if a k8s node becomes unresponsive, the kube controller will soft
delete all pods after the eviction time (default 5 mins)
* as long as the node stays unresponsive, the pod will never leave the
last status and hence the runner controller will assume that everything
is fine with the pod and will not try to create new pods
* this can result in a situation where a horizontal autoscaler thinks
that none of its runners are currently busy and will not schedule any
further runners / pods, resulting in a broken  runner deployment until the
runnerreplicaset is deleted or the node comes back online
* introducing a pod deletion timeout (1 minute) after which the runner
controller will try to reboot the runner and create a pod on a working
node
* use forceful deletion and requeue for pods that have been stuck for
more than one minute in terminating state
* gracefully handling race conditions if pod gets finally forcefully deleted within
2021-02-15 09:32:28 +09:00
Yusuke Kuoka
addcbfa7ee Fix runner registration timeout (#301)
Fixes #300
2021-02-12 10:00:20 +09:00
Yusuke Kuoka
bbb036e732 feat: Prevent blocking on transient runner registration failure (#297)
This enhances the controller to recreate the runner pod if the corresponding runner has failed to register itself to GitHub within 10 minutes(currently hard-coded).

It should alleviate #288 in case the root cause is some kind of transient failures(network unreliability, GitHub down, temporarly compute resource shortage, etc).

Formerly you had to manually detect and delete such pods or even force-delete corresponding runners to unblock the controller.

Since this enhancement, the controller does the pod deletion automatically after 10 minutes after pod creation, which result in the controller create another pod that might work.

Ref #288
2021-02-09 10:17:52 +09:00
Yusuke Kuoka
9301409aec fix: Paginate ListRepositoryWorkflowRuns (#295)
When we used `QueuedAndInProgressWorkflowRuns`-based autoscaling, it only fetched and considered only the first 30 workflow runs at the reconcilation time. This may have resulted in unreliable scaling behaviour, like scale-in/out not happening when it was expected.
2021-02-09 10:13:53 +09:00
Yusuke Kuoka
ab1c39de57 feat: HorizontalRunnerAutoscaler Webhook server (#282)
* feat: HorizontalRunnerAutoscaler Webhook server

This introduces a Webhook server that responds GitHub `check_run`, `pull_request`, and `push` events by scaling up matched HorizontalRunnerAutoscaler by 1 replica. This allows you to immediately add "resource slack" for future GitHub Actions job runs, without waiting next sync period to add insufficient runners.

This feature is highly inspired by https://github.com/philips-labs/terraform-aws-github-runner. terraform-aws-github-runner can manage one set of runners per deployment, where actions-runner-controller with this feature can manage as many sets of runners as you declare with HorizontalRunnerAutoscaler and RunnerDeployment pairs.

On each GitHub event received, the webhook server queries repository-wide and organizational runners from the cluster and searches for the single target to scale up. The webhook server tries to match HorizontalRunnerAutoscaler.Spec.ScaleUpTriggers[].GitHubEvent.[CheckRun|Push|PullRequest] against the event and if it finds only one HRA, it is the scale target. If none or two or more targets are found for repository-wide runners, it does the same on organizational runners.

Changes:

* Fix integration test
* Update manifests
* chart: Add support for github webhook server
* dockerfile: Include github-webhook-server binary
* Do not import unversioned go-github
* Update README
2021-02-07 17:37:27 +09:00
alex-mozejko
a4350d0fc2 bug-fix: patched dir owned by runner (#284)
* bug-fix: patched dir owned by runner

* always build with latest runner version

* Revert "always build with latest runner version"

This reverts commit e719724ae9fe92a12d4a087185cf2a2ff543a0dd.

* Also patch dindrunner.Dockerfile

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-02-07 17:21:10 +09:00
callum-tait-pbx
2146c62c9e chore: bumping Helm chart version (#294)
* chore: adding mac rubbish to gitignore

* chore: bumping chart version
2021-02-07 16:46:19 +09:00
Jesse Haka
28e80a2d28 Add support for enterprise runners (#290)
* Add support for enterprise runners

* update docs
2021-02-05 09:31:06 +09:00
Tom Bamford
831db9ee2a Added github.sha to DockerHub push (#286)
* Added GITHUB.RUN_NUMBER to DockerHub push

* switch run_number to sha on docker tag

* re-add mutable tags for backwards compatability

* truncate to short SHA (7 chars)

* behaviour workaround

* use ENV to define sha_short

* use ::set-output to define sha_short

* bump action
2021-02-04 09:29:32 +09:00
Donovan Muller
4d69e0806e Update GitHub runner version (#280) 2021-02-02 14:06:08 +09:00
Donovan Muller
d37cd69e9b feat/helm: Bump appVersion to 0.6.1 release (#272)
* feat/helm: Bump appVersion to 0.6.1 release

* Also bump chart version to trigger a new chart release

Co-authored-by: Yusuke Kuoka <c-ykuoka@zlab.co.jp>
2021-01-29 09:29:43 +09:00
Yusuke Kuoka
a2690aa5cb Update README.md
Follow-up for #275
2021-01-29 09:29:26 +09:00
Clément
da020df0fd docs: fix install installation method (#275) 2021-01-29 09:28:34 +09:00
Jonas Lergell
6c64ae6a01 Actually use 'dockerdContainerResources' to set resources on the dind container (#273) 2021-01-29 09:18:28 +09:00
Yusuke Kuoka
42c7d0489d chart: Bump to 0.2.0 2021-01-25 09:14:49 +09:00
Donovan Muller
b3bef6404c Add support for additional environment variables (#271) 2021-01-25 09:00:03 +09:00
David Young
1127c447c4 Add GitHub Actions to publish helm chart (#257)
* Add chart workflows (#1)

* Add chart workflows

* Fix publishing step in CI

Signed-off-by: David Young <davidy@funkypenguin.co.nz>

* Update CI on push-to-master (#3)

* Put helm installation step in the correct CI job

Signed-off-by: David Young <davidy@funkypenguin.co.nz>

* Put helm installation step in the correct CI job (#4)

* Update on-push-master-publish-chart.yml

* Remove references to certmanager dependency

Signed-off-by: David Young <davidy@funkypenguin.co.nz>

* Add ability to customize kube-rbac-proxy image

Signed-off-by: David Young <davidy@funkypenguin.co.nz>

* Only install cert-manager if we're going to spin up KinD

Signed-off-by: David Young <davidy@funkypenguin.co.nz>
2021-01-24 15:37:01 +09:00
Yusuke Kuoka
ace95d72ab Fix self-update failuers due to /runner/externals mount (#253)
* Fix self-update failuers due to /runner/externals mount

Fixes #252

* Tested Self-update Fixes (#269)

Adding fixes to #253 as confirmed and tested in https://github.com/summerwind/actions-runner-controller/issues/264#issuecomment-764549833 by @jolestar, @achedeuzot and @hfuss 🙇 🍻

Co-authored-by: Hayden Fuss <wifu1234@gmail.com>
2021-01-24 10:58:35 +09:00
Johannes Nicolai
42493d5e01 Adding --name-space parameter in example (#259)
* when setting a GitHub Enterprise server URL without a namespace, an error occurs: "error: the server doesn't have a resource type "controller-manager"
* setting default namespace "actions-runner-system" makes the example work out of the box
2021-01-22 10:12:04 +09:00
Johannes Nicolai
94e8c6ffbf minReplicas <= desiredReplicas <= maxReplicas (#267)
* ensure that minReplicas <= desiredReplicas <= maxReplicas no matter what
* before this change, if the number of runners was much larger than the max number, the applied scale down factor might still result in a desired value > maxReplicas
* if for resource constraints in the cluster, runners would be permanently restarted, the number of runners could go up more than the reverse scale down factor until the next reconciliation round, resulting in a situation where the number of runners climbs up even though it should actually go down
* by checking whether the desiredReplicas is always <= maxReplicas, infinite scaling up loops can be prevented
2021-01-22 10:11:21 +09:00
callum-tait-pbx
563c79c1b9 feat/helm: add manager secret to Helm chart (#254)
* feat: adding maanger secret to Helm

* fix: correcting secret data format

* feat: adding in common labels

* fix: updating default values to have config

The auth config needs to be commented out by default as we don't want to deploy both configs empty. This may break stuff, so we want the user to actively uncomment the auth method they want instead

* chore: updating default format of cert

* chore: wording
2021-01-22 10:03:25 +09:00
Johannes Nicolai
cbb41cbd18 Updating custom container example (#260)
* use latest instead of outdated version
* use sudo for package install (required)
* use sudo for package meta data removal (required)
2021-01-22 09:57:42 +09:00
Johannes Nicolai
64a1a58acf GitHub runner groups have to be created first (#261)
* in contrast to runner labels, GitHub runner groups are not automatically created
2021-01-22 09:52:35 +09:00
Reinier Timmer
524cf1b379 Update runner to v2.275.1 (#239) 2020-12-18 08:38:39 +09:00
ZacharyBenamram
0dadddfc7d Update README for "PercentageRunnersBusy" HRA metric type (#237)
* adding readme for new hpa scheme

* callum's comments

Co-authored-by: Zachary Benamram <zacharybenamram@blend.com>
2020-12-17 10:21:27 +09:00
ZacharyBenamram
48923fec56 Autoscaling: Percentage runners busy - remove magic number used for round up (#235)
* remove magic number for autoscaling

Co-authored-by: Zachary Benamram <zacharybenamram@blend.com>
2020-12-15 14:38:01 +09:00
ZacharyBenamram
466b30728d Add "PercentageRunnersBusy" horizontal runner autoscaler metric type (#223)
* hpa scheme based off busy runners

* running make manifests

Co-authored-by: Zachary Benamram <zacharybenamram@blend.com>
2020-12-13 08:48:19 +09:00
callum-tait-pbx
c13704d7e2 feat: custom labels (#231)
Co-authored-by: Callum Tait <callum.tait@PBXUK-HH-05772.photobox.priv>
2020-12-13 08:33:04 +09:00
callum-tait-pbx
fb49bbda75 feat: adding helm config for dind sidecar (#232)
Co-authored-by: Callum Tait <callum.tait@PBXUK-HH-05772.photobox.priv>
2020-12-13 08:31:24 +09:00
Reinier Timmer
8d6f77e07c Remove beta GitHub client implementations (#228) 2020-12-10 09:08:51 +09:00
Yusuke Kuoka
dfffd3fb62 feat: EKS IAM Roles for Service Accounts for Runner Pods (#226)
One of the pod recreation conditions has been modified to use hash of runner spec, so that the controller does not keep restarting pods mutated by admission webhooks. This naturally allows us, for example, to use IRSA for EKS that requires its admission webhook to mutate the runner pod to have additional, IRSA-related volumes, volume mounts and env.

Resolves #200
2020-12-08 17:56:06 +09:00
Juho Saarinen
f710a54110 Don't compare runner connetion token at restart need check (#227)
Fixes #143
2020-12-08 08:48:35 +09:00
Yusuke Kuoka
85c29a95f5 runner: Add support for ruby/setup-ruby (#224)
It turned out previous versions of runner images were unable to run actions that require `AGENT_TOOLSDIRECTORY` or `libyaml` to exist in the runner environment. One of notable examples of such actions is [`ruby/setup-ruby`](https://github.com/ruby/setup-ruby).

This change adds the support for those actions, by setting up AGENT_TOOLSDIRECTORY and installing libyaml-dev within runner images.
2020-12-06 11:53:38 +09:00
Erik Nobel
a2b335ad6a Github pkg: Bump github package to version 33 (#222) 2020-12-06 10:01:47 +09:00
Tom Bamford
56c57cbf71 ci: Replace deprecated crazy-max buildx action to use alternative docker actions (#197)
Deprecated action `crazy-max/setup-buildx-action@v1` has been replaced with:
  `docker/setup-qemu-action@v1`
  `docker/setup-buildx-action@v1`
  `docker/login-action@v1`
  `docker/build-push-action@v2`

See: https://github.com/crazy-max/ghaction-docker-buildx
2020-12-06 10:00:10 +09:00
Ahmad Hamade
837563c976 Adding priorityClassName to helm chart (#215)
* Adding priorityClassName to helm chart and README file

* removed README and revert chart version
2020-11-30 09:04:25 +09:00
ZacharyBenamram
df99f394b4 Remove 10 minute buffer to token expiration (#214)
Co-authored-by: Zachary Benamram <zacharybenamram@blend.com>
2020-11-30 09:03:27 +09:00
Shinnosuke Sawada
be25715e1e Use TLS for secure docker connection (#192) 2020-11-30 08:57:33 +09:00
Yusuke Kuoka
4ca825eef0 Publish runner images for v2.274.2
Ref #212
2020-11-27 08:49:58 +09:00
Yusuke Kuoka
e5101554b3 Fix release workflow to not use add-path
Fixes #208
2020-11-26 08:39:03 +09:00
Reinier Timmer
ee8fb5a388 parametrized working directory (#185)
* parametrized working directory

* manifests v3.0
2020-11-25 08:55:26 +09:00
Erik Nobel
4e93879b8f [BUG?]: Create mountpoint for /externals/ (#203)
* runner/controller: Add externals directory mount point

* Runner: Create hack for moving content of /runner/externals/ dir

* Externals dir Mount: mount examples for '__e/node12/bin/node' not found error
2020-11-25 08:53:47 +09:00
Shinnosuke Sawada
6ce6737f61 add dockerEnabled document (#193)
Follow-up for #191
2020-11-17 09:31:34 +09:00
Shinnosuke Sawada
4371de9733 add dockerEnabled option (#191)
Add dockerEnabled option for users who does not need docker and want not to run privileged container.
if `dockerEnabled == false`, dind container not run, and there are no privileged container.

Do the same as closed #96
2020-11-16 09:41:12 +09:00
Yusuke Kuoka
1fd752fca2 Use tcp DOCKER_HOST instead of sharing docker.sock (#177)
docker:dind container creates `/var/run/docker.sock` with root user and root group.
so, docker command in runner container needs root privileges to use docker.sock and docker action fails because lack of permission.

Use tcp connection between runner and docker container, so runner container doesn't need root privileges to run docker, and can run docker action.

Fixes #174
2020-11-16 09:32:29 +09:00
Yusuke Kuoka
ccd86dce69 chart: Update CRDs to make Runnner{Deployment,ReplicaSet} replicas optional (#189)
Ref #186
2020-11-14 22:09:06 +09:00
Yusuke Kuoka
3d531ffcdd refactor/sync-period (#188)
* refactor: adding explicit sync-period flag

* docs: adding mroe detail for sync period config

* docs: spelling 🙃

Co-authored-by: Callum Tait <callum.tait@PBXUK-HH-05772.photobox.priv>
2020-11-14 22:07:22 +09:00
Yusuke Kuoka
1658f51fcb Make Runner{Deployment,ReplicaSet} replicas actually optional (#186)
If omitted, it properly defaults to 1.

Fixes #64
2020-11-14 22:06:33 +09:00
Yusuke Kuoka
9a22bb5086 Initial version of Helm chart (#187)
Acceptance tests are passing with the chart. In addition to standard chart values, syncPeriod is supported.
Please use it as a foundation for further collaboration.

Ref #184
Inspired by #91
Related #61
2020-11-14 22:06:14 +09:00
Yusuke Kuoka
71436f0466 Final fixes before merging 2020-11-14 22:00:41 +09:00
Yusuke Kuoka
b63879f59f Ensure the chart is passing acceptance tests 2020-11-14 21:58:16 +09:00
Callum Tait
c623ce0765 docs: spelling 🙃 2020-11-14 12:32:14 +00:00
Yusuke Kuoka
01117041b8 chart: controller-manager should not be autoscaled 2020-11-14 20:37:40 +09:00
Yusuke Kuoka
42a272051d chart: Use the correct image 2020-11-14 20:37:22 +09:00
Yusuke Kuoka
6a4c29d30e Set ACCEPTANCE_TEST_DEPLOYMENT_TOOL=helm to run acceptance tests with chart 2020-11-14 20:31:37 +09:00
Yusuke Kuoka
ce48dc58e6 Add crds to the chart 2020-11-14 20:31:21 +09:00
Callum Tait
04dde518f0 docs: adding mroe detail for sync period config 2020-11-14 11:22:09 +00:00
Callum Tait
1b98c8f811 refactor: adding explicit sync-period flag 2020-11-14 11:19:16 +00:00
Yusuke Kuoka
14b34efa77 Start collaborating to develop a Helm chart
Ref #184
2020-11-14 20:15:42 +09:00
Yusuke Kuoka
bbfe03f02b Add acceptance test (#168)
To ease verifying the controller to work before submitting/merging PRs and releasing a new version of the controller.
2020-11-14 20:07:14 +09:00
Yusuke Kuoka
8ccf64080c Merge pull request #182 from callum-tait-pbx/docs/improvements
docs/improvements
2020-11-14 12:02:24 +09:00
Callum Tait
492ea4b583 docs: adding a common errors section 2020-11-13 08:38:25 +00:00
Callum Tait
3446d4d761 docs: reordering to be more logical 2020-11-13 08:27:18 +00:00
Javier de Pedro López
1f82032fe3 Improvements in the documentation for permissions and replication (#175) 2020-11-13 09:36:22 +09:00
Elliot Maincourt
5b2272d80a Add read-only permissions to actions for being able to autoscale (#180) 2020-11-13 09:27:01 +09:00
Reinier Timmer
0870250f9b Enable matrix build for runner image (#179) 2020-11-13 09:24:12 +09:00
Shinnosuke Sawada
a4061d0625 gofmt ed 2020-11-12 09:20:06 +09:00
Juho Saarinen
7846f26199 Increase memory limit (#173)
We saw controller quite often OOMing, so first help to increase limit a bit.
2020-11-12 09:14:33 +09:00
Reinier Timmer
8217ebcaac Updated Runner to 2.274.1 (#176)
* Updated Runner to 2.274.1

* Update image tag in workflow
2020-11-12 09:14:06 +09:00
Shinnosuke Sawada
83857ba7e0 use tcp DOCKER_HOST instead of sharing docker.sock 2020-11-12 08:07:52 +09:00
Yusuke Kuoka
e613219a89 Fix token registration broken since v0.11.0 (#167)
Fixes #166
2020-11-11 09:38:05 +09:00
Reinier Timmer
bc35bdfa85 fixed label argument in entrypoint (#162)
* fixed label argument in entrypoint

* Removed quotes from RUNNER_GROUP_ARG
2020-11-11 08:44:51 +09:00
Yusuke Kuoka
ece8fd8fe4 Bump Go to 1.15 (#160)
Closes #104
2020-11-10 17:16:04 +09:00
Dan Webb
dcf8524b5c Adds RUNNER_GROUP argument to the runner registration (#157)
* Adds RUNNER_GROUP argument to the runner registration

Adds the ability to register a runner to a predefined runner_group

Resolves #137

* Update README with runner group example

- Updates the README with instructions of how to add the runner to a
  group
- Fix code fencing for shell and yaml blocks in the README
- Use consistent bullet points (dash not asterisk)
2020-11-10 17:15:54 +09:00
Yusuke Kuoka
4eb45d3c7f Fix build error 2020-11-10 17:09:16 +09:00
Juho Saarinen
1c30bdf35b Add GHE URL to transport (#152)
Fixes #149
2020-11-10 17:05:09 +09:00
Yusuke Kuoka
3f335ca628 Fix panic on startup when misconfigured (#154)
Fixes #153
2020-11-10 17:03:33 +09:00
Juho Saarinen
f2a2ab7ede Check token validity only when creating new pod (#159)
Fixes #143
2020-11-10 17:02:30 +09:00
Juho Saarinen
40c5050978 Added support for other than public GitHub URL (#146)
Refactoring a bit
2020-10-28 22:15:53 +09:00
Juho Saarinen
99a53a6e79 Releasing latest controller from master push (#147)
Fixes #135
2020-10-28 22:13:35 +09:00
Yusuke Kuoka
6d78fb07b3 Fix permission error with the default setup since v0.9.4 (#142)
Fixes #138
2020-10-25 11:25:48 +09:00
Yusuke Kuoka
faaca10fba Rename Runner.Spec.dockerWithinRunnerContainer to docker"d"WithinRunnerContainer (#134)
* Rename Runner.Spec.dockerWithinRunnerContainer to dockerdWithinRunnerContainer

Ref https://github.com/summerwind/actions-runner-controller/pull/126#issuecomment-712501790
2020-10-21 21:32:40 +09:00
Juho Saarinen
d16dfac0f8 Restart if pod ends up succeeded (#136)
Fixes #132
2020-10-21 21:32:26 +09:00
Juho Saarinen
af483d83da Possibility to define resources for DIND container (#125)
Ref #119
2020-10-21 10:26:19 +09:00
Juho Saarinen
92920926fe Configurable "runner and DinD in a single container" (#126) 2020-10-20 08:48:28 +09:00
Brendan Galloway
7d0bfb77e3 Inject Env Vars into Runner defined Container Spec (#127)
The runner token is now injected into the `runner` container defined within Runner.Spec.Containers[]
2020-10-20 08:43:53 +09:00
Brendan Galloway
c4074130e8 Patch additional protocol instances during manifest generation (#129)
Fixes #128
2020-10-19 09:57:53 +09:00
Yusuke Kuoka
be2e61f209 Add "alternatives" section to README (#124)
Grow together :)
2020-10-15 08:40:02 +09:00
Juho Saarinen
da818a898a Manifests valid for K8s 1.18 and 1.19 (#123)
Fixes #113
Fixes #116
2020-10-15 08:39:46 +09:00
135 changed files with 18840 additions and 18026 deletions

12
.dockerignore Normal file
View File

@@ -0,0 +1,12 @@
Makefile
acceptance
runner
hack
test-assets
config
charts
.github
.envrc
*.md
*.txt
*.sh

36
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@@ -0,0 +1,36 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
assignees: ''
---
**Describe the bug**
A clear and concise description of what the bug is.
**Checks**
- [ ] My actions-runner-controller version (v0.x.y) does support the feature
- [ ] I'm using an unreleased version of the controller I built from HEAD of the default branch
**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Environment (please complete the following information):**
- Controller Version [e.g. 0.18.2]
- Deployment Method [e.g. Helm and Kustomize ]
- Helm Chart Version [e.g. 0.11.0, if applicable]
**Additional context**
Add any other context about the problem here.

View File

@@ -0,0 +1,19 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
assignees: ''
---
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
**Describe the solution you'd like**
A clear and concise description of what you want to happen.
**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.
**Additional context**
Add any other context or screenshots about the feature request here.

34
.github/RELEASE_NOTE_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,34 @@
# Release Note Template
This is the template of actions-runner-controller's release notes.
Whenever a new release is made, I start by manually copy-pasting this template onto the GitHub UI for creating the release.
I then walk-through all the changes, take sometime to think abount best one-sentence explanations to tell the users about changes, write it all,
and click the publish button.
If you think you can improve future release notes in any way, please do submit a pull request to change the template below.
Note that even though it looks like a Go template, I don't use any templating to generate the changelog.
It's just that I'm used to reading and intepreting Go template by myself, not a computer program :)
**Title**:
```
v{{ .Version }}: {{ .TitlesOfImportantChanges }}
```
**Body**:
```
**CAUTION:** If you're using the Helm chart, beware to review changes to CRDs and do manually upgrade CRDs! Helm installs CRDs only on installing a chart. It doesn't automatically upgrade CRDs. Otherwise you end up with troubles like #427, #467, and #468. Please refer to the [UPGRADING](charts/actions-runner-controller/docs/UPGRADING.md) docs for the latest process.
This release includes the following changes from contributors. Thank you!
- @{{ .GitHubUser }} fixed {{ .Feature }} to not break when ... (#{{ .PullRequestNumber }})
- @{{ .GitHubUser }} enhanced {{ .Feature }} to ... (#{{ .PullRequestNumber }})
- @{{ .GitHubUser }} added {{ .Feature }} for ... (#{{ .PullRequestNumber }})
- @{{ .GitHubUser }} fixed {{ .Topic }} in the documentation so that ... (#{{ .PullRequestNumber }})
- @{{ .GitHubUser }} added {{ .Topic }} to the documentation (#{{ .PullRequestNumber }})
- @{{ .GitHubUser }} improved the documentation about {{ .Topic }} to also cover ... (#{{ .PullRequestNumber }})
```

66
.github/stale.yml vendored Normal file
View File

@@ -0,0 +1,66 @@
# Configuration for probot-stale - https://github.com/probot/stale
# Number of days of inactivity before an Issue or Pull Request becomes stale
daysUntilStale: 30
# Number of days of inactivity before an Issue or Pull Request with the stale label is closed.
# Set to false to disable. If disabled, issues still need to be closed manually, but will remain marked as stale.
daysUntilClose: 14
# Only issues or pull requests with all of these labels are check if stale. Defaults to `[]` (disabled)
onlyLabels: []
# Issues or Pull Requests with these labels will never be considered stale. Set to `[]` to disable
exemptLabels:
- pinned
- security
- enhancement
- refactor
- documentation
- chore
- needs-investigation
- bug
# Set to true to ignore issues in a project (defaults to false)
exemptProjects: false
# Set to true to ignore issues in a milestone (defaults to false)
exemptMilestones: false
# Set to true to ignore issues with an assignee (defaults to false)
exemptAssignees: false
# Label to use when marking as stale
staleLabel: stale
# Comment to post when marking as stale. Set to `false` to disable
markComment: >
This issue has been automatically marked as stale because it has not had
recent activity. It will be closed if no further activity occurs. Thank you
for your contributions.
# Comment to post when removing the stale label.
# unmarkComment: >
# Your comment here.
# Comment to post when closing a stale Issue or Pull Request.
# closeComment: >
# Your comment here.
# Limit the number of actions per hour, from 1-30. Default is 30
limitPerRun: 30
# Limit to only `issues` or `pulls`
# only: issues
# Optionally, specify configuration settings that are specific to just 'issues' or 'pulls':
# pulls:
# daysUntilStale: 30
# markComment: >
# This pull request has been automatically marked as stale because it has not had
# recent activity. It will be closed if no further activity occurs. Thank you
# for your contributions.
# issues:
# exemptLabels:
# - confirmed

View File

@@ -0,0 +1,123 @@
name: Build and Release Runners
on:
pull_request:
branches:
- '**'
paths:
- 'runner/**'
- .github/workflows/build-and-release-runners.yml
push:
branches:
- master
paths:
- runner/patched/*
- runner/Dockerfile
- runner/Dockerfile.ubuntu.1804
- runner/Dockerfile.dindrunner
- runner/entrypoint.sh
- .github/workflows/build-and-release-runners.yml
jobs:
build:
runs-on: ubuntu-latest
name: Build ${{ matrix.name }}-ubuntu-${{ matrix.os-version }}
strategy:
matrix:
include:
- name: actions-runner
os-version: 20.04
dockerfile: Dockerfile
- name: actions-runner
os-version: 18.04
dockerfile: Dockerfile.ubuntu.1804
- name: actions-runner-dind
os-version: 20.04
dockerfile: Dockerfile.dindrunner
env:
RUNNER_VERSION: 2.278.0
DOCKER_VERSION: 19.03.12
DOCKERHUB_USERNAME: ${{ secrets.DOCKER_USER }}
steps:
- name: Set outputs
id: vars
run: echo ::set-output name=sha_short::${GITHUB_SHA::7}
- name: Checkout
uses: actions/checkout@v2
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
with:
version: latest
- name: Login to DockerHub
uses: docker/login-action@v1
if: ${{ github.event_name == 'push' || github.event_name == 'release' }}
with:
username: ${{ secrets.DOCKER_USER }}
password: ${{ secrets.DOCKER_ACCESS_TOKEN }}
- name: Build and Push Versioned Tags
uses: docker/build-push-action@v2
with:
context: ./runner
file: ./runner/${{ matrix.dockerfile }}
platforms: linux/amd64,linux/arm64
push: ${{ github.event_name != 'pull_request' }}
build-args: |
RUNNER_VERSION=${{ env.RUNNER_VERSION }}
DOCKER_VERSION=${{ env.DOCKER_VERSION }}
tags: |
${{ env.DOCKERHUB_USERNAME }}/${{ matrix.name }}:v${{ env.RUNNER_VERSION }}-ubuntu-${{ matrix.os-version }}
${{ env.DOCKERHUB_USERNAME }}/${{ matrix.name }}:v${{ env.RUNNER_VERSION }}-ubuntu-${{ matrix.os-version }}-${{ steps.vars.outputs.sha_short }}
latest-tags:
if: ${{ github.event_name == 'push' || github.event_name == 'release' }}
runs-on: ubuntu-latest
name: Build ${{ matrix.name }}-latest
strategy:
matrix:
include:
- name: actions-runner
dockerfile: Dockerfile
- name: actions-runner-dind
dockerfile: Dockerfile.dindrunner
env:
RUNNER_VERSION: 2.277.1
DOCKER_VERSION: 19.03.12
DOCKERHUB_USERNAME: ${{ secrets.DOCKER_USER }}
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
with:
version: latest
- name: Login to DockerHub
uses: docker/login-action@v1
with:
username: ${{ secrets.DOCKER_USER }}
password: ${{ secrets.DOCKER_ACCESS_TOKEN }}
- name: Build and Push Latest Tag
uses: docker/build-push-action@v2
with:
context: ./runner
file: ./runner/${{ matrix.dockerfile }}
platforms: linux/amd64,linux/arm64
push: true
build-args: |
RUNNER_VERSION=${{ env.RUNNER_VERSION }}
DOCKER_VERSION=${{ env.DOCKER_VERSION }}
tags: |
${{ env.DOCKERHUB_USERNAME }}/${{ matrix.name }}:latest

View File

@@ -1,64 +0,0 @@
on:
pull_request:
branches:
- '**'
paths:
- 'runner/**'
- .github/workflows/build-runner.yml
push:
branches:
- master
paths:
- runner/patched/*
- runner/Dockerfile
- runner/entrypoint.sh
- .github/workflows/build-runner.yml
name: Runner
jobs:
build:
runs-on: ubuntu-latest
name: Build
env:
RUNNER_VERSION: 2.273.5
DOCKER_VERSION: 19.03.12
DOCKERHUB_USERNAME: ${{ github.repository_owner }}
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Set up Docker Buildx
id: buildx
uses: crazy-max/ghaction-docker-buildx@v1
with:
buildx-version: latest
- name: Build Container Image
working-directory: runner
if: ${{ github.event_name == 'pull_request' }}
run: |
docker buildx build \
--build-arg RUNNER_VERSION=${RUNNER_VERSION} \
--build-arg DOCKER_VERSION=${DOCKER_VERSION} \
--platform linux/amd64,linux/arm64 \
--tag ${DOCKERHUB_USERNAME}/actions-runner:v${RUNNER_VERSION} \
--tag ${DOCKERHUB_USERNAME}/actions-runner:latest \
-f Dockerfile .
- name: Login to GitHub Docker Registry
run: echo "${DOCKERHUB_PASSWORD}" | docker login -u "${DOCKERHUB_USERNAME}" --password-stdin
if: ${{ github.event_name == 'push' }}
env:
DOCKERHUB_USERNAME: ${{ github.repository_owner }}
DOCKERHUB_PASSWORD: ${{ secrets.DOCKER_ACCESS_TOKEN }}
- name: Build and Push Container Image
working-directory: runner
if: ${{ github.event_name == 'push' }}
run: |
docker buildx build \
--build-arg RUNNER_VERSION=${RUNNER_VERSION} \
--build-arg DOCKER_VERSION=${DOCKER_VERSION} \
--platform linux/amd64,linux/arm64 \
--tag ${DOCKERHUB_USERNAME}/actions-runner:v${RUNNER_VERSION} \
--tag ${DOCKERHUB_USERNAME}/actions-runner:latest \
-f Dockerfile . --push

View File

@@ -0,0 +1,77 @@
name: Lint and Test Charts
on:
push:
paths:
- 'charts/**'
- '!charts/actions-runner-controller/docs/**'
- '!charts/actions-runner-controller/*.md'
- '.github/**'
- '!.github/*.md'
workflow_dispatch:
env:
KUBE_SCORE_VERSION: 1.10.0
HELM_VERSION: v3.4.1
jobs:
lint-test:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Set up Helm
uses: azure/setup-helm@v1
with:
version: ${{ env.HELM_VERSION }}
- name: Set up kube-score
run: |
wget https://github.com/zegl/kube-score/releases/download/v${{ env.KUBE_SCORE_VERSION }}/kube-score_${{ env.KUBE_SCORE_VERSION }}_linux_amd64 -O kube-score
chmod 755 kube-score
- name: Kube-score generated manifests
run: helm template --values charts/.ci/values-kube-score.yaml charts/* | ./kube-score score -
--ignore-test pod-networkpolicy
--ignore-test deployment-has-poddisruptionbudget
--ignore-test deployment-has-host-podantiaffinity
--ignore-test container-security-context
--ignore-test pod-probes
--ignore-test container-image-tag
--enable-optional-test container-security-context-privileged
--enable-optional-test container-security-context-readonlyrootfilesystem
# python is a requirement for the chart-testing action below (supports yamllint among other tests)
- uses: actions/setup-python@v2
with:
python-version: 3.7
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.0.1
- name: Run chart-testing (list-changed)
id: list-changed
run: |
changed=$(ct list-changed --config charts/.ci/ct-config.yaml)
if [[ -n "$changed" ]]; then
echo "::set-output name=changed::true"
fi
- name: Run chart-testing (lint)
run: ct lint --config charts/.ci/ct-config.yaml
- name: Create kind cluster
uses: helm/kind-action@v1.0.0
if: steps.list-changed.outputs.changed == 'true'
# We need cert-manager already installed in the cluster because we assume the CRDs exist
- name: Install cert-manager
run: |
helm repo add jetstack https://charts.jetstack.io --force-update
helm install cert-manager jetstack/cert-manager --set installCRDs=true --wait
if: steps.list-changed.outputs.changed == 'true'
- name: Run chart-testing (install)
run: ct install --config charts/.ci/ct-config.yaml

View File

@@ -0,0 +1,103 @@
name: Publish helm chart
on:
push:
branches:
- master
- main # assume that the branch name may change in future
paths:
- 'charts/**'
- '!charts/actions-runner-controller/docs/**'
- '.github/**'
- '!**.md'
workflow_dispatch:
env:
KUBE_SCORE_VERSION: 1.10.0
HELM_VERSION: v3.4.1
jobs:
lint-chart:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Set up Helm
uses: azure/setup-helm@v1
with:
version: ${{ env.HELM_VERSION }}
- name: Set up kube-score
run: |
wget https://github.com/zegl/kube-score/releases/download/v${{ env.KUBE_SCORE_VERSION }}/kube-score_${{ env.KUBE_SCORE_VERSION }}_linux_amd64 -O kube-score
chmod 755 kube-score
- name: Kube-score generated manifests
run: helm template --values charts/.ci/values-kube-score.yaml charts/* | ./kube-score score -
--ignore-test pod-networkpolicy
--ignore-test deployment-has-poddisruptionbudget
--ignore-test deployment-has-host-podantiaffinity
--ignore-test container-security-context
--ignore-test pod-probes
--ignore-test container-image-tag
--enable-optional-test container-security-context-privileged
--enable-optional-test container-security-context-readonlyrootfilesystem
# python is a requirement for the chart-testing action below (supports yamllint among other tests)
- uses: actions/setup-python@v2
with:
python-version: 3.7
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.0.1
- name: Run chart-testing (list-changed)
id: list-changed
run: |
changed=$(ct list-changed --config charts/.ci/ct-config.yaml)
if [[ -n "$changed" ]]; then
echo "::set-output name=changed::true"
fi
- name: Run chart-testing (lint)
run: ct lint --config charts/.ci/ct-config.yaml
- name: Create kind cluster
uses: helm/kind-action@v1.0.0
if: steps.list-changed.outputs.changed == 'true'
# We need cert-manager already installed in the cluster because we assume the CRDs exist
- name: Install cert-manager
run: |
helm repo add jetstack https://charts.jetstack.io --force-update
helm install cert-manager jetstack/cert-manager --set installCRDs=true --wait
if: steps.list-changed.outputs.changed == 'true'
- name: Run chart-testing (install)
run: ct install --config charts/.ci/ct-config.yaml
if: steps.list-changed.outputs.changed == 'true'
publish-chart:
runs-on: ubuntu-latest
needs: lint-chart
steps:
- name: Checkout
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Run chart-releaser
uses: helm/chart-releaser-action@v1.1.0
env:
CR_TOKEN: "${{ secrets.GITHUB_TOKEN }}"

View File

@@ -6,7 +6,13 @@ jobs:
build: build:
runs-on: ubuntu-latest runs-on: ubuntu-latest
name: Release name: Release
env:
DOCKERHUB_USERNAME: ${{ secrets.DOCKER_USER }}
steps: steps:
- name: Set outputs
id: vars
run: echo ::set-output name=sha_short::${GITHUB_SHA::7}
- name: Checkout - name: Checkout
uses: actions/checkout@v2 uses: actions/checkout@v2
@@ -22,30 +28,36 @@ jobs:
sudo mv ghr_v0.13.0_linux_amd64/ghr /usr/local/bin sudo mv ghr_v0.13.0_linux_amd64/ghr /usr/local/bin
- name: Set version - name: Set version
run: echo "::set-env name=VERSION::$(cat ${GITHUB_EVENT_PATH} | jq -r '.release.tag_name')" run: echo "VERSION=$(cat ${GITHUB_EVENT_PATH} | jq -r '.release.tag_name')" >> $GITHUB_ENV
- name: Upload artifacts - name: Upload artifacts
env: env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: make github-release run: make github-release
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
- name: Set up Docker Buildx - name: Set up Docker Buildx
id: buildx id: buildx
uses: crazy-max/ghaction-docker-buildx@v1 uses: docker/setup-buildx-action@v1
with: with:
buildx-version: latest version: latest
- name: Login to GitHub Docker Registry - name: Login to DockerHub
run: echo "${DOCKERHUB_PASSWORD}" | docker login -u "${DOCKERHUB_USERNAME}" --password-stdin uses: docker/login-action@v1
env: with:
DOCKERHUB_USERNAME: ${{ github.repository_owner }} username: ${{ secrets.DOCKER_USER }}
DOCKERHUB_PASSWORD: ${{ secrets.DOCKER_ACCESS_TOKEN }} password: ${{ secrets.DOCKER_ACCESS_TOKEN }}
- name: Build and Push
uses: docker/build-push-action@v2
with:
file: Dockerfile
platforms: linux/amd64,linux/arm64
push: true
tags: |
${{ env.DOCKERHUB_USERNAME }}/actions-runner-controller:latest
${{ env.DOCKERHUB_USERNAME }}/actions-runner-controller:${{ env.VERSION }}
${{ env.DOCKERHUB_USERNAME }}/actions-runner-controller:${{ env.VERSION }}-${{ steps.vars.outputs.sha_short }}
- name: Build Container Image
env:
DOCKERHUB_USERNAME: ${{ github.repository_owner }}
run: |
docker buildx build \
--platform linux/amd64,linux/arm64 \
--tag ${DOCKERHUB_USERNAME}/actions-runner-controller:${{ env.VERSION }} \
-f Dockerfile . --push

View File

@@ -6,6 +6,9 @@ on:
- master - master
paths-ignore: paths-ignore:
- 'runner/**' - 'runner/**'
- .github/workflows/build-and-release-runners.yml
- '*.md'
- '.gitignore'
jobs: jobs:
test: test:

44
.github/workflows/wip.yml vendored Normal file
View File

@@ -0,0 +1,44 @@
on:
push:
branches:
- master
paths-ignore:
- "runner/**"
- "**.md"
- ".gitignore"
jobs:
build:
runs-on: ubuntu-latest
name: release-latest
env:
DOCKERHUB_USERNAME: ${{ secrets.DOCKER_USER }}
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v1
with:
version: latest
- name: Login to DockerHub
uses: docker/login-action@v1
with:
username: ${{ secrets.DOCKER_USER }}
password: ${{ secrets.DOCKER_ACCESS_TOKEN }}
# Considered unstable builds
# See Issue #285, PR #286, and PR #323 for more information
- name: Build and Push
uses: docker/build-push-action@v2
with:
file: Dockerfile
platforms: linux/amd64,linux/arm64
push: true
tags: |
${{ env.DOCKERHUB_USERNAME }}/actions-runner-controller:canary

6
.gitignore vendored
View File

@@ -23,3 +23,9 @@ bin
*.swp *.swp
*.swo *.swo
*~ *~
.envrc
*.pem
# OS
.DS_STORE

8
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,8 @@
# Contributing
### Helm Version Bumps
**Chart Version :** When bumping the chart version follow semantic versioning https://semver.org/<br />
**App Version :** When bumping the app version you will also need to bump the chart version too. Again, follow semantic versioning when bumping the chart.
To determine if you need to bump the MAJOR, MINOR or PATCH versions you will need to review the changes between the previous app version and the new app version and / or ask for a maintainer to advise.

View File

@@ -1,5 +1,5 @@
# Build the manager binary # Build the manager binary
FROM golang:1.13 as builder FROM golang:1.15 as builder
ARG TARGETPLATFORM ARG TARGETPLATFORM
@@ -22,7 +22,8 @@ COPY . .
RUN export GOOS=$(echo ${TARGETPLATFORM} | cut -d / -f1) && \ RUN export GOOS=$(echo ${TARGETPLATFORM} | cut -d / -f1) && \
export GOARCH=$(echo ${TARGETPLATFORM} | cut -d / -f2) && \ export GOARCH=$(echo ${TARGETPLATFORM} | cut -d / -f2) && \
GOARM=$(echo ${TARGETPLATFORM} | cut -d / -f3 | cut -c2-) && \ GOARM=$(echo ${TARGETPLATFORM} | cut -d / -f3 | cut -c2-) && \
go build -a -o manager main.go go build -a -o manager main.go && \
go build -a -o github-webhook-server ./cmd/githubwebhookserver
# Use distroless as minimal base image to package the manager binary # Use distroless as minimal base image to package the manager binary
# Refer to https://github.com/GoogleContainerTools/distroless for more details # Refer to https://github.com/GoogleContainerTools/distroless for more details
@@ -31,6 +32,7 @@ FROM gcr.io/distroless/static:nonroot
WORKDIR / WORKDIR /
COPY --from=builder /workspace/manager . COPY --from=builder /workspace/manager .
COPY --from=builder /workspace/github-webhook-server .
USER nonroot:nonroot USER nonroot:nonroot

215
Makefile
View File

@@ -1,5 +1,20 @@
NAME ?= summerwind/actions-runner-controller ifdef DOCKER_USER
NAME ?= ${DOCKER_USER}/actions-runner-controller
else
NAME ?= summerwind/actions-runner-controller
endif
DOCKER_USER ?= $(shell echo ${NAME} | cut -d / -f1)
VERSION ?= latest VERSION ?= latest
RUNNER_NAME ?= ${DOCKER_USER}/actions-runner
RUNNER_TAG ?= ${VERSION}
TEST_REPO ?= ${DOCKER_USER}/actions-runner-controller
TEST_ORG ?=
TEST_ORG_REPO ?=
SYNC_PERIOD ?= 5m
# From https://github.com/VictoriaMetrics/operator/pull/44
YAML_DROP=$(YQ) delete --inplace
YAML_DROP_PREFIX=spec.validation.openAPIV3Schema.properties.spec.properties
# Produce CRDs that work back to Kubernetes 1.11 (no version conversion) # Produce CRDs that work back to Kubernetes 1.11 (no version conversion)
CRD_OPTIONS ?= "crd:trivialVersions=true" CRD_OPTIONS ?= "crd:trivialVersions=true"
@@ -11,6 +26,8 @@ else
GOBIN=$(shell go env GOBIN) GOBIN=$(shell go env GOBIN)
endif endif
TEST_ASSETS=$(PWD)/test-assets
# default list of platforms for which multiarch image is built # default list of platforms for which multiarch image is built
ifeq (${PLATFORMS}, ) ifeq (${PLATFORMS}, )
export PLATFORMS="linux/amd64,linux/arm64" export PLATFORMS="linux/amd64,linux/arm64"
@@ -19,8 +36,8 @@ endif
# if IMG_RESULT is unspecified, by default the image will be pushed to registry # if IMG_RESULT is unspecified, by default the image will be pushed to registry
ifeq (${IMG_RESULT}, load) ifeq (${IMG_RESULT}, load)
export PUSH_ARG="--load" export PUSH_ARG="--load"
# if load is specified, image will be built only for the build machine architecture. # if load is specified, image will be built only for the build machine architecture.
export PLATFORMS="local" export PLATFORMS="local"
else ifeq (${IMG_RESULT}, cache) else ifeq (${IMG_RESULT}, cache)
# if cache is specified, image will only be available in the build cache, it won't be pushed or loaded # if cache is specified, image will only be available in the build cache, it won't be pushed or loaded
# therefore no PUSH_ARG will be specified # therefore no PUSH_ARG will be specified
@@ -34,6 +51,13 @@ all: manager
test: generate fmt vet manifests test: generate fmt vet manifests
go test ./... -coverprofile cover.out go test ./... -coverprofile cover.out
test-with-deps: kube-apiserver etcd kubectl
# See https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/envtest#pkg-constants
TEST_ASSET_KUBE_APISERVER=$(KUBE_APISERVER_BIN) \
TEST_ASSET_ETCD=$(ETCD_BIN) \
TEST_ASSET_KUBECTL=$(KUBECTL_BIN) \
make test
# Build manager binary # Build manager binary
manager: generate fmt vet manager: generate fmt vet
go build -o bin/manager main.go go build -o bin/manager main.go
@@ -56,9 +80,14 @@ deploy: manifests
kustomize build config/default | kubectl apply -f - kustomize build config/default | kubectl apply -f -
# Generate manifests e.g. CRD, RBAC etc. # Generate manifests e.g. CRD, RBAC etc.
manifests: controller-gen manifests: manifests-118 fix118 chart-crds
manifests-118: controller-gen
$(CONTROLLER_GEN) $(CRD_OPTIONS) rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases $(CONTROLLER_GEN) $(CRD_OPTIONS) rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
chart-crds:
cp config/crd/bases/*.yaml charts/actions-runner-controller/crds/
# Run go fmt against code # Run go fmt against code
fmt: fmt:
go fmt ./... go fmt ./...
@@ -67,17 +96,30 @@ fmt:
vet: vet:
go vet ./... go vet ./...
# workaround for CRD issue with k8s 1.18 & controller-gen
# ref: https://github.com/kubernetes/kubernetes/issues/91395
fix118: yq
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runnerreplicasets.yaml $(YAML_DROP_PREFIX).template.properties.spec.properties.containers.items.properties
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runnerreplicasets.yaml $(YAML_DROP_PREFIX).template.properties.spec.properties.initContainers.items.properties
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runnerreplicasets.yaml $(YAML_DROP_PREFIX).template.properties.spec.properties.sidecarContainers.items.properties
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runnerreplicasets.yaml $(YAML_DROP_PREFIX).template.properties.spec.properties.ephemeralContainers.items.properties
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runnerdeployments.yaml $(YAML_DROP_PREFIX).template.properties.spec.properties.containers.items.properties
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runnerdeployments.yaml $(YAML_DROP_PREFIX).template.properties.spec.properties.initContainers.items.properties
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runnerdeployments.yaml $(YAML_DROP_PREFIX).template.properties.spec.properties.sidecarContainers.items.properties
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runnerdeployments.yaml $(YAML_DROP_PREFIX).template.properties.spec.properties.ephemeralContainers.items.properties
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runners.yaml $(YAML_DROP_PREFIX).containers.items.properties
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runners.yaml $(YAML_DROP_PREFIX).initContainers.items.properties
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runners.yaml $(YAML_DROP_PREFIX).sidecarContainers.items.properties
$(YAML_DROP) config/crd/bases/actions.summerwind.dev_runners.yaml $(YAML_DROP_PREFIX).ephemeralContainers.items.properties
# Generate code # Generate code
generate: controller-gen generate: controller-gen
$(CONTROLLER_GEN) object:headerFile=./hack/boilerplate.go.txt paths="./..." $(CONTROLLER_GEN) object:headerFile=./hack/boilerplate.go.txt paths="./..."
# Build the docker image # Build the docker image
docker-build: test docker-build:
docker build . -t ${NAME}:${VERSION} docker build . -t ${NAME}:${VERSION}
docker build runner -t ${RUNNER_NAME}:${RUNNER_TAG} --build-arg TARGETPLATFORM=$(shell arch)
# Push the docker image
docker-push:
docker push ${NAME}:${VERSION}
docker-buildx: docker-buildx:
export DOCKER_CLI_EXPERIMENTAL=enabled export DOCKER_CLI_EXPERIMENTAL=enabled
@@ -91,12 +133,74 @@ docker-buildx:
-f Dockerfile \ -f Dockerfile \
. ${PUSH_ARG} . ${PUSH_ARG}
# Push the docker image
docker-push:
docker push ${NAME}:${VERSION}
docker push ${RUNNER_NAME}:${RUNNER_TAG}
# Generate the release manifest file # Generate the release manifest file
release: manifests release: manifests
cd config/manager && kustomize edit set image controller=${NAME}:${VERSION} cd config/manager && kustomize edit set image controller=${NAME}:${VERSION}
mkdir -p release mkdir -p release
kustomize build config/default > release/actions-runner-controller.yaml kustomize build config/default > release/actions-runner-controller.yaml
.PHONY: release/clean
release/clean:
rm -rf release
.PHONY: acceptance
acceptance: release/clean acceptance/pull docker-build release
ACCEPTANCE_TEST_SECRET_TYPE=token make acceptance/run
ACCEPTANCE_TEST_SECRET_TYPE=app make acceptance/run
ACCEPTANCE_TEST_DEPLOYMENT_TOOL=helm ACCEPTANCE_TEST_SECRET_TYPE=token make acceptance/run
ACCEPTANCE_TEST_DEPLOYMENT_TOOL=helm ACCEPTANCE_TEST_SECRET_TYPE=app make acceptance/run
acceptance/run: acceptance/kind acceptance/load acceptance/setup acceptance/deploy acceptance/tests acceptance/teardown
acceptance/kind:
kind create cluster --name acceptance --config acceptance/kind.yaml
# Set TMPDIR to somewhere under $HOME when you use docker installed with Ubuntu snap
# Otherwise `load docker-image` fail while running `docker save`.
# See https://kind.sigs.k8s.io/docs/user/known-issues/#docker-installed-with-snap
acceptance/load:
kind load docker-image ${NAME}:${VERSION} --name acceptance
kind load docker-image quay.io/brancz/kube-rbac-proxy:v0.10.0 --name acceptance
kind load docker-image ${RUNNER_NAME}:${RUNNER_TAG} --name acceptance
kind load docker-image docker:dind --name acceptance
kind load docker-image quay.io/jetstack/cert-manager-controller:v1.0.4 --name acceptance
kind load docker-image quay.io/jetstack/cert-manager-cainjector:v1.0.4 --name acceptance
kind load docker-image quay.io/jetstack/cert-manager-webhook:v1.0.4 --name acceptance
kubectl cluster-info --context kind-acceptance
# Pull the docker images for acceptance
acceptance/pull:
docker pull quay.io/brancz/kube-rbac-proxy:v0.10.0
docker pull docker:dind
docker pull quay.io/jetstack/cert-manager-controller:v1.0.4
docker pull quay.io/jetstack/cert-manager-cainjector:v1.0.4
docker pull quay.io/jetstack/cert-manager-webhook:v1.0.4
acceptance/setup:
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.4/cert-manager.yaml #kubectl create namespace actions-runner-system
kubectl -n cert-manager wait deploy/cert-manager-cainjector --for condition=available --timeout 90s
kubectl -n cert-manager wait deploy/cert-manager-webhook --for condition=available --timeout 60s
kubectl -n cert-manager wait deploy/cert-manager --for condition=available --timeout 60s
kubectl create namespace actions-runner-system || true
# Adhocly wait for some time until cert-manager's admission webhook gets ready
sleep 5
acceptance/teardown:
kind delete cluster --name acceptance
acceptance/deploy:
NAME=${NAME} DOCKER_USER=${DOCKER_USER} VERSION=${VERSION} RUNNER_NAME=${RUNNER_NAME} RUNNER_TAG=${RUNNER_TAG} TEST_REPO=${TEST_REPO} \
TEST_ORG=${TEST_ORG} TEST_ORG_REPO=${TEST_ORG_REPO} SYNC_PERIOD=${SYNC_PERIOD} \
acceptance/deploy.sh
acceptance/tests:
acceptance/checks.sh
# Upload release file to GitHub. # Upload release file to GitHub.
github-release: release github-release: release
ghr ${VERSION} release/ ghr ${VERSION} release/
@@ -105,6 +209,7 @@ github-release: release
# download controller-gen if necessary # download controller-gen if necessary
controller-gen: controller-gen:
ifeq (, $(shell which controller-gen)) ifeq (, $(shell which controller-gen))
ifeq (, $(wildcard $(GOBIN)/controller-gen))
@{ \ @{ \
set -e ;\ set -e ;\
CONTROLLER_GEN_TMP_DIR=$$(mktemp -d) ;\ CONTROLLER_GEN_TMP_DIR=$$(mktemp -d) ;\
@@ -113,7 +218,99 @@ ifeq (, $(shell which controller-gen))
go get sigs.k8s.io/controller-tools/cmd/controller-gen@v0.3.0 ;\ go get sigs.k8s.io/controller-tools/cmd/controller-gen@v0.3.0 ;\
rm -rf $$CONTROLLER_GEN_TMP_DIR ;\ rm -rf $$CONTROLLER_GEN_TMP_DIR ;\
} }
endif
CONTROLLER_GEN=$(GOBIN)/controller-gen CONTROLLER_GEN=$(GOBIN)/controller-gen
else else
CONTROLLER_GEN=$(shell which controller-gen) CONTROLLER_GEN=$(shell which controller-gen)
endif endif
# find or download yq
# download yq if necessary
# Use always go-version to get consistent line wraps etc.
yq:
ifeq (, $(wildcard $(GOBIN)/yq))
echo "Downloading yq"
@{ \
set -e ;\
YQ_TMP_DIR=$$(mktemp -d) ;\
cd $$YQ_TMP_DIR ;\
go mod init tmp ;\
go get github.com/mikefarah/yq/v3@3.4.0 ;\
rm -rf $$YQ_TMP_DIR ;\
}
endif
YQ=$(GOBIN)/yq
OS_NAME := $(shell uname -s | tr A-Z a-z)
# find or download etcd
etcd:
ifeq (, $(shell which etcd))
ifeq (, $(wildcard $(TEST_ASSETS)/etcd))
@{ \
set -xe ;\
INSTALL_TMP_DIR=$$(mktemp -d) ;\
cd $$INSTALL_TMP_DIR ;\
wget https://github.com/kubernetes-sigs/kubebuilder/releases/download/v2.3.2/kubebuilder_2.3.2_$(OS_NAME)_amd64.tar.gz ;\
mkdir -p $(TEST_ASSETS) ;\
tar zxvf kubebuilder_2.3.2_$(OS_NAME)_amd64.tar.gz ;\
mv kubebuilder_2.3.2_$(OS_NAME)_amd64/bin/etcd $(TEST_ASSETS)/etcd ;\
mv kubebuilder_2.3.2_$(OS_NAME)_amd64/bin/kube-apiserver $(TEST_ASSETS)/kube-apiserver ;\
mv kubebuilder_2.3.2_$(OS_NAME)_amd64/bin/kubectl $(TEST_ASSETS)/kubectl ;\
rm -rf $$INSTALL_TMP_DIR ;\
}
ETCD_BIN=$(TEST_ASSETS)/etcd
else
ETCD_BIN=$(TEST_ASSETS)/etcd
endif
else
ETCD_BIN=$(shell which etcd)
endif
# find or download kube-apiserver
kube-apiserver:
ifeq (, $(shell which kube-apiserver))
ifeq (, $(wildcard $(TEST_ASSETS)/kube-apiserver))
@{ \
set -xe ;\
INSTALL_TMP_DIR=$$(mktemp -d) ;\
cd $$INSTALL_TMP_DIR ;\
wget https://github.com/kubernetes-sigs/kubebuilder/releases/download/v2.3.2/kubebuilder_2.3.2_$(OS_NAME)_amd64.tar.gz ;\
mkdir -p $(TEST_ASSETS) ;\
tar zxvf kubebuilder_2.3.2_$(OS_NAME)_amd64.tar.gz ;\
mv kubebuilder_2.3.2_$(OS_NAME)_amd64/bin/etcd $(TEST_ASSETS)/etcd ;\
mv kubebuilder_2.3.2_$(OS_NAME)_amd64/bin/kube-apiserver $(TEST_ASSETS)/kube-apiserver ;\
mv kubebuilder_2.3.2_$(OS_NAME)_amd64/bin/kubectl $(TEST_ASSETS)/kubectl ;\
rm -rf $$INSTALL_TMP_DIR ;\
}
KUBE_APISERVER_BIN=$(TEST_ASSETS)/kube-apiserver
else
KUBE_APISERVER_BIN=$(TEST_ASSETS)/kube-apiserver
endif
else
KUBE_APISERVER_BIN=$(shell which kube-apiserver)
endif
# find or download kubectl
kubectl:
ifeq (, $(shell which kubectl))
ifeq (, $(wildcard $(TEST_ASSETS)/kubectl))
@{ \
set -xe ;\
INSTALL_TMP_DIR=$$(mktemp -d) ;\
cd $$INSTALL_TMP_DIR ;\
wget https://github.com/kubernetes-sigs/kubebuilder/releases/download/v2.3.2/kubebuilder_2.3.2_$(OS_NAME)_amd64.tar.gz ;\
mkdir -p $(TEST_ASSETS) ;\
tar zxvf kubebuilder_2.3.2_$(OS_NAME)_amd64.tar.gz ;\
mv kubebuilder_2.3.2_$(OS_NAME)_amd64/bin/etcd $(TEST_ASSETS)/etcd ;\
mv kubebuilder_2.3.2_$(OS_NAME)_amd64/bin/kube-apiserver $(TEST_ASSETS)/kube-apiserver ;\
mv kubebuilder_2.3.2_$(OS_NAME)_amd64/bin/kubectl $(TEST_ASSETS)/kubectl ;\
rm -rf $$INSTALL_TMP_DIR ;\
}
KUBECTL_BIN=$(TEST_ASSETS)/kubectl
else
KUBECTL_BIN=$(TEST_ASSETS)/kubectl
endif
else
KUBECTL_BIN=$(shell which kubectl)
endif

942
README.md

File diff suppressed because it is too large Load Diff

32
acceptance/checks.sh Executable file
View File

@@ -0,0 +1,32 @@
#!/usr/bin/env bash
set -e
runner_name=
while [ -z "${runner_name}" ]; do
echo Finding the runner... 1>&2
sleep 1
runner_name=$(kubectl get runner --output=jsonpath="{.items[*].metadata.name}")
done
echo Found runner ${runner_name}.
# Wait a bit to make sure the runner pod is created before looking for it.
sleep 2
pod_name=
while [ -z "${pod_name}" ]; do
echo Finding the runner pod... 1>&2
sleep 1
pod_name=$(kubectl get pod --output=jsonpath="{.items[*].metadata.name}" | grep ${runner_name})
done
echo Found pod ${pod_name}.
echo Waiting for pod ${runner_name} to become ready... 1>&2
kubectl wait pod/${runner_name} --for condition=ready --timeout 270s
echo All tests passed. 1>&2

66
acceptance/deploy.sh Executable file
View File

@@ -0,0 +1,66 @@
#!/usr/bin/env bash
set -e
tpe=${ACCEPTANCE_TEST_SECRET_TYPE}
VALUES_FILE=${VALUES_FILE:-$(dirname $0)/values.yaml}
if [ "${tpe}" == "token" ]; then
if ! kubectl get secret controller-manager -n actions-runner-system >/dev/null; then
kubectl create secret generic controller-manager \
-n actions-runner-system \
--from-literal=github_token=${GITHUB_TOKEN:?GITHUB_TOKEN must not be empty}
fi
elif [ "${tpe}" == "app" ]; then
kubectl create secret generic controller-manager \
-n actions-runner-system \
--from-literal=github_app_id=${APP_ID:?must not be empty} \
--from-literal=github_app_installation_id=${INSTALLATION_ID:?must not be empty} \
--from-file=github_app_private_key=${PRIVATE_KEY_FILE_PATH:?must not be empty}
else
echo "ACCEPTANCE_TEST_SECRET_TYPE must be set to either \"token\" or \"app\"" 1>&2
exit 1
fi
tool=${ACCEPTANCE_TEST_DEPLOYMENT_TOOL}
if [ "${tool}" == "helm" ]; then
helm upgrade --install actions-runner-controller \
charts/actions-runner-controller \
-n actions-runner-system \
--create-namespace \
--set syncPeriod=${SYNC_PERIOD} \
--set authSecret.create=false \
--set image.repository=${NAME} \
--set image.tag=${VERSION} \
-f ${VALUES_FILE}
kubectl -n actions-runner-system wait deploy/actions-runner-controller --for condition=available --timeout 60s
else
kubectl apply \
-n actions-runner-system \
-f release/actions-runner-controller.yaml
kubectl -n actions-runner-system wait deploy/controller-manager --for condition=available --timeout 120s
fi
# Adhocly wait for some time until actions-runner-controller's admission webhook gets ready
sleep 20
if [ -n "${TEST_REPO}" ]; then
cat acceptance/testdata/runnerdeploy.yaml | envsubst | kubectl apply -f -
cat acceptance/testdata/hra.yaml | envsubst | kubectl apply -f -
else
echo 'Skipped deploying runnerdeployment and hra. Set TEST_REPO to "yourorg/yourrepo" to deploy.'
fi
if [ -n "${TEST_ORG}" ]; then
cat acceptance/testdata/org.runnerdeploy.yaml | envsubst | kubectl apply -f -
if [ -n "${TEST_ORG_REPO}" ]; then
cat acceptance/testdata/org.hra.yaml | envsubst | kubectl apply -f -
else
echo 'Skipped deploying organizational hra. Set TEST_ORG_REPO to "yourorg/yourrepo" to deploy.'
fi
else
echo 'Skipped deploying organizational runnerdeployment. Set TEST_ORG to deploy.'
fi

10
acceptance/kind.yaml Normal file
View File

@@ -0,0 +1,10 @@
apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
- role: control-plane
extraPortMappings:
- containerPort: 31000
hostPort: 31000
listenAddress: "0.0.0.0"
protocol: tcp
#- role: worker

View File

@@ -0,0 +1,36 @@
name: EKS Integration Tests
on:
workflow_dispatch:
env:
IRSA_ROLE_ARN:
ASSUME_ROLE_ARN:
AWS_REGION:
jobs:
assume-role-in-runner-test:
runs-on: ['self-hosted', 'Linux']
steps:
- name: Test aws-actions/configure-aws-credentials Action
uses: aws-actions/configure-aws-credentials@v1
with:
aws-region: ${{ env.AWS_REGION }}
role-to-assume: ${{ env.ASSUME_ROLE_ARN }}
role-duration-seconds: 900
assume-role-in-container-test:
runs-on: ['self-hosted', 'Linux']
container:
image: amazon/aws-cli
env:
AWS_WEB_IDENTITY_TOKEN_FILE: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
AWS_ROLE_ARN: ${{ env.IRSA_ROLE_ARN }}
volumes:
- /var/run/secrets/eks.amazonaws.com/serviceaccount/token:/var/run/secrets/eks.amazonaws.com/serviceaccount/token
steps:
- name: Test aws-actions/configure-aws-credentials Action in container
uses: aws-actions/configure-aws-credentials@v1
with:
aws-region: ${{ env.AWS_REGION }}
role-to-assume: ${{ env.ASSUME_ROLE_ARN }}
role-duration-seconds: 900

View File

@@ -0,0 +1,83 @@
name: Runner Integration Tests
on:
workflow_dispatch:
env:
ImageOS: ubuntu18 # Used by ruby/setup-ruby action | Update me for the runner OS version you are testing against
jobs:
run-step-in-container-test:
runs-on: ['self-hosted', 'Linux']
container:
image: alpine
steps:
- name: Test we are working in the container
run: |
if [[ $(sed -n '2p' < /etc/os-release | cut -d "=" -f2) != "alpine" ]]; then
echo "::error ::Failed OS detection test, could not match /etc/os-release with alpine. Are we really running in the container?"
echo "/etc/os-release below:"
cat /etc/os-release
exit 1
fi
setup-python-test:
runs-on: ['self-hosted', 'Linux']
steps:
- name: Print native Python environment
run: |
which python
python --version
- uses: actions/setup-python@v2
with:
python-version: 3.9
- name: Test actions/setup-python works
run: |
VERSION=$(python --version 2>&1 | cut -d ' ' -f2 | cut -d '.' -f1-2)
if [[ $VERSION != '3.9' ]]; then
echo "Python version detected : $(python --version 2>&1)"
echo "::error ::Detected python failed setup version test, could not match version with version specified in the setup action"
exit 1
else
echo "Python version detected : $(python --version 2>&1)"
fi
setup-node-test:
runs-on: ['self-hosted', 'Linux']
steps:
- uses: actions/setup-node@v2
with:
node-version: '12'
- name: Test actions/setup-node works
run: |
VERSION=$(node --version | cut -c 2- | cut -d '.' -f1)
if [[ $VERSION != '12' ]]; then
echo "Node version detected : $(node --version 2>&1)"
echo "::error ::Detected node failed setup version test, could not match version with version specified in the setup action"
exit 1
else
echo "Node version detected : $(node --version 2>&1)"
fi
setup-ruby-test:
runs-on: ['self-hosted', 'Linux']
steps:
- uses: ruby/setup-ruby@v1
with:
ruby-version: 3.0
bundler-cache: true
- name: Test ruby/setup-ruby works
run: |
VERSION=$(ruby --version | cut -d ' ' -f2 | cut -d '.' -f1-2)
if [[ $VERSION != '3.0' ]]; then
echo "Ruby version detected : $(ruby --version 2>&1)"
echo "::error ::Detected ruby failed setup version test, could not match version with version specified in the setup action"
exit 1
else
echo "Ruby version detected : $(ruby --version 2>&1)"
fi
python-shell-test:
runs-on: ['self-hosted', 'Linux']
steps:
- name: Test Python shell works
run: |
import os
print(os.environ['PATH'])
shell: python

25
acceptance/testdata/hra.yaml vendored Normal file
View File

@@ -0,0 +1,25 @@
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
name: actions-runner-aos-autoscaler
spec:
scaleTargetRef:
name: example-runnerdeploy
scaleUpTriggers:
- githubEvent:
checkRun:
types: ["created"]
status: "queued"
amount: 1
duration: "1m"
minReplicas: 0
maxReplicas: 5
metrics:
- type: PercentageRunnersBusy
scaleUpThreshold: '0.75'
scaleDownThreshold: '0.3'
scaleUpFactor: '2'
scaleDownFactor: '0.5'
- type: TotalNumberOfQueuedAndInProgressWorkflowRuns
repositoryNames:
- ${TEST_REPO}

35
acceptance/testdata/org.hra.yaml vendored Normal file
View File

@@ -0,0 +1,35 @@
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
name: org
spec:
scaleTargetRef:
name: org-runnerdeploy
scaleUpTriggers:
- githubEvent:
checkRun:
types: ["created"]
status: "queued"
amount: 1
duration: "1m"
scheduledOverrides:
- startTime: "2021-05-11T16:05:00+09:00"
endTime: "2021-05-11T16:40:00+09:00"
minReplicas: 2
- startTime: "2021-05-01T00:00:00+09:00"
endTime: "2021-05-03T00:00:00+09:00"
recurrenceRule:
frequency: Weekly
untilTime: "2022-05-01T00:00:00+09:00"
minReplicas: 0
minReplicas: 0
maxReplicas: 5
metrics:
- type: PercentageRunnersBusy
scaleUpThreshold: '0.75'
scaleDownThreshold: '0.3'
scaleUpFactor: '2'
scaleDownFactor: '0.5'
- type: TotalNumberOfQueuedAndInProgressWorkflowRuns
repositoryNames:
- ${TEST_ORG_REPO}

View File

@@ -0,0 +1,37 @@
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: org-runnerdeploy
spec:
# replicas: 1
template:
spec:
organization: ${TEST_ORG}
#
# Custom runner image
#
image: ${RUNNER_NAME}:${RUNNER_TAG}
imagePullPolicy: IfNotPresent
#
# dockerd within runner container
#
## Replace `mumoshu/actions-runner-dind:dev` with your dind image
#dockerdWithinRunnerContainer: true
#image: mumoshu/actions-runner-dind:dev
#
# Set the MTU used by dockerd-managed network interfaces (including docker-build-ubuntu)
#
#dockerMTU: 1450
#Runner group
# labels:
# - "mylabel 1"
# - "mylabel 2"
#
# Non-standard working directory
#
# workDir: "/"

37
acceptance/testdata/runnerdeploy.yaml vendored Normal file
View File

@@ -0,0 +1,37 @@
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: example-runnerdeploy
spec:
# replicas: 1
template:
spec:
repository: ${TEST_REPO}
#
# Custom runner image
#
image: ${RUNNER_NAME}:${RUNNER_TAG}
imagePullPolicy: IfNotPresent
#
# dockerd within runner container
#
## Replace `mumoshu/actions-runner-dind:dev` with your dind image
#dockerdWithinRunnerContainer: true
#image: mumoshu/actions-runner-dind:dev
#
# Set the MTU used by dockerd-managed network interfaces (including docker-build-ubuntu)
#
#dockerMTU: 1450
#Runner group
# labels:
# - "mylabel 1"
# - "mylabel 2"
#
# Non-standard working directory
#
# workDir: "/"

20
acceptance/values.yaml Normal file
View File

@@ -0,0 +1,20 @@
# Set actions-runner-controller settings for testing
githubAPICacheDuration: 10s
githubWebhookServer:
enabled: true
labels: {}
replicaCount: 1
syncPeriod: 10m
secret:
create: true
name: "github-webhook-server"
### GitHub Webhook Configuration
#github_webhook_secret_token: ""
service:
type: NodePort
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
nodePort: 31000

View File

@@ -41,6 +41,68 @@ type HorizontalRunnerAutoscalerSpec struct {
// Metrics is the collection of various metric targets to calculate desired number of runners // Metrics is the collection of various metric targets to calculate desired number of runners
// +optional // +optional
Metrics []MetricSpec `json:"metrics,omitempty"` Metrics []MetricSpec `json:"metrics,omitempty"`
// ScaleUpTriggers is an experimental feature to increase the desired replicas by 1
// on each webhook requested received by the webhookBasedAutoscaler.
//
// This feature requires you to also enable and deploy the webhookBasedAutoscaler onto your cluster.
//
// Note that the added runners remain until the next sync period at least,
// and they may or may not be used by GitHub Actions depending on the timing.
// They are intended to be used to gain "resource slack" immediately after you
// receive a webhook from GitHub, so that you can loosely expect MinReplicas runners to be always available.
ScaleUpTriggers []ScaleUpTrigger `json:"scaleUpTriggers,omitempty"`
CapacityReservations []CapacityReservation `json:"capacityReservations,omitempty" patchStrategy:"merge" patchMergeKey:"name"`
// ScheduledOverrides is the list of ScheduledOverride.
// It can be used to override a few fields of HorizontalRunnerAutoscalerSpec on schedule.
// The earlier a scheduled override is, the higher it is prioritized.
// +optional
ScheduledOverrides []ScheduledOverride `json:"scheduledOverrides,omitempty"`
}
type ScaleUpTrigger struct {
GitHubEvent *GitHubEventScaleUpTriggerSpec `json:"githubEvent,omitempty"`
Amount int `json:"amount,omitempty"`
Duration metav1.Duration `json:"duration,omitempty"`
}
type GitHubEventScaleUpTriggerSpec struct {
CheckRun *CheckRunSpec `json:"checkRun,omitempty"`
PullRequest *PullRequestSpec `json:"pullRequest,omitempty"`
Push *PushSpec `json:"push,omitempty"`
}
// https://docs.github.com/en/actions/reference/events-that-trigger-workflows#check_run
type CheckRunSpec struct {
Types []string `json:"types,omitempty"`
Status string `json:"status,omitempty"`
// Names is a list of GitHub Actions glob patterns.
// Any check_run event whose name matches one of patterns in the list can trigger autoscaling.
// Note that check_run name seem to equal to the job name you've defined in your actions workflow yaml file.
// So it is very likely that you can utilize this to trigger depending on the job.
Names []string `json:"names,omitempty"`
}
// https://docs.github.com/en/actions/reference/events-that-trigger-workflows#pull_request
type PullRequestSpec struct {
Types []string `json:"types,omitempty"`
Branches []string `json:"branches,omitempty"`
}
// PushSpec is the condition for triggering scale-up on push event
// Also see https://docs.github.com/en/actions/reference/events-that-trigger-workflows#push
type PushSpec struct {
}
// CapacityReservation specifies the number of replicas temporarily added
// to the scale target until ExpirationTime.
type CapacityReservation struct {
Name string `json:"name,omitempty"`
ExpirationTime metav1.Time `json:"expirationTime,omitempty"`
Replicas int `json:"replicas,omitempty"`
} }
type ScaleTargetRef struct { type ScaleTargetRef struct {
@@ -56,6 +118,70 @@ type MetricSpec struct {
// For example, a repository name is the REPO part of `github.com/USER/REPO`. // For example, a repository name is the REPO part of `github.com/USER/REPO`.
// +optional // +optional
RepositoryNames []string `json:"repositoryNames,omitempty"` RepositoryNames []string `json:"repositoryNames,omitempty"`
// ScaleUpThreshold is the percentage of busy runners greater than which will
// trigger the hpa to scale runners up.
// +optional
ScaleUpThreshold string `json:"scaleUpThreshold,omitempty"`
// ScaleDownThreshold is the percentage of busy runners less than which will
// trigger the hpa to scale the runners down.
// +optional
ScaleDownThreshold string `json:"scaleDownThreshold,omitempty"`
// ScaleUpFactor is the multiplicative factor applied to the current number of runners used
// to determine how many pods should be added.
// +optional
ScaleUpFactor string `json:"scaleUpFactor,omitempty"`
// ScaleDownFactor is the multiplicative factor applied to the current number of runners used
// to determine how many pods should be removed.
// +optional
ScaleDownFactor string `json:"scaleDownFactor,omitempty"`
// ScaleUpAdjustment is the number of runners added on scale-up.
// You can only specify either ScaleUpFactor or ScaleUpAdjustment.
// +optional
ScaleUpAdjustment int `json:"scaleUpAdjustment,omitempty"`
// ScaleDownAdjustment is the number of runners removed on scale-down.
// You can only specify either ScaleDownFactor or ScaleDownAdjustment.
// +optional
ScaleDownAdjustment int `json:"scaleDownAdjustment,omitempty"`
}
// ScheduledOverride can be used to override a few fields of HorizontalRunnerAutoscalerSpec on schedule.
// A schedule can optionally be recurring, so that the correspoding override happens every day, week, month, or year.
type ScheduledOverride struct {
// StartTime is the time at which the first override starts.
StartTime metav1.Time `json:"startTime"`
// EndTime is the time at which the first override ends.
EndTime metav1.Time `json:"endTime"`
// MinReplicas is the number of runners while overriding.
// If omitted, it doesn't override minReplicas.
// +optional
// +nullable
// +kubebuilder:validation:Minimum=0
MinReplicas *int `json:"minReplicas,omitempty"`
// +optional
RecurrenceRule RecurrenceRule `json:"recurrenceRule,omitempty"`
}
type RecurrenceRule struct {
// Frequency is the name of a predefined interval of each recurrence.
// The valid values are "Daily", "Weekly", "Monthly", and "Yearly".
// If empty, the corresponding override happens only once.
// +optional
// +kubebuilder:validation:Enum=Daily;Weekly;Monthly;Yearly
Frequency string `json:"frequency,omitempty"`
// UntilTime is the time of the final recurrence.
// If empty, the schedule recurs forever.
// +optional
UntilTime metav1.Time `json:"untilTime,omitempty"`
} }
type HorizontalRunnerAutoscalerStatus struct { type HorizontalRunnerAutoscalerStatus struct {
@@ -70,7 +196,24 @@ type HorizontalRunnerAutoscalerStatus struct {
DesiredReplicas *int `json:"desiredReplicas,omitempty"` DesiredReplicas *int `json:"desiredReplicas,omitempty"`
// +optional // +optional
// +nullable
LastSuccessfulScaleOutTime *metav1.Time `json:"lastSuccessfulScaleOutTime,omitempty"` LastSuccessfulScaleOutTime *metav1.Time `json:"lastSuccessfulScaleOutTime,omitempty"`
// +optional
CacheEntries []CacheEntry `json:"cacheEntries,omitempty"`
// ScheduledOverridesSummary is the summary of active and upcoming scheduled overrides to be shown in e.g. a column of a `kubectl get hra` output
// for observability.
// +optional
ScheduledOverridesSummary *string `json:"scheduledOverridesSummary,omitempty"`
}
const CacheEntryKeyDesiredReplicas = "desiredReplicas"
type CacheEntry struct {
Key string `json:"key,omitempty"`
Value int `json:"value,omitempty"`
ExpirationTime metav1.Time `json:"expirationTime,omitempty"`
} }
// +kubebuilder:object:root=true // +kubebuilder:object:root=true
@@ -78,6 +221,7 @@ type HorizontalRunnerAutoscalerStatus struct {
// +kubebuilder:printcolumn:JSONPath=".spec.minReplicas",name=Min,type=number // +kubebuilder:printcolumn:JSONPath=".spec.minReplicas",name=Min,type=number
// +kubebuilder:printcolumn:JSONPath=".spec.maxReplicas",name=Max,type=number // +kubebuilder:printcolumn:JSONPath=".spec.maxReplicas",name=Max,type=number
// +kubebuilder:printcolumn:JSONPath=".status.desiredReplicas",name=Desired,type=number // +kubebuilder:printcolumn:JSONPath=".status.desiredReplicas",name=Desired,type=number
// +kubebuilder:printcolumn:JSONPath=".status.scheduledOverridesSummary",name=Schedule,type=string
// HorizontalRunnerAutoscaler is the Schema for the horizontalrunnerautoscaler API // HorizontalRunnerAutoscaler is the Schema for the horizontalrunnerautoscaler API
type HorizontalRunnerAutoscaler struct { type HorizontalRunnerAutoscaler struct {

View File

@@ -19,12 +19,18 @@ package v1alpha1
import ( import (
"errors" "errors"
"k8s.io/apimachinery/pkg/api/resource"
corev1 "k8s.io/api/core/v1" corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
) )
// RunnerSpec defines the desired state of Runner // RunnerSpec defines the desired state of Runner
type RunnerSpec struct { type RunnerSpec struct {
// +optional
// +kubebuilder:validation:Pattern=`^[^/]+$`
Enterprise string `json:"enterprise,omitempty"`
// +optional // +optional
// +kubebuilder:validation:Pattern=`^[^/]+$` // +kubebuilder:validation:Pattern=`^[^/]+$`
Organization string `json:"organization,omitempty"` Organization string `json:"organization,omitempty"`
@@ -36,9 +42,19 @@ type RunnerSpec struct {
// +optional // +optional
Labels []string `json:"labels,omitempty"` Labels []string `json:"labels,omitempty"`
// +optional
Group string `json:"group,omitempty"`
// +optional
Ephemeral *bool `json:"ephemeral,omitempty"`
// +optional // +optional
Containers []corev1.Container `json:"containers,omitempty"` Containers []corev1.Container `json:"containers,omitempty"`
// +optional // +optional
DockerdContainerResources corev1.ResourceRequirements `json:"dockerdContainerResources,omitempty"`
// +optional
DockerVolumeMounts []corev1.VolumeMount `json:"dockerVolumeMounts,omitempty"`
// +optional
Resources corev1.ResourceRequirements `json:"resources,omitempty"` Resources corev1.ResourceRequirements `json:"resources,omitempty"`
// +optional // +optional
VolumeMounts []corev1.VolumeMount `json:"volumeMounts,omitempty"` VolumeMounts []corev1.VolumeMount `json:"volumeMounts,omitempty"`
@@ -54,6 +70,8 @@ type RunnerSpec struct {
// +optional // +optional
Volumes []corev1.Volume `json:"volumes,omitempty"` Volumes []corev1.Volume `json:"volumes,omitempty"`
// +optional
WorkDir string `json:"workDir,omitempty"`
// +optional // +optional
InitContainers []corev1.Container `json:"initContainers,omitempty"` InitContainers []corev1.Container `json:"initContainers,omitempty"`
@@ -77,16 +95,43 @@ type RunnerSpec struct {
EphemeralContainers []corev1.EphemeralContainer `json:"ephemeralContainers,omitempty"` EphemeralContainers []corev1.EphemeralContainer `json:"ephemeralContainers,omitempty"`
// +optional // +optional
TerminationGracePeriodSeconds *int64 `json:"terminationGracePeriodSeconds,omitempty"` TerminationGracePeriodSeconds *int64 `json:"terminationGracePeriodSeconds,omitempty"`
// +optional
DockerdWithinRunnerContainer *bool `json:"dockerdWithinRunnerContainer,omitempty"`
// +optional
DockerEnabled *bool `json:"dockerEnabled,omitempty"`
// +optional
DockerMTU *int64 `json:"dockerMTU,omitempty"`
// +optional
DockerRegistryMirror *string `json:"dockerRegistryMirror,omitempty"`
// +optional
HostAliases []corev1.HostAlias `json:"hostAliases,omitempty"`
// +optional
VolumeSizeLimit *resource.Quantity `json:"volumeSizeLimit,omitempty"`
// RuntimeClassName is the container runtime configuration that containers should run under.
// More info: https://kubernetes.io/docs/concepts/containers/runtime-class
// +optional
RuntimeClassName *string `json:"runtimeClassName,omitempty"`
} }
// ValidateRepository validates repository field. // ValidateRepository validates repository field.
func (rs *RunnerSpec) ValidateRepository() error { func (rs *RunnerSpec) ValidateRepository() error {
// Organization and repository are both exclusive. // Enterprise, Organization and repository are both exclusive.
if len(rs.Organization) == 0 && len(rs.Repository) == 0 { foundCount := 0
return errors.New("Spec needs organization or repository") if len(rs.Organization) > 0 {
foundCount += 1
} }
if len(rs.Organization) > 0 && len(rs.Repository) > 0 { if len(rs.Repository) > 0 {
return errors.New("Spec cannot have both organization and repository") foundCount += 1
}
if len(rs.Enterprise) > 0 {
foundCount += 1
}
if foundCount == 0 {
return errors.New("Spec needs enterprise, organization or repository")
}
if foundCount > 1 {
return errors.New("Spec cannot have many fields defined enterprise, organization and repository")
} }
return nil return nil
@@ -94,14 +139,22 @@ func (rs *RunnerSpec) ValidateRepository() error {
// RunnerStatus defines the observed state of Runner // RunnerStatus defines the observed state of Runner
type RunnerStatus struct { type RunnerStatus struct {
// +optional
Registration RunnerStatusRegistration `json:"registration"` Registration RunnerStatusRegistration `json:"registration"`
Phase string `json:"phase"` // +optional
Reason string `json:"reason"` Phase string `json:"phase,omitempty"`
Message string `json:"message"` // +optional
Reason string `json:"reason,omitempty"`
// +optional
Message string `json:"message,omitempty"`
// +optional
// +nullable
LastRegistrationCheckTime *metav1.Time `json:"lastRegistrationCheckTime,omitempty"`
} }
// RunnerStatusRegistration contains runner registration status // RunnerStatusRegistration contains runner registration status
type RunnerStatusRegistration struct { type RunnerStatusRegistration struct {
Enterprise string `json:"enterprise,omitempty"`
Organization string `json:"organization,omitempty"` Organization string `json:"organization,omitempty"`
Repository string `json:"repository,omitempty"` Repository string `json:"repository,omitempty"`
Labels []string `json:"labels,omitempty"` Labels []string `json:"labels,omitempty"`
@@ -111,10 +164,12 @@ type RunnerStatusRegistration struct {
// +kubebuilder:object:root=true // +kubebuilder:object:root=true
// +kubebuilder:subresource:status // +kubebuilder:subresource:status
// +kubebuilder:printcolumn:JSONPath=".spec.enterprise",name=Enterprise,type=string
// +kubebuilder:printcolumn:JSONPath=".spec.organization",name=Organization,type=string // +kubebuilder:printcolumn:JSONPath=".spec.organization",name=Organization,type=string
// +kubebuilder:printcolumn:JSONPath=".spec.repository",name=Repository,type=string // +kubebuilder:printcolumn:JSONPath=".spec.repository",name=Repository,type=string
// +kubebuilder:printcolumn:JSONPath=".spec.labels",name=Labels,type=string // +kubebuilder:printcolumn:JSONPath=".spec.labels",name=Labels,type=string
// +kubebuilder:printcolumn:JSONPath=".status.phase",name=Status,type=string // +kubebuilder:printcolumn:JSONPath=".status.phase",name=Status,type=string
// +kubebuilder:printcolumn:name="Age",type="date",JSONPath=".metadata.creationTimestamp"
// Runner is the Schema for the runners API // Runner is the Schema for the runners API
type Runner struct { type Runner struct {

View File

@@ -34,7 +34,7 @@ func (r *Runner) SetupWebhookWithManager(mgr ctrl.Manager) error {
Complete() Complete()
} }
// +kubebuilder:webhook:path=/mutate-actions-summerwind-dev-v1alpha1-runner,verbs=create;update,mutating=true,failurePolicy=fail,groups=actions.summerwind.dev,resources=runners,versions=v1alpha1,name=mutate.runner.actions.summerwind.dev // +kubebuilder:webhook:path=/mutate-actions-summerwind-dev-v1alpha1-runner,verbs=create;update,mutating=true,failurePolicy=fail,groups=actions.summerwind.dev,resources=runners,versions=v1alpha1,name=mutate.runner.actions.summerwind.dev,sideEffects=None
var _ webhook.Defaulter = &Runner{} var _ webhook.Defaulter = &Runner{}
@@ -43,7 +43,7 @@ func (r *Runner) Default() {
// Nothing to do. // Nothing to do.
} }
// +kubebuilder:webhook:path=/validate-actions-summerwind-dev-v1alpha1-runner,verbs=create;update,mutating=false,failurePolicy=fail,groups=actions.summerwind.dev,resources=runners,versions=v1alpha1,name=validate.runner.actions.summerwind.dev // +kubebuilder:webhook:path=/validate-actions-summerwind-dev-v1alpha1-runner,verbs=create;update,mutating=false,failurePolicy=fail,groups=actions.summerwind.dev,resources=runners,versions=v1alpha1,name=validate.runner.actions.summerwind.dev,sideEffects=None
var _ webhook.Validator = &Runner{} var _ webhook.Validator = &Runner{}

View File

@@ -22,31 +22,57 @@ import (
const ( const (
AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns = "TotalNumberOfQueuedAndInProgressWorkflowRuns" AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns = "TotalNumberOfQueuedAndInProgressWorkflowRuns"
AutoscalingMetricTypePercentageRunnersBusy = "PercentageRunnersBusy"
) )
// RunnerReplicaSetSpec defines the desired state of RunnerDeployment // RunnerDeploymentSpec defines the desired state of RunnerDeployment
type RunnerDeploymentSpec struct { type RunnerDeploymentSpec struct {
// +optional // +optional
// +nullable
Replicas *int `json:"replicas,omitempty"` Replicas *int `json:"replicas,omitempty"`
Template RunnerTemplate `json:"template"` // +optional
// +nullable
Selector *metav1.LabelSelector `json:"selector"`
Template RunnerTemplate `json:"template"`
} }
type RunnerDeploymentStatus struct { type RunnerDeploymentStatus struct {
AvailableReplicas int `json:"availableReplicas"` // See K8s deployment controller code for reference
ReadyReplicas int `json:"readyReplicas"` // https://github.com/kubernetes/kubernetes/blob/ea0764452222146c47ec826977f49d7001b0ea8c/pkg/controller/deployment/sync.go#L487-L505
// Replicas is the total number of desired, non-terminated and latest pods to be set for the primary RunnerSet // AvailableReplicas is the total number of available runners which have been successfully registered to GitHub and still running.
// This corresponds to the sum of status.availableReplicas of all the runner replica sets.
// +optional
AvailableReplicas *int `json:"availableReplicas"`
// ReadyReplicas is the total number of available runners which have been successfully registered to GitHub and still running.
// This corresponds to the sum of status.readyReplicas of all the runner replica sets.
// +optional
ReadyReplicas *int `json:"readyReplicas"`
// ReadyReplicas is the total number of available runners which have been successfully registered to GitHub and still running.
// This corresponds to status.replicas of the runner replica set that has the desired template hash.
// +optional
UpdatedReplicas *int `json:"updatedReplicas"`
// DesiredReplicas is the total number of desired, non-terminated and latest pods to be set for the primary RunnerSet
// This doesn't include outdated pods while upgrading the deployment and replacing the runnerset. // This doesn't include outdated pods while upgrading the deployment and replacing the runnerset.
// +optional // +optional
Replicas *int `json:"desiredReplicas,omitempty"` DesiredReplicas *int `json:"desiredReplicas"`
// Replicas is the total number of replicas
// +optional
Replicas *int `json:"replicas"`
} }
// +kubebuilder:object:root=true // +kubebuilder:object:root=true
// +kubebuilder:subresource:status // +kubebuilder:subresource:status
// +kubebuilder:printcolumn:JSONPath=".spec.replicas",name=Desired,type=number // +kubebuilder:printcolumn:JSONPath=".spec.replicas",name=Desired,type=number
// +kubebuilder:printcolumn:JSONPath=".status.availableReplicas",name=Current,type=number // +kubebuilder:printcolumn:JSONPath=".status.replicas",name=Current,type=number
// +kubebuilder:printcolumn:JSONPath=".status.readyReplicas",name=Ready,type=number // +kubebuilder:printcolumn:JSONPath=".status.updatedReplicas",name=Up-To-Date,type=number
// +kubebuilder:printcolumn:JSONPath=".status.availableReplicas",name=Available,type=number
// +kubebuilder:printcolumn:name="Age",type="date",JSONPath=".metadata.creationTimestamp"
// RunnerDeployment is the Schema for the runnerdeployments API // RunnerDeployment is the Schema for the runnerdeployments API
type RunnerDeployment struct { type RunnerDeployment struct {

View File

@@ -22,14 +22,30 @@ import (
// RunnerReplicaSetSpec defines the desired state of RunnerReplicaSet // RunnerReplicaSetSpec defines the desired state of RunnerReplicaSet
type RunnerReplicaSetSpec struct { type RunnerReplicaSetSpec struct {
Replicas *int `json:"replicas"` // +optional
// +nullable
Replicas *int `json:"replicas,omitempty"`
Template RunnerTemplate `json:"template"` // +optional
// +nullable
Selector *metav1.LabelSelector `json:"selector"`
Template RunnerTemplate `json:"template"`
} }
type RunnerReplicaSetStatus struct { type RunnerReplicaSetStatus struct {
AvailableReplicas int `json:"availableReplicas"` // See K8s replicaset controller code for reference
ReadyReplicas int `json:"readyReplicas"` // https://github.com/kubernetes/kubernetes/blob/ea0764452222146c47ec826977f49d7001b0ea8c/pkg/controller/replicaset/replica_set_utils.go#L101-L106
// Replicas is the number of runners that are created and still being managed by this runner replica set.
// +optional
Replicas *int `json:"replicas"`
// ReadyReplicas is the number of runners that are created and Runnning.
ReadyReplicas *int `json:"readyReplicas"`
// AvailableReplicas is the number of runners that are created and Runnning.
// This is currently same as ReadyReplicas but perserved for future use.
AvailableReplicas *int `json:"availableReplicas"`
} }
type RunnerTemplate struct { type RunnerTemplate struct {
@@ -41,8 +57,9 @@ type RunnerTemplate struct {
// +kubebuilder:object:root=true // +kubebuilder:object:root=true
// +kubebuilder:subresource:status // +kubebuilder:subresource:status
// +kubebuilder:printcolumn:JSONPath=".spec.replicas",name=Desired,type=number // +kubebuilder:printcolumn:JSONPath=".spec.replicas",name=Desired,type=number
// +kubebuilder:printcolumn:JSONPath=".status.availableReplicas",name=Current,type=number // +kubebuilder:printcolumn:JSONPath=".status.replicas",name=Current,type=number
// +kubebuilder:printcolumn:JSONPath=".status.readyReplicas",name=Ready,type=number // +kubebuilder:printcolumn:JSONPath=".status.readyReplicas",name=Ready,type=number
// +kubebuilder:printcolumn:name="Age",type="date",JSONPath=".metadata.creationTimestamp"
// RunnerReplicaSet is the Schema for the runnerreplicasets API // RunnerReplicaSet is the Schema for the runnerreplicasets API
type RunnerReplicaSet struct { type RunnerReplicaSet struct {

View File

@@ -34,7 +34,7 @@ func (r *RunnerReplicaSet) SetupWebhookWithManager(mgr ctrl.Manager) error {
Complete() Complete()
} }
// +kubebuilder:webhook:path=/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset,verbs=create;update,mutating=true,failurePolicy=fail,groups=actions.summerwind.dev,resources=runnerreplicasets,versions=v1alpha1,name=mutate.runnerreplicaset.actions.summerwind.dev // +kubebuilder:webhook:path=/mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset,verbs=create;update,mutating=true,failurePolicy=fail,groups=actions.summerwind.dev,resources=runnerreplicasets,versions=v1alpha1,name=mutate.runnerreplicaset.actions.summerwind.dev,sideEffects=None
var _ webhook.Defaulter = &RunnerReplicaSet{} var _ webhook.Defaulter = &RunnerReplicaSet{}
@@ -43,7 +43,7 @@ func (r *RunnerReplicaSet) Default() {
// Nothing to do. // Nothing to do.
} }
// +kubebuilder:webhook:path=/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset,verbs=create;update,mutating=false,failurePolicy=fail,groups=actions.summerwind.dev,resources=runnerreplicasets,versions=v1alpha1,name=validate.runnerreplicaset.actions.summerwind.dev // +kubebuilder:webhook:path=/validate-actions-summerwind-dev-v1alpha1-runnerreplicaset,verbs=create;update,mutating=false,failurePolicy=fail,groups=actions.summerwind.dev,resources=runnerreplicasets,versions=v1alpha1,name=validate.runnerreplicaset.actions.summerwind.dev,sideEffects=None
var _ webhook.Validator = &RunnerReplicaSet{} var _ webhook.Validator = &RunnerReplicaSet{}

View File

@@ -22,9 +22,97 @@ package v1alpha1
import ( import (
"k8s.io/api/core/v1" "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime" "k8s.io/apimachinery/pkg/runtime"
) )
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *CacheEntry) DeepCopyInto(out *CacheEntry) {
*out = *in
in.ExpirationTime.DeepCopyInto(&out.ExpirationTime)
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new CacheEntry.
func (in *CacheEntry) DeepCopy() *CacheEntry {
if in == nil {
return nil
}
out := new(CacheEntry)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *CapacityReservation) DeepCopyInto(out *CapacityReservation) {
*out = *in
in.ExpirationTime.DeepCopyInto(&out.ExpirationTime)
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new CapacityReservation.
func (in *CapacityReservation) DeepCopy() *CapacityReservation {
if in == nil {
return nil
}
out := new(CapacityReservation)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *CheckRunSpec) DeepCopyInto(out *CheckRunSpec) {
*out = *in
if in.Types != nil {
in, out := &in.Types, &out.Types
*out = make([]string, len(*in))
copy(*out, *in)
}
if in.Names != nil {
in, out := &in.Names, &out.Names
*out = make([]string, len(*in))
copy(*out, *in)
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new CheckRunSpec.
func (in *CheckRunSpec) DeepCopy() *CheckRunSpec {
if in == nil {
return nil
}
out := new(CheckRunSpec)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *GitHubEventScaleUpTriggerSpec) DeepCopyInto(out *GitHubEventScaleUpTriggerSpec) {
*out = *in
if in.CheckRun != nil {
in, out := &in.CheckRun, &out.CheckRun
*out = new(CheckRunSpec)
(*in).DeepCopyInto(*out)
}
if in.PullRequest != nil {
in, out := &in.PullRequest, &out.PullRequest
*out = new(PullRequestSpec)
(*in).DeepCopyInto(*out)
}
if in.Push != nil {
in, out := &in.Push, &out.Push
*out = new(PushSpec)
**out = **in
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new GitHubEventScaleUpTriggerSpec.
func (in *GitHubEventScaleUpTriggerSpec) DeepCopy() *GitHubEventScaleUpTriggerSpec {
if in == nil {
return nil
}
out := new(GitHubEventScaleUpTriggerSpec)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *HorizontalRunnerAutoscaler) DeepCopyInto(out *HorizontalRunnerAutoscaler) { func (in *HorizontalRunnerAutoscaler) DeepCopyInto(out *HorizontalRunnerAutoscaler) {
*out = *in *out = *in
@@ -110,6 +198,27 @@ func (in *HorizontalRunnerAutoscalerSpec) DeepCopyInto(out *HorizontalRunnerAuto
(*in)[i].DeepCopyInto(&(*out)[i]) (*in)[i].DeepCopyInto(&(*out)[i])
} }
} }
if in.ScaleUpTriggers != nil {
in, out := &in.ScaleUpTriggers, &out.ScaleUpTriggers
*out = make([]ScaleUpTrigger, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
if in.CapacityReservations != nil {
in, out := &in.CapacityReservations, &out.CapacityReservations
*out = make([]CapacityReservation, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
if in.ScheduledOverrides != nil {
in, out := &in.ScheduledOverrides, &out.ScheduledOverrides
*out = make([]ScheduledOverride, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new HorizontalRunnerAutoscalerSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new HorizontalRunnerAutoscalerSpec.
@@ -134,6 +243,18 @@ func (in *HorizontalRunnerAutoscalerStatus) DeepCopyInto(out *HorizontalRunnerAu
in, out := &in.LastSuccessfulScaleOutTime, &out.LastSuccessfulScaleOutTime in, out := &in.LastSuccessfulScaleOutTime, &out.LastSuccessfulScaleOutTime
*out = (*in).DeepCopy() *out = (*in).DeepCopy()
} }
if in.CacheEntries != nil {
in, out := &in.CacheEntries, &out.CacheEntries
*out = make([]CacheEntry, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
if in.ScheduledOverridesSummary != nil {
in, out := &in.ScheduledOverridesSummary, &out.ScheduledOverridesSummary
*out = new(string)
**out = **in
}
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new HorizontalRunnerAutoscalerStatus. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new HorizontalRunnerAutoscalerStatus.
@@ -166,6 +287,62 @@ func (in *MetricSpec) DeepCopy() *MetricSpec {
return out return out
} }
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *PullRequestSpec) DeepCopyInto(out *PullRequestSpec) {
*out = *in
if in.Types != nil {
in, out := &in.Types, &out.Types
*out = make([]string, len(*in))
copy(*out, *in)
}
if in.Branches != nil {
in, out := &in.Branches, &out.Branches
*out = make([]string, len(*in))
copy(*out, *in)
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PullRequestSpec.
func (in *PullRequestSpec) DeepCopy() *PullRequestSpec {
if in == nil {
return nil
}
out := new(PullRequestSpec)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *PushSpec) DeepCopyInto(out *PushSpec) {
*out = *in
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PushSpec.
func (in *PushSpec) DeepCopy() *PushSpec {
if in == nil {
return nil
}
out := new(PushSpec)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *RecurrenceRule) DeepCopyInto(out *RecurrenceRule) {
*out = *in
in.UntilTime.DeepCopyInto(&out.UntilTime)
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RecurrenceRule.
func (in *RecurrenceRule) DeepCopy() *RecurrenceRule {
if in == nil {
return nil
}
out := new(RecurrenceRule)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *Runner) DeepCopyInto(out *Runner) { func (in *Runner) DeepCopyInto(out *Runner) {
*out = *in *out = *in
@@ -260,6 +437,11 @@ func (in *RunnerDeploymentSpec) DeepCopyInto(out *RunnerDeploymentSpec) {
*out = new(int) *out = new(int)
**out = **in **out = **in
} }
if in.Selector != nil {
in, out := &in.Selector, &out.Selector
*out = new(metav1.LabelSelector)
(*in).DeepCopyInto(*out)
}
in.Template.DeepCopyInto(&out.Template) in.Template.DeepCopyInto(&out.Template)
} }
@@ -276,6 +458,26 @@ func (in *RunnerDeploymentSpec) DeepCopy() *RunnerDeploymentSpec {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *RunnerDeploymentStatus) DeepCopyInto(out *RunnerDeploymentStatus) { func (in *RunnerDeploymentStatus) DeepCopyInto(out *RunnerDeploymentStatus) {
*out = *in *out = *in
if in.AvailableReplicas != nil {
in, out := &in.AvailableReplicas, &out.AvailableReplicas
*out = new(int)
**out = **in
}
if in.ReadyReplicas != nil {
in, out := &in.ReadyReplicas, &out.ReadyReplicas
*out = new(int)
**out = **in
}
if in.UpdatedReplicas != nil {
in, out := &in.UpdatedReplicas, &out.UpdatedReplicas
*out = new(int)
**out = **in
}
if in.DesiredReplicas != nil {
in, out := &in.DesiredReplicas, &out.DesiredReplicas
*out = new(int)
**out = **in
}
if in.Replicas != nil { if in.Replicas != nil {
in, out := &in.Replicas, &out.Replicas in, out := &in.Replicas, &out.Replicas
*out = new(int) *out = new(int)
@@ -331,7 +533,7 @@ func (in *RunnerReplicaSet) DeepCopyInto(out *RunnerReplicaSet) {
out.TypeMeta = in.TypeMeta out.TypeMeta = in.TypeMeta
in.ObjectMeta.DeepCopyInto(&out.ObjectMeta) in.ObjectMeta.DeepCopyInto(&out.ObjectMeta)
in.Spec.DeepCopyInto(&out.Spec) in.Spec.DeepCopyInto(&out.Spec)
out.Status = in.Status in.Status.DeepCopyInto(&out.Status)
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RunnerReplicaSet. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RunnerReplicaSet.
@@ -392,6 +594,11 @@ func (in *RunnerReplicaSetSpec) DeepCopyInto(out *RunnerReplicaSetSpec) {
*out = new(int) *out = new(int)
**out = **in **out = **in
} }
if in.Selector != nil {
in, out := &in.Selector, &out.Selector
*out = new(metav1.LabelSelector)
(*in).DeepCopyInto(*out)
}
in.Template.DeepCopyInto(&out.Template) in.Template.DeepCopyInto(&out.Template)
} }
@@ -408,6 +615,21 @@ func (in *RunnerReplicaSetSpec) DeepCopy() *RunnerReplicaSetSpec {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *RunnerReplicaSetStatus) DeepCopyInto(out *RunnerReplicaSetStatus) { func (in *RunnerReplicaSetStatus) DeepCopyInto(out *RunnerReplicaSetStatus) {
*out = *in *out = *in
if in.Replicas != nil {
in, out := &in.Replicas, &out.Replicas
*out = new(int)
**out = **in
}
if in.ReadyReplicas != nil {
in, out := &in.ReadyReplicas, &out.ReadyReplicas
*out = new(int)
**out = **in
}
if in.AvailableReplicas != nil {
in, out := &in.AvailableReplicas, &out.AvailableReplicas
*out = new(int)
**out = **in
}
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RunnerReplicaSetStatus. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RunnerReplicaSetStatus.
@@ -428,6 +650,11 @@ func (in *RunnerSpec) DeepCopyInto(out *RunnerSpec) {
*out = make([]string, len(*in)) *out = make([]string, len(*in))
copy(*out, *in) copy(*out, *in)
} }
if in.Ephemeral != nil {
in, out := &in.Ephemeral, &out.Ephemeral
*out = new(bool)
**out = **in
}
if in.Containers != nil { if in.Containers != nil {
in, out := &in.Containers, &out.Containers in, out := &in.Containers, &out.Containers
*out = make([]v1.Container, len(*in)) *out = make([]v1.Container, len(*in))
@@ -435,6 +662,14 @@ func (in *RunnerSpec) DeepCopyInto(out *RunnerSpec) {
(*in)[i].DeepCopyInto(&(*out)[i]) (*in)[i].DeepCopyInto(&(*out)[i])
} }
} }
in.DockerdContainerResources.DeepCopyInto(&out.DockerdContainerResources)
if in.DockerVolumeMounts != nil {
in, out := &in.DockerVolumeMounts, &out.DockerVolumeMounts
*out = make([]v1.VolumeMount, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
in.Resources.DeepCopyInto(&out.Resources) in.Resources.DeepCopyInto(&out.Resources)
if in.VolumeMounts != nil { if in.VolumeMounts != nil {
in, out := &in.VolumeMounts, &out.VolumeMounts in, out := &in.VolumeMounts, &out.VolumeMounts
@@ -524,6 +759,43 @@ func (in *RunnerSpec) DeepCopyInto(out *RunnerSpec) {
*out = new(int64) *out = new(int64)
**out = **in **out = **in
} }
if in.DockerdWithinRunnerContainer != nil {
in, out := &in.DockerdWithinRunnerContainer, &out.DockerdWithinRunnerContainer
*out = new(bool)
**out = **in
}
if in.DockerEnabled != nil {
in, out := &in.DockerEnabled, &out.DockerEnabled
*out = new(bool)
**out = **in
}
if in.DockerMTU != nil {
in, out := &in.DockerMTU, &out.DockerMTU
*out = new(int64)
**out = **in
}
if in.DockerRegistryMirror != nil {
in, out := &in.DockerRegistryMirror, &out.DockerRegistryMirror
*out = new(string)
**out = **in
}
if in.HostAliases != nil {
in, out := &in.HostAliases, &out.HostAliases
*out = make([]v1.HostAlias, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
if in.VolumeSizeLimit != nil {
in, out := &in.VolumeSizeLimit, &out.VolumeSizeLimit
x := (*in).DeepCopy()
*out = &x
}
if in.RuntimeClassName != nil {
in, out := &in.RuntimeClassName, &out.RuntimeClassName
*out = new(string)
**out = **in
}
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RunnerSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RunnerSpec.
@@ -540,6 +812,10 @@ func (in *RunnerSpec) DeepCopy() *RunnerSpec {
func (in *RunnerStatus) DeepCopyInto(out *RunnerStatus) { func (in *RunnerStatus) DeepCopyInto(out *RunnerStatus) {
*out = *in *out = *in
in.Registration.DeepCopyInto(&out.Registration) in.Registration.DeepCopyInto(&out.Registration)
if in.LastRegistrationCheckTime != nil {
in, out := &in.LastRegistrationCheckTime, &out.LastRegistrationCheckTime
*out = (*in).DeepCopy()
}
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RunnerStatus. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RunnerStatus.
@@ -604,3 +880,47 @@ func (in *ScaleTargetRef) DeepCopy() *ScaleTargetRef {
in.DeepCopyInto(out) in.DeepCopyInto(out)
return out return out
} }
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *ScaleUpTrigger) DeepCopyInto(out *ScaleUpTrigger) {
*out = *in
if in.GitHubEvent != nil {
in, out := &in.GitHubEvent, &out.GitHubEvent
*out = new(GitHubEventScaleUpTriggerSpec)
(*in).DeepCopyInto(*out)
}
out.Duration = in.Duration
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScaleUpTrigger.
func (in *ScaleUpTrigger) DeepCopy() *ScaleUpTrigger {
if in == nil {
return nil
}
out := new(ScaleUpTrigger)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *ScheduledOverride) DeepCopyInto(out *ScheduledOverride) {
*out = *in
in.StartTime.DeepCopyInto(&out.StartTime)
in.EndTime.DeepCopyInto(&out.EndTime)
if in.MinReplicas != nil {
in, out := &in.MinReplicas, &out.MinReplicas
*out = new(int)
**out = **in
}
in.RecurrenceRule.DeepCopyInto(&out.RecurrenceRule)
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledOverride.
func (in *ScheduledOverride) DeepCopy() *ScheduledOverride {
if in == nil {
return nil
}
out := new(ScheduledOverride)
in.DeepCopyInto(out)
return out
}

View File

@@ -0,0 +1,4 @@
# This file defines the config for "ct" (chart tester) used by the helm linting GitHub workflow
lint-conf: charts/.ci/lint-config.yaml
chart-repos:
- jetstack=https://charts.jetstack.io

View File

@@ -0,0 +1,6 @@
rules:
# One blank line is OK
empty-lines:
max-start: 1
max-end: 1
max: 1

View File

@@ -0,0 +1,3 @@
#!/bin/bash
docker run --rm -it -w /repo -v $(pwd):/repo quay.io/helmpack/chart-testing ct lint --all --config charts/.ci/ct-config.yaml

View File

@@ -0,0 +1,15 @@
#!/bin/bash
for chart in `ls charts`;
do
helm template --values charts/$chart/ci/ci-values.yaml charts/$chart | kube-score score - \
--ignore-test pod-networkpolicy \
--ignore-test deployment-has-poddisruptionbudget \
--ignore-test deployment-has-host-podantiaffinity \
--ignore-test pod-probes \
--ignore-test container-image-tag \
--enable-optional-test container-security-context-privileged \
--enable-optional-test container-security-context-readonlyrootfilesystem \
--ignore-test container-security-context
done

View File

@@ -0,0 +1,25 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
# Docs
docs/

View File

@@ -0,0 +1,30 @@
apiVersion: v2
name: actions-runner-controller
description: A Kubernetes controller that operates self-hosted runners for GitHub Actions on your Kubernetes cluster.
# A chart can be either an 'application' or a 'library' chart.
#
# Application charts are a collection of templates that can be packaged into versioned archives
# to be deployed.
#
# Library charts provide useful utilities or functions for the chart developer. They're included as
# a dependency of application charts to inject those utilities and functions into the rendering
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.12.1
# Used as the default manager tag value when no tag property is provided in the values.yaml
appVersion: 0.19.0
home: https://github.com/actions-runner-controller/actions-runner-controller
sources:
- https://github.com/actions-runner-controller/actions-runner-controller
maintainers:
- name: actions-runner-controller
url: https://github.com/actions-runner-controller

View File

@@ -0,0 +1,81 @@
## Docs
All additional docs are kept in the `docs/` folder, this README is solely for documenting the values.yaml keys and values
## Values
_The values are documented as of HEAD_
_Default values are the defaults set in the charts values.yaml, some properties have default configurations in the code for when the property is omitted or invalid_
| Key | Description | Default |
|----------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------|
| `labels` | Set labels to apply to all resources in the chart | |
| `replicaCount` | Set the number of controller pods | 1 |
| `syncPeriod` | Set the period in which the controler reconciles the desired runners count | 10m |
| `githubAPICacheDuration` | Set the cache period for API calls | |
| `logLevel` | Set the log level of the controller container | |
| `authSecret.create` | Deploy the controller auth secret | true |
| `authSecret.name` | Set the name of the auth secret | controller-manager |
| `authSecret.github_app_id` | The ID of your GitHub App. **This can't be set at the same time as `authSecret.github_token`** | |
| `authSecret.github_app_installation_id` | The ID of your GitHub App installation. **This can't be set at the same time as `authSecret.github_token`** | |
| `authSecret.github_app_private_key` | The multiline string of your GitHub App's private key. **This can't be set at the same time as `authSecret.github_token`** | |
| `authSecret.github_token` | Your chosen GitHub PAT token. **This can't be set at the same time as the `authSecret.github_app_*`** | |
| `image.repository` | The "repository/image" of the controller container | summerwind/actions-runner-controller |
| `image.tag` | The tag of the controller container | |
| `image.dindSidecarRepositoryAndTag` | The "repository/image" of the dind sidecar container | docker:dind |
| `image.pullPolicy` | The pull policy of the controller image | IfNotPresent |
| `metrics.serviceMonitor` | Deploy serviceMonitor kind for for use with prometheus-operator CRDs | false |
| `metrics.port` | Set port of metrics service | 8443 |
| `metrics.proxy.enabled` | Deploy kube-rbac-proxy container in controller pod | true |
| `metrics.proxy.image.repository` | The "repository/image" of the kube-proxy container | quay.io/brancz/kube-rbac-proxy |
| `metrics.proxy.image.tag` | The tag of the kube-proxy image to use when pulling the container | v0.10.0 |
| `imagePullSecrets` | Specifies the secret to be used when pulling the controller pod containers | |
| `fullNameOverride` | Override the full resource names | |
| `nameOverride` | Override the resource name prefix | |
| `serviceAccont.annotations` | Set annotations to the service account | |
| `serviceAccount.create` | Deploy the controller pod under a service account | true |
| `podAnnotations` | Set annotations for the controller pod | |
| `podLabels` | Set labels for the controller pod | |
| `serviceAccount.name` | Set the name of the service account | |
| `securityContext` | Set the security context for each container in the controller pod | |
| `podSecurityContext` | Set the security context to controller pod | |
| `service.port` | Set controller service type | |
| `service.type` | Set controller service ports | |
| `topologySpreadConstraints` | Set the controller pod topologySpreadConstraints | |
| `nodeSelector` | Set the controller pod nodeSelector | |
| `resources` | Set the controller pod resources | |
| `affinity` | Set the controller pod affinity rules | |
| `tolerations` | Set the controller pod tolerations | |
| `env` | Set environment variables for the controller container | |
| `priorityClassName` | Set the controller pod priorityClassName | |
| `scope.watchNamespace` | Tells the controller which namespace to watch if `scope.singleNamespace` is true | |
| `scope.singleNamespace` | Limit the controller to watch a single namespace | false |
| `githubWebhookServer.logLevel` | Set the log level of the githubWebhookServer container | |
| `githubWebhookServer.replicaCount` | Set the number of webhook server pods | 1 |
| `githubWebhookServer.enabled` | Deploy the webhook server pod | false |
| `githubWebhookServer.secret.create` | Deploy the webhook hook secret | true |
| `githubWebhookServer.secret.name` | Set the name of the webhook hook secret | github-webhook-server |
| `githubWebhookServer.secret.github_webhook_secret_token` | Set the webhook secret token value | |
| `githubWebhookServer.imagePullSecrets` | Specifies the secret to be used when pulling the githubWebhookServer pod containers | |
| `githubWebhookServer.nameOveride` | Override the resource name prefix | |
| `githubWebhookServer.fullNameOveride` | Override the full resource names | |
| `githubWebhookServer.serviceAccount.create` | Deploy the githubWebhookServer under a service account | true |
| `githubWebhookServer.serviceAccount.annotations` | Set annotations for the service account | |
| `githubWebhookServer.serviceAccount.name` | Set the service account name | |
| `githubWebhookServer.podAnnotations` | Set annotations for the githubWebhookServer pod | |
| `githubWebhookServer.podLabels` | Set labels for the githubWebhookServer pod | |
| `githubWebhookServer.podSecurityContext` | Set the security context to githubWebhookServer pod | |
| `githubWebhookServer.securityContext` | Set the security context for each container in the githubWebhookServer pod | |
| `githubWebhookServer.resources` | Set the githubWebhookServer pod resources | |
| `githubWebhookServer.topologySpreadConstraints` | Set the githubWebhookServer pod topologySpreadConstraints | |
| `githubWebhookServer.nodeSelector` | Set the githubWebhookServer pod nodeSelector | |
| `githubWebhookServer.tolerations` | Set the githubWebhookServer pod tolerations | |
| `githubWebhookServer.affinity` | Set the githubWebhookServer pod affinity rules | |
| `githubWebhookServer.priorityClassName` | Set the githubWebhookServer pod priorityClassName | |
| `githubWebhookServer.service.type` | Set githubWebhookServer service type | |
| `githubWebhookServer.service.ports` | Set githubWebhookServer service ports | `[{"port":80, "targetPort:"http", "protocol":"TCP", "name":"http"}]` |
| `githubWebhookServer.ingress.enabled` | Deploy an ingress kind for the githubWebhookServer | false |
| `githubWebhookServer.ingress.annotations` | Set annotations for the ingress kind | |
| `githubWebhookServer.ingress.hosts` | Set hosts configuration for ingress | `[{"host": "chart-example.local", "paths": []}]` |
| `githubWebhookServer.ingress.tls` | Set tls configuration for ingress | |

View File

@@ -0,0 +1,30 @@
# This file sets some opinionated values for kube-score to use
# when parsing the chart
image:
pullPolicy: Always
podSecurityContext:
fsGroup: 2000
securityContext:
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 2000
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
authSecret:
create: false
# Set the following to true to create a dummy secret, allowing the manager pod to start
# This is only useful in CI
createDummySecret: true

View File

@@ -0,0 +1,288 @@
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.3.0
creationTimestamp: null
name: horizontalrunnerautoscalers.actions.summerwind.dev
spec:
additionalPrinterColumns:
- JSONPath: .spec.minReplicas
name: Min
type: number
- JSONPath: .spec.maxReplicas
name: Max
type: number
- JSONPath: .status.desiredReplicas
name: Desired
type: number
- JSONPath: .status.scheduledOverridesSummary
name: Schedule
type: string
group: actions.summerwind.dev
names:
kind: HorizontalRunnerAutoscaler
listKind: HorizontalRunnerAutoscalerList
plural: horizontalrunnerautoscalers
singular: horizontalrunnerautoscaler
scope: Namespaced
subresources:
status: {}
validation:
openAPIV3Schema:
description: HorizontalRunnerAutoscaler is the Schema for the horizontalrunnerautoscaler
API
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: HorizontalRunnerAutoscalerSpec defines the desired state of
HorizontalRunnerAutoscaler
properties:
capacityReservations:
items:
description: CapacityReservation specifies the number of replicas
temporarily added to the scale target until ExpirationTime.
properties:
expirationTime:
format: date-time
type: string
name:
type: string
replicas:
type: integer
type: object
type: array
maxReplicas:
description: MinReplicas is the maximum number of replicas the deployment
is allowed to scale
type: integer
metrics:
description: Metrics is the collection of various metric targets to
calculate desired number of runners
items:
properties:
repositoryNames:
description: RepositoryNames is the list of repository names to
be used for calculating the metric. For example, a repository
name is the REPO part of `github.com/USER/REPO`.
items:
type: string
type: array
scaleDownAdjustment:
description: ScaleDownAdjustment is the number of runners removed
on scale-down. You can only specify either ScaleDownFactor or
ScaleDownAdjustment.
type: integer
scaleDownFactor:
description: ScaleDownFactor is the multiplicative factor applied
to the current number of runners used to determine how many
pods should be removed.
type: string
scaleDownThreshold:
description: ScaleDownThreshold is the percentage of busy runners
less than which will trigger the hpa to scale the runners down.
type: string
scaleUpAdjustment:
description: ScaleUpAdjustment is the number of runners added
on scale-up. You can only specify either ScaleUpFactor or ScaleUpAdjustment.
type: integer
scaleUpFactor:
description: ScaleUpFactor is the multiplicative factor applied
to the current number of runners used to determine how many
pods should be added.
type: string
scaleUpThreshold:
description: ScaleUpThreshold is the percentage of busy runners
greater than which will trigger the hpa to scale runners up.
type: string
type:
description: Type is the type of metric to be used for autoscaling.
The only supported Type is TotalNumberOfQueuedAndInProgressWorkflowRuns
type: string
type: object
type: array
minReplicas:
description: MinReplicas is the minimum number of replicas the deployment
is allowed to scale
type: integer
scaleDownDelaySecondsAfterScaleOut:
description: ScaleDownDelaySecondsAfterScaleUp is the approximate delay
for a scale down followed by a scale up Used to prevent flapping (down->up->down->...
loop)
type: integer
scaleTargetRef:
description: ScaleTargetRef sis the reference to scaled resource like
RunnerDeployment
properties:
name:
type: string
type: object
scaleUpTriggers:
description: "ScaleUpTriggers is an experimental feature to increase
the desired replicas by 1 on each webhook requested received by the
webhookBasedAutoscaler. \n This feature requires you to also enable
and deploy the webhookBasedAutoscaler onto your cluster. \n Note that
the added runners remain until the next sync period at least, and
they may or may not be used by GitHub Actions depending on the timing.
They are intended to be used to gain \"resource slack\" immediately
after you receive a webhook from GitHub, so that you can loosely expect
MinReplicas runners to be always available."
items:
properties:
amount:
type: integer
duration:
type: string
githubEvent:
properties:
checkRun:
description: https://docs.github.com/en/actions/reference/events-that-trigger-workflows#check_run
properties:
names:
description: Names is a list of GitHub Actions glob patterns.
Any check_run event whose name matches one of patterns
in the list can trigger autoscaling. Note that check_run
name seem to equal to the job name you've defined in
your actions workflow yaml file. So it is very likely
that you can utilize this to trigger depending on the
job.
items:
type: string
type: array
status:
type: string
types:
items:
type: string
type: array
type: object
pullRequest:
description: https://docs.github.com/en/actions/reference/events-that-trigger-workflows#pull_request
properties:
branches:
items:
type: string
type: array
types:
items:
type: string
type: array
type: object
push:
description: PushSpec is the condition for triggering scale-up
on push event Also see https://docs.github.com/en/actions/reference/events-that-trigger-workflows#push
type: object
type: object
type: object
type: array
scheduledOverrides:
description: ScheduledOverrides is the list of ScheduledOverride. It
can be used to override a few fields of HorizontalRunnerAutoscalerSpec
on schedule. The earlier a scheduled override is, the higher it is
prioritized.
items:
description: ScheduledOverride can be used to override a few fields
of HorizontalRunnerAutoscalerSpec on schedule. A schedule can optionally
be recurring, so that the correspoding override happens every day,
week, month, or year.
properties:
endTime:
description: EndTime is the time at which the first override ends.
format: date-time
type: string
minReplicas:
description: MinReplicas is the number of runners while overriding.
If omitted, it doesn't override minReplicas.
minimum: 0
nullable: true
type: integer
recurrenceRule:
properties:
frequency:
description: Frequency is the name of a predefined interval
of each recurrence. The valid values are "Daily", "Weekly",
"Monthly", and "Yearly". If empty, the corresponding override
happens only once.
enum:
- Daily
- Weekly
- Monthly
- Yearly
type: string
untilTime:
description: UntilTime is the time of the final recurrence.
If empty, the schedule recurs forever.
format: date-time
type: string
type: object
startTime:
description: StartTime is the time at which the first override
starts.
format: date-time
type: string
required:
- endTime
- startTime
type: object
type: array
type: object
status:
properties:
cacheEntries:
items:
properties:
expirationTime:
format: date-time
type: string
key:
type: string
value:
type: integer
type: object
type: array
desiredReplicas:
description: DesiredReplicas is the total number of desired, non-terminated
and latest pods to be set for the primary RunnerSet This doesn't include
outdated pods while upgrading the deployment and replacing the runnerset.
type: integer
lastSuccessfulScaleOutTime:
format: date-time
nullable: true
type: string
observedGeneration:
description: ObservedGeneration is the most recent generation observed
for the target. It corresponds to e.g. RunnerDeployment's generation,
which is updated on mutation by the API Server.
format: int64
type: integer
scheduledOverridesSummary:
description: ScheduledOverridesSummary is the summary of active and
upcoming scheduled overrides to be shown in e.g. a column of a `kubectl
get hra` output for observability.
type: string
type: object
type: object
version: v1alpha1
versions:
- name: v1alpha1
served: true
storage: true
status:
acceptedNames:
kind: ""
plural: ""
conditions: []
storedVersions: []

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,40 @@
## Upgrading
This project makes extensive use of CRDs to provide much of its functionality. Helm unfortunately does not support [managing](https://helm.sh/docs/chart_best_practices/custom_resource_definitions/) CRDs by design:
_The full breakdown as to how they came to this decision and why they have taken the approach they have for dealing with CRDs can be found in [Helm Improvement Proposal 11](https://github.com/helm/community/blob/main/hips/hip-0011.md)_
```
There is no support at this time for upgrading or deleting CRDs using Helm. This was an explicit decision after much
community discussion due to the danger for unintentional data loss. Furthermore, there is currently no community
consensus around how to handle CRDs and their lifecycle. As this evolves, Helm will add support for those use cases.
```
Helm will do an initial install of CRDs but it will not touch them afterwards (update or delete).
Additionally, because the project leverages CRDs so extensively you **MUST** run the matching controller app container with its matching CRDs i.e. always redeploy your CRDs if you are changing the app version.
Due to the above you can't just do a `helm upgrade` to release the latest version of the chart, the best practice steps are recorded below:
## Steps
1. Upgrade CRDs
```shell
# REMEMBER TO UPDATE THE CHART_VERSION TO RELEVANT CHART VERISON!!!!
CHART_VERSION=0.11.0
curl -L https://github.com/actions-runner-controller/actions-runner-controller/releases/download/actions-runner-controller-${CHART_VERSION}/actions-runner-controller-${CHART_VERSION}.tgz | tar zxv --strip 1 actions-runner-controller/crds
kubectl apply -f crds/
```
2. Upgrade the Helm release
```shell
helm upgrade --install \
--namespace actions-runner-system \
--version ${CHART_VERSION} \
actions-runner-controller/actions-runner-controller \
actions-runner-controller
```

View File

@@ -0,0 +1,22 @@
1. Get the application URL by running these commands:
{{- if .Values.githubWebhookServer.ingress.enabled }}
{{- range $host := .Values.githubWebhookServer.ingress.hosts }}
{{- range .paths }}
http{{ if $.Values.githubWebhookServer.ingress.tls }}s{{ end }}://{{ $host.host }}{{ . }}
{{- end }}
{{- end }}
{{- else if contains "NodePort" .Values.service.type }}
export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "actions-runner-controller.fullname" . }})
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
{{- else if contains "LoadBalancer" .Values.service.type }}
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get --namespace {{ .Release.Namespace }} svc -w {{ include "actions-runner-controller.fullname" . }}'
export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "actions-runner-controller.fullname" . }} --template "{{"{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}"}}")
echo http://$SERVICE_IP:{{ .Values.service.port }}
{{- else if contains "ClusterIP" .Values.service.type }}
export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "app.kubernetes.io/name={{ include "actions-runner-controller.name" . }},app.kubernetes.io/instance={{ .Release.Name }}" -o jsonpath="{.items[0].metadata.name}")
export CONTAINER_PORT=$(kubectl get pod --namespace {{ .Release.Namespace }} $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl --namespace {{ .Release.Namespace }} port-forward $POD_NAME 8080:$CONTAINER_PORT
{{- end }}

View File

@@ -0,0 +1,60 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "actions-runner-controller-github-webhook-server.name" -}}
{{- default .Chart.Name .Values.githubWebhookServer.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- define "actions-runner-controller-github-webhook-server.instance" -}}
{{- printf "%s-%s" .Release.Name "github-webhook-server" }}
{{- end }}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "actions-runner-controller-github-webhook-server.fullname" -}}
{{- if .Values.githubWebhookServer.fullnameOverride }}
{{- .Values.githubWebhookServer.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.githubWebhookServer.nameOverride }}
{{- $instance := include "actions-runner-controller-github-webhook-server.instance" . }}
{{- if contains $name $instance }}
{{- $instance | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s-%s" .Release.Name $name "github-webhook-server" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "actions-runner-controller-github-webhook-server.selectorLabels" -}}
app.kubernetes.io/name: {{ include "actions-runner-controller-github-webhook-server.name" . }}
app.kubernetes.io/instance: {{ include "actions-runner-controller-github-webhook-server.instance" . }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "actions-runner-controller-github-webhook-server.serviceAccountName" -}}
{{- if .Values.githubWebhookServer.serviceAccount.create }}
{{- default (include "actions-runner-controller-github-webhook-server.fullname" .) .Values.githubWebhookServer.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.githubWebhookServer.serviceAccount.name }}
{{- end }}
{{- end }}
{{- define "actions-runner-controller-github-webhook-server.secretName" -}}
{{- default (include "actions-runner-controller-github-webhook-server.fullname" .) .Values.githubWebhookServer.secret.name }}
{{- end }}
{{- define "actions-runner-controller-github-webhook-server.roleName" -}}
{{- include "actions-runner-controller-github-webhook-server.fullname" . }}
{{- end }}
{{- define "actions-runner-controller-github-webhook-server.serviceMonitorName" -}}
{{- include "actions-runner-controller-github-webhook-server.fullname" . | trunc 47 }}-service-monitor
{{- end }}

View File

@@ -0,0 +1,109 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "actions-runner-controller.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "actions-runner-controller.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "actions-runner-controller.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "actions-runner-controller.labels" -}}
helm.sh/chart: {{ include "actions-runner-controller.chart" . }}
{{ include "actions-runner-controller.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- range $k, $v := .Values.labels }}
{{ $k }}: {{ $v }}
{{- end }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "actions-runner-controller.selectorLabels" -}}
app.kubernetes.io/name: {{ include "actions-runner-controller.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "actions-runner-controller.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "actions-runner-controller.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
{{- define "actions-runner-controller.secretName" -}}
{{- default (include "actions-runner-controller.fullname" .) .Values.authSecret.name -}}
{{- end }}
{{- define "actions-runner-controller.leaderElectionRoleName" -}}
{{- include "actions-runner-controller.fullname" . }}-leader-election
{{- end }}
{{- define "actions-runner-controller.authProxyRoleName" -}}
{{- include "actions-runner-controller.fullname" . }}-proxy
{{- end }}
{{- define "actions-runner-controller.managerRoleName" -}}
{{- include "actions-runner-controller.fullname" . }}-manager
{{- end }}
{{- define "actions-runner-controller.runnerEditorRoleName" -}}
{{- include "actions-runner-controller.fullname" . }}-runner-editor
{{- end }}
{{- define "actions-runner-controller.runnerViewerRoleName" -}}
{{- include "actions-runner-controller.fullname" . }}-runner-viewer
{{- end }}
{{- define "actions-runner-controller.webhookServiceName" -}}
{{- include "actions-runner-controller.fullname" . | trunc 55 }}-webhook
{{- end }}
{{- define "actions-runner-controller.metricsServiceName" -}}
{{- include "actions-runner-controller.fullname" . | trunc 47 }}-metrics-service
{{- end }}
{{- define "actions-runner-controller.serviceMonitorName" -}}
{{- include "actions-runner-controller.fullname" . | trunc 47 }}-service-monitor
{{- end }}
{{- define "actions-runner-controller.selfsignedIssuerName" -}}
{{- include "actions-runner-controller.fullname" . }}-selfsigned-issuer
{{- end }}
{{- define "actions-runner-controller.servingCertName" -}}
{{- include "actions-runner-controller.fullname" . }}-serving-cert
{{- end }}

View File

@@ -0,0 +1,15 @@
{{- if .Values.metrics.proxy.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "actions-runner-controller.authProxyRoleName" . }}
rules:
- apiGroups: ["authentication.k8s.io"]
resources:
- tokenreviews
verbs: ["create"]
- apiGroups: ["authorization.k8s.io"]
resources:
- subjectaccessreviews
verbs: ["create"]
{{- end }}

View File

@@ -0,0 +1,14 @@
{{- if .Values.metrics.proxy.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "actions-runner-controller.authProxyRoleName" . }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ include "actions-runner-controller.authProxyRoleName" . }}
subjects:
- kind: ServiceAccount
name: {{ include "actions-runner-controller.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
{{- end }}

View File

@@ -0,0 +1,24 @@
# The following manifests contain a self-signed issuer CR and a certificate CR.
# More document can be found at https://docs.cert-manager.io
# WARNING: Targets CertManager 0.11 check https://docs.cert-manager.io/en/latest/tasks/upgrading/index.html for breaking changes
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: {{ include "actions-runner-controller.selfsignedIssuerName" . }}
namespace: {{ .Release.Namespace }}
spec:
selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: {{ include "actions-runner-controller.servingCertName" . }}
namespace: {{ .Release.Namespace }}
spec:
dnsNames:
- {{ include "actions-runner-controller.webhookServiceName" . }}.{{ .Release.Namespace }}.svc
- {{ include "actions-runner-controller.webhookServiceName" . }}.{{ .Release.Namespace }}.svc.cluster.local
issuerRef:
kind: Issuer
name: {{ include "actions-runner-controller.selfsignedIssuerName" . }}
secretName: webhook-server-cert # this secret will not be prefixed, since it's not managed by kustomize

View File

@@ -0,0 +1,10 @@
# This template only exists to facilitate CI testing of the chart, since
# a secret is expected to be found in the namespace by the controller manager
{{ if .Values.createDummySecret -}}
apiVersion: v1
data:
github_token: dGVzdA==
kind: Secret
metadata:
name: controller-manager
{{- end }}

View File

@@ -0,0 +1,14 @@
apiVersion: v1
kind: Service
metadata:
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
name: {{ include "actions-runner-controller.metricsServiceName" . }}
namespace: {{ .Release.Namespace }}
spec:
ports:
- name: metrics-port
port: {{ .Values.metrics.port }}
targetPort: metrics-port
selector:
{{- include "actions-runner-controller.selectorLabels" . | nindent 4 }}

View File

@@ -0,0 +1,15 @@
{{- if .Values.metrics.serviceMonitor }}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
name: {{ include "actions-runner-controller.serviceMonitorName" . }}
spec:
endpoints:
- path: /metrics
port: metrics-port
selector:
matchLabels:
{{- include "actions-runner-controller.selectorLabels" . | nindent 6 }}
{{- end }}

View File

@@ -0,0 +1,147 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "actions-runner-controller.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "actions-runner-controller.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "actions-runner-controller.selectorLabels" . | nindent 8 }}
{{- with .Values.podLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "actions-runner-controller.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
{{- with .Values.priorityClassName }}
priorityClassName: "{{ . }}"
{{- end }}
containers:
- args:
{{- $metricsHost := .Values.metrics.proxy.enabled | ternary "127.0.0.1" "0.0.0.0" }}
{{- $metricsPort := .Values.metrics.proxy.enabled | ternary "8080" .Values.metrics.port }}
- "--metrics-addr={{ $metricsHost }}:{{ $metricsPort }}"
- "--enable-leader-election"
- "--sync-period={{ .Values.syncPeriod }}"
- "--docker-image={{ .Values.image.dindSidecarRepositoryAndTag }}"
{{- if .Values.scope.singleNamespace }}
- "--watch-namespace={{ default .Release.Namespace .Values.scope.watchNamespace }}"
{{- end }}
{{- if .Values.githubAPICacheDuration }}
- "--github-api-cache-duration={{ .Values.githubAPICacheDuration }}"
{{- end }}
{{- if .Values.logLevel }}
- "--log-level={{ .Values.logLevel }}"
{{- end }}
command:
- "/manager"
env:
- name: GITHUB_TOKEN
valueFrom:
secretKeyRef:
key: github_token
name: {{ include "actions-runner-controller.secretName" . }}
optional: true
- name: GITHUB_APP_ID
valueFrom:
secretKeyRef:
key: github_app_id
name: {{ include "actions-runner-controller.secretName" . }}
optional: true
- name: GITHUB_APP_INSTALLATION_ID
valueFrom:
secretKeyRef:
key: github_app_installation_id
name: {{ include "actions-runner-controller.secretName" . }}
optional: true
- name: GITHUB_APP_PRIVATE_KEY
value: /etc/actions-runner-controller/github_app_private_key
{{- range $key, $val := .Values.env }}
- name: {{ $key }}
value: {{ $val | quote }}
{{- end }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default (cat "v" .Chart.AppVersion | replace " " "") }}"
name: manager
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: 9443
name: webhook-server
protocol: TCP
{{- if not .Values.metrics.proxy.enabled }}
- containerPort: {{ .Values.metrics.port }}
name: metrics-port
protocol: TCP
{{- end }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
volumeMounts:
- mountPath: "/etc/actions-runner-controller"
name: secret
readOnly: true
- mountPath: /tmp
name: tmp
- mountPath: /tmp/k8s-webhook-server/serving-certs
name: cert
readOnly: true
{{- if .Values.metrics.proxy.enabled }}
- args:
- "--secure-listen-address=0.0.0.0:{{ .Values.metrics.port }}"
- "--upstream=http://127.0.0.1:8080/"
- "--logtostderr=true"
- "--v=10"
image: "{{ .Values.metrics.proxy.image.repository }}:{{ .Values.metrics.proxy.image.tag }}"
name: kube-rbac-proxy
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: {{ .Values.metrics.port }}
name: metrics-port
resources:
{{- toYaml .Values.resources | nindent 12 }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
{{- end }}
terminationGracePeriodSeconds: 10
volumes:
- name: secret
secret:
secretName: {{ include "actions-runner-controller.secretName" . }}
- name: cert
secret:
defaultMode: 420
secretName: webhook-server-cert
- name: tmp
emptyDir: {}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.topologySpreadConstraints }}
topologySpreadConstraints:
{{- toYaml . | nindent 8 }}
{{- end }}

View File

@@ -0,0 +1,108 @@
{{- if .Values.githubWebhookServer.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "actions-runner-controller-github-webhook-server.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.githubWebhookServer.replicaCount }}
selector:
matchLabels:
{{- include "actions-runner-controller-github-webhook-server.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.githubWebhookServer.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "actions-runner-controller-github-webhook-server.selectorLabels" . | nindent 8 }}
{{- with .Values.githubWebhookServer.podLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
{{- with .Values.githubWebhookServer.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "actions-runner-controller-github-webhook-server.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.githubWebhookServer.podSecurityContext | nindent 8 }}
{{- with .Values.githubWebhookServer.priorityClassName }}
priorityClassName: "{{ . }}"
{{- end }}
containers:
- args:
{{- $metricsHost := .Values.metrics.proxy.enabled | ternary "127.0.0.1" "0.0.0.0" }}
{{- $metricsPort := .Values.metrics.proxy.enabled | ternary "8080" .Values.metrics.port }}
- "--metrics-addr={{ $metricsHost }}:{{ $metricsPort }}"
- "--sync-period={{ .Values.githubWebhookServer.syncPeriod }}"
{{- if .Values.githubWebhookServer.logLevel }}
- "--log-level={{ .Values.githubWebhookServer.logLevel }}"
{{- end }}
command:
- "/github-webhook-server"
env:
- name: GITHUB_WEBHOOK_SECRET_TOKEN
valueFrom:
secretKeyRef:
key: github_webhook_secret_token
name: {{ include "actions-runner-controller-github-webhook-server.secretName" . }}
optional: true
{{- range $key, $val := .Values.githubWebhookServer.env }}
- name: {{ $key }}
value: {{ $val | quote }}
{{- end }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default (cat "v" .Chart.AppVersion | replace " " "") }}"
name: github-webhook-server
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: 8000
name: http
protocol: TCP
{{- if not .Values.metrics.proxy.enabled }}
- containerPort: {{ .Values.metrics.port }}
name: metrics-port
protocol: TCP
{{- end }}
resources:
{{- toYaml .Values.githubWebhookServer.resources | nindent 12 }}
securityContext:
{{- toYaml .Values.githubWebhookServer.securityContext | nindent 12 }}
{{- if .Values.metrics.proxy.enabled }}
- args:
- "--secure-listen-address=0.0.0.0:{{ .Values.metrics.port }}"
- "--upstream=http://127.0.0.1:8080/"
- "--logtostderr=true"
- "--v=10"
image: "{{ .Values.metrics.proxy.image.repository }}:{{ .Values.metrics.proxy.image.tag }}"
name: kube-rbac-proxy
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: {{ .Values.metrics.port }}
name: metrics-port
resources:
{{- toYaml .Values.resources | nindent 12 }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
{{- end }}
terminationGracePeriodSeconds: 10
{{- with .Values.githubWebhookServer.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.githubWebhookServer.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.githubWebhookServer.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.githubWebhookServer.topologySpreadConstraints }}
topologySpreadConstraints:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,41 @@
{{- if .Values.githubWebhookServer.ingress.enabled -}}
{{- $fullName := include "actions-runner-controller-github-webhook-server.fullname" . -}}
{{- $svcPort := (index .Values.githubWebhookServer.service.ports 0).port -}}
{{- if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
apiVersion: extensions/v1beta1
{{- end }}
kind: Ingress
metadata:
name: {{ $fullName }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
{{- with .Values.githubWebhookServer.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
{{- if .Values.githubWebhookServer.ingress.tls }}
tls:
{{- range .Values.githubWebhookServer.ingress.tls }}
- hosts:
{{- range .hosts }}
- {{ . | quote }}
{{- end }}
secretName: {{ .secretName }}
{{- end }}
{{- end }}
rules:
{{- range .Values.githubWebhookServer.ingress.hosts }}
- host: {{ .host | quote }}
http:
paths:
{{- range .paths }}
- path: {{ .path }}
backend:
serviceName: {{ $fullName }}
servicePort: {{ $svcPort }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,70 @@
{{- if .Values.githubWebhookServer.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: null
name: {{ include "actions-runner-controller-github-webhook-server.roleName" . }}
rules:
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers
verbs:
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers/status
verbs:
- get
- patch
- update
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments/status
verbs:
- get
- patch
- update
{{- end }}

View File

@@ -0,0 +1,14 @@
{{- if .Values.githubWebhookServer.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "actions-runner-controller-github-webhook-server.roleName" . }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ include "actions-runner-controller-github-webhook-server.roleName" . }}
subjects:
- kind: ServiceAccount
name: {{ include "actions-runner-controller-github-webhook-server.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
{{- end }}

View File

@@ -0,0 +1,16 @@
{{- if .Values.githubWebhookServer.enabled }}
{{- if .Values.githubWebhookServer.secret.create }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "actions-runner-controller-github-webhook-server.secretName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
type: Opaque
data:
{{- if .Values.githubWebhookServer.secret.github_webhook_secret_token }}
github_webhook_secret_token: {{ .Values.githubWebhookServer.secret.github_webhook_secret_token | toString | b64enc }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,22 @@
{{- if .Values.githubWebhookServer.enabled }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "actions-runner-controller-github-webhook-server.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
spec:
type: {{ .Values.githubWebhookServer.service.type }}
ports:
{{ range $_, $port := .Values.githubWebhookServer.service.ports -}}
- {{ $port | toYaml | nindent 6 }}
{{- end }}
{{- if .Values.metrics.serviceMonitor }}
- name: metrics-port
port: {{ .Values.metrics.port }}
targetPort: metrics-port
{{- end }}
selector:
{{- include "actions-runner-controller-github-webhook-server.selectorLabels" . | nindent 4 }}
{{- end }}

View File

@@ -0,0 +1,15 @@
{{- if and .Values.githubWebhookServer.enabled .Values.metrics.serviceMonitor }}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
name: {{ include "actions-runner-controller-github-webhook-server.serviceMonitorName" . }}
spec:
endpoints:
- path: /metrics
port: metrics-port
selector:
matchLabels:
{{- include "actions-runner-controller-github-webhook-server.selectorLabels" . | nindent 6 }}
{{- end }}

View File

@@ -0,0 +1,15 @@
{{- if .Values.githubWebhookServer.enabled -}}
{{- if .Values.githubWebhookServer.serviceAccount.create -}}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "actions-runner-controller-github-webhook-server.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
{{- with .Values.githubWebhookServer.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,33 @@
# permissions to do leader election.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "actions-runner-controller.leaderElectionRoleName" . }}
namespace: {{ .Release.Namespace }}
rules:
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- ""
resources:
- configmaps/status
verbs:
- get
- update
- patch
- apiGroups:
- ""
resources:
- events
verbs:
- create

View File

@@ -0,0 +1,13 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "actions-runner-controller.leaderElectionRoleName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "actions-runner-controller.leaderElectionRoleName" . }}
subjects:
- kind: ServiceAccount
name: {{ include "actions-runner-controller.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}

View File

@@ -0,0 +1,165 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: null
name: {{ include "actions-runner-controller.managerRoleName" . }}
rules:
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers/status
verbs:
- get
- patch
- update
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments/status
verbs:
- get
- patch
- update
- apiGroups:
- actions.summerwind.dev
resources:
- runnerreplicasets
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runnerreplicasets/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runnerreplicasets/status
verbs:
- get
- patch
- update
- apiGroups:
- actions.summerwind.dev
resources:
- runners
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runners/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runners/status
verbs:
- get
- patch
- update
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- ""
resources:
- pods
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- ""
resources:
- pods/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch

View File

@@ -0,0 +1,12 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "actions-runner-controller.managerRoleName" . }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ include "actions-runner-controller.managerRoleName" . }}
subjects:
- kind: ServiceAccount
name: {{ include "actions-runner-controller.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}

View File

@@ -0,0 +1,23 @@
{{- if .Values.authSecret.create }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "actions-runner-controller.secretName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
type: Opaque
data:
{{- if .Values.authSecret.github_app_id }}
github_app_id: {{ .Values.authSecret.github_app_id | toString | b64enc }}
{{- end }}
{{- if .Values.authSecret.github_app_installation_id }}
github_app_installation_id: {{ .Values.authSecret.github_app_installation_id | toString | b64enc }}
{{- end }}
{{- if .Values.authSecret.github_app_private_key }}
github_app_private_key: {{ .Values.authSecret.github_app_private_key | toString | b64enc }}
{{- end }}
{{- if .Values.authSecret.github_token }}
github_token: {{ .Values.authSecret.github_token | toString | b64enc }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,26 @@
# permissions to do edit runners.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "actions-runner-controller.runnerEditorRoleName" . }}
rules:
- apiGroups:
- actions.summerwind.dev
resources:
- runners
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runners/status
verbs:
- get
- patch
- update

View File

@@ -0,0 +1,20 @@
# permissions to do viewer runners.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "actions-runner-controller.runnerViewerRoleName" . }}
rules:
- apiGroups:
- actions.summerwind.dev
resources:
- runners
verbs:
- get
- list
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runners/status
verbs:
- get

View File

@@ -0,0 +1,13 @@
{{- if .Values.serviceAccount.create -}}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "actions-runner-controller.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
{{- with .Values.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,127 @@
---
apiVersion: admissionregistration.k8s.io/v1beta1
kind: MutatingWebhookConfiguration
metadata:
creationTimestamp: null
name: {{ include "actions-runner-controller.fullname" . }}-mutating-webhook-configuration
annotations:
cert-manager.io/inject-ca-from: {{ .Release.Namespace }}/{{ include "actions-runner-controller.servingCertName" . }}
webhooks:
- clientConfig:
service:
name: {{ include "actions-runner-controller.webhookServiceName" . }}
namespace: {{ .Release.Namespace }}
path: /mutate-actions-summerwind-dev-v1alpha1-runner
failurePolicy: Fail
name: mutate.runner.actions.summerwind.dev
rules:
- apiGroups:
- actions.summerwind.dev
apiVersions:
- v1alpha1
operations:
- CREATE
- UPDATE
resources:
- runners
sideEffects: None
- clientConfig:
service:
name: {{ include "actions-runner-controller.webhookServiceName" . }}
namespace: {{ .Release.Namespace }}
path: /mutate-actions-summerwind-dev-v1alpha1-runnerdeployment
failurePolicy: Fail
name: mutate.runnerdeployment.actions.summerwind.dev
rules:
- apiGroups:
- actions.summerwind.dev
apiVersions:
- v1alpha1
operations:
- CREATE
- UPDATE
resources:
- runnerdeployments
sideEffects: None
- clientConfig:
service:
name: {{ include "actions-runner-controller.webhookServiceName" . }}
namespace: {{ .Release.Namespace }}
path: /mutate-actions-summerwind-dev-v1alpha1-runnerreplicaset
failurePolicy: Fail
name: mutate.runnerreplicaset.actions.summerwind.dev
rules:
- apiGroups:
- actions.summerwind.dev
apiVersions:
- v1alpha1
operations:
- CREATE
- UPDATE
resources:
- runnerreplicasets
sideEffects: None
---
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
creationTimestamp: null
name: {{ include "actions-runner-controller.fullname" . }}-validating-webhook-configuration
annotations:
cert-manager.io/inject-ca-from: {{ .Release.Namespace }}/{{ include "actions-runner-controller.servingCertName" . }}
webhooks:
- clientConfig:
service:
name: {{ include "actions-runner-controller.webhookServiceName" . }}
namespace: {{ .Release.Namespace }}
path: /validate-actions-summerwind-dev-v1alpha1-runner
failurePolicy: Fail
name: validate.runner.actions.summerwind.dev
rules:
- apiGroups:
- actions.summerwind.dev
apiVersions:
- v1alpha1
operations:
- CREATE
- UPDATE
resources:
- runners
sideEffects: None
- clientConfig:
service:
name: {{ include "actions-runner-controller.webhookServiceName" . }}
namespace: {{ .Release.Namespace }}
path: /validate-actions-summerwind-dev-v1alpha1-runnerdeployment
failurePolicy: Fail
name: validate.runnerdeployment.actions.summerwind.dev
rules:
- apiGroups:
- actions.summerwind.dev
apiVersions:
- v1alpha1
operations:
- CREATE
- UPDATE
resources:
- runnerdeployments
sideEffects: None
- clientConfig:
service:
name: {{ include "actions-runner-controller.webhookServiceName" . }}
namespace: {{ .Release.Namespace }}
path: /validate-actions-summerwind-dev-v1alpha1-runnerreplicaset
failurePolicy: Fail
name: validate.runnerreplicaset.actions.summerwind.dev
rules:
- apiGroups:
- actions.summerwind.dev
apiVersions:
- v1alpha1
operations:
- CREATE
- UPDATE
resources:
- runnerreplicasets
sideEffects: None

View File

@@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
name: {{ include "actions-runner-controller.webhookServiceName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
spec:
type: {{ .Values.service.type }}
ports:
- port: 443
targetPort: 9443
protocol: TCP
name: https
selector:
{{- include "actions-runner-controller.selectorLabels" . | nindent 4 }}

View File

@@ -0,0 +1,162 @@
# Default values for actions-runner-controller.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
labels: {}
replicaCount: 1
syncPeriod: 10m
# The controller tries its best not to repeat the duplicate GitHub API call
# within this duration.
# Defaults to syncPeriod - 10s.
#githubAPICacheDuration: 30s
# Only 1 authentication method can be deployed at a time
# Uncomment the configuration you are applying and fill in the details
authSecret:
create: true
name: "controller-manager"
### GitHub Apps Configuration
#github_app_id: ""
#github_app_installation_id: ""
#github_app_private_key: |
### GitHub PAT Configuration
#github_token: ""
image:
repository: summerwind/actions-runner-controller
dindSidecarRepositoryAndTag: "docker:dind"
pullPolicy: IfNotPresent
imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""
serviceAccount:
# Specifies whether a service account should be created
create: true
# Annotations to add to the service account
annotations: {}
# The name of the service account to use.
# If not set and create is true, a name is generated using the fullname template
name: ""
podAnnotations: {}
podLabels: {}
podSecurityContext:
{}
# fsGroup: 2000
securityContext:
{}
# capabilities:
# drop:
# - ALL
# readOnlyRootFilesystem: true
# runAsNonRoot: true
# runAsUser: 1000
service:
type: ClusterIP
port: 443
metrics:
serviceMonitor: false
port: 8443
proxy:
enabled: true
image:
repository: quay.io/brancz/kube-rbac-proxy
tag: v0.10.0
resources:
{}
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
nodeSelector: {}
tolerations: []
affinity: {}
# Leverage a PriorityClass to ensure your pods survive resource shortages
# ref: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
# PriorityClass: system-cluster-critical
priorityClassName: ""
env:
{}
# http_proxy: "proxy.com:8080"
# https_proxy: "proxy.com:8080"
# no_proxy: ""
scope:
# If true, the controller will only watch custom resources in a single namespace
singleNamespace: false
# If `scope.singleNamespace=true`, the controller will only watch custom resources in this namespace
# The default value is "", which means the namespace of the controller
watchNamespace: ""
githubWebhookServer:
enabled: false
replicaCount: 1
secret:
create: true
name: "github-webhook-server"
### GitHub Webhook Configuration
#github_webhook_secret_token: ""
imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""
serviceAccount:
# Specifies whether a service account should be created
create: true
# Annotations to add to the service account
annotations: {}
# The name of the service account to use.
# If not set and create is true, a name is generated using the fullname template
name: ""
podAnnotations: {}
podLabels: {}
podSecurityContext: {}
# fsGroup: 2000
securityContext: {}
resources: {}
nodeSelector: {}
tolerations: []
affinity: {}
priorityClassName: ""
service:
type: ClusterIP
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
#nodePort: someFixedPortForUseWithTerraformCdkCfnEtc
ingress:
enabled: false
annotations:
{}
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
hosts:
- host: chart-example.local
paths: []
tls: []
# - secretName: chart-example-tls
# hosts:
# - chart-example.local

View File

@@ -0,0 +1,191 @@
/*
Copyright 2021 The actions-runner-controller authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package main
import (
"context"
"errors"
"flag"
"net/http"
"os"
"sync"
"time"
actionsv1alpha1 "github.com/summerwind/actions-runner-controller/api/v1alpha1"
"github.com/summerwind/actions-runner-controller/controllers"
zaplib "go.uber.org/zap"
"k8s.io/apimachinery/pkg/runtime"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
_ "k8s.io/client-go/plugin/pkg/client/auth/exec"
_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
_ "k8s.io/client-go/plugin/pkg/client/auth/oidc"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/log/zap"
// +kubebuilder:scaffold:imports
)
var (
scheme = runtime.NewScheme()
setupLog = ctrl.Log.WithName("setup")
)
const (
logLevelDebug = "debug"
logLevelInfo = "info"
logLevelWarn = "warn"
logLevelError = "error"
)
func init() {
_ = clientgoscheme.AddToScheme(scheme)
_ = actionsv1alpha1.AddToScheme(scheme)
// +kubebuilder:scaffold:scheme
}
func main() {
var (
err error
webhookAddr string
metricsAddr string
// The secret token of the GitHub Webhook. See https://docs.github.com/en/developers/webhooks-and-events/securing-your-webhooks
webhookSecretToken string
watchNamespace string
enableLeaderElection bool
syncPeriod time.Duration
logLevel string
)
webhookSecretToken = os.Getenv("GITHUB_WEBHOOK_SECRET_TOKEN")
flag.StringVar(&webhookAddr, "webhook-addr", ":8000", "The address the metric endpoint binds to.")
flag.StringVar(&metricsAddr, "metrics-addr", ":8080", "The address the metric endpoint binds to.")
flag.StringVar(&watchNamespace, "watch-namespace", "", "The namespace to watch for HorizontalRunnerAutoscaler's to scale on Webhook. Set to empty for letting it watch for all namespaces.")
flag.BoolVar(&enableLeaderElection, "enable-leader-election", false,
"Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.")
flag.DurationVar(&syncPeriod, "sync-period", 10*time.Minute, "Determines the minimum frequency at which K8s resources managed by this controller are reconciled. When you use autoscaling, set to a lower value like 10 minute, because this corresponds to the minimum time to react on demand change")
flag.StringVar(&logLevel, "log-level", logLevelDebug, `The verbosity of the logging. Valid values are "debug", "info", "warn", "error". Defaults to "debug".`)
flag.Parse()
if webhookSecretToken == "" {
setupLog.Info("-webhook-secret-token is missing or empty. Create one following https://docs.github.com/en/developers/webhooks-and-events/securing-your-webhooks")
}
if watchNamespace == "" {
setupLog.Info("-watch-namespace is empty. HorizontalRunnerAutoscalers in all the namespaces are watched, cached, and considered as scale targets.")
} else {
setupLog.Info("-watch-namespace is %q. Only HorizontalRunnerAutoscalers in %q are watched, cached, and considered as scale targets.")
}
logger := zap.New(func(o *zap.Options) {
switch logLevel {
case logLevelDebug:
o.Development = true
case logLevelInfo:
lvl := zaplib.NewAtomicLevelAt(zaplib.InfoLevel)
o.Level = &lvl
case logLevelWarn:
lvl := zaplib.NewAtomicLevelAt(zaplib.WarnLevel)
o.Level = &lvl
case logLevelError:
lvl := zaplib.NewAtomicLevelAt(zaplib.ErrorLevel)
o.Level = &lvl
}
})
ctrl.SetLogger(logger)
mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
Scheme: scheme,
SyncPeriod: &syncPeriod,
LeaderElection: enableLeaderElection,
Namespace: watchNamespace,
MetricsBindAddress: metricsAddr,
Port: 9443,
})
if err != nil {
setupLog.Error(err, "unable to start manager")
os.Exit(1)
}
hraGitHubWebhook := &controllers.HorizontalRunnerAutoscalerGitHubWebhook{
Client: mgr.GetClient(),
Log: ctrl.Log.WithName("controllers").WithName("Runner"),
Recorder: nil,
Scheme: mgr.GetScheme(),
SecretKeyBytes: []byte(webhookSecretToken),
Namespace: watchNamespace,
}
if err = hraGitHubWebhook.SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "Runner")
os.Exit(1)
}
var wg sync.WaitGroup
ctx, cancel := context.WithCancel(context.Background())
wg.Add(1)
go func() {
defer cancel()
defer wg.Done()
setupLog.Info("starting webhook server")
if err := mgr.Start(ctx.Done()); err != nil {
setupLog.Error(err, "problem running manager")
os.Exit(1)
}
}()
mux := http.NewServeMux()
mux.HandleFunc("/", hraGitHubWebhook.Handle)
srv := http.Server{
Addr: webhookAddr,
Handler: mux,
}
wg.Add(1)
go func() {
defer cancel()
defer wg.Done()
go func() {
<-ctx.Done()
srv.Shutdown(context.Background())
}()
if err := srv.ListenAndServe(); err != nil {
if !errors.Is(err, http.ErrServerClosed) {
setupLog.Error(err, "problem running http server")
}
}
}()
go func() {
<-ctrl.SetupSignalHandler()
cancel()
}()
wg.Wait()
}

View File

@@ -18,6 +18,9 @@ spec:
- JSONPath: .status.desiredReplicas - JSONPath: .status.desiredReplicas
name: Desired name: Desired
type: number type: number
- JSONPath: .status.scheduledOverridesSummary
name: Schedule
type: string
group: actions.summerwind.dev group: actions.summerwind.dev
names: names:
kind: HorizontalRunnerAutoscaler kind: HorizontalRunnerAutoscaler
@@ -48,6 +51,20 @@ spec:
description: HorizontalRunnerAutoscalerSpec defines the desired state of description: HorizontalRunnerAutoscalerSpec defines the desired state of
HorizontalRunnerAutoscaler HorizontalRunnerAutoscaler
properties: properties:
capacityReservations:
items:
description: CapacityReservation specifies the number of replicas
temporarily added to the scale target until ExpirationTime.
properties:
expirationTime:
format: date-time
type: string
name:
type: string
replicas:
type: integer
type: object
type: array
maxReplicas: maxReplicas:
description: MinReplicas is the maximum number of replicas the deployment description: MinReplicas is the maximum number of replicas the deployment
is allowed to scale is allowed to scale
@@ -64,6 +81,33 @@ spec:
items: items:
type: string type: string
type: array type: array
scaleDownAdjustment:
description: ScaleDownAdjustment is the number of runners removed
on scale-down. You can only specify either ScaleDownFactor or
ScaleDownAdjustment.
type: integer
scaleDownFactor:
description: ScaleDownFactor is the multiplicative factor applied
to the current number of runners used to determine how many
pods should be removed.
type: string
scaleDownThreshold:
description: ScaleDownThreshold is the percentage of busy runners
less than which will trigger the hpa to scale the runners down.
type: string
scaleUpAdjustment:
description: ScaleUpAdjustment is the number of runners added
on scale-up. You can only specify either ScaleUpFactor or ScaleUpAdjustment.
type: integer
scaleUpFactor:
description: ScaleUpFactor is the multiplicative factor applied
to the current number of runners used to determine how many
pods should be added.
type: string
scaleUpThreshold:
description: ScaleUpThreshold is the percentage of busy runners
greater than which will trigger the hpa to scale runners up.
type: string
type: type:
description: Type is the type of metric to be used for autoscaling. description: Type is the type of metric to be used for autoscaling.
The only supported Type is TotalNumberOfQueuedAndInProgressWorkflowRuns The only supported Type is TotalNumberOfQueuedAndInProgressWorkflowRuns
@@ -86,9 +130,129 @@ spec:
name: name:
type: string type: string
type: object type: object
scaleUpTriggers:
description: "ScaleUpTriggers is an experimental feature to increase
the desired replicas by 1 on each webhook requested received by the
webhookBasedAutoscaler. \n This feature requires you to also enable
and deploy the webhookBasedAutoscaler onto your cluster. \n Note that
the added runners remain until the next sync period at least, and
they may or may not be used by GitHub Actions depending on the timing.
They are intended to be used to gain \"resource slack\" immediately
after you receive a webhook from GitHub, so that you can loosely expect
MinReplicas runners to be always available."
items:
properties:
amount:
type: integer
duration:
type: string
githubEvent:
properties:
checkRun:
description: https://docs.github.com/en/actions/reference/events-that-trigger-workflows#check_run
properties:
names:
description: Names is a list of GitHub Actions glob patterns.
Any check_run event whose name matches one of patterns
in the list can trigger autoscaling. Note that check_run
name seem to equal to the job name you've defined in
your actions workflow yaml file. So it is very likely
that you can utilize this to trigger depending on the
job.
items:
type: string
type: array
status:
type: string
types:
items:
type: string
type: array
type: object
pullRequest:
description: https://docs.github.com/en/actions/reference/events-that-trigger-workflows#pull_request
properties:
branches:
items:
type: string
type: array
types:
items:
type: string
type: array
type: object
push:
description: PushSpec is the condition for triggering scale-up
on push event Also see https://docs.github.com/en/actions/reference/events-that-trigger-workflows#push
type: object
type: object
type: object
type: array
scheduledOverrides:
description: ScheduledOverrides is the list of ScheduledOverride. It
can be used to override a few fields of HorizontalRunnerAutoscalerSpec
on schedule. The earlier a scheduled override is, the higher it is
prioritized.
items:
description: ScheduledOverride can be used to override a few fields
of HorizontalRunnerAutoscalerSpec on schedule. A schedule can optionally
be recurring, so that the correspoding override happens every day,
week, month, or year.
properties:
endTime:
description: EndTime is the time at which the first override ends.
format: date-time
type: string
minReplicas:
description: MinReplicas is the number of runners while overriding.
If omitted, it doesn't override minReplicas.
minimum: 0
nullable: true
type: integer
recurrenceRule:
properties:
frequency:
description: Frequency is the name of a predefined interval
of each recurrence. The valid values are "Daily", "Weekly",
"Monthly", and "Yearly". If empty, the corresponding override
happens only once.
enum:
- Daily
- Weekly
- Monthly
- Yearly
type: string
untilTime:
description: UntilTime is the time of the final recurrence.
If empty, the schedule recurs forever.
format: date-time
type: string
type: object
startTime:
description: StartTime is the time at which the first override
starts.
format: date-time
type: string
required:
- endTime
- startTime
type: object
type: array
type: object type: object
status: status:
properties: properties:
cacheEntries:
items:
properties:
expirationTime:
format: date-time
type: string
key:
type: string
value:
type: integer
type: object
type: array
desiredReplicas: desiredReplicas:
description: DesiredReplicas is the total number of desired, non-terminated description: DesiredReplicas is the total number of desired, non-terminated
and latest pods to be set for the primary RunnerSet This doesn't include and latest pods to be set for the primary RunnerSet This doesn't include
@@ -96,6 +260,7 @@ spec:
type: integer type: integer
lastSuccessfulScaleOutTime: lastSuccessfulScaleOutTime:
format: date-time format: date-time
nullable: true
type: string type: string
observedGeneration: observedGeneration:
description: ObservedGeneration is the most recent generation observed description: ObservedGeneration is the most recent generation observed
@@ -103,6 +268,11 @@ spec:
which is updated on mutation by the API Server. which is updated on mutation by the API Server.
format: int64 format: int64
type: integer type: integer
scheduledOverridesSummary:
description: ScheduledOverridesSummary is the summary of active and
upcoming scheduled overrides to be shown in e.g. a column of a `kubectl
get hra` output for observability.
type: string
type: object type: object
type: object type: object
version: v1alpha1 version: v1alpha1

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -10,7 +10,7 @@ spec:
spec: spec:
containers: containers:
- name: kube-rbac-proxy - name: kube-rbac-proxy
image: gcr.io/kubebuilder/kube-rbac-proxy:v0.4.1 image: quay.io/brancz/kube-rbac-proxy:v0.10.0
args: args:
- "--secure-listen-address=0.0.0.0:8443" - "--secure-listen-address=0.0.0.0:8443"
- "--upstream=http://127.0.0.1:8080/" - "--upstream=http://127.0.0.1:8080/"
@@ -23,3 +23,4 @@ spec:
args: args:
- "--metrics-addr=127.0.0.1:8080" - "--metrics-addr=127.0.0.1:8080"
- "--enable-leader-election" - "--enable-leader-election"
- "--sync-period=10m"

View File

@@ -4,5 +4,5 @@ apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization kind: Kustomization
images: images:
- name: controller - name: controller
newName: summerwind/actions-runner-controller newName: mumoshu/actions-runner-controller
newTag: latest newTag: dev

View File

@@ -57,7 +57,7 @@ spec:
resources: resources:
limits: limits:
cpu: 100m cpu: 100m
memory: 30Mi memory: 100Mi
requests: requests:
cpu: 100m cpu: 100m
memory: 20Mi memory: 20Mi

View File

@@ -24,6 +24,7 @@ webhooks:
- UPDATE - UPDATE
resources: resources:
- runners - runners
sideEffects: None
- clientConfig: - clientConfig:
caBundle: Cg== caBundle: Cg==
service: service:
@@ -60,6 +61,7 @@ webhooks:
- UPDATE - UPDATE
resources: resources:
- runnerreplicasets - runnerreplicasets
sideEffects: None
--- ---
apiVersion: admissionregistration.k8s.io/v1beta1 apiVersion: admissionregistration.k8s.io/v1beta1
@@ -86,6 +88,7 @@ webhooks:
- UPDATE - UPDATE
resources: resources:
- runners - runners
sideEffects: None
- clientConfig: - clientConfig:
caBundle: Cg== caBundle: Cg==
service: service:
@@ -122,3 +125,4 @@ webhooks:
- UPDATE - UPDATE
resources: resources:
- runnerreplicasets - runnerreplicasets
sideEffects: None

View File

@@ -4,20 +4,135 @@ import (
"context" "context"
"errors" "errors"
"fmt" "fmt"
"math"
"strconv"
"strings" "strings"
"time"
"github.com/summerwind/actions-runner-controller/api/v1alpha1" "github.com/summerwind/actions-runner-controller/api/v1alpha1"
kerrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/controller-runtime/pkg/client"
) )
func (r *HorizontalRunnerAutoscalerReconciler) determineDesiredReplicas(rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler) (*int, error) { const (
defaultScaleUpThreshold = 0.8
defaultScaleDownThreshold = 0.3
defaultScaleUpFactor = 1.3
defaultScaleDownFactor = 0.7
)
func getValueAvailableAt(now time.Time, from, to *time.Time, reservedValue int) *int {
if to != nil && now.After(*to) {
return nil
}
if from != nil && now.Before(*from) {
return nil
}
return &reservedValue
}
func (r *HorizontalRunnerAutoscalerReconciler) fetchSuggestedReplicasFromCache(hra v1alpha1.HorizontalRunnerAutoscaler) *int {
var entry *v1alpha1.CacheEntry
for i := range hra.Status.CacheEntries {
ent := hra.Status.CacheEntries[i]
if ent.Key != v1alpha1.CacheEntryKeyDesiredReplicas {
continue
}
if !time.Now().Before(ent.ExpirationTime.Time) {
continue
}
entry = &ent
break
}
if entry != nil {
v := getValueAvailableAt(time.Now(), nil, &entry.ExpirationTime.Time, entry.Value)
if v != nil {
return v
}
}
return nil
}
func (r *HorizontalRunnerAutoscalerReconciler) suggestDesiredReplicas(rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler) (*int, error) {
if hra.Spec.MinReplicas == nil { if hra.Spec.MinReplicas == nil {
return nil, fmt.Errorf("horizontalrunnerautoscaler %s/%s is missing minReplicas", hra.Namespace, hra.Name) return nil, fmt.Errorf("horizontalrunnerautoscaler %s/%s is missing minReplicas", hra.Namespace, hra.Name)
} else if hra.Spec.MaxReplicas == nil { } else if hra.Spec.MaxReplicas == nil {
return nil, fmt.Errorf("horizontalrunnerautoscaler %s/%s is missing maxReplicas", hra.Namespace, hra.Name) return nil, fmt.Errorf("horizontalrunnerautoscaler %s/%s is missing maxReplicas", hra.Namespace, hra.Name)
} }
var repos [][]string metrics := hra.Spec.Metrics
numMetrics := len(metrics)
if numMetrics == 0 {
if len(hra.Spec.ScaleUpTriggers) == 0 {
return r.suggestReplicasByQueuedAndInProgressWorkflowRuns(rd, hra, nil)
}
return nil, nil
} else if numMetrics > 2 {
return nil, fmt.Errorf("Too many autoscaling metrics configured: It must be 0 to 2, but got %d", numMetrics)
}
primaryMetric := metrics[0]
primaryMetricType := primaryMetric.Type
var (
suggested *int
err error
)
switch primaryMetricType {
case v1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns:
suggested, err = r.suggestReplicasByQueuedAndInProgressWorkflowRuns(rd, hra, &primaryMetric)
case v1alpha1.AutoscalingMetricTypePercentageRunnersBusy:
suggested, err = r.suggestReplicasByPercentageRunnersBusy(rd, hra, primaryMetric)
default:
return nil, fmt.Errorf("validting autoscaling metrics: unsupported metric type %q", primaryMetric)
}
if err != nil {
return nil, err
}
if suggested != nil && *suggested > 0 {
return suggested, nil
}
if len(metrics) == 1 {
// This is never supposed to happen but anyway-
// Fall-back to `minReplicas + capacityReservedThroughWebhook`.
return nil, nil
}
// At this point, we are sure that there are exactly 2 Metrics entries.
fallbackMetric := metrics[1]
fallbackMetricType := fallbackMetric.Type
if primaryMetricType != v1alpha1.AutoscalingMetricTypePercentageRunnersBusy ||
fallbackMetricType != v1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns {
return nil, fmt.Errorf(
"invalid HRA Spec: Metrics[0] of %s cannot be combined with Metrics[1] of %s: The only allowed combination is 0=PercentageRunnersBusy and 1=TotalNumberOfQueuedAndInProgressWorkflowRuns",
primaryMetricType, fallbackMetricType,
)
}
return r.suggestReplicasByQueuedAndInProgressWorkflowRuns(rd, hra, &fallbackMetric)
}
func (r *HorizontalRunnerAutoscalerReconciler) suggestReplicasByQueuedAndInProgressWorkflowRuns(rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler, metrics *v1alpha1.MetricSpec) (*int, error) {
var repos [][]string
repoID := rd.Spec.Template.Spec.Repository repoID := rd.Spec.Template.Spec.Repository
if repoID == "" { if repoID == "" {
orgName := rd.Spec.Template.Spec.Organization orgName := rd.Spec.Template.Spec.Organization
@@ -25,17 +140,18 @@ func (r *HorizontalRunnerAutoscalerReconciler) determineDesiredReplicas(rd v1alp
return nil, fmt.Errorf("asserting runner deployment spec to detect bug: spec.template.organization should not be empty on this code path") return nil, fmt.Errorf("asserting runner deployment spec to detect bug: spec.template.organization should not be empty on this code path")
} }
metrics := hra.Spec.Metrics // In case it's an organizational runners deployment without any scaling metrics defined,
// we assume that the desired replicas should always be `minReplicas + capacityReservedThroughWebhook`.
// See https://github.com/summerwind/actions-runner-controller/issues/377#issuecomment-793372693
if metrics == nil {
return nil, nil
}
if len(metrics) == 0 { if len(metrics.RepositoryNames) == 0 {
return nil, fmt.Errorf("validating autoscaling metrics: one or more metrics is required")
} else if tpe := metrics[0].Type; tpe != v1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns {
return nil, fmt.Errorf("validting autoscaling metrics: unsupported metric type %q: only supported value is %s", tpe, v1alpha1.AutoscalingMetricTypeTotalNumberOfQueuedAndInProgressWorkflowRuns)
} else if len(metrics[0].RepositoryNames) == 0 {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].repositoryNames is required and must have one more more entries for organizational runner deployment") return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].repositoryNames is required and must have one more more entries for organizational runner deployment")
} }
for _, repoName := range metrics[0].RepositoryNames { for _, repoName := range metrics.RepositoryNames {
repos = append(repos, []string{orgName, repoName}) repos = append(repos, []string{orgName, repoName})
} }
} else { } else {
@@ -80,12 +196,12 @@ func (r *HorizontalRunnerAutoscalerReconciler) determineDesiredReplicas(rd v1alp
for _, repo := range repos { for _, repo := range repos {
user, repoName := repo[0], repo[1] user, repoName := repo[0], repo[1]
list, _, err := r.GitHubClient.Actions.ListRepositoryWorkflowRuns(context.TODO(), user, repoName, nil) workflowRuns, err := r.GitHubClient.ListRepositoryWorkflowRuns(context.TODO(), user, repoName)
if err != nil { if err != nil {
return nil, err return nil, err
} }
for _, run := range list.WorkflowRuns { for _, run := range workflowRuns {
total++ total++
// In May 2020, there are only 3 statuses. // In May 2020, there are only 3 statuses.
@@ -105,33 +221,189 @@ func (r *HorizontalRunnerAutoscalerReconciler) determineDesiredReplicas(rd v1alp
} }
} }
minReplicas := *hra.Spec.MinReplicas
maxReplicas := *hra.Spec.MaxReplicas
necessaryReplicas := queued + inProgress necessaryReplicas := queued + inProgress
var desiredReplicas int
if necessaryReplicas < minReplicas {
desiredReplicas = minReplicas
} else if necessaryReplicas > maxReplicas {
desiredReplicas = maxReplicas
} else {
desiredReplicas = necessaryReplicas
}
rd.Status.Replicas = &desiredReplicas
replicas := desiredReplicas
r.Log.V(1).Info( r.Log.V(1).Info(
"Calculated desired replicas", fmt.Sprintf("Suggested desired replicas of %d by TotalNumberOfQueuedAndInProgressWorkflowRuns", necessaryReplicas),
"computed_replicas_desired", desiredReplicas,
"spec_replicas_min", minReplicas,
"spec_replicas_max", maxReplicas,
"workflow_runs_completed", completed, "workflow_runs_completed", completed,
"workflow_runs_in_progress", inProgress, "workflow_runs_in_progress", inProgress,
"workflow_runs_queued", queued, "workflow_runs_queued", queued,
"workflow_runs_unknown", unknown, "workflow_runs_unknown", unknown,
"namespace", hra.Namespace,
"runner_deployment", rd.Name,
"horizontal_runner_autoscaler", hra.Name,
) )
return &replicas, nil return &necessaryReplicas, nil
}
func (r *HorizontalRunnerAutoscalerReconciler) suggestReplicasByPercentageRunnersBusy(rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler, metrics v1alpha1.MetricSpec) (*int, error) {
ctx := context.Background()
scaleUpThreshold := defaultScaleUpThreshold
scaleDownThreshold := defaultScaleDownThreshold
scaleUpFactor := defaultScaleUpFactor
scaleDownFactor := defaultScaleDownFactor
if metrics.ScaleUpThreshold != "" {
sut, err := strconv.ParseFloat(metrics.ScaleUpThreshold, 64)
if err != nil {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleUpThreshold cannot be parsed into a float64")
}
scaleUpThreshold = sut
}
if metrics.ScaleDownThreshold != "" {
sdt, err := strconv.ParseFloat(metrics.ScaleDownThreshold, 64)
if err != nil {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleDownThreshold cannot be parsed into a float64")
}
scaleDownThreshold = sdt
}
scaleUpAdjustment := metrics.ScaleUpAdjustment
if scaleUpAdjustment != 0 {
if metrics.ScaleUpAdjustment < 0 {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleUpAdjustment cannot be lower than 0")
}
if metrics.ScaleUpFactor != "" {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[]: scaleUpAdjustment and scaleUpFactor cannot be specified together")
}
} else if metrics.ScaleUpFactor != "" {
suf, err := strconv.ParseFloat(metrics.ScaleUpFactor, 64)
if err != nil {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleUpFactor cannot be parsed into a float64")
}
scaleUpFactor = suf
}
scaleDownAdjustment := metrics.ScaleDownAdjustment
if scaleDownAdjustment != 0 {
if metrics.ScaleDownAdjustment < 0 {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleDownAdjustment cannot be lower than 0")
}
if metrics.ScaleDownFactor != "" {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[]: scaleDownAdjustment and scaleDownFactor cannot be specified together")
}
} else if metrics.ScaleDownFactor != "" {
sdf, err := strconv.ParseFloat(metrics.ScaleDownFactor, 64)
if err != nil {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleDownFactor cannot be parsed into a float64")
}
scaleDownFactor = sdf
}
// return the list of runners in namespace. Horizontal Runner Autoscaler should only be responsible for scaling resources in its own ns.
var runnerList v1alpha1.RunnerList
var opts []client.ListOption
opts = append(opts, client.InNamespace(rd.Namespace))
selector, err := metav1.LabelSelectorAsSelector(getSelector(&rd))
if err != nil {
return nil, err
}
opts = append(opts, client.MatchingLabelsSelector{Selector: selector})
r.Log.V(2).Info("Finding runners with selector", "ns", rd.Namespace)
if err := r.List(
ctx,
&runnerList,
opts...,
); err != nil {
if !kerrors.IsNotFound(err) {
return nil, err
}
}
runnerMap := make(map[string]struct{})
for _, items := range runnerList.Items {
runnerMap[items.Name] = struct{}{}
}
var (
enterprise = rd.Spec.Template.Spec.Enterprise
organization = rd.Spec.Template.Spec.Organization
repository = rd.Spec.Template.Spec.Repository
)
// ListRunners will return all runners managed by GitHub - not restricted to ns
runners, err := r.GitHubClient.ListRunners(
ctx,
enterprise,
organization,
repository)
if err != nil {
return nil, err
}
var desiredReplicasBefore int
if v := rd.Spec.Replicas; v == nil {
desiredReplicasBefore = 1
} else {
desiredReplicasBefore = *v
}
var (
numRunners int
numRunnersRegistered int
numRunnersBusy int
)
numRunners = len(runnerList.Items)
for _, runner := range runners {
if _, ok := runnerMap[*runner.Name]; ok {
numRunnersRegistered++
if runner.GetBusy() {
numRunnersBusy++
}
}
}
var desiredReplicas int
fractionBusy := float64(numRunnersBusy) / float64(desiredReplicasBefore)
if fractionBusy >= scaleUpThreshold {
if scaleUpAdjustment > 0 {
desiredReplicas = desiredReplicasBefore + scaleUpAdjustment
} else {
desiredReplicas = int(math.Ceil(float64(desiredReplicasBefore) * scaleUpFactor))
}
} else if fractionBusy < scaleDownThreshold {
if scaleDownAdjustment > 0 {
desiredReplicas = desiredReplicasBefore - scaleDownAdjustment
} else {
desiredReplicas = int(float64(desiredReplicasBefore) * scaleDownFactor)
}
} else {
desiredReplicas = *rd.Spec.Replicas
}
// NOTES for operators:
//
// - num_runners can be as twice as large as replicas_desired_before while
// the runnerdeployment controller is replacing RunnerReplicaSet for runner update.
r.Log.V(1).Info(
fmt.Sprintf("Suggested desired replicas of %d by PercentageRunnersBusy", desiredReplicas),
"replicas_desired_before", desiredReplicasBefore,
"replicas_desired", desiredReplicas,
"num_runners", numRunners,
"num_runners_registered", numRunnersRegistered,
"num_runners_busy", numRunnersBusy,
"namespace", hra.Namespace,
"runner_deployment", rd.Name,
"horizontal_runner_autoscaler", hra.Name,
"enterprise", enterprise,
"organization", organization,
"repository", repository,
)
return &desiredReplicas, nil
} }

View File

@@ -16,7 +16,10 @@ import (
) )
func newGithubClient(server *httptest.Server) *github.Client { func newGithubClient(server *httptest.Server) *github.Client {
client, err := github.NewClientWithAccessToken("token") c := github.Config{
Token: "token",
}
client, err := c.NewClient()
if err != nil { if err != nil {
panic(err) panic(err)
} }
@@ -37,14 +40,18 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
metav1Now := metav1.Now() metav1Now := metav1.Now()
testcases := []struct { testcases := []struct {
repo string repo string
org string org string
fixed *int fixed *int
max *int max *int
min *int min *int
sReplicas *int sReplicas *int
sTime *metav1.Time sTime *metav1.Time
workflowRuns string
workflowRuns string
workflowRuns_queued string
workflowRuns_in_progress string
workflowJobs map[int]string workflowJobs map[int]string
want int want int
err string err string
@@ -52,87 +59,107 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
// Legacy functionality // Legacy functionality
// 3 demanded, max at 3 // 3 demanded, max at 3
{ {
repo: "test/valid", repo: "test/valid",
min: intPtr(2), min: intPtr(2),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
want: 3, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 3,
}, },
// 2 demanded, max at 3, currently 3, delay scaling down due to grace period // 2 demanded, max at 3, currently 3, delay scaling down due to grace period
{ {
repo: "test/valid", repo: "test/valid",
min: intPtr(2), min: intPtr(2),
max: intPtr(3), max: intPtr(3),
sReplicas: intPtr(3), sReplicas: intPtr(3),
sTime: &metav1Now, sTime: &metav1Now,
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 3, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
want: 3, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 3,
}, },
// 3 demanded, max at 2 // 3 demanded, max at 2
{ {
repo: "test/valid", repo: "test/valid",
min: intPtr(2), min: intPtr(2),
max: intPtr(2), max: intPtr(2),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
want: 2, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 2,
}, },
// 2 demanded, min at 2 // 2 demanded, min at 2
{ {
repo: "test/valid", repo: "test/valid",
min: intPtr(2), min: intPtr(2),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 3, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 3, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
want: 2, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 2,
}, },
// 1 demanded, min at 2 // 1 demanded, min at 2
{ {
repo: "test/valid", repo: "test/valid",
min: intPtr(2), min: intPtr(2),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`,
want: 2, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
want: 2,
}, },
// 1 demanded, min at 2 // 1 demanded, min at 2
{ {
repo: "test/valid", repo: "test/valid",
min: intPtr(2), min: intPtr(2),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`,
want: 2, workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 2,
}, },
// 1 demanded, min at 1 // 1 demanded, min at 1
{ {
repo: "test/valid", repo: "test/valid",
min: intPtr(1), min: intPtr(1),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`,
want: 1, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
want: 1,
}, },
// 1 demanded, min at 1 // 1 demanded, min at 1
{ {
repo: "test/valid", repo: "test/valid",
min: intPtr(1), min: intPtr(1),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`,
want: 1, workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 1,
}, },
// fixed at 3 // fixed at 3
{ {
repo: "test/valid", repo: "test/valid",
min: intPtr(1), min: intPtr(1),
max: intPtr(3), max: intPtr(3),
fixed: intPtr(3), fixed: intPtr(3),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
want: 3, workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 3, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 3,
}, },
// Job-level autoscaling // Job-level autoscaling
// 5 requested from 3 workflows // 5 requested from 3 workflows
{ {
repo: "test/valid", repo: "test/valid",
min: intPtr(2), min: intPtr(2),
max: intPtr(10), max: intPtr(10),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"id": 1, "status":"queued"}, {"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 4, "workflow_runs":[{"id": 1, "status":"queued"}, {"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"id": 1, "status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}]}"`,
workflowJobs: map[int]string{ workflowJobs: map[int]string{
1: `{"jobs": [{"status":"queued"}, {"status":"queued"}]}`, 1: `{"jobs": [{"status":"queued"}, {"status":"queued"}]}`,
2: `{"jobs": [{"status": "in_progress"}, {"status":"completed"}]}`, 2: `{"jobs": [{"status": "in_progress"}, {"status":"completed"}]}`,
@@ -154,7 +181,11 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
_ = v1alpha1.AddToScheme(scheme) _ = v1alpha1.AddToScheme(scheme)
t.Run(fmt.Sprintf("case %d", i), func(t *testing.T) { t.Run(fmt.Sprintf("case %d", i), func(t *testing.T) {
server := fake.NewServer(fake.WithListRepositoryWorkflowRunsResponse(200, tc.workflowRuns), fake.WithListWorkflowJobsResponse(200, tc.workflowJobs)) server := fake.NewServer(
fake.WithListRepositoryWorkflowRunsResponse(200, tc.workflowRuns, tc.workflowRuns_queued, tc.workflowRuns_in_progress),
fake.WithListWorkflowJobsResponse(200, tc.workflowJobs),
fake.WithListRunnersResponse(200, fake.RunnersListBody),
)
defer server.Close() defer server.Close()
client := newGithubClient(server) client := newGithubClient(server)
@@ -178,7 +209,7 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
Replicas: tc.fixed, Replicas: tc.fixed,
}, },
Status: v1alpha1.RunnerDeploymentStatus{ Status: v1alpha1.RunnerDeploymentStatus{
Replicas: tc.sReplicas, DesiredReplicas: tc.sReplicas,
}, },
} }
@@ -193,7 +224,12 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
}, },
} }
got, err := h.computeReplicas(rd, hra) minReplicas, _, _, err := h.getMinReplicas(log, metav1Now.Time, hra)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
got, _, _, err := h.computeReplicasWithCache(log, metav1Now.Time, rd, hra, minReplicas)
if err != nil { if err != nil {
if tc.err == "" { if tc.err == "" {
t.Fatalf("unexpected error: expected none, got %v", err) t.Fatalf("unexpected error: expected none, got %v", err)
@@ -203,12 +239,8 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
return return
} }
if got == nil { if got != tc.want {
t.Fatalf("unexpected value of rs.Spec.Replicas: nil") t.Errorf("%d: incorrect desired replicas: want %d, got %d", i, tc.want, got)
}
if *got != tc.want {
t.Errorf("%d: incorrect desired replicas: want %d, got %d", i, tc.want, *got)
} }
}) })
} }
@@ -221,129 +253,157 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
metav1Now := metav1.Now() metav1Now := metav1.Now()
testcases := []struct { testcases := []struct {
repos []string repos []string
org string org string
fixed *int fixed *int
max *int max *int
min *int min *int
sReplicas *int sReplicas *int
sTime *metav1.Time sTime *metav1.Time
workflowRuns string
workflowRuns string
workflowRuns_queued string
workflowRuns_in_progress string
workflowJobs map[int]string workflowJobs map[int]string
want int want int
err string err string
}{ }{
// 3 demanded, max at 3 // 3 demanded, max at 3
{ {
org: "test", org: "test",
repos: []string{"valid"}, repos: []string{"valid"},
min: intPtr(2), min: intPtr(2),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
want: 3, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 3,
}, },
// 2 demanded, max at 3, currently 3, delay scaling down due to grace period // 2 demanded, max at 3, currently 3, delay scaling down due to grace period
{ {
org: "test", org: "test",
repos: []string{"valid"}, repos: []string{"valid"},
min: intPtr(2), min: intPtr(2),
max: intPtr(3), max: intPtr(3),
sReplicas: intPtr(3), sReplicas: intPtr(3),
sTime: &metav1Now, sTime: &metav1Now,
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
want: 3, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 3,
}, },
// 3 demanded, max at 2 // 3 demanded, max at 2
{ {
org: "test", org: "test",
repos: []string{"valid"}, repos: []string{"valid"},
min: intPtr(2), min: intPtr(2),
max: intPtr(2), max: intPtr(2),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
want: 2, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 2,
}, },
// 2 demanded, min at 2 // 2 demanded, min at 2
{ {
org: "test", org: "test",
repos: []string{"valid"}, repos: []string{"valid"},
min: intPtr(2), min: intPtr(2),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 3, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 3, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
want: 2, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 2,
}, },
// 1 demanded, min at 2 // 1 demanded, min at 2
{ {
org: "test", org: "test",
repos: []string{"valid"}, repos: []string{"valid"},
min: intPtr(2), min: intPtr(2),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`,
want: 2, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
want: 2,
}, },
// 1 demanded, min at 2 // 1 demanded, min at 2
{ {
org: "test", org: "test",
repos: []string{"valid"}, repos: []string{"valid"},
min: intPtr(2), min: intPtr(2),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`,
want: 2, workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 2,
}, },
// 1 demanded, min at 1 // 1 demanded, min at 1
{ {
org: "test", org: "test",
repos: []string{"valid"}, repos: []string{"valid"},
min: intPtr(1), min: intPtr(1),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`,
want: 1, workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
want: 1,
}, },
// 1 demanded, min at 1 // 1 demanded, min at 1
{ {
org: "test", org: "test",
repos: []string{"valid"}, repos: []string{"valid"},
min: intPtr(1), min: intPtr(1),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`,
want: 1, workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 1,
}, },
// fixed at 3 // fixed at 3
{ {
org: "test", org: "test",
repos: []string{"valid"}, repos: []string{"valid"},
fixed: intPtr(1), fixed: intPtr(1),
min: intPtr(1), min: intPtr(1),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
want: 3, workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 3, "workflow_runs":[{"status":"in_progress"},{"status":"in_progress"},{"status":"in_progress"}]}"`,
want: 3,
}, },
// org runner, fixed at 3 // org runner, fixed at 3
{ {
org: "test", org: "test",
repos: []string{"valid"}, repos: []string{"valid"},
fixed: intPtr(1), fixed: intPtr(1),
min: intPtr(1), min: intPtr(1),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
want: 3, workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 3, "workflow_runs":[{"status":"in_progress"},{"status":"in_progress"},{"status":"in_progress"}]}"`,
want: 3,
}, },
// org runner, 1 demanded, min at 1, no repos // org runner, 1 demanded, min at 1, no repos
{ {
org: "test", org: "test",
min: intPtr(1), min: intPtr(1),
max: intPtr(3), max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`,
err: "validating autoscaling metrics: spec.autoscaling.metrics[].repositoryNames is required and must have one more more entries for organizational runner deployment", workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
err: "validating autoscaling metrics: spec.autoscaling.metrics[].repositoryNames is required and must have one more more entries for organizational runner deployment",
}, },
// Job-level autoscaling // Job-level autoscaling
// 5 requested from 3 workflows // 5 requested from 3 workflows
{ {
org: "test", org: "test",
repos: []string{"valid"}, repos: []string{"valid"},
min: intPtr(2), min: intPtr(2),
max: intPtr(10), max: intPtr(10),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"id": 1, "status":"queued"}, {"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}, {"status":"completed"}]}"`, workflowRuns: `{"total_count": 4, "workflow_runs":[{"id": 1, "status":"queued"}, {"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"id": 1, "status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}, {"status":"completed"}]}"`,
workflowJobs: map[int]string{ workflowJobs: map[int]string{
1: `{"jobs": [{"status":"queued"}, {"status":"queued"}]}`, 1: `{"jobs": [{"status":"queued"}, {"status":"queued"}]}`,
2: `{"jobs": [{"status": "in_progress"}, {"status":"completed"}]}`, 2: `{"jobs": [{"status": "in_progress"}, {"status":"completed"}]}`,
@@ -365,7 +425,13 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
_ = v1alpha1.AddToScheme(scheme) _ = v1alpha1.AddToScheme(scheme)
t.Run(fmt.Sprintf("case %d", i), func(t *testing.T) { t.Run(fmt.Sprintf("case %d", i), func(t *testing.T) {
server := fake.NewServer(fake.WithListRepositoryWorkflowRunsResponse(200, tc.workflowRuns), fake.WithListWorkflowJobsResponse(200, tc.workflowJobs)) t.Helper()
server := fake.NewServer(
fake.WithListRepositoryWorkflowRunsResponse(200, tc.workflowRuns, tc.workflowRuns_queued, tc.workflowRuns_in_progress),
fake.WithListWorkflowJobsResponse(200, tc.workflowJobs),
fake.WithListRunnersResponse(200, fake.RunnersListBody),
)
defer server.Close() defer server.Close()
client := newGithubClient(server) client := newGithubClient(server)
@@ -380,7 +446,17 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
Name: "testrd", Name: "testrd",
}, },
Spec: v1alpha1.RunnerDeploymentSpec{ Spec: v1alpha1.RunnerDeploymentSpec{
Selector: &metav1.LabelSelector{
MatchLabels: map[string]string{
"foo": "bar",
},
},
Template: v1alpha1.RunnerTemplate{ Template: v1alpha1.RunnerTemplate{
ObjectMeta: metav1.ObjectMeta{
Labels: map[string]string{
"foo": "bar",
},
},
Spec: v1alpha1.RunnerSpec{ Spec: v1alpha1.RunnerSpec{
Organization: tc.org, Organization: tc.org,
}, },
@@ -388,7 +464,7 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
Replicas: tc.fixed, Replicas: tc.fixed,
}, },
Status: v1alpha1.RunnerDeploymentStatus{ Status: v1alpha1.RunnerDeploymentStatus{
Replicas: tc.sReplicas, DesiredReplicas: tc.sReplicas,
}, },
} }
@@ -412,7 +488,12 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
}, },
} }
got, err := h.computeReplicas(rd, hra) minReplicas, _, _, err := h.getMinReplicas(log, metav1Now.Time, hra)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
got, _, _, err := h.computeReplicasWithCache(log, metav1Now.Time, rd, hra, minReplicas)
if err != nil { if err != nil {
if tc.err == "" { if tc.err == "" {
t.Fatalf("unexpected error: expected none, got %v", err) t.Fatalf("unexpected error: expected none, got %v", err)
@@ -422,12 +503,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
return return
} }
if got == nil { if got != tc.want {
t.Fatalf("unexpected value of rs.Spec.Replicas: nil, wanted %v", tc.want) t.Errorf("%d: incorrect desired replicas: want %d, got %d", i, tc.want, got)
}
if *got != tc.want {
t.Errorf("%d: incorrect desired replicas: want %d, got %d", i, tc.want, *got)
} }
}) })
} }

View File

@@ -0,0 +1,468 @@
/*
Copyright 2020 The actions-runner-controller authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package controllers
import (
"context"
"fmt"
"io/ioutil"
"net/http"
"strings"
"time"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
"github.com/go-logr/logr"
gogithub "github.com/google/go-github/v33/github"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/client-go/tools/record"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"github.com/summerwind/actions-runner-controller/api/v1alpha1"
)
const (
scaleTargetKey = "scaleTarget"
)
// HorizontalRunnerAutoscalerGitHubWebhook autoscales a HorizontalRunnerAutoscaler and the RunnerDeployment on each
// GitHub Webhook received
type HorizontalRunnerAutoscalerGitHubWebhook struct {
client.Client
Log logr.Logger
Recorder record.EventRecorder
Scheme *runtime.Scheme
// SecretKeyBytes is the byte representation of the Webhook secret token
// the administrator is generated and specified in GitHub Web UI.
SecretKeyBytes []byte
// Namespace is the namespace to watch for HorizontalRunnerAutoscaler's to be
// scaled on Webhook.
// Set to empty for letting it watch for all namespaces.
Namespace string
Name string
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) Reconcile(request reconcile.Request) (reconcile.Result, error) {
return ctrl.Result{}, nil
}
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=horizontalrunnerautoscalers,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=horizontalrunnerautoscalers/finalizers,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=horizontalrunnerautoscalers/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=core,resources=events,verbs=create;patch
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) Handle(w http.ResponseWriter, r *http.Request) {
var (
ok bool
err error
)
defer func() {
if !ok {
w.WriteHeader(http.StatusInternalServerError)
if err != nil {
msg := err.Error()
if written, err := w.Write([]byte(msg)); err != nil {
autoscaler.Log.Error(err, "failed writing http error response", "msg", msg, "written", written)
}
}
}
}()
defer func() {
if r.Body != nil {
r.Body.Close()
}
}()
// respond ok to GET / e.g. for health check
if r.Method == http.MethodGet {
fmt.Fprintln(w, "webhook server is running")
return
}
var payload []byte
if len(autoscaler.SecretKeyBytes) > 0 {
payload, err = gogithub.ValidatePayload(r, autoscaler.SecretKeyBytes)
if err != nil {
autoscaler.Log.Error(err, "error validating request body")
return
}
} else {
payload, err = ioutil.ReadAll(r.Body)
if err != nil {
autoscaler.Log.Error(err, "error reading request body")
return
}
}
webhookType := gogithub.WebHookType(r)
event, err := gogithub.ParseWebHook(webhookType, payload)
if err != nil {
var s string
if payload != nil {
s = string(payload)
}
autoscaler.Log.Error(err, "could not parse webhook", "webhookType", webhookType, "payload", s)
return
}
var target *ScaleTarget
log := autoscaler.Log.WithValues(
"event", webhookType,
"hookID", r.Header.Get("X-GitHub-Hook-ID"),
"delivery", r.Header.Get("X-GitHub-Delivery"),
)
switch e := event.(type) {
case *gogithub.PushEvent:
target, err = autoscaler.getScaleUpTarget(
context.TODO(),
log,
e.Repo.GetName(),
e.Repo.Owner.GetLogin(),
e.Repo.Owner.GetType(),
autoscaler.MatchPushEvent(e),
)
case *gogithub.PullRequestEvent:
target, err = autoscaler.getScaleUpTarget(
context.TODO(),
log,
e.Repo.GetName(),
e.Repo.Owner.GetLogin(),
e.Repo.Owner.GetType(),
autoscaler.MatchPullRequestEvent(e),
)
if pullRequest := e.PullRequest; pullRequest != nil {
log = log.WithValues(
"pullRequest.base.ref", e.PullRequest.Base.GetRef(),
"action", e.GetAction(),
)
}
case *gogithub.CheckRunEvent:
target, err = autoscaler.getScaleUpTarget(
context.TODO(),
log,
e.Repo.GetName(),
e.Repo.Owner.GetLogin(),
e.Repo.Owner.GetType(),
autoscaler.MatchCheckRunEvent(e),
)
if checkRun := e.GetCheckRun(); checkRun != nil {
log = log.WithValues(
"checkRun.status", checkRun.GetStatus(),
"action", e.GetAction(),
)
}
case *gogithub.PingEvent:
ok = true
w.WriteHeader(http.StatusOK)
msg := "pong"
if written, err := w.Write([]byte(msg)); err != nil {
log.Error(err, "failed writing http response", "msg", msg, "written", written)
}
log.Info("received ping event")
return
default:
log.Info("unknown event type", "eventType", webhookType)
return
}
if err != nil {
log.Error(err, "handling check_run event")
return
}
if target == nil {
log.Info(
"Scale target not found. If this is unexpected, ensure that there is exactly one repository-wide or organizational runner deployment that matches this webhook event",
)
msg := "no horizontalrunnerautoscaler to scale for this github event"
ok = true
w.WriteHeader(http.StatusOK)
if written, err := w.Write([]byte(msg)); err != nil {
log.Error(err, "failed writing http response", "msg", msg, "written", written)
}
return
}
if err := autoscaler.tryScaleUp(context.TODO(), target); err != nil {
log.Error(err, "could not scale up")
return
}
ok = true
w.WriteHeader(http.StatusOK)
msg := fmt.Sprintf("scaled %s by 1", target.Name)
autoscaler.Log.Info(msg)
if written, err := w.Write([]byte(msg)); err != nil {
log.Error(err, "failed writing http response", "msg", msg, "written", written)
}
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) findHRAsByKey(ctx context.Context, value string) ([]v1alpha1.HorizontalRunnerAutoscaler, error) {
ns := autoscaler.Namespace
var defaultListOpts []client.ListOption
if ns != "" {
defaultListOpts = append(defaultListOpts, client.InNamespace(ns))
}
var hras []v1alpha1.HorizontalRunnerAutoscaler
if value != "" {
opts := append([]client.ListOption{}, defaultListOpts...)
opts = append(opts, client.MatchingFields{scaleTargetKey: value})
if autoscaler.Namespace != "" {
opts = append(opts, client.InNamespace(autoscaler.Namespace))
}
var hraList v1alpha1.HorizontalRunnerAutoscalerList
if err := autoscaler.List(ctx, &hraList, opts...); err != nil {
return nil, err
}
for _, d := range hraList.Items {
hras = append(hras, d)
}
}
return hras, nil
}
func matchTriggerConditionAgainstEvent(types []string, eventAction *string) bool {
if len(types) == 0 {
return true
}
if eventAction == nil {
return false
}
for _, tpe := range types {
if tpe == *eventAction {
return true
}
}
return false
}
type ScaleTarget struct {
v1alpha1.HorizontalRunnerAutoscaler
v1alpha1.ScaleUpTrigger
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) searchScaleTargets(hras []v1alpha1.HorizontalRunnerAutoscaler, f func(v1alpha1.ScaleUpTrigger) bool) []ScaleTarget {
var matched []ScaleTarget
for _, hra := range hras {
if !hra.ObjectMeta.DeletionTimestamp.IsZero() {
continue
}
for _, scaleUpTrigger := range hra.Spec.ScaleUpTriggers {
if !f(scaleUpTrigger) {
continue
}
matched = append(matched, ScaleTarget{
HorizontalRunnerAutoscaler: hra,
ScaleUpTrigger: scaleUpTrigger,
})
}
}
return matched
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) getScaleTarget(ctx context.Context, name string, f func(v1alpha1.ScaleUpTrigger) bool) (*ScaleTarget, error) {
hras, err := autoscaler.findHRAsByKey(ctx, name)
if err != nil {
return nil, err
}
autoscaler.Log.V(1).Info(fmt.Sprintf("Found %d HRAs by key", len(hras)), "key", name)
targets := autoscaler.searchScaleTargets(hras, f)
n := len(targets)
if n == 0 {
return nil, nil
}
if n > 1 {
var scaleTargetIDs []string
for _, t := range targets {
scaleTargetIDs = append(scaleTargetIDs, t.HorizontalRunnerAutoscaler.Name)
}
autoscaler.Log.Info(
"Found too many scale targets: "+
"It must be exactly one to avoid ambiguity. "+
"Either set Namespace for the webhook-based autoscaler to let it only find HRAs in the namespace, "+
"or update Repository or Organization fields in your RunnerDeployment resources to fix the ambiguity.",
"scaleTargets", strings.Join(scaleTargetIDs, ","))
return nil, nil
}
return &targets[0], nil
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) getScaleUpTarget(ctx context.Context, log logr.Logger, repo, owner, ownerType string, f func(v1alpha1.ScaleUpTrigger) bool) (*ScaleTarget, error) {
repositoryRunnerKey := owner + "/" + repo
if target, err := autoscaler.getScaleTarget(ctx, repositoryRunnerKey, f); err != nil {
log.Info("finding repository-wide runner", "repository", repositoryRunnerKey)
return nil, err
} else if target != nil {
log.Info("scale up target is repository-wide runners", "repository", repo)
return target, nil
}
if ownerType == "User" {
log.V(1).Info("no repository runner found", "organization", owner)
return nil, nil
}
if target, err := autoscaler.getScaleTarget(ctx, owner, f); err != nil {
log.Info("finding organizational runner", "organization", owner)
return nil, err
} else if target != nil {
log.Info("scale up target is organizational runners", "organization", owner)
return target, nil
} else {
log.V(1).Info("no repository runner or organizational runner found",
"repository", repositoryRunnerKey,
"organization", owner,
)
}
return nil, nil
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) tryScaleUp(ctx context.Context, target *ScaleTarget) error {
if target == nil {
return nil
}
copy := target.HorizontalRunnerAutoscaler.DeepCopy()
amount := 1
if target.ScaleUpTrigger.Amount > 0 {
amount = target.ScaleUpTrigger.Amount
}
capacityReservations := getValidCapacityReservations(copy)
copy.Spec.CapacityReservations = append(capacityReservations, v1alpha1.CapacityReservation{
ExpirationTime: metav1.Time{Time: time.Now().Add(target.ScaleUpTrigger.Duration.Duration)},
Replicas: amount,
})
if err := autoscaler.Client.Patch(ctx, copy, client.MergeFrom(&target.HorizontalRunnerAutoscaler)); err != nil {
return fmt.Errorf("patching horizontalrunnerautoscaler to add capacity reservation: %w", err)
}
return nil
}
func getValidCapacityReservations(autoscaler *v1alpha1.HorizontalRunnerAutoscaler) []v1alpha1.CapacityReservation {
var capacityReservations []v1alpha1.CapacityReservation
now := time.Now()
for _, reservation := range autoscaler.Spec.CapacityReservations {
if reservation.ExpirationTime.Time.After(now) {
capacityReservations = append(capacityReservations, reservation)
}
}
return capacityReservations
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) SetupWithManager(mgr ctrl.Manager) error {
name := "webhookbasedautoscaler"
if autoscaler.Name != "" {
name = autoscaler.Name
}
autoscaler.Recorder = mgr.GetEventRecorderFor(name)
if err := mgr.GetFieldIndexer().IndexField(&v1alpha1.HorizontalRunnerAutoscaler{}, scaleTargetKey, func(rawObj runtime.Object) []string {
hra := rawObj.(*v1alpha1.HorizontalRunnerAutoscaler)
if hra.Spec.ScaleTargetRef.Name == "" {
return nil
}
var rd v1alpha1.RunnerDeployment
if err := autoscaler.Client.Get(context.Background(), types.NamespacedName{Namespace: hra.Namespace, Name: hra.Spec.ScaleTargetRef.Name}, &rd); err != nil {
return nil
}
return []string{rd.Spec.Template.Spec.Repository, rd.Spec.Template.Spec.Organization}
}); err != nil {
return err
}
return ctrl.NewControllerManagedBy(mgr).
For(&v1alpha1.HorizontalRunnerAutoscaler{}).
Named(name).
Complete(autoscaler)
}

View File

@@ -0,0 +1,43 @@
package controllers
import (
"github.com/google/go-github/v33/github"
"github.com/summerwind/actions-runner-controller/api/v1alpha1"
"github.com/summerwind/actions-runner-controller/pkg/actionsglob"
)
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) MatchCheckRunEvent(event *github.CheckRunEvent) func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
return func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
g := scaleUpTrigger.GitHubEvent
if g == nil {
return false
}
cr := g.CheckRun
if cr == nil {
return false
}
if !matchTriggerConditionAgainstEvent(cr.Types, event.Action) {
return false
}
if cr.Status != "" && (event.CheckRun == nil || event.CheckRun.Status == nil || *event.CheckRun.Status != cr.Status) {
return false
}
if checkRun := event.CheckRun; checkRun != nil && len(cr.Names) > 0 {
for _, pat := range cr.Names {
if r := actionsglob.Match(pat, checkRun.GetName()); r {
return true
}
}
return false
}
return true
}
}

View File

@@ -0,0 +1,32 @@
package controllers
import (
"github.com/google/go-github/v33/github"
"github.com/summerwind/actions-runner-controller/api/v1alpha1"
)
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) MatchPullRequestEvent(event *github.PullRequestEvent) func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
return func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
g := scaleUpTrigger.GitHubEvent
if g == nil {
return false
}
pr := g.PullRequest
if pr == nil {
return false
}
if !matchTriggerConditionAgainstEvent(pr.Types, event.Action) {
return false
}
if !matchTriggerConditionAgainstEvent(pr.Branches, event.PullRequest.Base.Ref) {
return false
}
return true
}
}

View File

@@ -0,0 +1,24 @@
package controllers
import (
"github.com/google/go-github/v33/github"
"github.com/summerwind/actions-runner-controller/api/v1alpha1"
)
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) MatchPushEvent(event *github.PushEvent) func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
return func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
g := scaleUpTrigger.GitHubEvent
if g == nil {
return false
}
push := g.Push
if push == nil {
return false
}
return true
}
}

View File

@@ -0,0 +1,314 @@
package controllers
import (
"bytes"
"encoding/json"
"fmt"
"github.com/go-logr/logr"
"github.com/google/go-github/v33/github"
actionsv1alpha1 "github.com/summerwind/actions-runner-controller/api/v1alpha1"
"io"
"io/ioutil"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
"net/http"
"net/http/httptest"
"net/url"
"os"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
"testing"
"time"
)
var (
sc = runtime.NewScheme()
)
func init() {
_ = clientgoscheme.AddToScheme(sc)
_ = actionsv1alpha1.AddToScheme(sc)
}
func TestOrgWebhookCheckRun(t *testing.T) {
f, err := os.Open("testdata/org_webhook_check_run_payload.json")
if err != nil {
t.Fatalf("could not open the fixture: %s", err)
}
defer f.Close()
var e github.CheckRunEvent
if err := json.NewDecoder(f).Decode(&e); err != nil {
t.Fatalf("invalid json: %s", err)
}
testServer(t,
"check_run",
&e,
200,
"no horizontalrunnerautoscaler to scale for this github event",
)
}
func TestRepoWebhookCheckRun(t *testing.T) {
f, err := os.Open("testdata/repo_webhook_check_run_payload.json")
if err != nil {
t.Fatalf("could not open the fixture: %s", err)
}
defer f.Close()
var e github.CheckRunEvent
if err := json.NewDecoder(f).Decode(&e); err != nil {
t.Fatalf("invalid json: %s", err)
}
testServer(t,
"check_run",
&e,
200,
"no horizontalrunnerautoscaler to scale for this github event",
)
}
func TestWebhookPullRequest(t *testing.T) {
testServer(t,
"pull_request",
&github.PullRequestEvent{
PullRequest: &github.PullRequest{
Base: &github.PullRequestBranch{
Ref: github.String("main"),
},
},
Repo: &github.Repository{
Name: github.String("myorg/myrepo"),
Organization: &github.Organization{
Name: github.String("myorg"),
},
},
Action: github.String("created"),
},
200,
"no horizontalrunnerautoscaler to scale for this github event",
)
}
func TestWebhookPush(t *testing.T) {
testServer(t,
"push",
&github.PushEvent{
Repo: &github.PushEventRepository{
Name: github.String("myrepo"),
Organization: github.String("myorg"),
},
},
200,
"no horizontalrunnerautoscaler to scale for this github event",
)
}
func TestWebhookPing(t *testing.T) {
testServer(t,
"ping",
&github.PingEvent{
Zen: github.String("zen"),
},
200,
"pong",
)
}
func TestGetRequest(t *testing.T) {
hra := HorizontalRunnerAutoscalerGitHubWebhook{}
request, _ := http.NewRequest(http.MethodGet, "/", nil)
recorder := httptest.ResponseRecorder{}
hra.Handle(&recorder, request)
response := recorder.Result()
if response.StatusCode != http.StatusOK {
t.Errorf("want %d, got %d", http.StatusOK, response.StatusCode)
}
}
func TestGetValidCapacityReservations(t *testing.T) {
now := time.Now()
hra := &actionsv1alpha1.HorizontalRunnerAutoscaler{
Spec: actionsv1alpha1.HorizontalRunnerAutoscalerSpec{
CapacityReservations: []actionsv1alpha1.CapacityReservation{
{
ExpirationTime: metav1.Time{Time: now.Add(-time.Second)},
Replicas: 1,
},
{
ExpirationTime: metav1.Time{Time: now},
Replicas: 2,
},
{
ExpirationTime: metav1.Time{Time: now.Add(time.Second)},
Replicas: 3,
},
},
},
}
revs := getValidCapacityReservations(hra)
var count int
for _, r := range revs {
count += r.Replicas
}
want := 3
if count != want {
t.Errorf("want %d, got %d", want, count)
}
}
func installTestLogger(webhook *HorizontalRunnerAutoscalerGitHubWebhook) *bytes.Buffer {
logs := &bytes.Buffer{}
log := testLogger{
name: "testlog",
writer: logs,
}
webhook.Log = &log
return logs
}
func testServer(t *testing.T, eventType string, event interface{}, wantCode int, wantBody string) {
t.Helper()
hraWebhook := &HorizontalRunnerAutoscalerGitHubWebhook{}
var initObjs []runtime.Object
client := fake.NewFakeClientWithScheme(sc, initObjs...)
logs := installTestLogger(hraWebhook)
defer func() {
if t.Failed() {
t.Logf("diagnostics: %s", logs.String())
}
}()
hraWebhook.Client = client
mux := http.NewServeMux()
mux.HandleFunc("/", hraWebhook.Handle)
server := httptest.NewServer(mux)
defer server.Close()
resp, err := sendWebhook(server, eventType, event)
if err != nil {
t.Fatal(err)
}
defer func() {
if resp != nil {
resp.Body.Close()
}
}()
if resp.StatusCode != wantCode {
t.Error("status:", resp.StatusCode)
}
respBody, err := ioutil.ReadAll(resp.Body)
if err != nil {
t.Fatal(err)
}
if string(respBody) != wantBody {
t.Fatal("body:", string(respBody))
}
}
func sendWebhook(server *httptest.Server, eventType string, event interface{}) (*http.Response, error) {
jsonBuf := &bytes.Buffer{}
enc := json.NewEncoder(jsonBuf)
enc.SetIndent(" ", "")
err := enc.Encode(event)
if err != nil {
return nil, fmt.Errorf("[bug in test] encoding event to json: %+v", err)
}
reqBody := jsonBuf.Bytes()
u, err := url.Parse(server.URL)
if err != nil {
return nil, fmt.Errorf("parsing server url: %v", err)
}
req := &http.Request{
Method: http.MethodPost,
URL: u,
Header: map[string][]string{
"X-GitHub-Event": {eventType},
"Content-Type": {"application/json"},
},
Body: ioutil.NopCloser(bytes.NewBuffer(reqBody)),
}
return http.DefaultClient.Do(req)
}
// testLogger is a sample logr.Logger that logs in-memory.
// It's only for testing log outputs.
type testLogger struct {
name string
keyValues map[string]interface{}
writer io.Writer
}
var _ logr.Logger = &testLogger{}
func (l *testLogger) Info(msg string, kvs ...interface{}) {
fmt.Fprintf(l.writer, "%s] %s\t", l.name, msg)
for k, v := range l.keyValues {
fmt.Fprintf(l.writer, "%s=%+v ", k, v)
}
for i := 0; i < len(kvs); i += 2 {
fmt.Fprintf(l.writer, "%s=%+v ", kvs[i], kvs[i+1])
}
fmt.Fprintf(l.writer, "\n")
}
func (_ *testLogger) Enabled() bool {
return true
}
func (l *testLogger) Error(err error, msg string, kvs ...interface{}) {
kvs = append(kvs, "error", err)
l.Info(msg, kvs...)
}
func (l *testLogger) V(_ int) logr.InfoLogger {
return l
}
func (l *testLogger) WithName(name string) logr.Logger {
return &testLogger{
name: l.name + "." + name,
keyValues: l.keyValues,
writer: l.writer,
}
}
func (l *testLogger) WithValues(kvs ...interface{}) logr.Logger {
newMap := make(map[string]interface{}, len(l.keyValues)+len(kvs)/2)
for k, v := range l.keyValues {
newMap[k] = v
}
for i := 0; i < len(kvs); i += 2 {
newMap[kvs[i].(string)] = kvs[i+1]
}
return &testLogger{
name: l.name,
keyValues: newMap,
writer: l.writer,
}
}

View File

@@ -18,8 +18,12 @@ package controllers
import ( import (
"context" "context"
"fmt"
"reflect"
"time" "time"
corev1 "k8s.io/api/core/v1"
"github.com/summerwind/actions-runner-controller/github" "github.com/summerwind/actions-runner-controller/github"
"k8s.io/apimachinery/pkg/types" "k8s.io/apimachinery/pkg/types"
@@ -29,10 +33,10 @@ import (
ctrl "sigs.k8s.io/controller-runtime" ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client" "sigs.k8s.io/controller-runtime/pkg/client"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"github.com/summerwind/actions-runner-controller/api/v1alpha1" "github.com/summerwind/actions-runner-controller/api/v1alpha1"
"github.com/summerwind/actions-runner-controller/controllers/metrics"
) )
const ( const (
@@ -46,8 +50,13 @@ type HorizontalRunnerAutoscalerReconciler struct {
Log logr.Logger Log logr.Logger
Recorder record.EventRecorder Recorder record.EventRecorder
Scheme *runtime.Scheme Scheme *runtime.Scheme
CacheDuration time.Duration
Name string
} }
const defaultReplicas = 1
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=runnerdeployments,verbs=get;list;watch;update;patch // +kubebuilder:rbac:groups=actions.summerwind.dev,resources=runnerdeployments,verbs=get;list;watch;update;patch
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=horizontalrunnerautoscalers,verbs=get;list;watch;create;update;patch;delete // +kubebuilder:rbac:groups=actions.summerwind.dev,resources=horizontalrunnerautoscalers,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=horizontalrunnerautoscalers/finalizers,verbs=get;list;watch;create;update;patch;delete // +kubebuilder:rbac:groups=actions.summerwind.dev,resources=horizontalrunnerautoscalers/finalizers,verbs=get;list;watch;create;update;patch;delete
@@ -67,6 +76,8 @@ func (r *HorizontalRunnerAutoscalerReconciler) Reconcile(req ctrl.Request) (ctrl
return ctrl.Result{}, nil return ctrl.Result{}, nil
} }
metrics.SetHorizontalRunnerAutoscalerSpec(hra.ObjectMeta, hra.Spec)
var rd v1alpha1.RunnerDeployment var rd v1alpha1.RunnerDeployment
if err := r.Get(ctx, types.NamespacedName{ if err := r.Get(ctx, types.NamespacedName{
Namespace: req.Namespace, Namespace: req.Namespace,
@@ -79,7 +90,16 @@ func (r *HorizontalRunnerAutoscalerReconciler) Reconcile(req ctrl.Request) (ctrl
return ctrl.Result{}, nil return ctrl.Result{}, nil
} }
replicas, err := r.computeReplicas(rd, hra) now := time.Now()
minReplicas, active, upcoming, err := r.getMinReplicas(log, now, hra)
if err != nil {
log.Error(err, "Could not compute min replicas")
return ctrl.Result{}, err
}
newDesiredReplicas, computedReplicas, computedReplicasFromCache, err := r.computeReplicasWithCache(log, now, rd, hra, minReplicas)
if err != nil { if err != nil {
r.Recorder.Event(&hra, corev1.EventTypeNormal, "RunnerAutoscalingFailure", err.Error()) r.Recorder.Event(&hra, corev1.EventTypeNormal, "RunnerAutoscalingFailure", err.Error())
@@ -88,62 +108,234 @@ func (r *HorizontalRunnerAutoscalerReconciler) Reconcile(req ctrl.Request) (ctrl
return ctrl.Result{}, err return ctrl.Result{}, err
} }
const defaultReplicas = 1
currentDesiredReplicas := getIntOrDefault(rd.Spec.Replicas, defaultReplicas) currentDesiredReplicas := getIntOrDefault(rd.Spec.Replicas, defaultReplicas)
newDesiredReplicas := getIntOrDefault(replicas, defaultReplicas)
// Please add more conditions that we can in-place update the newest runnerreplicaset without disruption // Please add more conditions that we can in-place update the newest runnerreplicaset without disruption
if currentDesiredReplicas != newDesiredReplicas { if currentDesiredReplicas != newDesiredReplicas {
copy := rd.DeepCopy() copy := rd.DeepCopy()
copy.Spec.Replicas = &newDesiredReplicas copy.Spec.Replicas = &newDesiredReplicas
if err := r.Client.Update(ctx, copy); err != nil { if err := r.Client.Patch(ctx, copy, client.MergeFrom(&rd)); err != nil {
log.Error(err, "Failed to update runnerderployment resource") return ctrl.Result{}, fmt.Errorf("patching runnerdeployment to have %d replicas: %w", newDesiredReplicas, err)
return ctrl.Result{}, err
} }
return ctrl.Result{}, err
} }
if hra.Status.DesiredReplicas == nil || *hra.Status.DesiredReplicas != *replicas { updated := hra.DeepCopy()
updated := hra.DeepCopy()
if (hra.Status.DesiredReplicas == nil && *replicas > 1) || if hra.Status.DesiredReplicas == nil || *hra.Status.DesiredReplicas != newDesiredReplicas {
(hra.Status.DesiredReplicas != nil && *replicas > *hra.Status.DesiredReplicas) { if (hra.Status.DesiredReplicas == nil && newDesiredReplicas > 1) ||
(hra.Status.DesiredReplicas != nil && newDesiredReplicas > *hra.Status.DesiredReplicas) {
updated.Status.LastSuccessfulScaleOutTime = &metav1.Time{Time: time.Now()} updated.Status.LastSuccessfulScaleOutTime = &metav1.Time{Time: time.Now()}
} }
updated.Status.DesiredReplicas = replicas updated.Status.DesiredReplicas = &newDesiredReplicas
}
if err := r.Status().Update(ctx, updated); err != nil { if computedReplicasFromCache == nil {
log.Error(err, "Failed to update horizontalrunnerautoscaler status") cacheEntries := getValidCacheEntries(updated, now)
return ctrl.Result{}, err var cacheDuration time.Duration
if r.CacheDuration > 0 {
cacheDuration = r.CacheDuration
} else {
cacheDuration = 10 * time.Minute
}
updated.Status.CacheEntries = append(cacheEntries, v1alpha1.CacheEntry{
Key: v1alpha1.CacheEntryKeyDesiredReplicas,
Value: computedReplicas,
ExpirationTime: metav1.Time{Time: time.Now().Add(cacheDuration)},
})
}
var overridesSummary string
if (active != nil && upcoming == nil) || (active != nil && upcoming != nil && active.Period.EndTime.Before(upcoming.Period.StartTime)) {
after := defaultReplicas
if hra.Spec.MinReplicas != nil && *hra.Spec.MinReplicas >= 0 {
after = *hra.Spec.MinReplicas
}
overridesSummary = fmt.Sprintf("min=%d time=%s", after, active.Period.EndTime)
}
if active == nil && upcoming != nil || (active != nil && upcoming != nil && active.Period.EndTime.After(upcoming.Period.StartTime)) {
if upcoming.ScheduledOverride.MinReplicas != nil {
overridesSummary = fmt.Sprintf("min=%d time=%s", *upcoming.ScheduledOverride.MinReplicas, upcoming.Period.StartTime)
}
}
if overridesSummary != "" {
updated.Status.ScheduledOverridesSummary = &overridesSummary
} else {
updated.Status.ScheduledOverridesSummary = nil
}
if !reflect.DeepEqual(hra.Status, updated.Status) {
metrics.SetHorizontalRunnerAutoscalerStatus(updated.ObjectMeta, updated.Status)
if err := r.Status().Patch(ctx, updated, client.MergeFrom(&hra)); err != nil {
return ctrl.Result{}, fmt.Errorf("patching horizontalrunnerautoscaler status: %w", err)
} }
} }
return ctrl.Result{}, nil return ctrl.Result{}, nil
} }
func getValidCacheEntries(hra *v1alpha1.HorizontalRunnerAutoscaler, now time.Time) []v1alpha1.CacheEntry {
var cacheEntries []v1alpha1.CacheEntry
for _, ent := range hra.Status.CacheEntries {
if ent.ExpirationTime.After(now) {
cacheEntries = append(cacheEntries, ent)
}
}
return cacheEntries
}
func (r *HorizontalRunnerAutoscalerReconciler) SetupWithManager(mgr ctrl.Manager) error { func (r *HorizontalRunnerAutoscalerReconciler) SetupWithManager(mgr ctrl.Manager) error {
r.Recorder = mgr.GetEventRecorderFor("horizontalrunnerautoscaler-controller") name := "horizontalrunnerautoscaler-controller"
if r.Name != "" {
name = r.Name
}
r.Recorder = mgr.GetEventRecorderFor(name)
return ctrl.NewControllerManagedBy(mgr). return ctrl.NewControllerManagedBy(mgr).
For(&v1alpha1.HorizontalRunnerAutoscaler{}). For(&v1alpha1.HorizontalRunnerAutoscaler{}).
Named(name).
Complete(r) Complete(r)
} }
func (r *HorizontalRunnerAutoscalerReconciler) computeReplicas(rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler) (*int, error) { type Override struct {
var computedReplicas *int ScheduledOverride v1alpha1.ScheduledOverride
Period Period
}
replicas, err := r.determineDesiredReplicas(rd, hra) func (r *HorizontalRunnerAutoscalerReconciler) matchScheduledOverrides(log logr.Logger, now time.Time, hra v1alpha1.HorizontalRunnerAutoscaler) (*int, *Override, *Override, error) {
if err != nil { var minReplicas *int
return nil, err var active, upcoming *Override
for _, o := range hra.Spec.ScheduledOverrides {
log.V(1).Info(
"Checking scheduled override",
"now", now,
"startTime", o.StartTime,
"endTime", o.EndTime,
"frequency", o.RecurrenceRule.Frequency,
"untilTime", o.RecurrenceRule.UntilTime,
)
a, u, err := MatchSchedule(
now, o.StartTime.Time, o.EndTime.Time,
RecurrenceRule{
Frequency: o.RecurrenceRule.Frequency,
UntilTime: o.RecurrenceRule.UntilTime.Time,
},
)
if err != nil {
return minReplicas, nil, nil, err
}
// Use the first when there are two or more active scheduled overrides,
// as the spec defines that the earlier scheduled override is prioritized higher than later ones.
if a != nil && active == nil {
active = &Override{Period: *a, ScheduledOverride: o}
if o.MinReplicas != nil {
minReplicas = o.MinReplicas
log.V(1).Info(
"Found active scheduled override",
"activeStartTime", a.StartTime,
"activeEndTime", a.EndTime,
"activeMinReplicas", minReplicas,
)
}
}
if u != nil && (upcoming == nil || u.StartTime.Before(upcoming.Period.StartTime)) {
upcoming = &Override{Period: *u, ScheduledOverride: o}
log.V(1).Info(
"Found upcoming scheduled override",
"upcomingStartTime", u.StartTime,
"upcomingEndTime", u.EndTime,
"upcomingMinReplicas", o.MinReplicas,
)
}
} }
return minReplicas, active, upcoming, nil
}
func (r *HorizontalRunnerAutoscalerReconciler) getMinReplicas(log logr.Logger, now time.Time, hra v1alpha1.HorizontalRunnerAutoscaler) (int, *Override, *Override, error) {
minReplicas := defaultReplicas
if hra.Spec.MinReplicas != nil && *hra.Spec.MinReplicas >= 0 {
minReplicas = *hra.Spec.MinReplicas
}
m, active, upcoming, err := r.matchScheduledOverrides(log, now, hra)
if err != nil {
return 0, nil, nil, err
} else if m != nil {
minReplicas = *m
}
return minReplicas, active, upcoming, nil
}
func (r *HorizontalRunnerAutoscalerReconciler) computeReplicasWithCache(log logr.Logger, now time.Time, rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler, minReplicas int) (int, int, *int, error) {
var suggestedReplicas int
suggestedReplicasFromCache := r.fetchSuggestedReplicasFromCache(hra)
var cached *int
if suggestedReplicasFromCache != nil {
cached = suggestedReplicasFromCache
if cached == nil {
suggestedReplicas = minReplicas
} else {
suggestedReplicas = *cached
}
} else {
v, err := r.suggestDesiredReplicas(rd, hra)
if err != nil {
return 0, 0, nil, err
}
if v == nil {
suggestedReplicas = minReplicas
} else {
suggestedReplicas = *v
}
}
var reserved int
for _, reservation := range hra.Spec.CapacityReservations {
if reservation.ExpirationTime.Time.After(now) {
reserved += reservation.Replicas
}
}
newDesiredReplicas := suggestedReplicas + reserved
if newDesiredReplicas < minReplicas {
newDesiredReplicas = minReplicas
} else if hra.Spec.MaxReplicas != nil && newDesiredReplicas > *hra.Spec.MaxReplicas {
newDesiredReplicas = *hra.Spec.MaxReplicas
}
//
// Delay scaling-down for ScaleDownDelaySecondsAfterScaleUp or DefaultScaleDownDelay
//
var scaleDownDelay time.Duration var scaleDownDelay time.Duration
if hra.Spec.ScaleDownDelaySecondsAfterScaleUp != nil { if hra.Spec.ScaleDownDelaySecondsAfterScaleUp != nil {
@@ -152,17 +344,50 @@ func (r *HorizontalRunnerAutoscalerReconciler) computeReplicas(rd v1alpha1.Runne
scaleDownDelay = DefaultScaleDownDelay scaleDownDelay = DefaultScaleDownDelay
} }
now := time.Now() var scaleDownDelayUntil *time.Time
if hra.Status.DesiredReplicas == nil || if hra.Status.DesiredReplicas == nil ||
*hra.Status.DesiredReplicas < *replicas || *hra.Status.DesiredReplicas < newDesiredReplicas ||
hra.Status.LastSuccessfulScaleOutTime == nil || hra.Status.LastSuccessfulScaleOutTime == nil {
hra.Status.LastSuccessfulScaleOutTime.Add(scaleDownDelay).Before(now) {
computedReplicas = replicas } else if hra.Status.LastSuccessfulScaleOutTime != nil {
t := hra.Status.LastSuccessfulScaleOutTime.Add(scaleDownDelay)
// ScaleDownDelay is not passed
if t.After(now) {
scaleDownDelayUntil = &t
newDesiredReplicas = *hra.Status.DesiredReplicas
}
} else { } else {
computedReplicas = hra.Status.DesiredReplicas newDesiredReplicas = *hra.Status.DesiredReplicas
} }
return computedReplicas, nil //
// Logs various numbers for monitoring and debugging purpose
//
kvs := []interface{}{
"suggested", suggestedReplicas,
"reserved", reserved,
"min", minReplicas,
}
if cached != nil {
kvs = append(kvs, "cached", *cached)
}
if scaleDownDelayUntil != nil {
kvs = append(kvs, "last_scale_up_time", *hra.Status.LastSuccessfulScaleOutTime)
kvs = append(kvs, "scale_down_delay_until", scaleDownDelayUntil)
}
if maxReplicas := hra.Spec.MaxReplicas; maxReplicas != nil {
kvs = append(kvs, "max", *maxReplicas)
}
log.V(1).Info(fmt.Sprintf("Calculated desired replicas of %d", newDesiredReplicas),
kvs...,
)
return newDesiredReplicas, suggestedReplicas, suggestedReplicasFromCache, nil
} }

View File

@@ -0,0 +1,49 @@
package controllers
import (
"github.com/google/go-cmp/cmp"
actionsv1alpha1 "github.com/summerwind/actions-runner-controller/api/v1alpha1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"testing"
"time"
)
func TestGetValidCacheEntries(t *testing.T) {
now := time.Now()
hra := &actionsv1alpha1.HorizontalRunnerAutoscaler{
Status: actionsv1alpha1.HorizontalRunnerAutoscalerStatus{
CacheEntries: []actionsv1alpha1.CacheEntry{
{
Key: "foo",
Value: 1,
ExpirationTime: metav1.Time{Time: now.Add(-time.Second)},
},
{
Key: "foo",
Value: 2,
ExpirationTime: metav1.Time{Time: now},
},
{
Key: "foo",
Value: 3,
ExpirationTime: metav1.Time{Time: now.Add(time.Second)},
},
},
},
}
revs := getValidCacheEntries(hra, now)
counts := map[string]int{}
for _, r := range revs {
counts[r.Key] += r.Value
}
want := map[string]int{"foo": 3}
if d := cmp.Diff(want, counts); d != "" {
t.Errorf("%s", d)
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,67 @@
package metrics
import (
"github.com/prometheus/client_golang/prometheus"
"github.com/summerwind/actions-runner-controller/api/v1alpha1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
const (
hraName = "horizontalrunnerautoscaler"
hraNamespace = "namespace"
)
var (
horizontalRunnerAutoscalerMetrics = []prometheus.Collector{
horizontalRunnerAutoscalerMinReplicas,
horizontalRunnerAutoscalerMaxReplicas,
horizontalRunnerAutoscalerDesiredReplicas,
}
)
var (
horizontalRunnerAutoscalerMinReplicas = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "horizontalrunnerautoscaler_spec_min_replicas",
Help: "minReplicas of HorizontalRunnerAutoscaler",
},
[]string{hraName, hraNamespace},
)
horizontalRunnerAutoscalerMaxReplicas = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "horizontalrunnerautoscaler_spec_max_replicas",
Help: "maxReplicas of HorizontalRunnerAutoscaler",
},
[]string{hraName, hraNamespace},
)
horizontalRunnerAutoscalerDesiredReplicas = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "horizontalrunnerautoscaler_status_desired_replicas",
Help: "desiredReplicas of HorizontalRunnerAutoscaler",
},
[]string{hraName, hraNamespace},
)
)
func SetHorizontalRunnerAutoscalerSpec(o metav1.ObjectMeta, spec v1alpha1.HorizontalRunnerAutoscalerSpec) {
labels := prometheus.Labels{
hraName: o.Name,
hraNamespace: o.Namespace,
}
if spec.MaxReplicas != nil {
horizontalRunnerAutoscalerMaxReplicas.With(labels).Set(float64(*spec.MaxReplicas))
}
if spec.MinReplicas != nil {
horizontalRunnerAutoscalerMinReplicas.With(labels).Set(float64(*spec.MinReplicas))
}
}
func SetHorizontalRunnerAutoscalerStatus(o metav1.ObjectMeta, status v1alpha1.HorizontalRunnerAutoscalerStatus) {
labels := prometheus.Labels{
hraName: o.Name,
hraNamespace: o.Namespace,
}
if status.DesiredReplicas != nil {
horizontalRunnerAutoscalerDesiredReplicas.With(labels).Set(float64(*status.DesiredReplicas))
}
}

View File

@@ -0,0 +1,14 @@
// Package metrics provides the metrics of custom resources such as HRA.
//
// This depends on the metrics exporter of kubebuilder.
// See https://book.kubebuilder.io/reference/metrics.html for details.
package metrics
import (
"sigs.k8s.io/controller-runtime/pkg/metrics"
)
func init() {
metrics.Registry.MustRegister(runnerDeploymentMetrics...)
metrics.Registry.MustRegister(horizontalRunnerAutoscalerMetrics...)
}

View File

@@ -0,0 +1,37 @@
package metrics
import (
"github.com/prometheus/client_golang/prometheus"
"github.com/summerwind/actions-runner-controller/api/v1alpha1"
)
const (
rdName = "runnerdeployment"
rdNamespace = "namespace"
)
var (
runnerDeploymentMetrics = []prometheus.Collector{
runnerDeploymentReplicas,
}
)
var (
runnerDeploymentReplicas = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "runnerdeployment_spec_replicas",
Help: "replicas of RunnerDeployment",
},
[]string{rdName, rdNamespace},
)
)
func SetRunnerDeployment(rd v1alpha1.RunnerDeployment) {
labels := prometheus.Labels{
rdName: rd.Name,
rdNamespace: rd.Namespace,
}
if rd.Spec.Replicas != nil {
runnerDeploymentReplicas.With(labels).Set(float64(*rd.Spec.Replicas))
}
}

View File

@@ -18,12 +18,17 @@ package controllers
import ( import (
"context" "context"
"errors"
"fmt" "fmt"
"reflect"
"strings" "strings"
"time"
gogithub "github.com/google/go-github/v33/github"
"github.com/summerwind/actions-runner-controller/hash"
"k8s.io/apimachinery/pkg/util/wait"
"github.com/go-logr/logr" "github.com/go-logr/logr"
"k8s.io/apimachinery/pkg/api/errors" kerrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime" "k8s.io/apimachinery/pkg/runtime"
"k8s.io/client-go/tools/record" "k8s.io/client-go/tools/record"
ctrl "sigs.k8s.io/controller-runtime" ctrl "sigs.k8s.io/controller-runtime"
@@ -39,17 +44,27 @@ import (
const ( const (
containerName = "runner" containerName = "runner"
finalizerName = "runner.actions.summerwind.dev" finalizerName = "runner.actions.summerwind.dev"
LabelKeyPodTemplateHash = "pod-template-hash"
retryDelayOnGitHubAPIRateLimitError = 30 * time.Second
// This is an annotation internal to actions-runner-controller and can change in backward-incompatible ways
annotationKeyRegistrationOnly = "actions-runner-controller/registration-only"
) )
// RunnerReconciler reconciles a Runner object // RunnerReconciler reconciles a Runner object
type RunnerReconciler struct { type RunnerReconciler struct {
client.Client client.Client
Log logr.Logger Log logr.Logger
Recorder record.EventRecorder Recorder record.EventRecorder
Scheme *runtime.Scheme Scheme *runtime.Scheme
GitHubClient *github.Client GitHubClient *github.Client
RunnerImage string RunnerImage string
DockerImage string DockerImage string
Name string
RegistrationRecheckInterval time.Duration
RegistrationRecheckJitter time.Duration
} }
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=runners,verbs=get;list;watch;create;update;patch;delete // +kubebuilder:rbac:groups=actions.summerwind.dev,resources=runners,verbs=get;list;watch;create;update;patch;delete
@@ -93,9 +108,22 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
if removed { if removed {
if len(runner.Status.Registration.Token) > 0 { if len(runner.Status.Registration.Token) > 0 {
ok, err := r.unregisterRunner(ctx, runner.Spec.Organization, runner.Spec.Repository, runner.Name) ok, err := r.unregisterRunner(ctx, runner.Spec.Enterprise, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
if err != nil { if err != nil {
log.Error(err, "Failed to unregister runner") if errors.Is(err, &gogithub.RateLimitError{}) {
// We log the underlying error when we failed calling GitHub API to list or unregisters,
// or the runner is still busy.
log.Error(
err,
fmt.Sprintf(
"Failed to unregister runner due to GitHub API rate limits. Delaying retry for %s to avoid excessive GitHub API calls",
retryDelayOnGitHubAPIRateLimitError,
),
)
return ctrl.Result{RequeueAfter: retryDelayOnGitHubAPIRateLimitError}, err
}
return ctrl.Result{}, err return ctrl.Result{}, err
} }
@@ -109,8 +137,8 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
newRunner := runner.DeepCopy() newRunner := runner.DeepCopy()
newRunner.ObjectMeta.Finalizers = finalizers newRunner.ObjectMeta.Finalizers = finalizers
if err := r.Update(ctx, newRunner); err != nil { if err := r.Patch(ctx, newRunner, client.MergeFrom(&runner)); err != nil {
log.Error(err, "Failed to update runner") log.Error(err, "Failed to update runner for finalizer removal")
return ctrl.Result{}, err return ctrl.Result{}, err
} }
@@ -120,39 +148,46 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
return ctrl.Result{}, nil return ctrl.Result{}, nil
} }
if !runner.IsRegisterable() { registrationOnly := metav1.HasAnnotation(runner.ObjectMeta, annotationKeyRegistrationOnly)
rt, err := r.GitHubClient.GetRegistrationToken(ctx, runner.Spec.Organization, runner.Spec.Repository, runner.Name) if registrationOnly && runner.Status.Phase != "" {
if err != nil { // At this point we are sure that the registration-only runner has successfully configured and
r.Recorder.Event(&runner, corev1.EventTypeWarning, "FailedUpdateRegistrationToken", "Updating registration token failed") // is of `offline` status, because we set runner.Status.Phase to that of the runner pod only after
log.Error(err, "Failed to get new registration token") // successful registration.
return ctrl.Result{}, err
var pod corev1.Pod
if err := r.Get(ctx, req.NamespacedName, &pod); err != nil {
if !kerrors.IsNotFound(err) {
log.Info(fmt.Sprintf("Retrying soon as we failed to get registration-only runner pod: %v", err))
return ctrl.Result{Requeue: true}, nil
}
} else if err := r.Delete(ctx, &pod); err != nil {
if !kerrors.IsNotFound(err) {
log.Info(fmt.Sprintf("Retrying soon as we failed to delete registration-only runner pod: %v", err))
return ctrl.Result{Requeue: true}, nil
}
} }
updated := runner.DeepCopy() log.Info("Successfully deleted egistration-only runner pod to free node and cluster resource")
updated.Status.Registration = v1alpha1.RunnerStatusRegistration{
Organization: runner.Spec.Organization,
Repository: runner.Spec.Repository,
Labels: runner.Spec.Labels,
Token: rt.GetToken(),
ExpiresAt: metav1.NewTime(rt.GetExpiresAt().Time),
}
if err := r.Status().Update(ctx, updated); err != nil { // Return here to not recreate the deleted pod, because recreating it is the waste of cluster and node resource,
log.Error(err, "Failed to update runner status") // and also defeats the original purpose of scale-from/to-zero we're trying to implement by using the registration-only runner.
return ctrl.Result{}, err
}
r.Recorder.Event(&runner, corev1.EventTypeNormal, "RegistrationTokenUpdated", "Successfully update registration token")
log.Info("Updated registration token", "repository", runner.Spec.Repository)
return ctrl.Result{}, nil return ctrl.Result{}, nil
} }
var pod corev1.Pod var pod corev1.Pod
if err := r.Get(ctx, req.NamespacedName, &pod); err != nil { if err := r.Get(ctx, req.NamespacedName, &pod); err != nil {
if !errors.IsNotFound(err) { if !kerrors.IsNotFound(err) {
return ctrl.Result{}, err return ctrl.Result{}, err
} }
if updated, err := r.updateRegistrationToken(ctx, runner); err != nil {
return ctrl.Result{}, err
} else if updated {
return ctrl.Result{Requeue: true}, nil
}
newPod, err := r.newPod(runner) newPod, err := r.newPod(runner)
if err != nil { if err != nil {
log.Error(err, "Could not create pod") log.Error(err, "Could not create pod")
@@ -160,65 +195,299 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
} }
if err := r.Create(ctx, &newPod); err != nil { if err := r.Create(ctx, &newPod); err != nil {
if kerrors.IsAlreadyExists(err) {
// Gracefully handle pod-already-exists errors due to informer cache delay.
// Without this we got a few errors like the below on new runner pod:
// 2021-03-16T00:23:10.116Z ERROR controller-runtime.controller Reconciler error {"controller": "runner-controller", "request": "default/example-runnerdeploy-b2g2g-j4mcp", "error": "pods \"example-runnerdeploy-b2g2g-j4mcp\" already exists"}
log.Info(
"Failed to create pod due to AlreadyExists error. Probably this pod has been already created in previous reconcilation but is still not in the informer cache. Will retry on pod created. If it doesn't repeat, there's no problem",
)
return ctrl.Result{}, nil
}
log.Error(err, "Failed to create pod resource") log.Error(err, "Failed to create pod resource")
return ctrl.Result{}, err return ctrl.Result{}, err
} }
r.Recorder.Event(&runner, corev1.EventTypeNormal, "PodCreated", fmt.Sprintf("Created pod '%s'", newPod.Name)) r.Recorder.Event(&runner, corev1.EventTypeNormal, "PodCreated", fmt.Sprintf("Created pod '%s'", newPod.Name))
log.Info("Created runner pod", "repository", runner.Spec.Repository) log.Info("Created runner pod", "repository", runner.Spec.Repository)
} else { } else {
if runner.Status.Phase != string(pod.Status.Phase) { if !pod.ObjectMeta.DeletionTimestamp.IsZero() {
updated := runner.DeepCopy() deletionTimeout := 1 * time.Minute
updated.Status.Phase = string(pod.Status.Phase) currentTime := time.Now()
updated.Status.Reason = pod.Status.Reason deletionDidTimeout := currentTime.Sub(pod.DeletionTimestamp.Add(deletionTimeout)) > 0
updated.Status.Message = pod.Status.Message
if err := r.Status().Update(ctx, updated); err != nil { if deletionDidTimeout {
log.Error(err, "Failed to update runner status") log.Info(
fmt.Sprintf("Failed to delete pod within %s. ", deletionTimeout)+
"This is typically the case when a Kubernetes node became unreachable "+
"and the kube controller started evicting nodes. Forcefully deleting the pod to not get stuck.",
"podDeletionTimestamp", pod.DeletionTimestamp,
"currentTime", currentTime,
"configuredDeletionTimeout", deletionTimeout,
)
var force int64 = 0
// forcefully delete runner as we would otherwise get stuck if the node stays unreachable
if err := r.Delete(ctx, &pod, &client.DeleteOptions{GracePeriodSeconds: &force}); err != nil {
// probably
if !kerrors.IsNotFound(err) {
log.Error(err, "Failed to forcefully delete pod resource ...")
return ctrl.Result{}, err
}
// forceful deletion finally succeeded
return ctrl.Result{Requeue: true}, nil
}
r.Recorder.Event(&runner, corev1.EventTypeNormal, "PodDeleted", fmt.Sprintf("Forcefully deleted pod '%s'", pod.Name))
log.Info("Forcefully deleted runner pod", "repository", runner.Spec.Repository)
// give kube manager a little time to forcefully delete the stuck pod
return ctrl.Result{RequeueAfter: 3 * time.Second}, err
} else {
return ctrl.Result{}, err return ctrl.Result{}, err
} }
return ctrl.Result{}, nil
} }
if !pod.ObjectMeta.DeletionTimestamp.IsZero() { // If pod has ended up succeeded we need to restart it
return ctrl.Result{}, err // Happens e.g. when dind is in runner and run completes
} stopped := pod.Status.Phase == corev1.PodSucceeded
restart := false if !stopped {
if pod.Status.Phase == corev1.PodRunning {
for _, status := range pod.Status.ContainerStatuses {
if status.Name != containerName {
continue
}
if pod.Status.Phase == corev1.PodRunning { if status.State.Terminated != nil && status.State.Terminated.ExitCode == 0 {
for _, status := range pod.Status.ContainerStatuses { stopped = true
if status.Name != containerName { }
continue
}
if status.State.Terminated != nil && status.State.Terminated.ExitCode == 0 {
restart = true
} }
} }
} }
restart := stopped
if registrationOnly && stopped {
restart = false
log.Info(
"Observed that registration-only runner for scaling-from-zero has successfully stopped. " +
"Unlike other pods, this one will be recreated only when runner spec changes.",
)
}
if updated, err := r.updateRegistrationToken(ctx, runner); err != nil {
return ctrl.Result{}, err
} else if updated {
return ctrl.Result{Requeue: true}, nil
}
newPod, err := r.newPod(runner) newPod, err := r.newPod(runner)
if err != nil { if err != nil {
log.Error(err, "Could not create pod") log.Error(err, "Could not create pod")
return ctrl.Result{}, err return ctrl.Result{}, err
} }
runnerBusy, err := r.isRunnerBusy(ctx, runner.Spec.Organization, runner.Spec.Repository, runner.Name) if registrationOnly {
if err != nil { newPod.Spec.Containers[0].Env = append(
log.Error(err, "Failed to check if runner is busy") newPod.Spec.Containers[0].Env,
corev1.EnvVar{
Name: "RUNNER_REGISTRATION_ONLY",
Value: "true",
},
)
}
var registrationRecheckDelay time.Duration
// all checks done below only decide whether a restart is needed
// if a restart was already decided before, there is no need for the checks
// saving API calls and scary log messages
if !restart {
registrationCheckInterval := time.Minute
if r.RegistrationRecheckInterval > 0 {
registrationCheckInterval = r.RegistrationRecheckInterval
}
// We want to call ListRunners GitHub Actions API only once per runner per minute.
// This if block, in conjunction with:
// return ctrl.Result{RequeueAfter: registrationRecheckDelay}, nil
// achieves that.
if lastCheckTime := runner.Status.LastRegistrationCheckTime; lastCheckTime != nil {
nextCheckTime := lastCheckTime.Add(registrationCheckInterval)
now := time.Now()
// Requeue scheduled by RequeueAfter can happen a bit earlier (like dozens of milliseconds)
// so to avoid excessive, in-effective retry, we heuristically ignore the remaining delay in case it is
// shorter than 1s
requeueAfter := nextCheckTime.Sub(now) - time.Second
if requeueAfter > 0 {
log.Info(
fmt.Sprintf("Skipped registration check because it's deferred until %s. Retrying in %s at latest", nextCheckTime, requeueAfter),
"lastRegistrationCheckTime", lastCheckTime,
"registrationCheckInterval", registrationCheckInterval,
)
// Without RequeueAfter, the controller may not retry on scheduled. Instead, it must wait until the
// next sync period passes, which can be too much later than nextCheckTime.
//
// We need to requeue on this reconcilation even though we have already scheduled the initial
// requeue previously with `return ctrl.Result{RequeueAfter: registrationRecheckDelay}, nil`.
// Apparently, the workqueue used by controller-runtime seems to deduplicate and resets the delay on
// other requeues- so the initial scheduled requeue may have been reset due to requeue on
// spec/status change.
return ctrl.Result{RequeueAfter: requeueAfter}, nil
}
}
notFound := false
offline := false
runnerBusy, err := r.GitHubClient.IsRunnerBusy(ctx, runner.Spec.Enterprise, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
currentTime := time.Now()
if err != nil {
var notFoundException *github.RunnerNotFound
var offlineException *github.RunnerOffline
if errors.As(err, &notFoundException) {
notFound = true
} else if errors.As(err, &offlineException) {
offline = true
} else {
var e *gogithub.RateLimitError
if errors.As(err, &e) {
// We log the underlying error when we failed calling GitHub API to list or unregisters,
// or the runner is still busy.
log.Error(
err,
fmt.Sprintf(
"Failed to check if runner is busy due to Github API rate limit. Retrying in %s to avoid excessive GitHub API calls",
retryDelayOnGitHubAPIRateLimitError,
),
)
return ctrl.Result{RequeueAfter: retryDelayOnGitHubAPIRateLimitError}, err
}
return ctrl.Result{}, err
}
}
// See the `newPod` function called above for more information
// about when this hash changes.
curHash := pod.Labels[LabelKeyPodTemplateHash]
newHash := newPod.Labels[LabelKeyPodTemplateHash]
if !runnerBusy && curHash != newHash {
restart = true
}
registrationTimeout := 10 * time.Minute
durationAfterRegistrationTimeout := currentTime.Sub(pod.CreationTimestamp.Add(registrationTimeout))
registrationDidTimeout := durationAfterRegistrationTimeout > 0
if notFound {
if registrationDidTimeout {
log.Info(
"Runner failed to register itself to GitHub in timely manner. "+
"Recreating the pod to see if it resolves the issue. "+
"CAUTION: If you see this a lot, you should investigate the root cause. "+
"See https://github.com/summerwind/actions-runner-controller/issues/288",
"podCreationTimestamp", pod.CreationTimestamp,
"currentTime", currentTime,
"configuredRegistrationTimeout", registrationTimeout,
)
restart = true
} else {
log.V(1).Info(
"Runner pod exists but we failed to check if runner is busy. Apparently it still needs more time.",
"runnerName", runner.Name,
)
}
} else if offline {
if registrationOnly {
log.Info(
"Observed that registration-only runner for scaling-from-zero has successfully been registered.",
"podCreationTimestamp", pod.CreationTimestamp,
"currentTime", currentTime,
"configuredRegistrationTimeout", registrationTimeout,
)
} else if registrationDidTimeout {
log.Info(
"Already existing GitHub runner still appears offline . "+
"Recreating the pod to see if it resolves the issue. "+
"CAUTION: If you see this a lot, you should investigate the root cause. ",
"podCreationTimestamp", pod.CreationTimestamp,
"currentTime", currentTime,
"configuredRegistrationTimeout", registrationTimeout,
)
restart = true
} else {
log.V(1).Info(
"Runner pod exists but the GitHub runner appears to be still offline. Waiting for runner to get online ...",
"runnerName", runner.Name,
)
}
}
if (notFound || (offline && !registrationOnly)) && !registrationDidTimeout {
registrationRecheckJitter := 10 * time.Second
if r.RegistrationRecheckJitter > 0 {
registrationRecheckJitter = r.RegistrationRecheckJitter
}
registrationRecheckDelay = registrationCheckInterval + wait.Jitter(registrationRecheckJitter, 0.1)
}
}
// Don't do anything if there's no need to restart the runner
if !restart {
// This guard enables us to update runner.Status.Phase to `Running` only after
// the runner is registered to GitHub.
if registrationRecheckDelay > 0 {
log.V(1).Info(fmt.Sprintf("Rechecking the runner registration in %s", registrationRecheckDelay))
updated := runner.DeepCopy()
updated.Status.LastRegistrationCheckTime = &metav1.Time{Time: time.Now()}
if err := r.Status().Patch(ctx, updated, client.MergeFrom(&runner)); err != nil {
log.Error(err, "Failed to update runner status for LastRegistrationCheckTime")
return ctrl.Result{}, err
}
return ctrl.Result{RequeueAfter: registrationRecheckDelay}, nil
}
if runner.Status.Phase != string(pod.Status.Phase) {
if pod.Status.Phase == corev1.PodRunning {
// Seeing this message, you can expect the runner to become `Running` soon.
log.Info(
"Runner appears to have registered and running.",
"podCreationTimestamp", pod.CreationTimestamp,
)
}
updated := runner.DeepCopy()
updated.Status.Phase = string(pod.Status.Phase)
updated.Status.Reason = pod.Status.Reason
updated.Status.Message = pod.Status.Message
if err := r.Status().Patch(ctx, updated, client.MergeFrom(&runner)); err != nil {
log.Error(err, "Failed to update runner status for Phase/Reason/Message")
return ctrl.Result{}, err
}
}
return ctrl.Result{}, nil return ctrl.Result{}, nil
} }
if !runnerBusy && (!reflect.DeepEqual(pod.Spec.Containers[0].Env, newPod.Spec.Containers[0].Env) || pod.Spec.Containers[0].Image != newPod.Spec.Containers[0].Image) { // Delete current pod if recreation is needed
restart = true
}
if !restart {
return ctrl.Result{}, err
}
if err := r.Delete(ctx, &pod); err != nil { if err := r.Delete(ctx, &pod); err != nil {
log.Error(err, "Failed to delete pod resource") log.Error(err, "Failed to delete pod resource")
return ctrl.Result{}, err return ctrl.Result{}, err
@@ -231,23 +500,8 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
return ctrl.Result{}, nil return ctrl.Result{}, nil
} }
func (r *RunnerReconciler) isRunnerBusy(ctx context.Context, org, repo, name string) (bool, error) { func (r *RunnerReconciler) unregisterRunner(ctx context.Context, enterprise, org, repo, name string) (bool, error) {
runners, err := r.GitHubClient.ListRunners(ctx, org, repo) runners, err := r.GitHubClient.ListRunners(ctx, enterprise, org, repo)
if err != nil {
return false, err
}
for _, runner := range runners {
if runner.GetName() == name {
return runner.GetBusy(), nil
}
}
return false, fmt.Errorf("runner not found")
}
func (r *RunnerReconciler) unregisterRunner(ctx context.Context, org, repo, name string) (bool, error) {
runners, err := r.GitHubClient.ListRunners(ctx, org, repo)
if err != nil { if err != nil {
return false, err return false, err
} }
@@ -267,17 +521,54 @@ func (r *RunnerReconciler) unregisterRunner(ctx context.Context, org, repo, name
return false, nil return false, nil
} }
if err := r.GitHubClient.RemoveRunner(ctx, org, repo, id); err != nil { if err := r.GitHubClient.RemoveRunner(ctx, enterprise, org, repo, id); err != nil {
return false, err return false, err
} }
return true, nil return true, nil
} }
func (r *RunnerReconciler) updateRegistrationToken(ctx context.Context, runner v1alpha1.Runner) (bool, error) {
if runner.IsRegisterable() {
return false, nil
}
log := r.Log.WithValues("runner", runner.Name)
rt, err := r.GitHubClient.GetRegistrationToken(ctx, runner.Spec.Enterprise, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
if err != nil {
r.Recorder.Event(&runner, corev1.EventTypeWarning, "FailedUpdateRegistrationToken", "Updating registration token failed")
log.Error(err, "Failed to get new registration token")
return false, err
}
updated := runner.DeepCopy()
updated.Status.Registration = v1alpha1.RunnerStatusRegistration{
Organization: runner.Spec.Organization,
Repository: runner.Spec.Repository,
Labels: runner.Spec.Labels,
Token: rt.GetToken(),
ExpiresAt: metav1.NewTime(rt.GetExpiresAt().Time),
}
if err := r.Status().Patch(ctx, updated, client.MergeFrom(&runner)); err != nil {
log.Error(err, "Failed to update runner status for Registration")
return false, err
}
r.Recorder.Event(&runner, corev1.EventTypeNormal, "RegistrationTokenUpdated", "Successfully update registration token")
log.Info("Updated registration token", "repository", runner.Spec.Repository)
return true, nil
}
func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) { func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
var ( var (
privileged bool = true privileged bool = true
group int64 = 0 dockerdInRunner bool = runner.Spec.DockerdWithinRunnerContainer != nil && *runner.Spec.DockerdWithinRunnerContainer
dockerEnabled bool = runner.Spec.DockerEnabled == nil || *runner.Spec.DockerEnabled
ephemeral bool = runner.Spec.Ephemeral == nil || *runner.Spec.Ephemeral
dockerdInRunnerPrivileged bool = dockerdInRunner
) )
runnerImage := runner.Spec.Image runnerImage := runner.Spec.Image
@@ -285,6 +576,11 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
runnerImage = r.RunnerImage runnerImage = r.RunnerImage
} }
workDir := runner.Spec.WorkDir
if workDir == "" {
workDir = "/runner/_work"
}
runnerImagePullPolicy := runner.Spec.ImagePullPolicy runnerImagePullPolicy := runner.Spec.ImagePullPolicy
if runnerImagePullPolicy == "" { if runnerImagePullPolicy == "" {
runnerImagePullPolicy = corev1.PullAlways runnerImagePullPolicy = corev1.PullAlways
@@ -303,22 +599,96 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
Name: "RUNNER_REPO", Name: "RUNNER_REPO",
Value: runner.Spec.Repository, Value: runner.Spec.Repository,
}, },
{
Name: "RUNNER_ENTERPRISE",
Value: runner.Spec.Enterprise,
},
{ {
Name: "RUNNER_LABELS", Name: "RUNNER_LABELS",
Value: strings.Join(runner.Spec.Labels, ","), Value: strings.Join(runner.Spec.Labels, ","),
}, },
{
Name: "RUNNER_GROUP",
Value: runner.Spec.Group,
},
{ {
Name: "RUNNER_TOKEN", Name: "RUNNER_TOKEN",
Value: runner.Status.Registration.Token, Value: runner.Status.Registration.Token,
}, },
{
Name: "DOCKERD_IN_RUNNER",
Value: fmt.Sprintf("%v", dockerdInRunner),
},
{
Name: "GITHUB_URL",
Value: r.GitHubClient.GithubBaseURL,
},
{
Name: "RUNNER_WORKDIR",
Value: workDir,
},
{
Name: "RUNNER_EPHEMERAL",
Value: fmt.Sprintf("%v", ephemeral),
},
}
if metav1.HasAnnotation(runner.ObjectMeta, annotationKeyRegistrationOnly) {
env = append(env, corev1.EnvVar{
Name: "RUNNER_REGISTRATION_ONLY",
Value: "true",
},
)
} }
env = append(env, runner.Spec.Env...) env = append(env, runner.Spec.Env...)
labels := map[string]string{}
for k, v := range runner.Labels {
labels[k] = v
}
// This implies that...
//
// (1) We recreate the runner pod whenever the runner has changes in:
// - metadata.labels (excluding "runner-template-hash" added by the parent RunnerReplicaSet
// - metadata.annotations
// - metadata.spec (including image, env, organization, repository, group, and so on)
// - GithubBaseURL setting of the controller (can be configured via GITHUB_ENTERPRISE_URL)
//
// (2) We don't recreate the runner pod when there are changes in:
// - runner.status.registration.token
// - This token expires and changes hourly, but you don't need to recreate the pod due to that.
// It's the opposite.
// An unexpired token is required only when the runner agent is registering itself on launch.
//
// In other words, the registered runner doesn't get invalidated on registration token expiration.
// A registered runner's session and the a registration token seem to have two different and independent
// lifecycles.
//
// See https://github.com/summerwind/actions-runner-controller/issues/143 for more context.
labels[LabelKeyPodTemplateHash] = hash.FNVHashStringObjects(
filterLabels(runner.Labels, LabelKeyRunnerTemplateHash),
runner.Annotations,
runner.Spec,
r.GitHubClient.GithubBaseURL,
)
var seLinuxOptions *corev1.SELinuxOptions
if runner.Spec.SecurityContext != nil {
seLinuxOptions = runner.Spec.SecurityContext.SELinuxOptions
if seLinuxOptions != nil {
privileged = false
dockerdInRunnerPrivileged = false
}
}
pod := corev1.Pod{ pod := corev1.Pod{
ObjectMeta: metav1.ObjectMeta{ ObjectMeta: metav1.ObjectMeta{
Name: runner.Name, Name: runner.Name,
Namespace: runner.Namespace, Namespace: runner.Namespace,
Labels: runner.Labels, Labels: labels,
Annotations: runner.Annotations, Annotations: runner.Annotations,
}, },
Spec: corev1.PodSpec{ Spec: corev1.PodSpec{
@@ -330,58 +700,172 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
ImagePullPolicy: runnerImagePullPolicy, ImagePullPolicy: runnerImagePullPolicy,
Env: env, Env: env,
EnvFrom: runner.Spec.EnvFrom, EnvFrom: runner.Spec.EnvFrom,
VolumeMounts: []corev1.VolumeMount{
{
Name: "work",
MountPath: "/runner/_work",
},
{
Name: "docker",
MountPath: "/var/run",
},
},
SecurityContext: &corev1.SecurityContext{ SecurityContext: &corev1.SecurityContext{
RunAsGroup: &group, // Runner need to run privileged if it contains DinD
Privileged: &dockerdInRunnerPrivileged,
}, },
Resources: runner.Spec.Resources, Resources: runner.Spec.Resources,
}, },
{
Name: "docker",
Image: r.DockerImage,
VolumeMounts: []corev1.VolumeMount{
{
Name: "work",
MountPath: "/runner/_work",
},
{
Name: "docker",
MountPath: "/var/run",
},
},
SecurityContext: &corev1.SecurityContext{
Privileged: &privileged,
},
},
},
Volumes: []corev1.Volume{
{
Name: "work",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{},
},
},
{
Name: "docker",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{},
},
},
}, },
}, },
} }
if mtu := runner.Spec.DockerMTU; mtu != nil && dockerdInRunner {
pod.Spec.Containers[0].Env = append(pod.Spec.Containers[0].Env, []corev1.EnvVar{
{
Name: "MTU",
Value: fmt.Sprintf("%d", *runner.Spec.DockerMTU),
},
}...)
}
if mirror := runner.Spec.DockerRegistryMirror; mirror != nil && dockerdInRunner {
pod.Spec.Containers[0].Env = append(pod.Spec.Containers[0].Env, []corev1.EnvVar{
{
Name: "DOCKER_REGISTRY_MIRROR",
Value: *runner.Spec.DockerRegistryMirror,
},
}...)
}
//
// /runner must be generated on runtime from /runnertmp embedded in the container image.
//
// When you're NOT using dindWithinRunner=true,
// it must also be shared with the dind container as it seems like required to run docker steps.
//
runnerVolumeName := "runner"
runnerVolumeMountPath := "/runner"
runnerVolumeEmptyDir := &corev1.EmptyDirVolumeSource{}
if runner.Spec.VolumeSizeLimit != nil {
runnerVolumeEmptyDir.SizeLimit = runner.Spec.VolumeSizeLimit
}
pod.Spec.Volumes = append(pod.Spec.Volumes,
corev1.Volume{
Name: runnerVolumeName,
VolumeSource: corev1.VolumeSource{
EmptyDir: runnerVolumeEmptyDir,
},
},
)
pod.Spec.Containers[0].VolumeMounts = append(pod.Spec.Containers[0].VolumeMounts,
corev1.VolumeMount{
Name: runnerVolumeName,
MountPath: runnerVolumeMountPath,
},
)
if !dockerdInRunner && dockerEnabled {
pod.Spec.Volumes = append(pod.Spec.Volumes,
corev1.Volume{
Name: "work",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{},
},
},
corev1.Volume{
Name: "certs-client",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{},
},
},
)
pod.Spec.Containers[0].VolumeMounts = append(pod.Spec.Containers[0].VolumeMounts,
corev1.VolumeMount{
Name: "work",
MountPath: workDir,
},
corev1.VolumeMount{
Name: "certs-client",
MountPath: "/certs/client",
ReadOnly: true,
},
)
pod.Spec.Containers[0].Env = append(pod.Spec.Containers[0].Env, []corev1.EnvVar{
{
Name: "DOCKER_HOST",
Value: "tcp://localhost:2376",
},
{
Name: "DOCKER_TLS_VERIFY",
Value: "1",
},
{
Name: "DOCKER_CERT_PATH",
Value: "/certs/client",
},
}...)
// Determine the volume mounts assigned to the docker sidecar. In case extra mounts are included in the RunnerSpec, append them to the standard
// set of mounts. See https://github.com/summerwind/actions-runner-controller/issues/435 for context.
dockerVolumeMounts := []corev1.VolumeMount{
{
Name: "work",
MountPath: workDir,
},
{
Name: runnerVolumeName,
MountPath: runnerVolumeMountPath,
},
{
Name: "certs-client",
MountPath: "/certs/client",
},
}
if extraDockerVolumeMounts := runner.Spec.DockerVolumeMounts; extraDockerVolumeMounts != nil {
dockerVolumeMounts = append(dockerVolumeMounts, extraDockerVolumeMounts...)
}
pod.Spec.Containers = append(pod.Spec.Containers, corev1.Container{
Name: "docker",
Image: r.DockerImage,
VolumeMounts: dockerVolumeMounts,
Env: []corev1.EnvVar{
{
Name: "DOCKER_TLS_CERTDIR",
Value: "/certs",
},
},
SecurityContext: &corev1.SecurityContext{
Privileged: &privileged,
SELinuxOptions: seLinuxOptions,
},
Resources: runner.Spec.DockerdContainerResources,
})
if mtu := runner.Spec.DockerMTU; mtu != nil {
pod.Spec.Containers[1].Env = append(pod.Spec.Containers[1].Env, []corev1.EnvVar{
// See https://docs.docker.com/engine/security/rootless/
{
Name: "DOCKERD_ROOTLESS_ROOTLESSKIT_MTU",
Value: fmt.Sprintf("%d", *runner.Spec.DockerMTU),
},
}...)
pod.Spec.Containers[1].Args = append(pod.Spec.Containers[1].Args,
"--mtu",
fmt.Sprintf("%d", *runner.Spec.DockerMTU),
)
}
if mirror := runner.Spec.DockerRegistryMirror; mirror != nil {
pod.Spec.Containers[1].Args = append(pod.Spec.Containers[1].Args,
fmt.Sprintf("--registry-mirror=%s", *runner.Spec.DockerRegistryMirror),
)
}
}
if len(runner.Spec.Containers) != 0 { if len(runner.Spec.Containers) != 0 {
pod.Spec.Containers = runner.Spec.Containers pod.Spec.Containers = runner.Spec.Containers
for i := 0; i < len(pod.Spec.Containers); i++ {
if pod.Spec.Containers[i].Name == containerName {
pod.Spec.Containers[i].Env = append(pod.Spec.Containers[i].Env, env...)
}
}
} }
if len(runner.Spec.VolumeMounts) != 0 { if len(runner.Spec.VolumeMounts) != 0 {
@@ -433,6 +917,14 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
pod.Spec.TerminationGracePeriodSeconds = runner.Spec.TerminationGracePeriodSeconds pod.Spec.TerminationGracePeriodSeconds = runner.Spec.TerminationGracePeriodSeconds
} }
if len(runner.Spec.HostAliases) != 0 {
pod.Spec.HostAliases = runner.Spec.HostAliases
}
if runner.Spec.RuntimeClassName != nil {
pod.Spec.RuntimeClassName = runner.Spec.RuntimeClassName
}
if err := ctrl.SetControllerReference(&runner, &pod, r.Scheme); err != nil { if err := ctrl.SetControllerReference(&runner, &pod, r.Scheme); err != nil {
return pod, err return pod, err
} }
@@ -441,11 +933,17 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
} }
func (r *RunnerReconciler) SetupWithManager(mgr ctrl.Manager) error { func (r *RunnerReconciler) SetupWithManager(mgr ctrl.Manager) error {
r.Recorder = mgr.GetEventRecorderFor("runner-controller") name := "runner-controller"
if r.Name != "" {
name = r.Name
}
r.Recorder = mgr.GetEventRecorderFor(name)
return ctrl.NewControllerManagedBy(mgr). return ctrl.NewControllerManagedBy(mgr).
For(&v1alpha1.Runner{}). For(&v1alpha1.Runner{}).
Owns(&corev1.Pod{}). Owns(&corev1.Pod{}).
Named(name).
Complete(r) Complete(r)
} }

View File

@@ -20,6 +20,7 @@ import (
"context" "context"
"fmt" "fmt"
"hash/fnv" "hash/fnv"
"reflect"
"sort" "sort"
"time" "time"
@@ -37,10 +38,12 @@ import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"github.com/summerwind/actions-runner-controller/api/v1alpha1" "github.com/summerwind/actions-runner-controller/api/v1alpha1"
"github.com/summerwind/actions-runner-controller/controllers/metrics"
) )
const ( const (
LabelKeyRunnerTemplateHash = "runner-template-hash" LabelKeyRunnerTemplateHash = "runner-template-hash"
LabelKeyRunnerDeploymentName = "runner-deployment-name"
runnerSetOwnerKey = ".metadata.controller" runnerSetOwnerKey = ".metadata.controller"
) )
@@ -48,9 +51,11 @@ const (
// RunnerDeploymentReconciler reconciles a Runner object // RunnerDeploymentReconciler reconciles a Runner object
type RunnerDeploymentReconciler struct { type RunnerDeploymentReconciler struct {
client.Client client.Client
Log logr.Logger Log logr.Logger
Recorder record.EventRecorder Recorder record.EventRecorder
Scheme *runtime.Scheme Scheme *runtime.Scheme
CommonRunnerLabels []string
Name string
} }
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=runnerdeployments,verbs=get;list;watch;create;update;patch;delete // +kubebuilder:rbac:groups=actions.summerwind.dev,resources=runnerdeployments,verbs=get;list;watch;create;update;patch;delete
@@ -73,6 +78,8 @@ func (r *RunnerDeploymentReconciler) Reconcile(req ctrl.Request) (ctrl.Result, e
return ctrl.Result{}, nil return ctrl.Result{}, nil
} }
metrics.SetRunnerDeployment(rd)
var myRunnerReplicaSetList v1alpha1.RunnerReplicaSetList var myRunnerReplicaSetList v1alpha1.RunnerReplicaSetList
if err := r.List(ctx, &myRunnerReplicaSetList, client.InNamespace(req.Namespace), client.MatchingFields{runnerSetOwnerKey: req.Name}); err != nil { if err := r.List(ctx, &myRunnerReplicaSetList, client.InNamespace(req.Namespace), client.MatchingFields{runnerSetOwnerKey: req.Name}); err != nil {
return ctrl.Result{}, err return ctrl.Result{}, err
@@ -141,6 +148,28 @@ func (r *RunnerDeploymentReconciler) Reconcile(req ctrl.Request) (ctrl.Result, e
return ctrl.Result{RequeueAfter: 5 * time.Second}, nil return ctrl.Result{RequeueAfter: 5 * time.Second}, nil
} }
if !reflect.DeepEqual(newestSet.Spec.Selector, desiredRS.Spec.Selector) {
updateSet := newestSet.DeepCopy()
updateSet.Spec = *desiredRS.Spec.DeepCopy()
// A selector update change doesn't trigger replicaset replacement,
// but we still need to update the existing replicaset with it.
// Otherwise selector-based runner query will never work on replicasets created before the controller v0.17.0
// See https://github.com/summerwind/actions-runner-controller/pull/355#discussion_r585379259
if err := r.Client.Update(ctx, updateSet); err != nil {
log.Error(err, "Failed to update runnerreplicaset resource")
return ctrl.Result{}, err
}
// At this point, we are already sure that there's no need to create a new replicaset
// as the runner template hash is not changed.
//
// But we still need to requeue for the (possibly rare) cases that there are still old replicasets that needs
// to be cleaned up.
return ctrl.Result{RequeueAfter: 5 * time.Second}, nil
}
const defaultReplicas = 1 const defaultReplicas = 1
currentDesiredReplicas := getIntOrDefault(newestSet.Spec.Replicas, defaultReplicas) currentDesiredReplicas := getIntOrDefault(newestSet.Spec.Replicas, defaultReplicas)
@@ -159,25 +188,42 @@ func (r *RunnerDeploymentReconciler) Reconcile(req ctrl.Request) (ctrl.Result, e
return ctrl.Result{}, err return ctrl.Result{}, err
} }
// Do we old runner replica sets that should eventually deleted? // Do we have old runner replica sets that should eventually deleted?
if len(oldSets) > 0 { if len(oldSets) > 0 {
readyReplicas := newestSet.Status.ReadyReplicas var readyReplicas int
if newestSet.Status.ReadyReplicas != nil {
readyReplicas = *newestSet.Status.ReadyReplicas
}
if readyReplicas < currentDesiredReplicas { oldSetsCount := len(oldSets)
log.WithValues("runnerreplicaset", types.NamespacedName{
logWithDebugInfo := log.WithValues(
"newest_runnerreplicaset", types.NamespacedName{
Namespace: newestSet.Namespace, Namespace: newestSet.Namespace,
Name: newestSet.Name, Name: newestSet.Name,
}). },
Info("Waiting until the newest runner replica set to be 100% available") "newest_runnerreplicaset_replicas_ready", readyReplicas,
"newest_runnerreplicaset_replicas_desired", currentDesiredReplicas,
"old_runnerreplicasets_count", oldSetsCount,
)
return ctrl.Result{RequeueAfter: 10 * time.Second}, nil if readyReplicas < currentDesiredReplicas {
logWithDebugInfo.
Info("Waiting until the newest runnerreplicaset to be 100% available")
return ctrl.Result{}, nil
}
if oldSetsCount > 0 {
logWithDebugInfo.
Info("The newest runnerreplicaset is 100% available. Deleting old runnerreplicasets")
} }
for i := range oldSets { for i := range oldSets {
rs := oldSets[i] rs := oldSets[i]
if err := r.Client.Delete(ctx, &rs); err != nil { if err := r.Client.Delete(ctx, &rs); err != nil {
log.Error(err, "Failed to delete runner resource") log.Error(err, "Failed to delete runnerreplicaset resource")
return ctrl.Result{}, err return ctrl.Result{}, err
} }
@@ -188,14 +234,49 @@ func (r *RunnerDeploymentReconciler) Reconcile(req ctrl.Request) (ctrl.Result, e
} }
} }
if rd.Spec.Replicas == nil && desiredRS.Spec.Replicas != nil { var replicaSets []v1alpha1.RunnerReplicaSet
replicaSets = append(replicaSets, *newestSet)
replicaSets = append(replicaSets, oldSets...)
var totalCurrentReplicas, totalStatusAvailableReplicas, updatedReplicas int
for _, rs := range replicaSets {
var current, available int
if rs.Status.Replicas != nil {
current = *rs.Status.Replicas
}
if rs.Status.AvailableReplicas != nil {
available = *rs.Status.AvailableReplicas
}
totalCurrentReplicas += current
totalStatusAvailableReplicas += available
}
if newestSet.Status.Replicas != nil {
updatedReplicas = *newestSet.Status.Replicas
}
var status v1alpha1.RunnerDeploymentStatus
status.AvailableReplicas = &totalStatusAvailableReplicas
status.ReadyReplicas = &totalStatusAvailableReplicas
status.DesiredReplicas = &newDesiredReplicas
status.Replicas = &totalCurrentReplicas
status.UpdatedReplicas = &updatedReplicas
if !reflect.DeepEqual(rd.Status, status) {
updated := rd.DeepCopy() updated := rd.DeepCopy()
updated.Status.Replicas = desiredRS.Spec.Replicas updated.Status = status
if err := r.Status().Update(ctx, updated); err != nil { if err := r.Status().Patch(ctx, updated, client.MergeFrom(&rd)); err != nil {
log.Error(err, "Failed to update runnerdeployment status") log.Info("Failed to patch runnerdeployment status. Retrying immediately", "error", err.Error())
return ctrl.Result{
return ctrl.Result{}, err Requeue: true,
}, nil
} }
} }
@@ -256,28 +337,94 @@ func CloneAndAddLabel(labels map[string]string, labelKey, labelValue string) map
return newLabels return newLabels
} }
func (r *RunnerDeploymentReconciler) newRunnerReplicaSet(rd v1alpha1.RunnerDeployment) (*v1alpha1.RunnerReplicaSet, error) { // Clones the given selector and returns a new selector with the given key and value added.
newRSTemplate := *rd.Spec.Template.DeepCopy() // Returns the given selector, if labelKey is empty.
templateHash := ComputeHash(&newRSTemplate) //
// Add template hash label to selector. // Proudly copied from k8s.io/kubernetes/pkg/util/labels.CloneSelectorAndAddLabel
labels := CloneAndAddLabel(rd.Spec.Template.Labels, LabelKeyRunnerTemplateHash, templateHash) func CloneSelectorAndAddLabel(selector *metav1.LabelSelector, labelKey, labelValue string) *metav1.LabelSelector {
if labelKey == "" {
// Don't need to add a label.
return selector
}
newRSTemplate.Labels = labels // Clone.
newSelector := new(metav1.LabelSelector)
newSelector.MatchLabels = make(map[string]string)
if selector.MatchLabels != nil {
for key, val := range selector.MatchLabels {
newSelector.MatchLabels[key] = val
}
}
newSelector.MatchLabels[labelKey] = labelValue
if selector.MatchExpressions != nil {
newMExps := make([]metav1.LabelSelectorRequirement, len(selector.MatchExpressions))
for i, me := range selector.MatchExpressions {
newMExps[i].Key = me.Key
newMExps[i].Operator = me.Operator
if me.Values != nil {
newMExps[i].Values = make([]string, len(me.Values))
copy(newMExps[i].Values, me.Values)
} else {
newMExps[i].Values = nil
}
}
newSelector.MatchExpressions = newMExps
} else {
newSelector.MatchExpressions = nil
}
return newSelector
}
func (r *RunnerDeploymentReconciler) newRunnerReplicaSet(rd v1alpha1.RunnerDeployment) (*v1alpha1.RunnerReplicaSet, error) {
return newRunnerReplicaSet(&rd, r.CommonRunnerLabels, r.Scheme)
}
func getSelector(rd *v1alpha1.RunnerDeployment) *metav1.LabelSelector {
selector := rd.Spec.Selector
if selector == nil {
selector = &metav1.LabelSelector{MatchLabels: map[string]string{LabelKeyRunnerDeploymentName: rd.Name}}
}
return selector
}
func newRunnerReplicaSet(rd *v1alpha1.RunnerDeployment, commonRunnerLabels []string, scheme *runtime.Scheme) (*v1alpha1.RunnerReplicaSet, error) {
newRSTemplate := *rd.Spec.Template.DeepCopy()
for _, l := range commonRunnerLabels {
newRSTemplate.Spec.Labels = append(newRSTemplate.Spec.Labels, l)
}
templateHash := ComputeHash(&newRSTemplate)
// Add template hash label to selector.
newRSTemplate.ObjectMeta.Labels = CloneAndAddLabel(newRSTemplate.ObjectMeta.Labels, LabelKeyRunnerTemplateHash, templateHash)
// This label selector is used by default when rd.Spec.Selector is empty.
newRSTemplate.ObjectMeta.Labels = CloneAndAddLabel(newRSTemplate.ObjectMeta.Labels, LabelKeyRunnerDeploymentName, rd.Name)
selector := getSelector(rd)
newRSSelector := CloneSelectorAndAddLabel(selector, LabelKeyRunnerTemplateHash, templateHash)
rs := v1alpha1.RunnerReplicaSet{ rs := v1alpha1.RunnerReplicaSet{
TypeMeta: metav1.TypeMeta{}, TypeMeta: metav1.TypeMeta{},
ObjectMeta: metav1.ObjectMeta{ ObjectMeta: metav1.ObjectMeta{
GenerateName: rd.ObjectMeta.Name + "-", GenerateName: rd.ObjectMeta.Name + "-",
Namespace: rd.ObjectMeta.Namespace, Namespace: rd.ObjectMeta.Namespace,
Labels: labels, Labels: newRSTemplate.ObjectMeta.Labels,
}, },
Spec: v1alpha1.RunnerReplicaSetSpec{ Spec: v1alpha1.RunnerReplicaSetSpec{
Replicas: rd.Spec.Replicas, Replicas: rd.Spec.Replicas,
Selector: newRSSelector,
Template: newRSTemplate, Template: newRSTemplate,
}, },
} }
if err := ctrl.SetControllerReference(&rd, &rs, r.Scheme); err != nil { if err := ctrl.SetControllerReference(rd, &rs, scheme); err != nil {
return &rs, err return &rs, err
} }
@@ -285,7 +432,12 @@ func (r *RunnerDeploymentReconciler) newRunnerReplicaSet(rd v1alpha1.RunnerDeplo
} }
func (r *RunnerDeploymentReconciler) SetupWithManager(mgr ctrl.Manager) error { func (r *RunnerDeploymentReconciler) SetupWithManager(mgr ctrl.Manager) error {
r.Recorder = mgr.GetEventRecorderFor("runnerdeployment-controller") name := "runnerdeployment-controller"
if r.Name != "" {
name = r.Name
}
r.Recorder = mgr.GetEventRecorderFor(name)
if err := mgr.GetFieldIndexer().IndexField(&v1alpha1.RunnerReplicaSet{}, runnerSetOwnerKey, func(rawObj runtime.Object) []string { if err := mgr.GetFieldIndexer().IndexField(&v1alpha1.RunnerReplicaSet{}, runnerSetOwnerKey, func(rawObj runtime.Object) []string {
runnerSet := rawObj.(*v1alpha1.RunnerReplicaSet) runnerSet := rawObj.(*v1alpha1.RunnerReplicaSet)
@@ -306,5 +458,6 @@ func (r *RunnerDeploymentReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr). return ctrl.NewControllerManagedBy(mgr).
For(&v1alpha1.RunnerDeployment{}). For(&v1alpha1.RunnerDeployment{}).
Owns(&v1alpha1.RunnerReplicaSet{}). Owns(&v1alpha1.RunnerReplicaSet{}).
Named(name).
Complete(r) Complete(r)
} }

Some files were not shown because too many files have changed in this diff Show More