Commit Graph

425 Commits

Author SHA1 Message Date
toast-gear
cb60c1ec3b docs: add explicit permission list (#593)
Fixes https://github.com/actions-runner-controller/actions-runner-controller/issues/543

Co-authored-by: Callum James Tait <callum.tait@photobox.com>
2021-06-02 08:52:14 +09:00
Christian Dobinsky
e108e04dda chart: add podLabels to helm chart (#583)
* Add pod labels to helm chart

* fix: make podLabels consistent to podAnnotations

* Update charts/actions-runner-controller/Chart.yaml

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-06-01 09:21:32 +09:00
toast-gear
2e083bca28 fix: fixing mising pip PATH (#585)
* fix: fixing mising pip PATH

* chore: removing User Site Directory

Co-authored-by: Callum James Tait <callum.tait@photobox.com>
2021-06-01 09:21:14 +09:00
toast-gear
198b13324d ci: only run latest tag job on push / release (#586)
* ci: only run latest tag job on merge

* ci: update job conditional
2021-06-01 09:18:50 +09:00
toast-gear
605dae3995 docs: add docs for upgrading the project when using Helm (#582)
* docs: adding upgrade notes for Helm

* chore: adding new ignore

* docs: add in cmd to check for stuck runners

* docs: better format

* docs: removing superfluous steps

* docs: moved location of docs

Co-authored-by: Callum James Tait <callum.tait@photobox.com>
2021-05-29 10:37:07 +09:00
toast-gear
d2b0920454 chore: removing dead chart parameters (#577)
* chore: removing autoscale parameters

* chore: removing dead parameter

* chore: removing dead parameters
2021-05-28 08:57:25 +09:00
Yair Fried
2cbeca0e7c chart: Add service monitor and remove kube_rbac_proxy leftovers (#527)
* remove all authProxy refs

* Add serviceMonitor

* fix metrics port

* fix newline

* fix newline

* bump chart version

* fix indentation typo

* Rename metrics.proxy

* Make metrics.portNumber configurable

* fix metrics port

* revert: chart version change

Co-authored-by: toast-gear <15716903+toast-gear@users.noreply.github.com>
2021-05-26 12:10:25 +01:00
Callum James Tait
859e04a680 chore: moving python to alphabetical order 2021-05-26 09:32:01 +09:00
Callum James Tait
c0821d4ede chore: correcting lists removal path 2021-05-26 09:32:01 +09:00
Callum James Tait
c3a6e45920 chore: aligning package order 2021-05-26 09:32:01 +09:00
Callum James Tait
818dfd6515 chore: whitespace alignment 2021-05-26 09:32:01 +09:00
Callum James Tait
726b39aedd feat: adding pip to base image 2021-05-26 09:32:01 +09:00
toast-gear
7638c21e92 docs: adding caveat to scaling metric (#570)
* docs: adding caveat to scaling metric

* docs: better wording

Fixes #338
2021-05-25 10:23:32 +09:00
Viktor Anderling
c09d6075c6 Add topologySpreadConstraints to helm chart (#569)
This commit adds the ability to use topologySpreadConstraints in the
helm chart by populating either one or both of topologySpreadConstraints
and githubWebhookServer.topologySpreadConstraints values.

See the official docs:
https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/

Resolves #567
2021-05-25 10:23:08 +09:00
callum-tait-pbx
39d37a7d28 docs: removing git version (#572)
The version of git bundled isn't pinned
2021-05-24 21:47:33 +01:00
toast-gear
de0315380d docs: better formating (#571) 2021-05-24 21:25:27 +01:00
toast-gear
906ddacbc6 chore: lowering daysUntilStale config (#568) 2021-05-24 09:41:24 +01:00
toast-gear
c388446668 docs: adding comment on permissions being included (#565)
* docs: adding comment on permissions being included

* docs: aligning text across readme
2021-05-22 20:05:19 +09:00
Yusuke Kuoka
d56971ca7c Fix typo (sucessfully -> successfully (#563)
Follow-up for #556
2021-05-22 08:36:18 +09:00
Yusuke Kuoka
cb14d7530b Add HRA printer column "SCHEDULE" (#561)
Adds a column to help the operator see if they configured HRA.Spec.ScheduledOverrides correctly, in a form of "next override schedule recognized by the controller":

```
$ k get horizontalrunnerautoscaler
NAME                            MIN   MAX   DESIRED   SCHEDULE
actions-runner-aos-autoscaler   0     5     0
org                             0     5     0         min=0 time=2021-05-21 15:00:00 +0000 UTC
```

Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/484
2021-05-22 08:29:53 +09:00
Yusuke Kuoka
fbb24c8c0a chore: update issue templates (#559)
* Update bug_report.md

* chore: removing default label for enhancement

Co-authored-by: toast-gear <15716903+toast-gear@users.noreply.github.com>
2021-05-21 16:51:07 +01:00
Yusuke Kuoka
0b88b246d3 Fix additionalPrinterColumns (#556)
This fixes human-readable output of `kubectl get` on `runnerdeployment`, `runnerreplicaset`, and `runner`.

Most notably, CURRENT and READY of runner replicasets are now computed and printed correctly. Runner deployments now have UP-TO-DATE and AVAILABLE instead of READY so that it is consistent with columns of K8s deployments.

A few fixes has been also made to runner deployment and runner replicaset controllers so that those numbers stored in Status objects are reliably updated and in-sync with actual values.

Finally, `AGE` columns are added to runnerdeployment, runnerreplicaset, runnner to make that more visible to users.

`kubectl get` outputs should now look like the below examples:

```
# Immediately after runnerdeployment updated/created
$ k get runnerdeployment
NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
example-runnerdeploy   0         0         0            0           8d
org-runnerdeploy       5         5         5            0           8d

# A few dozens of seconds after update/create all the runners are registered that "available" numbers increase
$ k get runnerdeployment
NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
example-runnerdeploy   0         0         0            0           8d
org-runnerdeploy       5         5         5            5           8d
```

```
$ k get runnerreplicaset
NAME                         DESIRED   CURRENT   READY   AGE
example-runnerdeploy-wnpf6   0         0         0       61m
org-runnerdeploy-fsnmr       2         2         0       8m41s
```

```
$ k get runner
NAME                                           ENTERPRISE   ORGANIZATION                REPOSITORY                                       LABELS                      STATUS    AGE
example-runnerdeploy-wnpf6-registration-only                                            actions-runner-controller/mumoshu-actions-test                               Running   61m
org-runnerdeploy-fsnmr-n8kkx                                actions-runner-controller                                                    ["mylabel 1","mylabel 2"]             21s
org-runnerdeploy-fsnmr-sq6m8                                actions-runner-controller                                                    ["mylabel 1","mylabel 2"]             21s
```

Fixes #490
2021-05-21 09:10:47 +09:00
Yusuke Kuoka
a4631f345b Update issue templates (#552) 2021-05-18 18:15:00 +09:00
Yusuke Kuoka
7be31ce3e5 kubectl-diff / dry-run support (#549)
Resolves #266
2021-05-17 09:36:13 +09:00
toast-gear
57a7b8076f docs: correcting shell command (#548)
Fixes #546
2021-05-16 09:08:41 +09:00
ToMe25
5309b1c02c Fix acceptance test not working due to missing SYNC_PERIOD (#542)
Fixes #533
2021-05-11 20:30:34 +09:00
Yusuke Kuoka
ae09e6ebb7 Make log level configurable (#541)
Resolves #425
2021-05-11 20:23:06 +09:00
Yusuke Kuoka
3cd124dce3 chore: Add debug logs for scheduledOverrides (#540)
Follow-up for #515
Ref #484
2021-05-11 17:30:22 +09:00
Yusuke Kuoka
25f5817a5e Improve debug log in webhook-based autoscaling
Adds some helpful debug log messages I have used while verifying #534
2021-05-11 15:49:03 +09:00
Yusuke Kuoka
0510f19607 chore: Enhance acceptance test to cover webhook-based autoscaling for repo and org runners
Adds what I used while verifying #534
2021-05-11 15:36:02 +09:00
Yusuke Kuoka
9d961c58ff Log used settings on startup 2021-05-11 11:46:35 +09:00
Yusuke Kuoka
ab25907050 chart: Add githubAPICacheDuration
Ref #502
2021-05-11 11:46:35 +09:00
Yusuke Kuoka
6cbba80df1 Add --github-api-cache-duration
Resolves #502
2021-05-11 11:46:35 +09:00
Liam Gibson
082245c5db Fix typos in README.md (#528) 2021-05-08 21:29:11 +09:00
Yusuke Kuoka
a82e020daa Add notes for unreleased features (#526) 2021-05-05 14:59:36 +09:00
Yusuke Kuoka
c8c2d44a5c Add documentation for ScheduledOverrides (#525)
Ref #484
2021-05-05 14:54:50 +09:00
Yusuke Kuoka
4e7b8b57c0 edge: Enable scaling from zero with PercentageRunnersBusy (#524)
`PercentageRunnersBusy`, in combination with a secondary `TotalInProgressAndQueuedWorkflowRuns` metric, enables scale-from-zero for PercentageRunnersBusy.

Please see the new `Autoscaling to/from 0` section in the updated documentation about how it works.

Resolves #522
2021-05-05 14:27:17 +09:00
Yusuke Kuoka
e7020c7c0f Fix scale-from-zero to retain the reg-only runner until other pods come up (#523)
Fixes #516
2021-05-05 12:13:51 +09:00
Yair Fried
cb54864387 chart: Allow to disabling kube-rbac-proxy and expose metrics (#511)
Fixes #454
2021-05-03 23:36:01 +09:00
Yusuke Kuoka
0e0f385f72 Experimental support for ScheduledOverrides (#515)
This adds the initial version of ScheduledOverrides to HorizontalRunnerAutoscaler.
`MinReplicas` overriding should just work.
When there are two or more ScheduledOverrides, the earliest one that matched is activated. Each ScheduledOverride can be recurring or one-time. If you have two or more ScheduledOverrides, only one of them should be one-time. And the one-time override should be the earliest item in the list to make sense.

Tests will be added in another commit. Logging improvements and additional observability in HRA.Status will also be added in yet another commits.

Ref #484
2021-05-03 23:31:17 +09:00
Yusuke Kuoka
b3cae25741 Enhance HorizontalRunnerAutoscaler API for ScheduledOverrides (#514)
This adds types and CRD changes related to HorizontalRunnerAutoscaler for the upcoming ScheduledOverrides feature.

Ref #484
2021-05-03 22:31:54 +09:00
Yusuke Kuoka
469b117a09 Foundation for ScheduledOverrides (#513)
Adds two types `RecurrenceRule` and `Period` and one function `MatchSchedule` as the foundation for building the upcoming ScheduledOverrides feature.

Ref #484
2021-05-03 22:03:49 +09:00
Yusuke Kuoka
5f59734078 Fix docker-login failing since move to GitHub organization (#510)
Fixes #509
2021-05-03 14:56:58 +09:00
Yusuke Kuoka
e00b3b9714 Make development cycle faster (#508)
Improves Makefile, acceptance/deploy.sh, acceptance/testdata/runnerdeploy.yaml, and the documentation to help developers and contributors.
2021-05-03 13:03:17 +09:00
Thejas N
588872a316 feat: allow ephemeral runner to be optional (#498)
- Adds `ephemeral` option to `runner.spec` 
    
    ```
      ....
      template:
         spec:
             ephemeral: false
             repository: mumoshu/actions-runner-controller-ci
      ....
    ```
- `ephemeral` defaults to `true`
- `entrypoint.sh` in runner/Dockerfile modified to read `RUNNER_EPHEMERAL` flag
- Runner images are backward-compatible. `--once` is omitted only when the new envvar `RUNNER_EPHEMERAL` is explicitly set to `false`.

Resolves #457
2021-05-02 19:04:14 +09:00
Yusuke Kuoka
a0feee257f Add .dockerignore for controller to accelerate image rebuild in local dev env (#504)
Previously any non-go changes resulted in `make docker-build` rerunning time-consufming `go build`. This fixes that by adding clearly unnecessary files .dockerignore
2021-05-02 16:47:07 +09:00
Christoph Brand
a18ac330bb feature(controller): allow autoscaler to scale down to 0 (#447) 2021-05-02 16:46:51 +09:00
Yusuke Kuoka
0901456320 Update README with more detailed test instructions (#503)
- You can now use `make acceptance/run` to run only a specific acceptance test case
- Add note about Ubuntu 20.04 users / snap-provided docker
- Add instruction to run Ginkgo tests
- Extract acceptance/load from acceptance/kind
- Make `acceptance/pull` not depend on `docker-build`, so that you can do `make docker-build acceptance/load` for faster image reload
2021-05-02 16:31:07 +09:00
Yusuke Kuoka
dbd7b486d2 feat: Support for scaling from/to zero (#465)
This is an attempt to support scaling from/to zero.

The basic idea is that we create a one-off "registration-only" runner pod on RunnerReplicaSet being scaled to zero, so that there is one "offline" runner, which enables GitHub Actions to queue jobs instead of discarding those.

GitHub Actions seems to immediately throw away the new job when there are no runners at all. Generally, having runners of any status, `busy`, `idle`, or `offline` would prevent GitHub actions from failing jobs. But retaining `busy` or `idle` runners means that we need to keep runner pods running, which conflicts with our desired to scale to/from zero, hence we retain `offline` runners.

In this change, I enhanced the runnerreplicaset controller to create a registration-only runner on very beginning of its reconciliation logic, only when a runnerreplicaset is scaled to zero. The runner controller creates the registration-only runner pod, waits for it to become "offline", and then removes the runner pod. The runner on GitHub stays `offline`, until the runner resource on K8s is deleted. As we remove the registration-only runner pod as soon as it registers, this doesn't block cluster-autoscaler.

Related to #447
2021-05-02 16:11:36 +09:00
callum-tait-pbx
7e766282aa ci: updating paths-ignore (#496)
* chore: updating paths-ignore

* chore: adding more path-ignores
2021-05-01 21:36:45 +09:00