Commit Graph

145 Commits

Author SHA1 Message Date
Felipe Galindo Sanchez
d0d316252e Option to consider runner group visibility on scale based on webhook (#1062)
This will work on GHES but GitHub Enterprise Cloud due to excessive GitHub API calls required.
More work is needed, like adding a cache layer to the GitHub client, to make it usable on GitHub Enterprise Cloud.

Fixes additional cases from https://github.com/actions-runner-controller/actions-runner-controller/pull/1012

If GitHub auth is provided in the webhooks controller then runner groups with custom visibility are supported. Otherwise, all runner groups will be assumed to be visible to all repositories

`getScaleUpTargetWithFunction()` will check if there is an HRA available with the following flow:

1. Search for **repository** HRAs - if so it ends here
2. Get available HRAs in k8s
3. Compute visible runner groups
  a. If GitHub auth is provided - get all the runner groups that are visible to the repository of the incoming webhook using GitHub API calls.  
  b. If GitHub auth is not provided - assume all runner groups are visible to all repositories
4. Search for **default organization** runners (a.k.a runners from organization's visible default runner group) with matching labels
5. Search for **default enterprise** runners (a.k.a runners from enterprise's visible default runner group) with matching labels
6. Search for **custom organization runner groups** with matching labels
7. Search for **custom enterprise runner groups** with matching labels

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2022-02-16 19:08:56 +09:00
Yusuke Kuoka
59437ef79f Update README.md
Ref https://github.com/actions-runner-controller/actions-runner-controller/issues/1100#issuecomment-1032775144
2022-02-09 09:16:46 +09:00
Callum Tait
eb53d238d1 docs: move istio to troubleshooting (#1097)
Co-authored-by: toast-gear <toast-gear@users.noreply.github.com>
2022-02-07 20:49:26 +00:00
Chris Bui
1b911749a6 feat: disable automatic runner updates (#1088)
* Add env variable to configure `disablupdate` flag

* Write test for entrypoint disable update

* Rename flag, update docs for DISABLE_RUNNER_UPDATE

* chore: bump runner version in makefile

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
2022-02-03 21:03:38 +00:00
Yusuke Kuoka
01301d3ce8 Stop creating registration-only runners on scale-to-zero (#1028)
Resolves #859
2022-01-07 09:56:21 +09:00
Hyeonmin Park
1a6e5719c3 test: Add tests with self-hosted label for #953 (#1030) 2022-01-07 08:50:26 +09:00
Callum Tait
f72d871c5b docs: move troubleshooting out of main readme (#1023)
Co-authored-by: toast-gear <toast-gear@users.noreply.github.com>
2021-12-29 10:24:09 +09:00
Callum Tait
ad48851dc9 feat: expose if docker is enabled and wait for docker to be ready (#962)
Resolves #897
Resolves #915
2021-12-29 10:23:35 +09:00
clement-loiselet-talend
0c34196d87 fix(#951): add exception for self-hosted label in webhook search (#953)
The webhook "workflowJob" pass the labels the job needs to the controller, who in turns search for them in its RunnerDeployment / RunnerSet. The current implementation ignore the search for `self-hosted` if this is the only label, however if multiple labels are found the `self-hosted` label must be declared explicitely or the RD / RS will not be selected for the autoscaling.

This PR fixes the behavior by ignoring this label, and add documentation on this webhook for the other labels that will still require an explicit declaration (OS and architecture). 

The exception should be temporary, ideally the labels implicitely created (self-hosted, OS, architecture) should be searchable alongside the explicitly declared labels.

code tested, work with `["self-hosted"]` and `["self-hosted","anotherLabel"]`

Fixes #951
2021-12-19 10:55:23 +09:00
Jacob Lauritzen
83c8a9809e Add GKE firewall issues to common errors (#1010)
A lot of people have issues with private GKE clusters and it seems they are all solved by setting up a firewall policy. I think it would be relevant to include this in a troubleshooting-section since so many people are searching around issues for it. I myself just spent most of my day trying to figure it out.

Issues where this is the solution:
* #293
* #335
* #68
2021-12-17 09:10:15 +09:00
Pavel Smalenski
91102c8088 Add dockerEnv variable for RunnerDeployment (#912)
Resolves #878

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-12-14 17:13:24 +09:00
kannappan senthilnathan
473fe7f736 update readme instruction for webhook scaling (#987)
added note about mandatory parameters for webhook driven scaling
2021-12-12 16:16:54 +09:00
Callum Tait
acb004f291 docs: remove RunnerSet limitation (#991) 2021-12-08 22:03:42 +00:00
Jonathan Sokolowski
3de4e7e9c6 Support installing without cert-manager (#834)
* Support installing without cert-manager
2021-12-08 21:58:46 +00:00
Callum Tait
23841642df docs: bring ephemeral runners to current (#981)
Co-authored-by: toast-gear <toast-gear@users.noreply.github.com>
2021-12-04 11:12:24 +00:00
Callum Tait
550717020d ci: remove ubuntu 18.04 image (#980)
* ci: remove Ubuntu 18.04 runner

Co-authored-by: toast-gear <toast-gear@users.noreply.github.com>
2021-12-04 10:46:08 +00:00
Callum Tait
9a5ae93cb7 docs: various clarifications (#976)
* docs: various clarifications
2021-11-30 19:22:49 +00:00
Yusuke Kuoka
b87e6e3966 doc: Add GitHub App Permission for Workflow Job Webhook (#932)
* doc: Add GitHub App Permission for Workflow Job Webhook

I think we've missed documenting about the app permission for receiving workflow_job events via the app webhook

* docs: clarification on webhook events

Co-authored-by: Callum Tait <15716903+toast-gear@users.noreply.github.com>
2021-11-09 09:12:47 +09:00
Callum Tait
f7e14e06e8 docs: clarify runnerset limitation (#919)
* docs: clarify runnerset limitation

* docs: correct terminology

* docs: missing word
2021-10-28 11:19:56 +01:00
Callum Tait
09e6b1839b docs: correct anti-flapping example (#918)
* docs: correct anti-flapping example

* docs: provide complete example

* docs: use example/myrepo everywhere
2021-10-28 10:52:54 +01:00
Callum Tait
0416a9272f docs: minor correction to autoscaling description (#917)
* docs: minor correction to autoscaling description

* docs: make pull metrics plural

* docs: remove redundant words

* docs: make plural and expand anti-flapping
2021-10-28 10:33:20 +01:00
Huu Khiem (Mark)
c33578a041 Update examples of RunnerDeployment in README.md (#913)
The README had some examples of RunnerDeployment with wrong/outdated
spec.

This caused some pain when trying to create them based on the examples.

=> Fixed using the spec declared in: api/v1alpha1/runner_types.go
2021-10-28 10:12:48 +01:00
Callum Tait
49566aaebd docs: fix broken link (#916) 2021-10-28 09:57:46 +01:00
Callum Tait
f66e6a00fa docs: clean up auto scaling documentation (#909)
* docs: clean up of autoscaling section

* docs: clarifying anti-flapping

* docs: more improvements

* docs: more improvements

* docs: adding duration details and cavaets

* docs: smaller english and better structure

* docs: use consistent wording

* docs: adding limitation cavaet for RunnerSets

* docs: correct helm uprgade order

* docs: lines helm upgrade command with help switch

* docs: use existing limitations section

* docs: fix table of headers and contents

* docs: add link to runnersets on first mention

* docs: adding runnerset limitation

* chore: use new enterprise permission for PAT

* docs: bump example deploy to latest version

* docs: adding oauth apps link

* docs: adding cavaet to the oauth apps doc
2021-10-28 09:55:04 +01:00
Maxim Pogozhiy
fce7d6d2a7 Add topologySpreadConstraints (#814) 2021-10-17 21:49:44 +01:00
Callum Tait
24224613f3 docs: updating to reflect the latest release (#844)
* docs: updating to reflect the latest release

* docs: further updates

* docs: format updates

* docs: correcting title sizes

* docs: correcting links

* docs: correcting the grammar of multi controllers

* docs: slightly less awkward English
2021-09-28 09:44:15 +01:00
Callum Tait
40c88eb490 docs: slight update to the wording 2021-09-14 17:30:46 +09:00
Yusuke Kuoka
fe64850d3d Document and values.yaml updates for leader election customization
Follow-up for #806
2021-09-14 17:30:46 +09:00
toast-gear
6f27b4920e docs: watch namespace feature (#786)
Fixes #455
2021-09-06 08:46:01 +09:00
Tarasovych
3f801af72a Update README.md (#774) 2021-08-31 09:47:20 +09:00
Tarasovych
7008b0c257 feat: Organization RunnerDeployment with webhook-based autoscaling only for certain repositories (#766)
Resolves #765

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-08-31 09:46:36 +09:00
Tarasovych
d9df455781 Update README.md (#775)
Update organization App url query parameters
2021-08-31 09:44:29 +09:00
Patrick Ellis
a815c37614 docs: fix a few small YAML typos (#763)
- Remove two extra colons that were making the yaml invalid 🕵️
- Add `yaml` tags to the markdown blocks 🧹
2021-08-25 09:13:20 +01:00
Yusuke Kuoka
2c711506ea Update documentation about epehemral runners and RunnerSet (#727)
Follow-up for #721 and #629
2021-08-25 09:26:26 +09:00
Tarasovych
dfa0f2eef4 Update README.md (#735)
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>
2021-08-25 09:25:51 +09:00
Patrick Ellis
91c22ef964 doc: GitHub calls it admin:enterprise, not enterprise:admin (#759) 2021-08-24 10:26:43 +09:00
Yusuke Kuoka
fabead8c8e feat: Workflow job based ephemeral runner scaling (#721)
This add support for two upcoming enhancements on the GitHub side of self-hosted runners, ephemeral runners, and `workflow_jow` events. You can't use these yet.

**These features are not yet generally available to all GitHub users**. Please take this pull request as a preparation to make it available to actions-runner-controller users as soon as possible after GitHub released the necessary features on their end.

**Ephemeral runners**:

The former, ephemeral runners, is basically the reliable alternative to `--once`, which we've been using when you enabled `ephemeral: true` (default in actions-runner-controller).

`--once` has been suffering from a race issue #466. `--ephemeral` fixes that.

To enable ephemeral runners with `actions/runner`, you give `--ephemeral` to `config.sh`. This updated version of `actions-runner-controller` does it for you, by using `--ephemeral` instead of `--once` when you set `RUNNER_FEATURE_FLAG_EPHEMERAL=true`.

Please read the section `Ephemeral Runners` in the updated version of our README for more information.

Note that ephemeral runners is not released on GitHub yet. And `RUNNER_FEATURE_FLAG_EPHEMERAL=true` won't work at all until the feature gets released on GitHub. Stay tuned for an announcement from GitHub!

**`workflow_job` events**:

`workflow_job` is the additional webhook event that corresponds to each GitHub Actions workflow job run. It provides `actions-runner-controller` a solid foundation to improve our webhook-based autoscale.

Formerly, we've been exploiting webhook events like `check_run` for autoscaling. However, as none of our supported events has included `labels`, you had to configure an HRA to only match relevant `check_run` events. It wasn't trivial.

In contrast, a `workflow_job` event payload contains `labels` of runners requested. `actions-runner-controller` is able to automatically decide which HRA to scale by filtering the corresponding RunnerDeployment by `labels` included in the webhook payload. So all you need to use webhook-based autoscale will be to enable `workflow_job` on GitHub and expose actions-runner-controller's webhook server to the internet.

Note that the current implementation of `workflow_job` support works in two ways, increment, and decrement. An increment happens when the webhook server receives` workflow_job` of `queued` status. A decrement happens when it receives `workflow_job` of `completed` status. The latter is used to make scaling-down faster so that you waste money less than before. You still don't suffer from flapping, as a scale-down is still subject to `scaleDownDelaySecondsAfterScaleOut `.

Please read the section `Example 3: Scale on each `workflow_job` event` in the updated version of our README for more information on its usage.
2021-08-11 09:52:04 +09:00
Rolf Ahrenberg
d528d18211 Fix markdown header (#718) 2021-08-09 14:37:57 +01:00
toast-gear
7e593a80ff docs: more improvements to the english used 2021-08-06 17:36:11 +01:00
toast-gear
27bdc780a3 docs: better english 2021-08-06 17:34:53 +01:00
toast-gear
3948406374 docs: using better english 2021-08-06 17:32:58 +01:00
toast-gear
743e6d6202 feat: bump runner version (#705)
* feat: bump runner version

* feat: remove deprecated env var

* docs: updating the docs

Co-authored-by: Callum James Tait <callum.tait@photobox.com>
2021-07-30 19:58:04 +09:00
Rolf Ahrenberg
29260549fa Document volumeStorageMedium and volumeSizeLimit (#700)
Related to #674
2021-07-21 07:50:25 +09:00
Rolf Ahrenberg
14564c7b8e Allow disabling /runner emptydir mounts and setting storage volume (#674)
* Allow disabling /runner emptydir mounts

* Support defining storage medium for emptydirs

* Fix typos
2021-07-15 06:29:58 +09:00
lucas-pate
dcea0f7f79 Update README.md to fix scaleUp/Down examples (#684)
* Update README.md to fix scaleUp/Down examples

* fix comment formatting
2021-07-11 09:05:43 +09:00
toast-gear
9437e164b4 docs: runner startup delay docs PR #678 (#679)
* docs: runner startup delay docs PR #678

* docs: adding in immutable tag into the docs
2021-07-03 12:02:37 +01:00
Yusuke Kuoka
3b45d1b334 doc: Describe RunnerSet (#654)
Ref #629
Ref #613
Ref #612
2021-06-26 07:34:58 +09:00
Callum James Tait
774db3fef4 docs: moving dev docs to contributing md 2021-06-23 08:48:43 +09:00
Yusuke Kuoka
8b90b0f0e3 Clean up import list (#645)
Resolves #644
2021-06-22 17:55:06 +09:00
Yusuke Kuoka
af0ca03752 doc: Introduce summerwind/actions-runner images (#634)
I have noticed that this isnt documented anywhere while working on https://github.com/actions-runner-controller/actions-runner-controller/issues/631#issuecomment-862807900
2021-06-22 17:07:36 +09:00