Compare commits

..

83 Commits

Author SHA1 Message Date
Nikola Jokic
abc0b678d3 Revert chart name and use helper constant to trim the name base (#2824)
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
2023-08-21 15:36:14 +02:00
Bassem Dghaidi
963ab2a748 Fix workflow after chart renaming (#2822) 2023-08-21 14:28:55 +02:00
Bassem Dghaidi
8a41a596b6 Prepare 0.5.0 release (#2783) 2023-08-21 14:10:36 +02:00
Bassem Dghaidi
e10c437f46 Move gha-* docs out of preview (#2779) 2023-08-21 14:06:12 +02:00
Nikola Jokic
a0a3916c80 Provide scale-set listener metrics (#2559)
Co-authored-by: Tingluo Huang <tingluohuang@github.com>
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
2023-08-21 13:50:07 +02:00
Nikola Jokic
1c360d7e26 Document customization for containerModes (#2777) 2023-08-18 11:03:28 +02:00
github-actions[bot]
20bb860a37 Updates: runner to v2.308.0 (#2814)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-08-15 12:53:03 +02:00
Nikola Jokic
6a75bc0880 Trim gha-runner-scale-set to gha-rs in names and remove role type suffixes (#2706) 2023-08-09 11:11:45 +02:00
Lukas Hauser
78271000c0 Logs - Add missing formatting (#2780) 2023-08-09 17:54:24 +09:00
Juliet Boyd
a36b0e58b0 Clarify multiple metrics in docs (#2712)
Co-authored-by: Dylan Boyd <5061312+dylanjboyd@users.noreply.github.com>
2023-08-09 17:53:39 +09:00
Nikola Jokic
336e11a4e9 Fix scaling back to 0 after min runners were set to number > 0 (#2742) 2023-08-09 10:32:08 +02:00
github-actions[bot]
dcb64f0b9e Updates: runner to v2.307.1 (#2778)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-07-26 20:02:33 +02:00
Nikola Jokic
0dadfc4d37 ADR: Customize listener pod (#2752) 2023-07-25 16:47:26 +02:00
Thorsten Wildberger
dc58f6ba13 feat: allow more dockerd options (#2701) 2023-07-25 13:59:49 +09:00
arielly-parussulo
06cbd632b8 add interval and timeout configuration for the actions-runner-controler serviceMonitors (#2654)
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2023-07-25 13:59:41 +09:00
Paweł Rein
9f33ae1507 fixed indent in a README example (#2725) 2023-07-25 13:45:44 +09:00
Ekaterina Sobolevskaia
63a6b5a7f0 add opportunity write dnsPolicy for controller by helm values (#2708) 2023-07-25 13:38:13 +09:00
marcin-motyl
fddc5bf1c8 Fix deployment & service values in actionsMetrics (#2683) 2023-07-25 09:56:20 +09:00
Daniel Kubat
d90ce2bed5 Upgrade Docker Compose to v2.20.0 (#2738) 2023-07-25 09:54:09 +09:00
Gavin Williams
cd996e7c27 Fix panic: slice bounds out of range when runner spec contains volumeMounts. (#2720)
Signed-off-by: Gavin Williams <gavin.williams@machinemax.com>
2023-07-25 09:53:50 +09:00
Lars Lange
297442975e fix: remove callbacks resulting in scales due to incomplete response (#2671)
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2023-07-25 09:04:54 +09:00
dependabot[bot]
5271f316e6 Bump golang.org/x/net from 0.11.0 to 0.12.0 (#2750)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-18 12:07:33 +02:00
dependabot[bot]
9845a934f4 Bump github.com/cloudflare/circl from 1.1.0 to 1.3.3 (#2628)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>
2023-07-14 13:48:27 +02:00
github-actions[bot]
e0a7e142e0 Updates: runner to v2.306.0 (#2727)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-07-07 14:48:40 +02:00
dependabot[bot]
f9a11a8b0b chore(deps): bump github.com/stretchr/testify from 1.8.2 to 1.8.4 (#2716)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>
2023-07-06 12:41:55 +02:00
Nikola Jokic
fde1893494 Add status check before deserializing runner-registration response (#2699) 2023-07-05 21:09:07 +02:00
Nikola Jokic
6fe8008640 Add configurable log format to values.yaml and propagate it to listener (#2686) 2023-07-05 21:06:42 +02:00
Yusuke Kuoka
2fee26ddce chore: Set build version on make-runscaleset (#2713) 2023-07-03 11:52:04 +02:00
marcin-motyl
685f7162a4 Fix serviceMonitor labels in actionsMetrics (#2682) 2023-07-01 13:59:44 +09:00
Lars Lange
d134dee14b fix: template test of service account (#2705) 2023-06-28 10:24:49 +02:00
dependabot[bot]
c33ce998f4 chore(deps): bump github.com/onsi/ginkgo/v2 from 2.9.1 to 2.11.0 (#2689)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>
2023-06-27 13:14:31 +02:00
kahirokunn
78a93566af chore: remove 16 characters from -service-account (#2567)
Signed-off-by: kahirokunn <okinakahiro@gmail.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>
2023-06-27 12:32:47 +02:00
Rose Soriano
81dea9b3dc Fix more broken links in docs (#2473)
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
2023-06-23 08:54:13 -04:00
Nikola Jokic
7ca3df3605 fix chart test (#2694) 2023-06-21 08:43:03 -04:00
kahirokunn
2343cd2d7b chore(gha-runner-scale-set): update indentation of initContainers (#2638) 2023-06-21 13:50:02 +02:00
Timm Drevensek
cf18cb3fb0 Adapt role name to prevent namespace collision (#2617) 2023-06-20 17:35:53 +02:00
Bassem Dghaidi
ae8b27a9a3 Apply the label "runners update" on runner update PRs (#2680) 2023-06-16 09:11:58 -04:00
dependabot[bot]
58ee5e8c4e chore(deps): bump github.com/onsi/ginkgo/v2 from 2.9.0 to 2.9.1 (#2401)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>
2023-06-15 14:56:53 +02:00
dependabot[bot]
fade63a663 chore(deps): bump go.uber.org/multierr from 1.7.0 to 1.10.0 (#2400)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nikola Jokic <jokicnikola07@gmail.com>
2023-06-15 14:05:35 +02:00
Nikola Jokic
ac4056f85b Upgrade golang.org/x/net to 0.11 (#2676) 2023-06-15 13:38:55 +02:00
github-actions[bot]
462d044604 Updates: runner to v2.305.0 (#2674)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-06-15 06:07:09 -04:00
Nikola Jokic
94934819c4 Trim repo/org/enterprise to 63 characters in label values (#2657) 2023-06-09 20:57:20 +02:00
Nuru
aac811f210 Update unconsumed HRA capacity reservation's expiration more frequently and consistently (#2502)
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2023-05-30 09:04:57 +09:00
Thang Le
e7ec736738 Use head_branch metric (#2549)
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2023-05-28 16:36:55 +09:00
Daniel Hobley
90ea691e72 feat: allow for modifying var-run mount maximum size limit (#2624) 2023-05-27 11:47:23 +09:00
robert lestak
32a653c0ca enable passing docker-gid in helm chart (#2574) 2023-05-27 11:33:46 +09:00
Vincent Rivellino
c7b2dd1764 fix: labels on github webhook service template (#2582) 2023-05-27 11:33:20 +09:00
Changliang Wu
80af7fc125 feat: support configure docker insecure registry with env (#2606) 2023-05-27 11:32:46 +09:00
Armin Becher
34909f0cf1 Fix typo in HorizontalRunnerAutoscaler (#2563)
Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2023-05-27 11:22:44 +09:00
Bassem Dghaidi
8afef51c8b Add DrainJobsMode (aka UpdateStrategy feature) (#2569) 2023-05-23 07:42:30 -04:00
Bassem Dghaidi
032443fcfd Fix workflows concurrency group names (#2611) 2023-05-22 07:16:38 -04:00
Nikola Jokic
91c8991835 Scale Set Metrics ADR (#2568)
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
2023-05-18 15:37:41 +02:00
Nikola Jokic
c5ebe750dc Discard logs on helm chart tests (#2607) 2023-05-18 14:15:05 +02:00
Bassem Dghaidi
34fdbf1231 Add concurrency limits on all workflows to eliminate wasted cycles (#2603) 2023-05-18 04:55:03 -04:00
Bassem Dghaidi
44e9b7d8eb Add new architecture diagram (#2598) 2023-05-17 08:36:16 -04:00
Bassem Dghaidi
7ab516fdab Update CONTRIBUTING.md with new contribution guidelines and release process documentation (#2596)
Co-authored-by: John Sudol <24583161+johnsudol@users.noreply.github.com>
2023-05-17 07:42:35 -04:00
github-actions[bot]
e571df52b5 Updates: container-hooks to v0.3.2 (#2597)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-05-17 05:57:23 -04:00
Bassem Dghaidi
706ec17bf4 Fix broken chart validation workflows (#2589) 2023-05-15 10:12:03 -04:00
Bassem Dghaidi
30355f742b Apply naming convention to workflows (#2581)
Co-authored-by: John Sudol <24583161+johnsudol@users.noreply.github.com>
2023-05-15 08:31:18 -04:00
Yusuke Kuoka
8a5fb6ccb7 Bump chart version to v0.23.3 for ARC v0.27.4 (#2577) 2023-05-12 09:10:59 -04:00
github-actions[bot]
e930ba6e98 Updates: container-hooks to v0.3.1 (#2580)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-05-12 05:55:09 -04:00
Bassem Dghaidi
5ba3805a3f Fix update runners scheduled workflow to check for container-hooks upgrades (#2576) 2023-05-12 05:52:24 -04:00
kahirokunn
f798cddca1 docs: use INSTALLATION_NAME (#2552)
Signed-off-by: kahirokunn <okinakahiro@gmail.com>
2023-05-10 10:39:54 -04:00
Y. Luis
367ee46122 Fixed scaling runners doc link (#2474)
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
2023-05-09 14:45:18 -04:00
Seonghyeon Cho
f4a318fca6 docs: Update github docs links under /managing-self-hosted-runners (#2554)
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
2023-05-09 14:43:15 -04:00
Bassem Dghaidi
4ee21cb24b Add link to walkthrough video on youtube (#2570) 2023-05-08 15:24:32 -04:00
Yusuke Kuoka
102c9e1afa Update "People" section in README (#2537)
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
2023-05-04 08:04:42 -04:00
Nikola Jokic
73e676f951 Check release tag version and chart versions during the release process (#2524)
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
2023-05-03 11:53:42 +02:00
github-actions[bot]
41ebb43c65 Update runner to version 2.304.0 (#2543)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-04-28 10:05:46 -04:00
mspasoje
aa50b62c01 Fix for GHES when authorized through GitHub App with GITHUB_URL instead of GITHUB_ENTERPRISE_URL (#2464)
Ref #2457
2023-04-27 13:53:22 +09:00
Alex Williams
942f773fef Update helm chart to support actions metrics graceful termiantion (#2498)
# Summary

- add lifecycle, terminationGracePeriodSeconds, and loadBalancerSource ranges to metrics server
- these were missed when copying from the other webhook server
- original PR adding them to the other webhook server is here https://github.com/actions/actions-runner-controller/pull/2305

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2023-04-27 13:50:31 +09:00
Thomas B
21722a5de8 Add CR and CRB to the helm chart (#2504)
In response to https://github.com/actions/actions-runner-controller/issues/2212 , the ARC helm chart is missing ClusterRoleBinding and ClusterRole for the ActionsMetricsServer resulting on missing permissions.

This also fix the labels of the ActionsMetricsServer Service as it is selected by the ServiceMonitor with those labels.

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2023-04-27 13:33:48 +09:00
argokasper
a2d4b95b79 Fix GET validation for lowercase http methods (#2497)
Some requests send method in lowercase (verified with curl and as a default for AWS ALB health check requests), but Go HTTP library constant MethodGet is in upper.
2023-04-27 13:22:41 +09:00
Thilo Uttendorfer
04fb9f4fa1 Fix the default version of kube-rbac-proxy in the docs (#2535) 2023-04-27 13:16:12 +09:00
Paul Brousseau
8304b80955 docs: minor correction for actions metrics server secret (#2542)
Aligning docs with what the Helm chart produces
2023-04-27 13:15:49 +09:00
Nuru
9bd4025e9c Stricter filtering of check run completion events (#2520)
I observed that 100% of canceled jobs in my runner pool were not causing scale down events. This PR fixes that.

The problem was caused by #2119. 

#2119 ignores certain webhook events in order to fix #2118. However, #2119 overdoes it and filters out valid job cancellation events. This PR uses stricter filtering and add visibility for future troubleshooting.

<details><summary>Example cancellation event</summary>

This is the redacted top portion of a valid cancellation event my runner pool received and ignored.

```json
{
  "action": "completed",
  "workflow_job": {
    "id": 12848997134,
    "run_id": 4738060033,
    "workflow_name": "slack-notifier",
    "head_branch": "auto-update/slack-notifier-0.5.1",
    "run_url": "https://api.github.com/repos/nuru/<redacted>/actions/runs/4738060033",
    "run_attempt": 1,
    "node_id": "CR_kwDOB8Xtbc8AAAAC_dwjDg",
    "head_sha": "55bada8f3d0d3e12a510a1bf34d0c3e169b65f89",
    "url": "https://api.github.com/repos/nuru/<redacted>/actions/jobs/12848997134",
    "html_url": "https://github.com/nuru/<redacted>/actions/runs/4738060033/jobs/8411515430",
    "status": "completed",
    "conclusion": "cancelled",
    "created_at": "2023-04-19T00:03:12Z",
    "started_at": "2023-04-19T00:03:42Z",
    "completed_at": "2023-04-19T00:03:42Z",
    "name": "build (arm64)",
    "steps": [

    ],
    "check_run_url": "https://api.github.com/repos/nuru/<redacted>/check-runs/12848997134",
    "labels": [
      "self-hosted",
      "arm64"
    ],
    "runner_id": 0,
    "runner_name": "",
    "runner_group_id": 0,
    "runner_group_name": ""
  },
```

</details>
2023-04-27 13:15:23 +09:00
Yusuke Kuoka
94c089c407 Revert docker.sock path to /var/run/docker.sock (#2536)
Starting ARC v0.27.2, we've changed the `docker.sock` path from `/var/run/docker.sock` to `/var/run/docker/docker.sock`. That resulted in breaking some container-based actions due to the hard-coded `docker.sock` path in various places.

Even `actions/runner` seem to use `/var/run/docker.sock` for building container-based actions and for service containers?

Anyway, this fixes that by moving the sock file back to the previous location.

Once this gets merged, users stuck at ARC v0.27.1, previously upgraded to 0.27.2 or 0.27.3 and reverted back to v0.27.1 due to #2519, should be able to upgrade to the upcoming v0.27.4.

Resolves #2519
Resolves #2538
2023-04-27 13:06:35 +09:00
Nikola Jokic
9859bbc7f2 Use build.Version to check if resource version is a mismatch (#2521)
Co-authored-by: Bassem Dghaidi <568794+Link-@users.noreply.github.com>
2023-04-24 10:40:15 +02:00
Thomas
c1e2c4ef9d docs: Fix typo for automatic runner scaling (#2375) 2023-04-21 11:15:53 +09:00
Edgar Kalinovski
2ee15dbca3 Add description for "dockerRegistryMirror" key (#2488) 2023-04-21 11:10:55 +09:00
Sam Greening
a4cf626410 Revert actions-runner-controller image tag in kustomization to latest (#2522) 2023-04-21 10:59:34 +09:00
cavila-evoliq
58f4b6ff2d Update ubuntu-22.04 Dockerfile to add python user script dir (#2508) 2023-04-18 08:26:14 +09:00
Bassem Dghaidi
22fbd10bd3 Fix the path of the index.yaml in job summary (#2515) 2023-04-17 14:09:56 -04:00
146 changed files with 4174 additions and 1504 deletions

1
.gitattributes vendored Normal file
View File

@@ -0,0 +1 @@
*.png filter=lfs diff=lfs merge=lfs -text

View File

@@ -23,6 +23,14 @@ inputs:
arc-controller-namespace:
description: 'The namespace of the configured gha-runner-scale-set-controller'
required: true
wait-to-finish:
description: 'Wait for the workflow run to finish'
required: true
default: "true"
wait-to-running:
description: 'Wait for the workflow run to start running'
required: true
default: "false"
runs:
using: "composite"
@@ -111,14 +119,43 @@ runs:
- name: Generate summary about the triggered workflow run
shell: bash
run: |
run: |
cat <<-EOF > $GITHUB_STEP_SUMMARY
| **Triggered workflow run** |
|:--------------------------:|
| ${{steps.query_workflow.outputs.workflow_run_url}} |
EOF
- name: Wait for workflow to start running
if: inputs.wait-to-running == 'true' && inputs.wait-to-finish == 'false'
uses: actions/github-script@v6
with:
script: |
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms))
}
const owner = '${{inputs.repo-owner}}'
const repo = '${{inputs.repo-name}}'
const workflow_run_id = ${{steps.query_workflow.outputs.workflow_run}}
const workflow_job_id = ${{steps.query_workflow.outputs.workflow_job}}
let count = 0
while (count++<10) {
await sleep(30 * 1000);
let getRunResponse = await github.rest.actions.getWorkflowRun({
owner: owner,
repo: repo,
run_id: workflow_run_id
})
console.log(`${getRunResponse.data.html_url}: ${getRunResponse.data.status} (${getRunResponse.data.conclusion})`);
if (getRunResponse.data.status == 'in_progress') {
console.log(`Workflow run is in progress.`)
return
}
}
core.setFailed(`The triggered workflow run didn't start properly using ${{inputs.arc-name}}`)
- name: Wait for workflow to finish successfully
if: inputs.wait-to-finish == 'true'
uses: actions/github-script@v6
with:
script: |
@@ -151,10 +188,15 @@ runs:
}
core.setFailed(`The triggered workflow run didn't finish properly using ${{inputs.arc-name}}`)
- name: cleanup
if: inputs.wait-to-finish == 'true'
shell: bash
run: |
helm uninstall ${{ inputs.arc-name }} --namespace ${{inputs.arc-namespace}} --debug
kubectl wait --timeout=10s --for=delete AutoScalingRunnerSet -n ${{inputs.arc-name}} -l app.kubernetes.io/instance=${{ inputs.arc-name }}
- name: Gather logs and cleanup
shell: bash
if: always()
run: |
helm uninstall ${{ inputs.arc-name }} --namespace ${{inputs.arc-namespace}} --debug
kubectl wait --timeout=10s --for=delete AutoScalingRunnerSet -n ${{inputs.arc-name}} -l app.kubernetes.io/instance=${{ inputs.arc-name }}
kubectl logs deployment/arc-gha-runner-scale-set-controller -n ${{inputs.arc-controller-namespace}}
kubectl logs deployment/arc-gha-rs-controller -n ${{inputs.arc-controller-namespace}}

View File

@@ -1,4 +1,4 @@
name: Publish Helm Chart
name: Publish ARC Helm Charts
# Revert to https://github.com/actions-runner-controller/releases#releases
# for details on why we use this approach
@@ -8,7 +8,7 @@ on:
- master
paths:
- 'charts/**'
- '.github/workflows/publish-chart.yaml'
- '.github/workflows/arc-publish-chart.yaml'
- '!charts/actions-runner-controller/docs/**'
- '!charts/gha-runner-scale-set-controller/**'
- '!charts/gha-runner-scale-set/**'
@@ -28,6 +28,10 @@ env:
permissions:
contents: write
concurrency:
group: ${{ github.workflow }}
cancel-in-progress: true
jobs:
lint-chart:
name: Lint Chart
@@ -171,6 +175,7 @@ jobs:
--owner "$(echo ${{ github.repository }} | cut -d '/' -f 1)" \
--git-repo "$(echo ${{ github.repository }} | cut -d '/' -f 2)" \
--index-path ${{ github.workspace }}/index.yaml \
--token ${{ secrets.GITHUB_TOKEN }} \
--push \
--pages-branch 'gh-pages' \
--pages-index-path 'index.yaml'
@@ -204,4 +209,4 @@ jobs:
echo "New helm chart has been published" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "**Status:**" >> $GITHUB_STEP_SUMMARY
echo "- New [index.yaml](https://github.com/${{ env.CHART_TARGET_ORG }}/${{ env.CHART_TARGET_REPO }}/tree/main/actions-runner-controller) pushed" >> $GITHUB_STEP_SUMMARY
echo "- New [index.yaml](https://github.com/${{ env.CHART_TARGET_ORG }}/${{ env.CHART_TARGET_REPO }}/tree/master/actions-runner-controller) pushed" >> $GITHUB_STEP_SUMMARY

View File

@@ -1,4 +1,4 @@
name: Publish ARC
name: Publish ARC Image
# Revert to https://github.com/actions-runner-controller/releases#releases
# for details on why we use this approach
@@ -25,6 +25,10 @@ env:
TARGET_ORG: actions-runner-controller
TARGET_REPO: actions-runner-controller
concurrency:
group: ${{ github.workflow }}
cancel-in-progress: true
jobs:
release-controller:
name: Release

View File

@@ -1,4 +1,4 @@
name: Runners
name: Release ARC Runner Images
# Revert to https://github.com/actions-runner-controller/releases#releases
# for details on why we use this approach
@@ -10,7 +10,7 @@ on:
- 'master'
paths:
- 'runner/VERSION'
- '.github/workflows/release-runners.yaml'
- '.github/workflows/arc-release-runners.yaml'
env:
# Safeguard to prevent pushing images to registeries after build
@@ -18,7 +18,10 @@ env:
TARGET_ORG: actions-runner-controller
TARGET_WORKFLOW: release-runners.yaml
DOCKER_VERSION: 20.10.23
RUNNER_CONTAINER_HOOKS_VERSION: 0.2.0
concurrency:
group: ${{ github.workflow }}
cancel-in-progress: true
jobs:
build-runners:
@@ -27,10 +30,12 @@ jobs:
steps:
- uses: actions/checkout@v3
- name: Get runner version
id: runner_version
id: versions
run: |
version=$(echo -n $(cat runner/VERSION))
echo runner_version=$version >> $GITHUB_OUTPUT
runner_current_version="$(echo -n $(cat runner/VERSION | grep 'RUNNER_VERSION=' | cut -d '=' -f2))"
container_hooks_current_version="$(echo -n $(cat runner/VERSION | grep 'RUNNER_CONTAINER_HOOKS_VERSION=' | cut -d '=' -f2))"
echo runner_version=$runner_current_version >> $GITHUB_OUTPUT
echo container_hooks_version=$container_hooks_current_version >> $GITHUB_OUTPUT
- name: Get Token
id: get_workflow_token
@@ -42,7 +47,8 @@ jobs:
- name: Trigger Build And Push Runner Images To Registries
env:
RUNNER_VERSION: ${{ steps.runner_version.outputs.runner_version }}
RUNNER_VERSION: ${{ steps.versions.outputs.runner_version }}
CONTAINER_HOOKS_VERSION: ${{ steps.versions.outputs.container_hooks_version }}
run: |
# Authenticate
gh auth login --with-token <<< ${{ steps.get_workflow_token.outputs.token }}
@@ -51,20 +57,21 @@ jobs:
gh workflow run ${{ env.TARGET_WORKFLOW }} -R ${{ env.TARGET_ORG }}/releases \
-f runner_version=${{ env.RUNNER_VERSION }} \
-f docker_version=${{ env.DOCKER_VERSION }} \
-f runner_container_hooks_version=${{ env.RUNNER_CONTAINER_HOOKS_VERSION }} \
-f runner_container_hooks_version=${{ env.CONTAINER_HOOKS_VERSION }} \
-f sha='${{ github.sha }}' \
-f push_to_registries=${{ env.PUSH_TO_REGISTRIES }}
- name: Job summary
env:
RUNNER_VERSION: ${{ steps.runner_version.outputs.runner_version }}
RUNNER_VERSION: ${{ steps.versions.outputs.runner_version }}
CONTAINER_HOOKS_VERSION: ${{ steps.versions.outputs.container_hooks_version }}
run: |
echo "The [release-runners.yaml](https://github.com/actions-runner-controller/releases/blob/main/.github/workflows/release-runners.yaml) workflow has been triggered!" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "**Parameters:**" >> $GITHUB_STEP_SUMMARY
echo "- runner_version: ${{ env.RUNNER_VERSION }}" >> $GITHUB_STEP_SUMMARY
echo "- docker_version: ${{ env.DOCKER_VERSION }}" >> $GITHUB_STEP_SUMMARY
echo "- runner_container_hooks_version: ${{ env.RUNNER_CONTAINER_HOOKS_VERSION }}" >> $GITHUB_STEP_SUMMARY
echo "- runner_container_hooks_version: ${{ env.CONTAINER_HOOKS_VERSION }}" >> $GITHUB_STEP_SUMMARY
echo "- sha: ${{ github.sha }}" >> $GITHUB_STEP_SUMMARY
echo "- push_to_registries: ${{ env.PUSH_TO_REGISTRIES }}" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY

View File

@@ -0,0 +1,149 @@
# This workflows polls releases from actions/runner and in case of a new one it
# updates files containing runner version and opens a pull request.
name: Runner Updates Check (Scheduled Job)
on:
schedule:
# run daily
- cron: "0 9 * * *"
workflow_dispatch:
jobs:
# check_versions compares our current version and the latest available runner
# version and sets them as outputs.
check_versions:
runs-on: ubuntu-latest
env:
GH_TOKEN: ${{ github.token }}
outputs:
runner_current_version: ${{ steps.runner_versions.outputs.runner_current_version }}
runner_latest_version: ${{ steps.runner_versions.outputs.runner_latest_version }}
container_hooks_current_version: ${{ steps.container_hooks_versions.outputs.container_hooks_current_version }}
container_hooks_latest_version: ${{ steps.container_hooks_versions.outputs.container_hooks_latest_version }}
steps:
- uses: actions/checkout@v3
- name: Get runner current and latest versions
id: runner_versions
run: |
CURRENT_VERSION="$(echo -n $(cat runner/VERSION | grep 'RUNNER_VERSION=' | cut -d '=' -f2))"
echo "Current version: $CURRENT_VERSION"
echo runner_current_version=$CURRENT_VERSION >> $GITHUB_OUTPUT
LATEST_VERSION=$(gh release list --exclude-drafts --exclude-pre-releases --limit 1 -R actions/runner | grep -oP '(?<=v)[0-9.]+' | head -1)
echo "Latest version: $LATEST_VERSION"
echo runner_latest_version=$LATEST_VERSION >> $GITHUB_OUTPUT
- name: Get container-hooks current and latest versions
id: container_hooks_versions
run: |
CURRENT_VERSION="$(echo -n $(cat runner/VERSION | grep 'RUNNER_CONTAINER_HOOKS_VERSION=' | cut -d '=' -f2))"
echo "Current version: $CURRENT_VERSION"
echo container_hooks_current_version=$CURRENT_VERSION >> $GITHUB_OUTPUT
LATEST_VERSION=$(gh release list --exclude-drafts --exclude-pre-releases --limit 1 -R actions/runner-container-hooks | grep -oP '(?<=v)[0-9.]+' | head -1)
echo "Latest version: $LATEST_VERSION"
echo container_hooks_latest_version=$LATEST_VERSION >> $GITHUB_OUTPUT
# check_pr checks if a PR for the same update already exists. It only runs if
# runner latest version != our current version. If no existing PR is found,
# it sets a PR name as output.
check_pr:
runs-on: ubuntu-latest
needs: check_versions
if: needs.check_versions.outputs.runner_current_version != needs.check_versions.outputs.runner_latest_version || needs.check_versions.outputs.container_hooks_current_version != needs.check_versions.outputs.container_hooks_latest_version
outputs:
pr_name: ${{ steps.pr_name.outputs.pr_name }}
env:
GH_TOKEN: ${{ github.token }}
steps:
- name: debug
run:
echo "RUNNER_CURRENT_VERSION=${{ needs.check_versions.outputs.runner_current_version }}"
echo "RUNNER_LATEST_VERSION=${{ needs.check_versions.outputs.runner_latest_version }}"
echo "CONTAINER_HOOKS_CURRENT_VERSION=${{ needs.check_versions.outputs.container_hooks_current_version }}"
echo "CONTAINER_HOOKS_LATEST_VERSION=${{ needs.check_versions.outputs.container_hooks_latest_version }}"
- uses: actions/checkout@v3
- name: PR Name
id: pr_name
env:
RUNNER_CURRENT_VERSION: ${{ needs.check_versions.outputs.runner_current_version }}
RUNNER_LATEST_VERSION: ${{ needs.check_versions.outputs.runner_latest_version }}
CONTAINER_HOOKS_CURRENT_VERSION: ${{ needs.check_versions.outputs.container_hooks_current_version }}
CONTAINER_HOOKS_LATEST_VERSION: ${{ needs.check_versions.outputs.container_hooks_latest_version }}
# Generate a PR name with the following title:
# Updates: runner to v2.304.0 and container-hooks to v0.3.1
run: |
RUNNER_MESSAGE="runner to v${RUNNER_LATEST_VERSION}"
CONTAINER_HOOKS_MESSAGE="container-hooks to v${CONTAINER_HOOKS_LATEST_VERSION}"
PR_NAME="Updates:"
if [ "$RUNNER_CURRENT_VERSION" != "$RUNNER_LATEST_VERSION" ]
then
PR_NAME="$PR_NAME $RUNNER_MESSAGE"
fi
if [ "$CONTAINER_HOOKS_CURRENT_VERSION" != "$CONTAINER_HOOKS_LATEST_VERSION" ]
then
PR_NAME="$PR_NAME $CONTAINER_HOOKS_MESSAGE"
fi
result=$(gh pr list --search "$PR_NAME" --json number --jq ".[].number" --limit 1)
if [ -z "$result" ]
then
echo "No existing PRs found, setting output with pr_name=$PR_NAME"
echo pr_name=$PR_NAME >> $GITHUB_OUTPUT
else
echo "Found a PR with title '$PR_NAME' already existing: ${{ github.server_url }}/${{ github.repository }}/pull/$result"
fi
# update_version updates runner version in the files listed below, commits
# the changes and opens a pull request as `github-actions` bot.
update_version:
runs-on: ubuntu-latest
needs:
- check_versions
- check_pr
if: needs.check_pr.outputs.pr_name
permissions:
pull-requests: write
contents: write
actions: write
env:
GH_TOKEN: ${{ github.token }}
RUNNER_CURRENT_VERSION: ${{ needs.check_versions.outputs.runner_current_version }}
RUNNER_LATEST_VERSION: ${{ needs.check_versions.outputs.runner_latest_version }}
CONTAINER_HOOKS_CURRENT_VERSION: ${{ needs.check_versions.outputs.container_hooks_current_version }}
CONTAINER_HOOKS_LATEST_VERSION: ${{ needs.check_versions.outputs.container_hooks_latest_version }}
PR_NAME: ${{ needs.check_pr.outputs.pr_name }}
steps:
- uses: actions/checkout@v3
- name: New branch
run: git checkout -b update-runner-"$(date +%Y-%m-%d)"
- name: Update files
run: |
sed -i "s/$RUNNER_CURRENT_VERSION/$RUNNER_LATEST_VERSION/g" runner/VERSION
sed -i "s/$RUNNER_CURRENT_VERSION/$RUNNER_LATEST_VERSION/g" runner/Makefile
sed -i "s/$RUNNER_CURRENT_VERSION/$RUNNER_LATEST_VERSION/g" Makefile
sed -i "s/$RUNNER_CURRENT_VERSION/$RUNNER_LATEST_VERSION/g" test/e2e/e2e_test.go
sed -i "s/$CONTAINER_HOOKS_CURRENT_VERSION/$CONTAINER_HOOKS_LATEST_VERSION/g" runner/VERSION
sed -i "s/$CONTAINER_HOOKS_CURRENT_VERSION/$CONTAINER_HOOKS_LATEST_VERSION/g" runner/Makefile
sed -i "s/$CONTAINER_HOOKS_CURRENT_VERSION/$CONTAINER_HOOKS_LATEST_VERSION/g" Makefile
sed -i "s/$CONTAINER_HOOKS_CURRENT_VERSION/$CONTAINER_HOOKS_LATEST_VERSION/g" test/e2e/e2e_test.go
- name: Commit changes
run: |
# from https://github.com/orgs/community/discussions/26560
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git config user.name "github-actions[bot]"
git add .
git commit -m "$PR_NAME"
git push -u origin HEAD
- name: Create pull request
run: gh pr create -f -l "runners update"

View File

@@ -6,7 +6,7 @@ on:
- master
paths:
- 'charts/**'
- '.github/workflows/validate-chart.yaml'
- '.github/workflows/arc-validate-chart.yaml'
- '!charts/actions-runner-controller/docs/**'
- '!**.md'
- '!charts/gha-runner-scale-set-controller/**'
@@ -14,7 +14,7 @@ on:
push:
paths:
- 'charts/**'
- '.github/workflows/validate-chart.yaml'
- '.github/workflows/arc-validate-chart.yaml'
- '!charts/actions-runner-controller/docs/**'
- '!**.md'
- '!charts/gha-runner-scale-set-controller/**'
@@ -27,6 +27,13 @@ env:
permissions:
contents: read
concurrency:
# This will make sure we only apply the concurrency limits on pull requests
# but not pushes to master branch by making the concurrency group name unique
# for pushes
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
validate-chart:
name: Lint Chart
@@ -65,7 +72,7 @@ jobs:
python-version: '3.7'
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.3.1
uses: helm/chart-testing-action@v2.4.0
- name: Run chart-testing (list-changed)
id: list-changed

View File

@@ -1,4 +1,4 @@
name: Validate Runners
name: Validate ARC Runners
on:
pull_request:
@@ -12,6 +12,13 @@ on:
permissions:
contents: read
concurrency:
# This will make sure we only apply the concurrency limits on pull requests
# but not pushes to master branch by making the concurrency group name unique
# for pushes
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
shellcheck:
name: runner / shellcheck

View File

@@ -1,4 +1,4 @@
name: CI ARC E2E Linux VM Test
name: (gha) E2E Tests
on:
push:
@@ -16,7 +16,14 @@ env:
TARGET_ORG: actions-runner-controller
TARGET_REPO: arc_e2e_test_dummy
IMAGE_NAME: "arc-test-image"
IMAGE_VERSION: "dev"
IMAGE_VERSION: "0.4.0"
concurrency:
# This will make sure we only apply the concurrency limits on pull requests
# but not pushes to master branch by making the concurrency group name unique
# for pushes
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
default-setup:
@@ -51,21 +58,21 @@ jobs:
--debug
count=0
while true; do
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller -o name)
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-rs-controller -o name)
if [ -n "$POD_NAME" ]; then
echo "Pod found: $POD_NAME"
break
fi
if [ "$count" -ge 60 ]; then
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-runner-scale-set-controller"
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-rs-controller"
exit 1
fi
sleep 1
count=$((count+1))
done
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-rs-controller
kubectl get pod -n arc-systems
kubectl describe deployment arc-gha-runner-scale-set-controller -n arc-systems
kubectl describe deployment arc-gha-rs-controller -n arc-systems
- name: Install gha-runner-scale-set
id: install_arc
@@ -142,21 +149,21 @@ jobs:
--debug
count=0
while true; do
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller -o name)
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-rs-controller -o name)
if [ -n "$POD_NAME" ]; then
echo "Pod found: $POD_NAME"
break
fi
if [ "$count" -ge 60 ]; then
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-runner-scale-set-controller"
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-rs-controller"
exit 1
fi
sleep 1
count=$((count+1))
done
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-rs-controller
kubectl get pod -n arc-systems
kubectl describe deployment arc-gha-runner-scale-set-controller -n arc-systems
kubectl describe deployment arc-gha-rs-controller -n arc-systems
- name: Install gha-runner-scale-set
id: install_arc
@@ -231,21 +238,21 @@ jobs:
--debug
count=0
while true; do
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller -o name)
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-rs-controller -o name)
if [ -n "$POD_NAME" ]; then
echo "Pod found: $POD_NAME"
break
fi
if [ "$count" -ge 60 ]; then
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-runner-scale-set-controller"
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-rs-controller"
exit 1
fi
sleep 1
count=$((count+1))
done
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-rs-controller
kubectl get pod -n arc-systems
kubectl describe deployment arc-gha-runner-scale-set-controller -n arc-systems
kubectl describe deployment arc-gha-rs-controller -n arc-systems
- name: Install gha-runner-scale-set
id: install_arc
@@ -326,21 +333,21 @@ jobs:
--debug
count=0
while true; do
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller -o name)
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-rs-controller -o name)
if [ -n "$POD_NAME" ]; then
echo "Pod found: $POD_NAME"
break
fi
if [ "$count" -ge 60 ]; then
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-runner-scale-set-controller"
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-rs-controller"
exit 1
fi
sleep 1
count=$((count+1))
done
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-rs-controller
kubectl get pod -n arc-systems
kubectl describe deployment arc-gha-runner-scale-set-controller -n arc-systems
kubectl describe deployment arc-gha-rs-controller -n arc-systems
kubectl wait --timeout=30s --for=condition=ready pod -n openebs -l name=openebs-localpv-provisioner
- name: Install gha-runner-scale-set
@@ -420,21 +427,21 @@ jobs:
--debug
count=0
while true; do
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller -o name)
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-rs-controller -o name)
if [ -n "$POD_NAME" ]; then
echo "Pod found: $POD_NAME"
break
fi
if [ "$count" -ge 60 ]; then
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-runner-scale-set-controller"
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-rs-controller"
exit 1
fi
sleep 1
count=$((count+1))
done
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-rs-controller
kubectl get pod -n arc-systems
kubectl describe deployment arc-gha-runner-scale-set-controller -n arc-systems
kubectl describe deployment arc-gha-rs-controller -n arc-systems
- name: Install gha-runner-scale-set
id: install_arc
@@ -521,21 +528,21 @@ jobs:
--debug
count=0
while true; do
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller -o name)
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-rs-controller -o name)
if [ -n "$POD_NAME" ]; then
echo "Pod found: $POD_NAME"
break
fi
if [ "$count" -ge 60 ]; then
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-runner-scale-set-controller"
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-rs-controller"
exit 1
fi
sleep 1
count=$((count+1))
done
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-rs-controller
kubectl get pod -n arc-systems
kubectl describe deployment arc-gha-runner-scale-set-controller -n arc-systems
kubectl describe deployment arc-gha-rs-controller -n arc-systems
- name: Install gha-runner-scale-set
id: install_arc
@@ -616,21 +623,21 @@ jobs:
--debug
count=0
while true; do
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller -o name)
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-rs-controller -o name)
if [ -n "$POD_NAME" ]; then
echo "Pod found: $POD_NAME"
break
fi
if [ "$count" -ge 60 ]; then
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-runner-scale-set-controller"
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-rs-controller"
exit 1
fi
sleep 1
count=$((count+1))
done
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-runner-scale-set-controller
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-rs-controller
kubectl get pod -n arc-systems
kubectl describe deployment arc-gha-runner-scale-set-controller -n arc-systems
kubectl describe deployment arc-gha-rs-controller -n arc-systems
- name: Install gha-runner-scale-set
id: install_arc
@@ -703,3 +710,173 @@ jobs:
arc-name: ${{steps.install_arc.outputs.ARC_NAME}}
arc-namespace: "arc-runners"
arc-controller-namespace: "arc-systems"
update-strategy-tests:
runs-on: ubuntu-latest
timeout-minutes: 20
if: github.event_name != 'pull_request' || github.event.pull_request.head.repo.id == github.repository_id
env:
WORKFLOW_FILE: "arc-test-sleepy-matrix.yaml"
steps:
- uses: actions/checkout@v3
with:
ref: ${{github.head_ref}}
- uses: ./.github/actions/setup-arc-e2e
id: setup
with:
app-id: ${{secrets.E2E_TESTS_ACCESS_APP_ID}}
app-pk: ${{secrets.E2E_TESTS_ACCESS_PK}}
image-name: ${{env.IMAGE_NAME}}
image-tag: ${{env.IMAGE_VERSION}}
target-org: ${{env.TARGET_ORG}}
- name: Install gha-runner-scale-set-controller
id: install_arc_controller
run: |
helm install arc \
--namespace "arc-systems" \
--create-namespace \
--set image.repository=${{ env.IMAGE_NAME }} \
--set image.tag=${{ env.IMAGE_VERSION }} \
--set flags.updateStrategy="eventual" \
./charts/gha-runner-scale-set-controller \
--debug
count=0
while true; do
POD_NAME=$(kubectl get pods -n arc-systems -l app.kubernetes.io/name=gha-rs-controller -o name)
if [ -n "$POD_NAME" ]; then
echo "Pod found: $POD_NAME"
break
fi
if [ "$count" -ge 60 ]; then
echo "Timeout waiting for controller pod with label app.kubernetes.io/name=gha-rs-controller"
exit 1
fi
sleep 1
count=$((count+1))
done
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l app.kubernetes.io/name=gha-rs-controller
kubectl get pod -n arc-systems
kubectl describe deployment arc-gha-rs-controller -n arc-systems
- name: Install gha-runner-scale-set
id: install_arc
run: |
ARC_NAME=${{github.job}}-$(date +'%M%S')$((($RANDOM + 100) % 100 + 1))
helm install "$ARC_NAME" \
--namespace "arc-runners" \
--create-namespace \
--set githubConfigUrl="https://github.com/${{ env.TARGET_ORG }}/${{env.TARGET_REPO}}" \
--set githubConfigSecret.github_token="${{ steps.setup.outputs.token }}" \
./charts/gha-runner-scale-set \
--debug
echo "ARC_NAME=$ARC_NAME" >> $GITHUB_OUTPUT
count=0
while true; do
POD_NAME=$(kubectl get pods -n arc-systems -l actions.github.com/scale-set-name=$ARC_NAME -o name)
if [ -n "$POD_NAME" ]; then
echo "Pod found: $POD_NAME"
break
fi
if [ "$count" -ge 60 ]; then
echo "Timeout waiting for listener pod with label actions.github.com/scale-set-name=$ARC_NAME"
exit 1
fi
sleep 1
count=$((count+1))
done
kubectl wait --timeout=30s --for=condition=ready pod -n arc-systems -l actions.github.com/scale-set-name=$ARC_NAME
kubectl get pod -n arc-systems
- name: Trigger long running jobs and wait for runners to pick them up
uses: ./.github/actions/execute-assert-arc-e2e
timeout-minutes: 10
with:
auth-token: ${{ steps.setup.outputs.token }}
repo-owner: ${{ env.TARGET_ORG }}
repo-name: ${{env.TARGET_REPO}}
workflow-file: ${{env.WORKFLOW_FILE}}
arc-name: ${{steps.install_arc.outputs.ARC_NAME}}
arc-namespace: "arc-runners"
arc-controller-namespace: "arc-systems"
wait-to-running: "true"
wait-to-finish: "false"
- name: Upgrade the gha-runner-scale-set
shell: bash
run: |
helm upgrade --install "${{ steps.install_arc.outputs.ARC_NAME }}" \
--namespace "arc-runners" \
--create-namespace \
--set githubConfigUrl="https://github.com/${{ env.TARGET_ORG }}/${{ env.TARGET_REPO }}" \
--set githubConfigSecret.github_token="${{ steps.setup.outputs.token }}" \
--set template.spec.containers[0].name="runner" \
--set template.spec.containers[0].image="ghcr.io/actions/actions-runner:latest" \
--set template.spec.containers[0].command={"/home/runner/run.sh"} \
--set template.spec.containers[0].env[0].name="TEST" \
--set template.spec.containers[0].env[0].value="E2E TESTS" \
./charts/gha-runner-scale-set \
--debug
- name: Assert that the listener is deleted while jobs are running
shell: bash
run: |
count=0
while true; do
LISTENER_COUNT="$(kubectl get pods -l actions.github.com/scale-set-name=${{ steps.install_arc.outputs.ARC_NAME }} -n arc-systems --field-selector=status.phase=Running -o=jsonpath='{.items}' | jq 'length')"
RUNNERS_COUNT="$(kubectl get pods -l app.kubernetes.io/component=runner -n arc-runners --field-selector=status.phase=Running -o=jsonpath='{.items}' | jq 'length')"
RESOURCES="$(kubectl get pods -A)"
if [ "$LISTENER_COUNT" -eq 0 ]; then
echo "Listener has been deleted"
echo "$RESOURCES"
exit 0
fi
if [ "$count" -ge 60 ]; then
echo "Timeout waiting for listener to be deleted"
echo "$RESOURCES"
exit 1
fi
echo "Waiting for listener to be deleted"
echo "Listener count: $LISTENER_COUNT target: 0 | Runners count: $RUNNERS_COUNT target: 3"
sleep 1
count=$((count+1))
done
- name: Assert that the listener goes back up after the jobs are done
shell: bash
run: |
count=0
while true; do
LISTENER_COUNT="$(kubectl get pods -l actions.github.com/scale-set-name=${{ steps.install_arc.outputs.ARC_NAME }} -n arc-systems --field-selector=status.phase=Running -o=jsonpath='{.items}' | jq 'length')"
RUNNERS_COUNT="$(kubectl get pods -l app.kubernetes.io/component=runner -n arc-runners --field-selector=status.phase=Running -o=jsonpath='{.items}' | jq 'length')"
RESOURCES="$(kubectl get pods -A)"
if [ "$LISTENER_COUNT" -eq 1 ]; then
echo "Listener is up!"
echo "$RESOURCES"
exit 0
fi
if [ "$count" -ge 120 ]; then
echo "Timeout waiting for listener to be recreated"
echo "$RESOURCES"
exit 1
fi
echo "Waiting for listener to be recreated"
echo "Listener count: $LISTENER_COUNT target: 1 | Runners count: $RUNNERS_COUNT target: 0"
sleep 1
count=$((count+1))
done
- name: Gather logs and cleanup
shell: bash
if: always()
run: |
helm uninstall "${{ steps.install_arc.outputs.ARC_NAME }}" --namespace "arc-runners" --debug
kubectl wait --timeout=10s --for=delete AutoScalingRunnerSet -n "${{ steps.install_arc.outputs.ARC_NAME }}" -l app.kubernetes.io/instance="${{ steps.install_arc.outputs.ARC_NAME }}"
kubectl logs deployment/arc-gha-rs-controller -n "arc-systems"

View File

@@ -1,4 +1,4 @@
name: Publish Runner Scale Set Controller Charts
name: (gha) Publish Helm Charts
on:
workflow_dispatch:
@@ -33,10 +33,14 @@ env:
HELM_VERSION: v3.8.0
permissions:
packages: write
packages: write
concurrency:
group: ${{ github.workflow }}
cancel-in-progress: true
jobs:
build-push-image:
build-push-image:
name: Build and push controller image
runs-on: ubuntu-latest
steps:
@@ -46,7 +50,14 @@ jobs:
# If inputs.ref is empty, it'll resolve to the default branch
ref: ${{ inputs.ref }}
- name: Resolve parameters
- name: Check chart versions
# Binary version and chart versions need to match.
# In case of an upgrade, the controller will try to clean up
# resources with older versions that should have been cleaned up
# during the upgrade process
run: ./hack/check-gh-chart-versions.sh ${{ inputs.release_tag_name }}
- name: Resolve parameters
id: resolve_parameters
run: |
resolvedRef="${{ inputs.ref }}"
@@ -67,7 +78,7 @@ jobs:
uses: docker/setup-buildx-action@v2
with:
# Pinning v0.9.1 for Buildx and BuildKit v0.10.6
# BuildKit v0.11 which has a bug causing intermittent
# BuildKit v0.11 which has a bug causing intermittent
# failures pushing images to GHCR
version: v0.9.1
driver-opts: image=moby/buildkit:v0.10.6
@@ -94,7 +105,7 @@ jobs:
- name: Job summary
run: |
echo "The [publish-runner-scale-set.yaml](https://github.com/actions/actions-runner-controller/blob/main/.github/workflows/publish-runner-scale-set.yaml) workflow run was completed successfully!" >> $GITHUB_STEP_SUMMARY
echo "The [gha-publish-chart.yaml](https://github.com/actions/actions-runner-controller/blob/main/.github/workflows/gha-publish-chart.yaml) workflow run was completed successfully!" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "**Parameters:**" >> $GITHUB_STEP_SUMMARY
echo "- Ref: ${{ steps.resolve_parameters.outputs.resolvedRef }}" >> $GITHUB_STEP_SUMMARY
@@ -115,7 +126,7 @@ jobs:
# If inputs.ref is empty, it'll resolve to the default branch
ref: ${{ inputs.ref }}
- name: Resolve parameters
- name: Resolve parameters
id: resolve_parameters
run: |
resolvedRef="${{ inputs.ref }}"
@@ -126,7 +137,7 @@ jobs:
echo "INFO: Resolving short SHA for $resolvedRef"
echo "short_sha=$(git rev-parse --short $resolvedRef)" >> $GITHUB_OUTPUT
echo "INFO: Normalizing repository name (lowercase)"
echo "repository_owner=$(echo ${{ github.repository_owner }} | tr '[:upper:]' '[:lower:]')" >> $GITHUB_OUTPUT
echo "repository_owner=$(echo ${{ github.repository_owner }} | tr '[:upper:]' '[:lower:]')" >> $GITHUB_OUTPUT
- name: Set up Helm
# Using https://github.com/Azure/setup-helm/releases/tag/v3.5
@@ -140,7 +151,7 @@ jobs:
GHA_RUNNER_SCALE_SET_CONTROLLER_CHART_VERSION_TAG=$(cat charts/gha-runner-scale-set-controller/Chart.yaml | grep version: | cut -d " " -f 2)
echo "GHA_RUNNER_SCALE_SET_CONTROLLER_CHART_VERSION_TAG=${GHA_RUNNER_SCALE_SET_CONTROLLER_CHART_VERSION_TAG}" >> $GITHUB_ENV
helm package charts/gha-runner-scale-set-controller/ --version="${GHA_RUNNER_SCALE_SET_CONTROLLER_CHART_VERSION_TAG}"
helm push gha-runner-scale-set-controller-"${GHA_RUNNER_SCALE_SET_CONTROLLER_CHART_VERSION_TAG}".tgz oci://ghcr.io/${{ steps.resolve_parameters.outputs.repository_owner }}/actions-runner-controller-charts
helm push gha-rs-controller-"${GHA_RUNNER_SCALE_SET_CONTROLLER_CHART_VERSION_TAG}".tgz oci://ghcr.io/${{ steps.resolve_parameters.outputs.repository_owner }}/actions-runner-controller-charts
- name: Job summary
run: |
@@ -163,7 +174,7 @@ jobs:
# If inputs.ref is empty, it'll resolve to the default branch
ref: ${{ inputs.ref }}
- name: Resolve parameters
- name: Resolve parameters
id: resolve_parameters
run: |
resolvedRef="${{ inputs.ref }}"
@@ -174,7 +185,7 @@ jobs:
echo "INFO: Resolving short SHA for $resolvedRef"
echo "short_sha=$(git rev-parse --short $resolvedRef)" >> $GITHUB_OUTPUT
echo "INFO: Normalizing repository name (lowercase)"
echo "repository_owner=$(echo ${{ github.repository_owner }} | tr '[:upper:]' '[:lower:]')" >> $GITHUB_OUTPUT
echo "repository_owner=$(echo ${{ github.repository_owner }} | tr '[:upper:]' '[:lower:]')" >> $GITHUB_OUTPUT
- name: Set up Helm
# Using https://github.com/Azure/setup-helm/releases/tag/v3.5
@@ -189,7 +200,7 @@ jobs:
GHA_RUNNER_SCALE_SET_CHART_VERSION_TAG=$(cat charts/gha-runner-scale-set/Chart.yaml | grep version: | cut -d " " -f 2)
echo "GHA_RUNNER_SCALE_SET_CHART_VERSION_TAG=${GHA_RUNNER_SCALE_SET_CHART_VERSION_TAG}" >> $GITHUB_ENV
helm package charts/gha-runner-scale-set/ --version="${GHA_RUNNER_SCALE_SET_CHART_VERSION_TAG}"
helm push gha-runner-scale-set-"${GHA_RUNNER_SCALE_SET_CHART_VERSION_TAG}".tgz oci://ghcr.io/${{ steps.resolve_parameters.outputs.repository_owner }}/actions-runner-controller-charts
helm push gha-rs-"${GHA_RUNNER_SCALE_SET_CHART_VERSION_TAG}".tgz oci://ghcr.io/${{ steps.resolve_parameters.outputs.repository_owner }}/actions-runner-controller-charts
- name: Job summary
run: |

View File

@@ -1,4 +1,4 @@
name: Validate Helm Chart (gha-runner-scale-set-controller and gha-runner-scale-set)
name: (gha) Validate Helm Charts
on:
pull_request:
@@ -6,13 +6,13 @@ on:
- master
paths:
- 'charts/**'
- '.github/workflows/validate-gha-chart.yaml'
- '.github/workflows/gha-validate-chart.yaml'
- '!charts/actions-runner-controller/**'
- '!**.md'
push:
paths:
- 'charts/**'
- '.github/workflows/validate-gha-chart.yaml'
- '.github/workflows/gha-validate-chart.yaml'
- '!charts/actions-runner-controller/**'
- '!**.md'
workflow_dispatch:
@@ -23,6 +23,13 @@ env:
permissions:
contents: read
concurrency:
# This will make sure we only apply the concurrency limits on pull requests
# but not pushes to master branch by making the concurrency group name unique
# for pushes
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
validate-chart:
name: Lint Chart
@@ -61,23 +68,7 @@ jobs:
python-version: '3.7'
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.3.1
- name: Set up latest version chart-testing
run: |
echo 'deb [trusted=yes] https://repo.goreleaser.com/apt/ /' | sudo tee /etc/apt/sources.list.d/goreleaser.list
sudo apt update
sudo apt install goreleaser
git clone https://github.com/helm/chart-testing
cd chart-testing
unset CT_CONFIG_DIR
goreleaser build --clean --skip-validate
./dist/chart-testing_linux_amd64_v1/ct version
echo 'Adding ct directory to PATH...'
echo "$RUNNER_TEMP/chart-testing/dist/chart-testing_linux_amd64_v1" >> "$GITHUB_PATH"
echo 'Setting CT_CONFIG_DIR...'
echo "CT_CONFIG_DIR=$RUNNER_TEMP/chart-testing/etc" >> "$GITHUB_ENV"
working-directory: ${{ runner.temp }}
uses: helm/chart-testing-action@v2.4.0
- name: Run chart-testing (list-changed)
id: list-changed

View File

@@ -1,4 +1,4 @@
name: Publish Canary Image
name: Publish Canary Images
# Revert to https://github.com/actions-runner-controller/releases#releases
# for details on why we use this approach
@@ -11,19 +11,19 @@ on:
- '.github/actions/**'
- '.github/ISSUE_TEMPLATE/**'
- '.github/workflows/e2e-test-dispatch-workflow.yaml'
- '.github/workflows/e2e-test-linux-vm.yaml'
- '.github/workflows/publish-arc.yaml'
- '.github/workflows/publish-chart.yaml'
- '.github/workflows/publish-runner-scale-set.yaml'
- '.github/workflows/release-runners.yaml'
- '.github/workflows/run-codeql.yaml'
- '.github/workflows/run-first-interaction.yaml'
- '.github/workflows/run-stale.yaml'
- '.github/workflows/update-runners.yaml'
- '.github/workflows/gha-e2e-tests.yaml'
- '.github/workflows/arc-publish.yaml'
- '.github/workflows/arc-publish-chart.yaml'
- '.github/workflows/gha-publish-chart.yaml'
- '.github/workflows/arc-release-runners.yaml'
- '.github/workflows/global-run-codeql.yaml'
- '.github/workflows/global-run-first-interaction.yaml'
- '.github/workflows/global-run-stale.yaml'
- '.github/workflows/arc-update-runners-scheduled.yaml'
- '.github/workflows/validate-arc.yaml'
- '.github/workflows/validate-chart.yaml'
- '.github/workflows/validate-gha-chart.yaml'
- '.github/workflows/validate-runners.yaml'
- '.github/workflows/arc-validate-chart.yaml'
- '.github/workflows/gha-validate-chart.yaml'
- '.github/workflows/arc-validate-runners.yaml'
- '.github/dependabot.yml'
- '.github/RELEASE_NOTE_TEMPLATE.md'
- 'runner/**'
@@ -37,6 +37,10 @@ permissions:
contents: read
packages: write
concurrency:
group: ${{ github.workflow }}
cancel-in-progress: true
env:
# Safeguard to prevent pushing images to registeries after build
PUSH_TO_REGISTRIES: true
@@ -126,4 +130,4 @@ jobs:
ghcr.io/${{ steps.resolve_parameters.outputs.repository_owner }}/gha-runner-scale-set-controller:canary
ghcr.io/${{ steps.resolve_parameters.outputs.repository_owner }}/gha-runner-scale-set-controller:canary-${{ steps.resolve_parameters.outputs.short_sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
cache-to: type=gha,mode=max

View File

@@ -10,6 +10,13 @@ on:
schedule:
- cron: '30 1 * * 0'
concurrency:
# This will make sure we only apply the concurrency limits on pull requests
# but not pushes to master branch by making the concurrency group name unique
# for pushes
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
analyze:
name: Analyze

View File

@@ -1,4 +1,4 @@
name: first-interaction
name: First Interaction
on:
issues:

View File

@@ -8,7 +8,6 @@ on:
- '**.go'
- 'go.mod'
- 'go.sum'
pull_request:
paths:
- '.github/workflows/go.yaml'
@@ -19,6 +18,13 @@ on:
permissions:
contents: read
concurrency:
# This will make sure we only apply the concurrency limits on pull requests
# but not pushes to master branch by making the concurrency group name unique
# for pushes
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
fmt:
runs-on: ubuntu-latest
@@ -72,9 +78,11 @@ jobs:
run: git diff --exit-code
- name: Install kubebuilder
run: |
curl -L -O https://github.com/kubernetes-sigs/kubebuilder/releases/download/v2.3.2/kubebuilder_2.3.2_linux_amd64.tar.gz
tar zxvf kubebuilder_2.3.2_linux_amd64.tar.gz
sudo mv kubebuilder_2.3.2_linux_amd64 /usr/local/kubebuilder
curl -D headers.txt -fsL "https://storage.googleapis.com/kubebuilder-tools/kubebuilder-tools-1.26.1-linux-amd64.tar.gz" -o kubebuilder-tools
echo "$(grep -i etag headers.txt -m 1 | cut -d'"' -f2) kubebuilder-tools" > sum
md5sum -c sum
tar -zvxf kubebuilder-tools
sudo mv kubebuilder /usr/local/
- name: Run go tests
run: |
go test -short `go list ./... | grep -v ./test_e2e_arc`

View File

@@ -1,109 +0,0 @@
# This workflows polls releases from actions/runner and in case of a new one it
# updates files containing runner version and opens a pull request.
name: Update runners
on:
schedule:
# run daily
- cron: "0 9 * * *"
workflow_dispatch:
jobs:
# check_versions compares our current version and the latest available runner
# version and sets them as outputs.
check_versions:
runs-on: ubuntu-latest
env:
GH_TOKEN: ${{ github.token }}
outputs:
current_version: ${{ steps.versions.outputs.current_version }}
latest_version: ${{ steps.versions.outputs.latest_version }}
steps:
- uses: actions/checkout@v3
- name: Get current and latest versions
id: versions
run: |
CURRENT_VERSION=$(echo -n $(cat runner/VERSION))
echo "Current version: $CURRENT_VERSION"
echo current_version=$CURRENT_VERSION >> $GITHUB_OUTPUT
LATEST_VERSION=$(gh release list --exclude-drafts --exclude-pre-releases --limit 1 -R actions/runner | grep -oP '(?<=v)[0-9.]+' | head -1)
echo "Latest version: $LATEST_VERSION"
echo latest_version=$LATEST_VERSION >> $GITHUB_OUTPUT
# check_pr checks if a PR for the same update already exists. It only runs if
# runner latest version != our current version. If no existing PR is found,
# it sets a PR name as output.
check_pr:
runs-on: ubuntu-latest
needs: check_versions
if: needs.check_versions.outputs.current_version != needs.check_versions.outputs.latest_version
outputs:
pr_name: ${{ steps.pr_name.outputs.pr_name }}
env:
GH_TOKEN: ${{ github.token }}
steps:
- name: debug
run:
echo ${{ needs.check_versions.outputs.current_version }}
echo ${{ needs.check_versions.outputs.latest_version }}
- uses: actions/checkout@v3
- name: PR Name
id: pr_name
env:
LATEST_VERSION: ${{ needs.check_versions.outputs.latest_version }}
run: |
PR_NAME="Update runner to version ${LATEST_VERSION}"
result=$(gh pr list --search "$PR_NAME" --json number --jq ".[].number" --limit 1)
if [ -z "$result" ]
then
echo "No existing PRs found, setting output with pr_name=$PR_NAME"
echo pr_name=$PR_NAME >> $GITHUB_OUTPUT
else
echo "Found a PR with title '$PR_NAME' already existing: ${{ github.server_url }}/${{ github.repository }}/pull/$result"
fi
# update_version updates runner version in the files listed below, commits
# the changes and opens a pull request as `github-actions` bot.
update_version:
runs-on: ubuntu-latest
needs:
- check_versions
- check_pr
if: needs.check_pr.outputs.pr_name
permissions:
pull-requests: write
contents: write
actions: write
env:
GH_TOKEN: ${{ github.token }}
CURRENT_VERSION: ${{ needs.check_versions.outputs.current_version }}
LATEST_VERSION: ${{ needs.check_versions.outputs.latest_version }}
PR_NAME: ${{ needs.check_pr.outputs.pr_name }}
steps:
- uses: actions/checkout@v3
- name: New branch
run: git checkout -b update-runner-$LATEST_VERSION
- name: Update files
run: |
sed -i "s/$CURRENT_VERSION/$LATEST_VERSION/g" runner/VERSION
sed -i "s/$CURRENT_VERSION/$LATEST_VERSION/g" runner/Makefile
sed -i "s/$CURRENT_VERSION/$LATEST_VERSION/g" Makefile
sed -i "s/$CURRENT_VERSION/$LATEST_VERSION/g" test/e2e/e2e_test.go
sed -i "s/$CURRENT_VERSION/$LATEST_VERSION/g" .github/workflows/e2e-test-linux-vm.yaml
- name: Commit changes
run: |
# from https://github.com/orgs/community/discussions/26560
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git config user.name "github-actions[bot]"
git add .
git commit -m "$PR_NAME"
git push -u origin HEAD
- name: Create pull request
run: gh pr create -f

1
.gitignore vendored
View File

@@ -35,3 +35,4 @@ bin
.DS_STORE
/test-assets
/.tools

View File

@@ -15,6 +15,13 @@
- [Opening the Pull Request](#opening-the-pull-request)
- [Helm Version Changes](#helm-version-changes)
- [Testing Controller Built from a Pull Request](#testing-controller-built-from-a-pull-request)
- [Release process](#release-process)
- [Workflow structure](#workflow-structure)
- [Releasing legacy actions-runner-controller image and helm charts](#releasing-legacy-actions-runner-controller-image-and-helm-charts)
- [Release actions-runner-controller runner images](#release-actions-runner-controller-runner-images)
- [Release gha-runner-scale-set-controller image and helm charts](#release-gha-runner-scale-set-controller-image-and-helm-charts)
- [Release actions/runner image](#release-actionsrunner-image)
- [Canary releases](#canary-releases)
## Welcome
@@ -25,14 +32,13 @@ reviewed and merged.
## Before contributing code
We welcome code patches, but to make sure things are well coordinated you should discuss any significant change before starting the work.
The maintainers ask that you signal your intention to contribute to the project using the issue tracker.
If there is an existing issue that you want to work on, please let us know so we can get it assigned to you.
If you noticed a bug or want to add a new feature, there are issue templates you can fill out.
We welcome code patches, but to make sure things are well coordinated you should discuss any significant change before starting the work. The maintainers ask that you signal your intention to contribute to the project using the issue tracker. If there is an existing issue that you want to work on, please let us know so we can get it assigned to you. If you noticed a bug or want to add a new feature, there are issue templates you can fill out.
When filing a feature request, the maintainers will review the change and give you a decision on whether we are willing to accept the feature into the project.
For significantly large and/or complex features, we may request that you write up an architectural decision record ([ADR](https://github.blog/2020-08-13-why-write-adrs/)) detailing the change.
Please use the [template](/adrs/0000-TEMPLATE.md) as guidance.
Please use the [template](/docs/adrs/yyyy-mm-dd-TEMPLATE) as guidance.
<!--
TODO: Add a pre-requisite section describing what developers should
@@ -45,6 +51,7 @@ Depending on what you are patching depends on how you should go about it.
Below are some guides on how to test patches locally as well as develop the controller and runners.
When submitting a PR for a change please provide evidence that your change works as we still need to work on improving the CI of the project.
Some resources are provided for helping achieve this, see this guide for details.
### Developing the Controller
@@ -130,7 +137,7 @@ GINKGO_FOCUS='[It] should create a new Runner resource from the specified templa
>
> If you want to stick with `snap`-provided `docker`, do not forget to set `TMPDIR` to somewhere under `$HOME`.
> Otherwise `kind load docker-image` fail while running `docker save`.
> See https://kind.sigs.k8s.io/docs/user/known-issues/#docker-installed-with-snap for more information.
> See <https://kind.sigs.k8s.io/docs/user/known-issues/#docker-installed-with-snap> for more information.
To test your local changes against both PAT and App based authentication please run the `acceptance` make target with the authentication configuration details provided:
@@ -186,7 +193,7 @@ Before shipping your PR, please check the following items to make sure CI passes
- Run `go mod tidy` if you made changes to dependencies.
- Format the code using `gofmt`
- Run the `golangci-lint` tool locally.
- We recommend you use `make lint` to run the tool using a Docker container matching the CI version.
- We recommend you use `make lint` to run the tool using a Docker container matching the CI version.
### Opening the Pull Request
@@ -217,3 +224,146 @@ Please also note that you need to replace `$DOCKER_USER` with your own DockerHub
Only the maintainers can release a new version of actions-runner-controller, publish a new version of the helm charts, and runner images.
All release workflows have been moved to [actions-runner-controller/releases](https://github.com/actions-runner-controller/releases) since the packages are owned by the former organization.
### Workflow structure
Following the migration of actions-runner-controller into GitHub actions, all the workflows had to be modified to accommodate the move to a new organization. The following table describes the workflows, their purpose and dependencies.
| Filename | Workflow name | Purpose |
|-----------------------------------|--------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| gha-e2e-tests.yaml | (gha) E2E Tests | Tests the Autoscaling Runner Set mode end to end. Coverage is restricted to this mode. Legacy modes are not tested. |
| go.yaml | Format, Lint, Unit Tests | Formats, lints and runs unit tests for the entire codebase. |
| arc-publish.yaml | Publish ARC Image | Uploads release/actions-runner-controller.yaml as an artifact to the newly created release and triggers the [build and publication of the controller image](https://github.com/actions-runner-controller/releases/blob/main/.github/workflows/publish-arc.yaml) |
| global-publish-canary.yaml | Publish Canary Images | Builds and publishes canary controller container images for both new and legacy modes. |
| arc-publish-chart.yaml | Publish ARC Helm Charts | Packages and publishes charts/actions-runner-controller (via GitHub Pages) |
| gha-publish-chart.yaml | (gha) Publish Helm Charts | Packages and publishes charts/gha-runner-scale-set-controller and charts/gha-runner-scale-set charts (OCI to GHCR) |
| arc-release-runners.yaml | Release ARC Runner Images | Triggers [release-runners.yaml](https://github.com/actions-runner-controller/releases/blob/main/.github/workflows/release-runners.yaml) which will build and push new runner images used with the legacy ARC modes. |
| global-run-codeql.yaml | Run CodeQL | Run CodeQL on all the codebase |
| global-run-first-interaction.yaml | First Interaction | Informs first time contributors what to expect when they open a new issue / PR |
| global-run-stale.yaml | Run Stale Bot | Closes issues / PRs without activity |
| arc-update-runners-scheduled.yaml | Runner Updates Check (Scheduled Job) | Polls [actions/runner](https://github.com/actions/runner) and [actions/runner-container-hooks](https://github.com/actions/runner-container-hooks) for new releases. If found, a PR is created to publish new runner images |
| arc-validate-chart.yaml | Validate Helm Chart | Run helm chart validators for charts/actions-runner-controller |
| gha-validate-chart.yaml | (gha) Validate Helm Charts | Run helm chart validators for charts/gha-runner-scale-set-controller and charts/gha-runner-scale-set charts |
| arc-validate-runners.yaml | Validate ARC Runners | Run validators for runners |
There are 7 components that we release regularly:
1. legacy [actions-runner-controller controller image](https://github.com/actions-runner-controller/actions-runner-controller/pkgs/container/actions-runner-controller)
2. legacy [actions-runner-controller helm charts](https://actions-runner-controller.github.io/actions-runner-controller/)
3. legacy actions-runner-controller runner images
1. [ubuntu-20.04](https://github.com/actions-runner-controller/actions-runner-controller/pkgs/container/actions-runner-controller%2Factions-runner)
2. [ubuntu-22.04](https://github.com/actions-runner-controller/actions-runner-controller/pkgs/container/actions-runner-controller%2Factions-runner)
3. [dind-ubuntu-20.04](https://github.com/actions-runner-controller/actions-runner-controller/pkgs/container/actions-runner-controller%2Factions-runner-dind)
4. [dind-ubuntu-22.04](https://github.com/actions-runner-controller/actions-runner-controller/pkgs/container/actions-runner-controller%2Factions-runner-dind)
5. [dind-rootless-ubuntu-20.04](https://github.com/actions-runner-controller/actions-runner-controller/pkgs/container/actions-runner-controller%2Factions-runner-dind-rootless)
6. [dind-rootless-ubuntu-22.04](https://github.com/actions-runner-controller/actions-runner-controller/pkgs/container/actions-runner-controller%2Factions-runner-dind-rootless)
4. [gha-runner-scale-set-controller image](https://github.com/actions/actions-runner-controller/pkgs/container/gha-runner-scale-set-controller)
5. [gha-runner-scale-set-controller helm charts](https://github.com/actions/actions-runner-controller/pkgs/container/actions-runner-controller-charts%2Fgha-runner-scale-set-controller)
6. [gha-runner-scale-set runner helm charts](https://github.com/actions/actions-runner-controller/pkgs/container/actions-runner-controller-charts%2Fgha-runner-scale-set)
7. [actions/runner image](https://github.com/actions/actions-runner-controller/pkgs/container/actions-runner-controller%2Factions-runner)
#### Releasing legacy actions-runner-controller image and helm charts
1. Start by making sure the master branch is stable and all CI jobs are passing
2. Create a new release in <https://github.com/actions/actions-runner-controller/releases> (Draft a new release)
3. Bump up the `version` and `appVersion` in charts/actions-runner-controller/Chart.yaml - make sure the `version` matches the release version you just created. (Example: <https://github.com/actions/actions-runner-controller/pull/2577>)
4. When the workflows finish execution, you will see:
1. A new controller image published to: <https://github.com/actions-runner-controller/actions-runner-controller/pkgs/container/actions-runner-controller>
2. Helm charts published to: <https://github.com/actions-runner-controller/actions-runner-controller.github.io/tree/master/actions-runner-controller> (the index.yaml file is updated)
When a new release is created, the [Publish ARC Image](https://github.com/actions/actions-runner-controller/blob/master/.github/workflows/arc-publish.yaml) workflow is triggered.
```mermaid
flowchart LR
subgraph repository: actions/actions-runner-controller
event_a{{"release: published"}} -- triggers --> workflow_a["arc-publish.yaml"]
event_b{{"workflow_dispatch"}} -- triggers --> workflow_a["arc-publish.yaml"]
workflow_a["arc-publish.yaml"] -- uploads --> package["actions-runner-controller.tar.gz"]
end
subgraph repository: actions-runner-controller/releases
workflow_a["arc-publish.yaml"] -- triggers --> event_d{{"repository_dispatch"}} --> workflow_b["publish-arc.yaml"]
workflow_b["publish-arc.yaml"] -- push --> A["GHCR: \nactions-runner-controller/actions-runner-controller:*"]
workflow_b["publish-arc.yaml"] -- push --> B["DockerHub: \nsummerwind/actions-runner-controller:*"]
end
```
#### Release actions-runner-controller runner images
**Manual steps:**
1. Navigate to the [actions-runner-controller/releases](https://github.com/actions-runner-controller/releases) repository
2. Trigger [the release-runners.yaml](https://github.com/actions-runner-controller/releases/actions/workflows/release-runners.yaml) workflow.
1. The list of input prameters for this workflow is defined in the table below (always inspect the workflow file for the latest version)
<!-- Table of Paramters -->
| Parameter | Description | Default |
|----------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|
| `runner_version` | The version of the [actions/runner](https://github.com/actions/runner) to use | `2.300.2` |
| `docker_version` | The version of docker to use | `20.10.12` |
| `runner_container_hooks_version` | The version of [actions/runner-container-hooks](https://github.com/actions/runner-container-hooks) to use | `0.2.0` |
| `sha` | The commit sha from [actions/actions-runner-controller](https://github.com/actions/actions-runner-controller) to be used to build the runner images. This will be provided to `actions/checkout` & used to tag the container images | Empty string. |
| `push_to_registries` | Whether to push the images to the registries. Use false to test the build | false |
**Automated steps:**
```mermaid
flowchart LR
workflow["release-runners.yaml"] -- workflow_dispatch* --> workflow_b["release-runners.yaml"]
subgraph repository: actions/actions-runner-controller
runner_updates_check["arc-update-runners-scheduled.yaml"] -- "polls (daily)" --> runner_releases["actions/runner/releases"]
runner_updates_check -- creates --> runner_update_pr["PR: update /runner/VERSION"]****
runner_update_pr --> runner_update_pr_merge{{"merge"}}
runner_update_pr_merge -- triggers --> workflow["release-runners.yaml"]
end
subgraph repository: actions-runner-controller/releases
workflow_b["release-runners.yaml"] -- push --> A["GHCR: \n actions-runner-controller/actions-runner:* \n actions-runner-controller/actions-runner-dind:* \n actions-runner-controller/actions-runner-dind-rootless:*"]
workflow_b["release-runners.yaml"] -- push --> B["DockerHub: \n summerwind/actions-runner:* \n summerwind/actions-runner-dind:* \n summerwind/actions-runner-dind-rootless:*"]
event_b{{"workflow_dispatch"}} -- triggers --> workflow_b["release-runners.yaml"]
end
```
#### Release gha-runner-scale-set-controller image and helm charts
1. Make sure the master branch is stable and all CI jobs are passing
1. Prepare a release PR (example: <https://github.com/actions/actions-runner-controller/pull/2467>)
1. Bump up the version of the chart in: charts/gha-runner-scale-set-controller/Chart.yaml
2. Bump up the version of the chart in: charts/gha-runner-scale-set/Chart.yaml
1. Make sure that `version`, `appVersion` of both charts are always the same. These versions cannot diverge.
3. Update the quickstart guide to reflect the latest versions: docs/preview/gha-runner-scale-set-controller/README.md
4. Add changelog to the PR as well as the quickstart guide
1. Merge the release PR
1. Manually trigger the [(gha) Publish Helm Charts](https://github.com/actions/actions-runner-controller/actions/workflows/gha-publish-chart.yaml) workflow
1. Manually create a tag and release in [actions/actions-runner-controller](https://github.com/actions/actions-runner-controller/releases) with the format: `gha-runner-scale-set-x.x.x` where the version (x.x.x) matches that of the Helm chart
| Parameter | Description | Default |
|-------------------------------------------------|--------------------------------------------------------------------------------------------------------|----------------|
| `ref` | The branch, tag or SHA to cut a release from. | default branch |
| `release_tag_name` | The tag of the controller image. This is not a git tag. | canary |
| `push_to_registries` | Push images to registries. Use false to test the build process. | false |
| `publish_gha_runner_scale_set_controller_chart` | Publish new helm chart for gha-runner-scale-set-controller. This will push the new OCI archive to GHCR | false |
| `publish_gha_runner_scale_set_chart` | Publish new helm chart for gha-runner-scale-set. This will push the new OCI archive to GHCR | false |
#### Release actions/runner image
A new runner image is built and published to <https://github.com/actions/runner/pkgs/container/actions-runner> whenever a new runner binary has been released. There's nothing to do here.
#### Canary releases
We publish canary images for both the legacy actions-runner-controller and gha-runner-scale-set-controller images.
```mermaid
flowchart LR
subgraph org: actions
event_a{{"push: [master]"}} -- triggers --> workflow_a["publish-canary.yaml"]
end
subgraph org: actions-runner-controller
workflow_a["publish-canary.yaml"] -- triggers --> event_d{{"repository_dispatch"}} --> workflow_b["publish-canary.yaml"]
workflow_b["publish-canary.yaml"] -- push --> A["GHCR: \nactions-runner-controller/actions-runner-controller:canary"]
workflow_b["publish-canary.yaml"] -- push --> B["DockerHub: \nsummerwind/actions-runner-controller:canary"]
end
```
1. [actions-runner-controller canary image](https://github.com/actions-runner-controller/actions-runner-controller/pkgs/container/actions-runner-controller)
2. [gha-runner-scale-set-controller image](https://github.com/actions/actions-runner-controller/pkgs/container/gha-runner-scale-set-controller)
These canary images are automatically built and released on each push to the master branch.

View File

@@ -1,5 +1,5 @@
# Build the manager binary
FROM --platform=$BUILDPLATFORM golang:1.19.4 as builder
FROM --platform=$BUILDPLATFORM golang:1.20.7 as builder
WORKDIR /workspace

View File

@@ -5,7 +5,7 @@ else
endif
DOCKER_USER ?= $(shell echo ${DOCKER_IMAGE_NAME} | cut -d / -f1)
VERSION ?= dev
RUNNER_VERSION ?= 2.303.0
RUNNER_VERSION ?= 2.308.0
TARGETPLATFORM ?= $(shell arch)
RUNNER_NAME ?= ${DOCKER_USER}/actions-runner
RUNNER_TAG ?= ${VERSION}
@@ -95,7 +95,8 @@ run: generate fmt vet manifests
run-scaleset: generate fmt vet
CONTROLLER_MANAGER_POD_NAMESPACE=default \
CONTROLLER_MANAGER_CONTAINER_IMAGE="${DOCKER_IMAGE_NAME}:${VERSION}" \
go run ./main.go --auto-scaling-runner-set-only
go run -ldflags="-s -w -X 'github.com/actions/actions-runner-controller/build.Version=$(VERSION)'" \
./main.go --auto-scaling-runner-set-only
# Install CRDs into a cluster
install: manifests
@@ -202,7 +203,7 @@ generate: controller-gen
# Run shellcheck on runner scripts
shellcheck: shellcheck-install
$(TOOLS_PATH)/shellcheck --shell bash --source-path runner runner/*.sh
$(TOOLS_PATH)/shellcheck --shell bash --source-path runner runner/*.sh hack/*.sh
docker-buildx:
export DOCKER_CLI_EXPERIMENTAL=enabled ;\

View File

@@ -4,42 +4,40 @@
[![awesome-runners](https://img.shields.io/badge/listed%20on-awesome--runners-blue.svg)](https://github.com/jonico/awesome-runners)
[![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/actions-runner-controller)](https://artifacthub.io/packages/search?repo=actions-runner-controller)
## About
Actions Runner Controller (ARC) is a Kubernetes operator that orchestrates and scales self-hosted runners for GitHub Actions.
With ARC, you can create runner scale sets that automatically scale based on the number of workflows running in your repository, organization, or enterprise. Because controlled runners can be ephemeral and based on containers, new runner instances can scale up or down rapidly and cleanly. For more information about autoscaling, see ["Autoscaling with self-hosted runners."](https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/autoscaling-with-self-hosted-runners)
You can set up ARC on Kubernetes using Helm, then create and run a workflow that uses runner scale sets. For more information about runner scale sets, see ["Deploying runner scale sets with Actions Runner Controller."](https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/deploying-runner-scale-sets-with-actions-runner-controller#runner-scale-set)
## People
`actions-runner-controller` is an open-source project currently developed and maintained in collaboration with maintainers @mumoshu and @toast-gear, various [contributors](https://github.com/actions/actions-runner-controller/graphs/contributors), and the [awesome community](https://github.com/actions/actions-runner-controller/discussions), mostly in their spare time.
Actions Runner Controller (ARC) is an open-source project currently developed and maintained in collaboration with the GitHub Actions team, external maintainers @mumoshu and @toast-gear, various [contributors](https://github.com/actions/actions-runner-controller/graphs/contributors), and the [awesome community](https://github.com/actions/actions-runner-controller/discussions).
If you think the project is awesome and it's becoming a basis for your important business, consider [sponsoring us](https://github.com/sponsors/actions-runner-controller)!
If you think the project is awesome and is adding value to your business, please consider directly sponsoring [community maintainers](https://github.com/sponsors/actions-runner-controller) and individual contributors via GitHub Sponsors.
In case you are already the employer of one of contributors, sponsoring via GitHub Sponsors might not be an option. Just support them in other means!
We don't currently have [any sponsors dedicated to this project yet](https://github.com/sponsors/actions-runner-controller).
However, [HelloFresh](https://www.hellofreshgroup.com/en/) has recently started sponsoring @mumoshu for this project along with his other works. A part of their sponsorship will enable @mumoshu to add an E2E test to keep ARC even more reliable on AWS. Thank you for your sponsorship!
[<img src="https://user-images.githubusercontent.com/22009/170898715-07f02941-35ec-418b-8cd4-251b422fa9ac.png" width="219" height="71" />](https://careers.hellofresh.com/)
## Status
Even though actions-runner-controller is used in production environments, it is still in its early stage of development, hence versioned 0.x.
actions-runner-controller complies to Semantic Versioning 2.0.0 in which v0.x means that there could be backward-incompatible changes for every release.
The documentation is kept inline with master@HEAD, we do our best to highlight any features that require a specific ARC version or higher however this is not always easily done due to there being many moving parts. Additionally, we actively do not retain compatibly with every GitHub Enterprise Server version nor every Kubernetes version so you will need to ensure you stay current within a reasonable timespan.
## About
[GitHub Actions](https://github.com/features/actions) is a very useful tool for automating development. GitHub Actions jobs are run in the cloud by default, but you may want to run your jobs in your environment. [Self-hosted runner](https://github.com/actions/runner) can be used for such use cases, but requires the provisioning and configuration of a virtual machine instance. Instead if you already have a Kubernetes cluster, it makes more sense to run the self-hosted runner on top of it.
**actions-runner-controller** makes that possible. Just create a *Runner* resource on your Kubernetes, and it will run and operate the self-hosted runner for the specified repository. Combined with Kubernetes RBAC, you can also build simple Self-hosted runners as a Service.
See [the sponsorship dashboard](https://github.com/sponsors/actions-runner-controller) for the former and the current sponsors.
## Getting Started
To give ARC a try with just a handful of commands, Please refer to the [Quickstart guide](/docs/quickstart.md).
For an overview of ARC, please refer to [About ARC](https://github.com/actions/actions-runner-controller/blob/master/docs/about-arc.md)
To give ARC a try with just a handful of commands, Please refer to the [Quickstart guide](https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/quickstart-for-actions-runner-controller).
For more information, please refer to detailed documentation below!
For an overview of ARC, please refer to [About ARC](https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/about-actions-runner-controller)
## Documentation
With the introduction of [autoscaling runner scale sets](https://github.com/actions/actions-runner-controller/discussions/2775), the existing [autoscaling modes](./docs/automatically-scaling-runners.md) are now legacy. The legacy modes have certain use cases and will continue to be maintained by the community only.
For further information on what is supported by GitHub and what's managed by the community, please refer to [this announcement discussion.](https://github.com/actions/actions-runner-controller/discussions/2775)
### Documentation
ARC documentation is available on [docs.github.com](https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/quickstart-for-actions-runner-controller).
### Legacy documentation
The following documentation is for the legacy autoscaling modes that continue to be maintained by the community
- [Quickstart guide](/docs/quickstart.md)
- [About ARC](/docs/about-arc.md)

View File

@@ -304,3 +304,27 @@ If you noticed that it takes several minutes for sidecar dind container to be cr
**Solution**
The solution is to switch to using faster storage, if you are experiencing this issue you are probably using HDD storage. Switching to SSD storage fixed the problem in my case. Most cloud providers have a list of storage options to use just pick something faster that your current disk, for on prem clusters you will need to invest in some SSDs.
### Dockerd no space left on device
**Problem**
If you are running many containers on your runner you might encounter an issue where docker daemon is unable to start new containers and you see error `no space left on device`.
**Solution**
Add a `dockerVarRunVolumeSizeLimit` key in your runner's spec with a higher size limit (the default is 1M) For instance:
```yaml
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: github-runner
namespace: github-system
spec:
replicas: 6
template:
spec:
dockerVarRunVolumeSizeLimit: 50M
env: []
```

View File

@@ -102,6 +102,7 @@ if [ "${tool}" == "helm" ]; then
--set githubWebhookServer.podAnnotations.test-id=${TEST_ID} \
--set actionsMetricsServer.podAnnotations.test-id=${TEST_ID} \
${flags[@]} --set image.imagePullPolicy=${IMAGE_PULL_POLICY} \
--set image.dindSidecarRepositoryAndTag=${DIND_SIDECAR_REPOSITORY_AND_TAG} \
-f ${VALUES_FILE}
set +v
# To prevent `CustomResourceDefinition.apiextensions.k8s.io "runners.actions.summerwind.dev" is invalid: metadata.annotations: Too long: must have at most 262144 bytes`

View File

@@ -6,6 +6,10 @@ OP=${OP:-apply}
RUNNER_LABEL=${RUNNER_LABEL:-self-hosted}
# See https://github.com/actions/actions-runner-controller/issues/2123
kubectl delete secret generic docker-config || :
kubectl create secret generic docker-config --from-file .dockerconfigjson=<(jq -M 'del(.aliases)' $HOME/.docker/config.json) --type=kubernetes.io/dockerconfigjson || :
cat acceptance/testdata/kubernetes_container_mode.envsubst.yaml | NAMESPACE=${RUNNER_NAMESPACE} envsubst | kubectl apply -f -
if [ -n "${TEST_REPO}" ]; then

View File

@@ -95,6 +95,24 @@ spec:
# that part is created by dockerd.
mountPath: /home/runner/.local
readOnly: false
# See https://github.com/actions/actions-runner-controller/issues/2123
# Be sure to omit the "aliases" field from the config.json.
# Otherwise you may encounter nasty errors like:
# $ docker build
# docker: 'buildx' is not a docker command.
# See 'docker --help'
# due to the incompatibility between your host docker config.json and the runner environment.
# That is, your host dockcer config.json might contain this:
# "aliases": {
# "builder": "buildx"
# }
# And this results in the above error when the runner does not have buildx installed yet.
- name: docker-config
mountPath: /home/runner/.docker/config.json
subPath: config.json
readOnly: true
- name: docker-config-root
mountPath: /home/runner/.docker
volumes:
- name: rootless-dind-work-dir
ephemeral:
@@ -105,6 +123,15 @@ spec:
resources:
requests:
storage: 3Gi
- name: docker-config
# Refer to .dockerconfigjson/.docker/config.json
secret:
secretName: docker-config
items:
- key: .dockerconfigjson
path: config.json
- name: docker-config-root
emptyDir: {}
#
# Non-standard working directory

View File

@@ -22,7 +22,7 @@ import (
// HorizontalRunnerAutoscalerSpec defines the desired state of HorizontalRunnerAutoscaler
type HorizontalRunnerAutoscalerSpec struct {
// ScaleTargetRef sis the reference to scaled resource like RunnerDeployment
// ScaleTargetRef is the reference to scaled resource like RunnerDeployment
ScaleTargetRef ScaleTargetRef `json:"scaleTargetRef,omitempty"`
// MinReplicas is the minimum number of replicas the deployment is allowed to scale

View File

@@ -70,6 +70,8 @@ type RunnerConfig struct {
// +optional
DockerRegistryMirror *string `json:"dockerRegistryMirror,omitempty"`
// +optional
DockerVarRunVolumeSizeLimit *resource.Quantity `json:"dockerVarRunVolumeSizeLimit,omitempty"`
// +optional
VolumeSizeLimit *resource.Quantity `json:"volumeSizeLimit,omitempty"`
// +optional
VolumeStorageMedium *string `json:"volumeStorageMedium,omitempty"`

View File

@@ -436,6 +436,11 @@ func (in *RunnerConfig) DeepCopyInto(out *RunnerConfig) {
*out = new(string)
**out = **in
}
if in.DockerVarRunVolumeSizeLimit != nil {
in, out := &in.DockerVarRunVolumeSizeLimit, &out.DockerVarRunVolumeSizeLimit
x := (*in).DeepCopy()
*out = &x
}
if in.VolumeSizeLimit != nil {
in, out := &in.VolumeSizeLimit, &out.VolumeSizeLimit
x := (*in).DeepCopy()

View File

@@ -15,10 +15,10 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.23.2
version: 0.23.3
# Used as the default manager tag value when no tag property is provided in the values.yaml
appVersion: 0.27.3
appVersion: 0.27.4
home: https://github.com/actions/actions-runner-controller

View File

@@ -35,18 +35,21 @@ All additional docs are kept in the `docs/` folder, this README is solely for do
| `authSecret.github_basicauth_password` | Password for GitHub basic auth to use instead of PAT or GitHub APP in case it's running behind a proxy API | |
| `dockerRegistryMirror` | The default Docker Registry Mirror used by runners. | |
| `hostNetwork` | The "hostNetwork" of the controller container | false |
| `dnsPolicy` | The "dnsPolicy" of the controller container | ClusterFirst |
| `image.repository` | The "repository/image" of the controller container | summerwind/actions-runner-controller |
| `image.tag` | The tag of the controller container | |
| `image.actionsRunnerRepositoryAndTag` | The "repository/image" of the actions runner container | summerwind/actions-runner:latest |
| `image.actionsRunnerImagePullSecrets` | Optional image pull secrets to be included in the runner pod's ImagePullSecrets | |
| `image.dindSidecarRepositoryAndTag` | The "repository/image" of the dind sidecar container | docker:dind |
| `image.pullPolicy` | The pull policy of the controller image | IfNotPresent |
| `metrics.serviceMonitor` | Deploy serviceMonitor kind for for use with prometheus-operator CRDs | false |
| `metrics.serviceMonitor.enable` | Deploy serviceMonitor kind for for use with prometheus-operator CRDs | false |
| `metrics.serviceMonitor.interval` | Configure the interval that Prometheus should scrap the controller's metrics | 1m |
| `metrics.serviceMonitor.timeout` | Configure the timeout the timeout of Prometheus scrapping. | 30s |
| `metrics.serviceAnnotations` | Set annotations for the provisioned metrics service resource | |
| `metrics.port` | Set port of metrics service | 8443 |
| `metrics.proxy.enabled` | Deploy kube-rbac-proxy container in controller pod | true |
| `metrics.proxy.image.repository` | The "repository/image" of the kube-proxy container | quay.io/brancz/kube-rbac-proxy |
| `metrics.proxy.image.tag` | The tag of the kube-proxy image to use when pulling the container | v0.10.0 |
| `metrics.proxy.image.tag` | The tag of the kube-proxy image to use when pulling the container | v0.13.1 |
| `metrics.serviceMonitorLabels` | Set labels to apply to ServiceMonitor resources | |
| `imagePullSecrets` | Specifies the secret to be used when pulling the controller pod containers | |
| `fullnameOverride` | Override the full resource names | |
@@ -102,8 +105,11 @@ All additional docs are kept in the `docs/` folder, this README is solely for do
| `githubWebhookServer.tolerations` | Set the githubWebhookServer pod tolerations | |
| `githubWebhookServer.affinity` | Set the githubWebhookServer pod affinity rules | |
| `githubWebhookServer.priorityClassName` | Set the githubWebhookServer pod priorityClassName | |
| `githubWebhookServer.terminationGracePeriodSeconds` | Set the githubWebhookServer pod terminationGracePeriodSeconds. Useful when using preStop hooks to drain/sleep. | `10` |
| `githubWebhookServer.lifecycle` | Set the githubWebhookServer pod lifecycle hooks | `{}` |
| `githubWebhookServer.service.type` | Set githubWebhookServer service type | |
| `githubWebhookServer.service.ports` | Set githubWebhookServer service ports | `[{"port":80, "targetPort:"http", "protocol":"TCP", "name":"http"}]` |
| `githubWebhookServer.service.loadBalancerSourceRanges` | Set githubWebhookServer loadBalancerSourceRanges for restricting loadBalancer type services | `[]` |
| `githubWebhookServer.ingress.enabled` | Deploy an ingress kind for the githubWebhookServer | false |
| `githubWebhookServer.ingress.annotations` | Set annotations for the ingress kind | |
| `githubWebhookServer.ingress.hosts` | Set hosts configuration for ingress | `[{"host": "chart-example.local", "paths": []}]` |
@@ -115,9 +121,9 @@ All additional docs are kept in the `docs/` folder, this README is solely for do
| `actionsMetricsServer.logLevel` | Set the log level of the actionsMetricsServer container | |
| `actionsMetricsServer.logFormat` | Set the log format of the actionsMetricsServer controller. Valid options are "text" and "json" | text |
| `actionsMetricsServer.enabled` | Deploy the actions metrics server pod | false |
| `actionsMetricsServer.secret.enabled` | Passes the webhook hook secret to the github-webhook-server | false |
| `actionsMetricsServer.secret.enabled` | Passes the webhook hook secret to the actions-metrics-server | false |
| `actionsMetricsServer.secret.create` | Deploy the webhook hook secret | false |
| `actionsMetricsServer.secret.name` | Set the name of the webhook hook secret | github-webhook-server |
| `actionsMetricsServer.secret.name` | Set the name of the webhook hook secret | actions-metrics-server |
| `actionsMetricsServer.secret.github_webhook_secret_token` | Set the webhook secret token value | |
| `actionsMetricsServer.imagePullSecrets` | Specifies the secret to be used when pulling the actionsMetricsServer pod containers | |
| `actionsMetricsServer.nameOverride` | Override the resource name prefix | |
@@ -135,17 +141,22 @@ All additional docs are kept in the `docs/` folder, this README is solely for do
| `actionsMetricsServer.tolerations` | Set the actionsMetricsServer pod tolerations | |
| `actionsMetricsServer.affinity` | Set the actionsMetricsServer pod affinity rules | |
| `actionsMetricsServer.priorityClassName` | Set the actionsMetricsServer pod priorityClassName | |
| `actionsMetricsServer.terminationGracePeriodSeconds` | Set the actionsMetricsServer pod terminationGracePeriodSeconds. Useful when using preStop hooks to drain/sleep. | `10` |
| `actionsMetricsServer.lifecycle` | Set the actionsMetricsServer pod lifecycle hooks | `{}` |
| `actionsMetricsServer.service.type` | Set actionsMetricsServer service type | |
| `actionsMetricsServer.service.ports` | Set actionsMetricsServer service ports | `[{"port":80, "targetPort:"http", "protocol":"TCP", "name":"http"}]` |
| `actionsMetricsServer.service.loadBalancerSourceRanges` | Set actionsMetricsServer loadBalancerSourceRanges for restricting loadBalancer type services | `[]` |
| `actionsMetricsServer.ingress.enabled` | Deploy an ingress kind for the actionsMetricsServer | false |
| `actionsMetricsServer.ingress.annotations` | Set annotations for the ingress kind | |
| `actionsMetricsServer.ingress.hosts` | Set hosts configuration for ingress | `[{"host": "chart-example.local", "paths": []}]` |
| `actionsMetricsServer.ingress.tls` | Set tls configuration for ingress | |
| `actionsMetricsServer.ingress.ingressClassName` | Set ingress class name | |
| `actionsMetrics.serviceMonitor` | Deploy serviceMonitor kind for for use with prometheus-operator CRDs | false |
| `actionsMetrics.serviceAnnotations` | Set annotations for the provisioned actions metrics service resource | |
| `actionsMetrics.port` | Set port of actions metrics service | 8443 |
| `actionsMetrics.proxy.enabled` | Deploy kube-rbac-proxy container in controller pod | true |
| `actionsMetrics.proxy.image.repository` | The "repository/image" of the kube-proxy container | quay.io/brancz/kube-rbac-proxy |
| `actionsMetrics.proxy.image.tag` | The tag of the kube-proxy image to use when pulling the container | v0.10.0 |
| `actionsMetrics.serviceMonitorLabels` | Set labels to apply to ServiceMonitor resources | |
| `actionsMetrics.serviceMonitor.enable` | Deploy serviceMonitor kind for for use with prometheus-operator CRDs | false |
| `actionsMetrics.serviceMonitor.interval` | Configure the interval that Prometheus should scrap the controller's metrics | 1m |
| `actionsMetrics.serviceMonitor.timeout` | Configure the timeout the timeout of Prometheus scrapping. | 30s |
| `actionsMetrics.serviceAnnotations` | Set annotations for the provisioned actions metrics service resource | |
| `actionsMetrics.port` | Set port of actions metrics service | 8443 |
| `actionsMetrics.proxy.enabled` | Deploy kube-rbac-proxy container in controller pod | true |
| `actionsMetrics.proxy.image.repository` | The "repository/image" of the kube-proxy container | quay.io/brancz/kube-rbac-proxy |
| `actionsMetrics.proxy.image.tag` | The tag of the kube-proxy image to use when pulling the container | v0.13.1 |
| `actionsMetrics.serviceMonitorLabels` | Set labels to apply to ServiceMonitor resources | |

View File

@@ -113,7 +113,7 @@ spec:
description: ScaleDownDelaySecondsAfterScaleUp is the approximate delay for a scale down followed by a scale up Used to prevent flapping (down->up->down->... loop)
type: integer
scaleTargetRef:
description: ScaleTargetRef sis the reference to scaled resource like RunnerDeployment
description: ScaleTargetRef is the reference to scaled resource like RunnerDeployment
properties:
kind:
description: Kind is the type of resource being referenced

View File

@@ -1497,6 +1497,12 @@ spec:
type: integer
dockerRegistryMirror:
type: string
dockerVarRunVolumeSizeLimit:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
dockerVolumeMounts:
items:
description: VolumeMount describes a mounting of a Volume within a container.

View File

@@ -1479,6 +1479,12 @@ spec:
type: integer
dockerRegistryMirror:
type: string
dockerVarRunVolumeSizeLimit:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
dockerVolumeMounts:
items:
description: VolumeMount describes a mounting of a Volume within a container.

View File

@@ -1432,6 +1432,12 @@ spec:
type: integer
dockerRegistryMirror:
type: string
dockerVarRunVolumeSizeLimit:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
dockerVolumeMounts:
items:
description: VolumeMount describes a mounting of a Volume within a container.

View File

@@ -55,6 +55,12 @@ spec:
type: integer
dockerRegistryMirror:
type: string
dockerVarRunVolumeSizeLimit:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
dockerdWithinRunnerContainer:
type: boolean
effectiveTime:

View File

@@ -36,8 +36,8 @@ spec:
{{- end }}
containers:
- args:
{{- $metricsHost := .Values.metrics.proxy.enabled | ternary "127.0.0.1" "0.0.0.0" }}
{{- $metricsPort := .Values.metrics.proxy.enabled | ternary "8080" .Values.metrics.port }}
{{- $metricsHost := .Values.actionsMetrics.proxy.enabled | ternary "127.0.0.1" "0.0.0.0" }}
{{- $metricsPort := .Values.actionsMetrics.proxy.enabled | ternary "8080" .Values.actionsMetrics.port }}
- "--metrics-addr={{ $metricsHost }}:{{ $metricsPort }}"
{{- if .Values.actionsMetricsServer.logLevel }}
- "--log-level={{ .Values.actionsMetricsServer.logLevel }}"
@@ -50,6 +50,12 @@ spec:
{{- end }}
command:
- "/actions-metrics-server"
{{- if .Values.actionsMetricsServer.lifecycle }}
{{- with .Values.actionsMetricsServer.lifecycle }}
lifecycle:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- end }}
env:
- name: GITHUB_WEBHOOK_SECRET_TOKEN
valueFrom:
@@ -116,8 +122,8 @@ spec:
- containerPort: 8000
name: http
protocol: TCP
{{- if not .Values.metrics.proxy.enabled }}
- containerPort: {{ .Values.metrics.port }}
{{- if not .Values.actionsMetrics.proxy.enabled }}
- containerPort: {{ .Values.actionsMetrics.port }}
name: metrics-port
protocol: TCP
{{- end }}
@@ -125,24 +131,24 @@ spec:
{{- toYaml .Values.actionsMetricsServer.resources | nindent 12 }}
securityContext:
{{- toYaml .Values.actionsMetricsServer.securityContext | nindent 12 }}
{{- if .Values.metrics.proxy.enabled }}
{{- if .Values.actionsMetrics.proxy.enabled }}
- args:
- "--secure-listen-address=0.0.0.0:{{ .Values.metrics.port }}"
- "--secure-listen-address=0.0.0.0:{{ .Values.actionsMetrics.port }}"
- "--upstream=http://127.0.0.1:8080/"
- "--logtostderr=true"
- "--v=10"
image: "{{ .Values.metrics.proxy.image.repository }}:{{ .Values.metrics.proxy.image.tag }}"
image: "{{ .Values.actionsMetrics.proxy.image.repository }}:{{ .Values.actionsMetrics.proxy.image.tag }}"
name: kube-rbac-proxy
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: {{ .Values.metrics.port }}
- containerPort: {{ .Values.actionsMetrics.port }}
name: metrics-port
resources:
{{- toYaml .Values.resources | nindent 12 }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
{{- end }}
terminationGracePeriodSeconds: 10
terminationGracePeriodSeconds: {{ .Values.actionsMetricsServer.terminationGracePeriodSeconds }}
{{- with .Values.actionsMetricsServer.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}

View File

@@ -0,0 +1,90 @@
{{- if .Values.actionsMetricsServer.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: null
name: {{ include "actions-runner-controller-actions-metrics-server.roleName" . }}
rules:
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers
verbs:
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers/status
verbs:
- get
- patch
- update
- apiGroups:
- actions.summerwind.dev
resources:
- runnersets
verbs:
- get
- list
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments/status
verbs:
- get
- patch
- update
- apiGroups:
- authentication.k8s.io
resources:
- tokenreviews
verbs:
- create
- apiGroups:
- authorization.k8s.io
resources:
- subjectaccessreviews
verbs:
- create
{{- end }}

View File

@@ -0,0 +1,14 @@
{{- if .Values.actionsMetricsServer.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "actions-runner-controller-actions-metrics-server.roleName" . }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ include "actions-runner-controller-actions-metrics-server.roleName" . }}
subjects:
- kind: ServiceAccount
name: {{ include "actions-runner-controller-actions-metrics-server.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
{{- end }}

View File

@@ -5,7 +5,7 @@ metadata:
name: {{ include "actions-runner-controller-actions-metrics-server.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
{{- include "actions-runner-controller-actions-metrics-server.selectorLabels" . | nindent 4 }}
{{- if .Values.actionsMetricsServer.service.annotations }}
annotations:
{{ toYaml .Values.actionsMetricsServer.service.annotations | nindent 4 }}
@@ -16,11 +16,17 @@ spec:
{{ range $_, $port := .Values.actionsMetricsServer.service.ports -}}
- {{ $port | toYaml | nindent 6 }}
{{- end }}
{{- if .Values.metrics.serviceMonitor }}
{{- if .Values.actionsMetrics.serviceMonitor.enable }}
- name: metrics-port
port: {{ .Values.metrics.port }}
port: {{ .Values.actionsMetrics.port }}
targetPort: metrics-port
{{- end }}
selector:
{{- include "actions-runner-controller-actions-metrics-server.selectorLabels" . | nindent 4 }}
{{- if .Values.actionsMetricsServer.service.loadBalancerSourceRanges }}
loadBalancerSourceRanges:
{{- range $ip := .Values.actionsMetricsServer.service.loadBalancerSourceRanges }}
- {{ $ip -}}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -1,10 +1,10 @@
{{- if and .Values.actionsMetricsServer.enabled .Values.actionsMetrics.serviceMonitor }}
{{- if and .Values.actionsMetricsServer.enabled .Values.actionsMetrics.serviceMonitor.enable }}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
{{- with .Values.actionsMetricsServer.serviceMonitorLabels }}
{{- with .Values.actionsMetrics.serviceMonitorLabels }}
{{- toYaml . | nindent 4 }}
{{- end }}
name: {{ include "actions-runner-controller-actions-metrics-server.serviceMonitorName" . }}
@@ -19,6 +19,8 @@ spec:
tlsConfig:
insecureSkipVerify: true
{{- end }}
interval: {{ .Values.actionsMetrics.serviceMonitor.interval }}
scrapeTimeout: {{ .Values.actionsMetrics.serviceMonitor.timeout }}
selector:
matchLabels:
{{- include "actions-runner-controller-actions-metrics-server.selectorLabels" . | nindent 6 }}

View File

@@ -1,4 +1,4 @@
{{- if .Values.metrics.serviceMonitor }}
{{- if .Values.metrics.serviceMonitor.enable }}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
@@ -19,6 +19,8 @@ spec:
tlsConfig:
insecureSkipVerify: true
{{- end }}
interval: {{ .Values.metrics.serviceMonitor.interval }}
scrapeTimeout: {{ .Values.metrics.serviceMonitor.timeout }}
selector:
matchLabels:
{{- include "actions-runner-controller.selectorLabels" . | nindent 6 }}

View File

@@ -70,6 +70,9 @@ spec:
{{- if .Values.logFormat }}
- "--log-format={{ .Values.logFormat }}"
{{- end }}
{{- if .Values.dockerGID }}
- "--docker-gid={{ .Values.dockerGID }}"
{{- end }}
command:
- "/manager"
env:
@@ -211,3 +214,6 @@ spec:
{{- if .Values.hostNetwork }}
hostNetwork: {{ .Values.hostNetwork }}
{{- end }}
{{- if .Values.dnsPolicy }}
dnsPolicy: {{ .Values.dnsPolicy }}
{{- end }}

View File

@@ -5,7 +5,7 @@ metadata:
name: {{ include "actions-runner-controller-github-webhook-server.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
{{- include "actions-runner-controller-github-webhook-server.selectorLabels" . | nindent 4 }}
{{- if .Values.githubWebhookServer.service.annotations }}
annotations:
{{ toYaml .Values.githubWebhookServer.service.annotations | nindent 4 }}
@@ -16,7 +16,7 @@ spec:
{{ range $_, $port := .Values.githubWebhookServer.service.ports -}}
- {{ $port | toYaml | nindent 6 }}
{{- end }}
{{- if .Values.metrics.serviceMonitor }}
{{- if .Values.metrics.serviceMonitor.enable }}
- name: metrics-port
port: {{ .Values.metrics.port }}
targetPort: metrics-port

View File

@@ -1,4 +1,4 @@
{{- if and .Values.githubWebhookServer.enabled .Values.metrics.serviceMonitor }}
{{- if and .Values.githubWebhookServer.enabled .Values.metrics.serviceMonitor.enable }}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
@@ -19,6 +19,8 @@ spec:
tlsConfig:
insecureSkipVerify: true
{{- end }}
interval: {{ .Values.metrics.serviceMonitor.interval }}
scrapeTimeout: {{ .Values.metrics.serviceMonitor.timeout }}
selector:
matchLabels:
{{- include "actions-runner-controller-github-webhook-server.selectorLabels" . | nindent 6 }}

View File

@@ -47,6 +47,7 @@ authSecret:
#github_basicauth_username: ""
#github_basicauth_password: ""
# http(s) should be specified for dockerRegistryMirror, e.g.: dockerRegistryMirror="https://<your-docker-registry-mirror>"
dockerRegistryMirror: ""
image:
repository: "summerwind/actions-runner-controller"
@@ -108,7 +109,10 @@ service:
# Metrics service resource
metrics:
serviceAnnotations: {}
serviceMonitor: false
serviceMonitor:
enable: false
timeout: 30s
interval: 1m
serviceMonitorLabels: {}
port: 8443
proxy:
@@ -188,9 +192,17 @@ admissionWebHooks:
# https://github.com/actions/actions-runner-controller/issues/1005#issuecomment-993097155
#hostNetwork: true
# If you use `hostNetwork: true`, then you need dnsPolicy: ClusterFirstWithHostNet
# https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
#dnsPolicy: ClusterFirst
## specify log format for actions runner controller. Valid options are "text" and "json"
logFormat: text
# enable setting the docker group id for the runner container
# https://github.com/actions/actions-runner-controller/pull/2499
#dockerGID: 121
githubWebhookServer:
enabled: false
replicaCount: 1
@@ -299,7 +311,10 @@ actionsMetrics:
# as a part of the helm release.
# Do note that you also need actionsMetricsServer.enabled=true
# to deploy the actions-metrics-server whose k8s service is referenced by the service monitor.
serviceMonitor: false
serviceMonitor:
enable: false
timeout: 30s
interval: 1m
serviceMonitorLabels: {}
port: 8443
proxy:
@@ -359,6 +374,7 @@ actionsMetricsServer:
protocol: TCP
name: http
#nodePort: someFixedPortForUseWithTerraformCdkCfnEtc
loadBalancerSourceRanges: []
ingress:
enabled: false
ingressClassName: ""
@@ -388,4 +404,5 @@ actionsMetricsServer:
# - secretName: chart-example-tls
# hosts:
# - chart-example.local
terminationGracePeriodSeconds: 10
lifecycle: {}

View File

@@ -15,13 +15,13 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.4.0
version: 0.5.0
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "0.4.0"
appVersion: "0.5.0"
home: https://github.com/actions/actions-runner-controller

View File

@@ -1,8 +1,14 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "gha-base-name" -}}
gha-rs-controller
{{- end }}
{{- define "gha-runner-scale-set-controller.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- default (include "gha-base-name" .) .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
@@ -14,7 +20,7 @@ If release name contains chart name it will be used as a full name.
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- $name := default (include "gha-base-name" .) .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
@@ -27,7 +33,7 @@ If release name contains chart name it will be used as a full name.
Create chart name and version as used by the chart label.
*/}}
{{- define "gha-runner-scale-set-controller.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- printf "%s-%s" (include "gha-base-name" .) .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
@@ -39,7 +45,7 @@ helm.sh/chart: {{ include "gha-runner-scale-set-controller.chart" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/part-of: gha-runner-scale-set-controller
app.kubernetes.io/part-of: gha-rs-controller
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- range $k, $v := .Values.labels }}
{{ $k }}: {{ $v }}
@@ -51,6 +57,7 @@ Selector labels
*/}}
{{- define "gha-runner-scale-set-controller.selectorLabels" -}}
app.kubernetes.io/name: {{ include "gha-runner-scale-set-controller.name" . }}
app.kubernetes.io/namespace: {{ .Release.Namespace }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
@@ -73,35 +80,43 @@ Create the name of the service account to use
{{- end }}
{{- define "gha-runner-scale-set-controller.managerClusterRoleName" -}}
{{- include "gha-runner-scale-set-controller.fullname" . }}-manager-cluster-role
{{- include "gha-runner-scale-set-controller.fullname" . }}
{{- end }}
{{- define "gha-runner-scale-set-controller.managerClusterRoleBinding" -}}
{{- include "gha-runner-scale-set-controller.fullname" . }}-manager-cluster-rolebinding
{{- include "gha-runner-scale-set-controller.fullname" . }}
{{- end }}
{{- define "gha-runner-scale-set-controller.managerSingleNamespaceRoleName" -}}
{{- include "gha-runner-scale-set-controller.fullname" . }}-manager-single-namespace-role
{{- include "gha-runner-scale-set-controller.fullname" . }}-single-namespace
{{- end }}
{{- define "gha-runner-scale-set-controller.managerSingleNamespaceRoleBinding" -}}
{{- include "gha-runner-scale-set-controller.fullname" . }}-manager-single-namespace-rolebinding
{{- include "gha-runner-scale-set-controller.fullname" . }}-single-namespace
{{- end }}
{{- define "gha-runner-scale-set-controller.managerSingleNamespaceWatchRoleName" -}}
{{- include "gha-runner-scale-set-controller.fullname" . }}-single-namespace-watch
{{- end }}
{{- define "gha-runner-scale-set-controller.managerSingleNamespaceWatchRoleBinding" -}}
{{- include "gha-runner-scale-set-controller.fullname" . }}-single-namespace-watch
{{- end }}
{{- define "gha-runner-scale-set-controller.managerListenerRoleName" -}}
{{- include "gha-runner-scale-set-controller.fullname" . }}-manager-listener-role
{{- include "gha-runner-scale-set-controller.fullname" . }}-listener
{{- end }}
{{- define "gha-runner-scale-set-controller.managerListenerRoleBinding" -}}
{{- include "gha-runner-scale-set-controller.fullname" . }}-manager-listener-rolebinding
{{- include "gha-runner-scale-set-controller.fullname" . }}-listener
{{- end }}
{{- define "gha-runner-scale-set-controller.leaderElectionRoleName" -}}
{{- include "gha-runner-scale-set-controller.fullname" . }}-leader-election-role
{{- include "gha-runner-scale-set-controller.fullname" . }}-leader-election
{{- end }}
{{- define "gha-runner-scale-set-controller.leaderElectionRoleBinding" -}}
{{- include "gha-runner-scale-set-controller.fullname" . }}-leader-election-rolebinding
{{- include "gha-runner-scale-set-controller.fullname" . }}-leader-election
{{- end }}
{{- define "gha-runner-scale-set-controller.imagePullSecretsNames" -}}
@@ -111,3 +126,7 @@ Create the name of the service account to use
{{- end }}
{{- $names | join ","}}
{{- end }}
{{- define "gha-runner-scale-set-controller.serviceMonitorName" -}}
{{- include "gha-runner-scale-set-controller.fullname" . }}-service-monitor
{{- end }}

View File

@@ -23,7 +23,7 @@ spec:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
app.kubernetes.io/part-of: gha-runner-scale-set-controller
app.kubernetes.io/part-of: gha-rs-controller
app.kubernetes.io/component: controller-manager
app.kubernetes.io/version: {{ .Chart.Version }}
{{- include "gha-runner-scale-set-controller.selectorLabels" . | nindent 8 }}
@@ -56,11 +56,34 @@ spec:
{{- with .Values.flags.logLevel }}
- "--log-level={{ . }}"
{{- end }}
{{- with .Values.flags.logFormat }}
- "--log-format={{ . }}"
{{- end }}
{{- with .Values.flags.watchSingleNamespace }}
- "--watch-single-namespace={{ . }}"
{{- end }}
{{- with .Values.flags.updateStrategy }}
- "--update-strategy={{ . }}"
{{- end }}
{{- if .Values.metrics }}
{{- with .Values.metrics }}
- "--listener-metrics-addr={{ .listenerAddr }}"
- "--listener-metrics-endpoint={{ .listenerEndpoint }}"
- "--metrics-addr={{ .controllerManagerAddr }}"
{{- end }}
{{- else }}
- "--listener-metrics-addr=0"
- "--listener-metrics-endpoint="
- "--metrics-addr=0"
{{- end }}
command:
- "/manager"
{{- with .Values.metrics }}
ports:
- containerPort: {{regexReplaceAll ":([0-9]+)" .controllerManagerAddr "${1}"}}
protocol: TCP
name: metrics
{{- end }}
env:
- name: CONTROLLER_MANAGER_CONTAINER_IMAGE
value: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"

View File

@@ -2,7 +2,7 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "gha-runner-scale-set-controller.managerSingleNamespaceRoleName" . }}
name: {{ include "gha-runner-scale-set-controller.managerSingleNamespaceWatchRoleName" . }}
namespace: {{ .Values.flags.watchSingleNamespace }}
rules:
- apiGroups:

View File

@@ -2,14 +2,14 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "gha-runner-scale-set-controller.managerSingleNamespaceRoleBinding" . }}
name: {{ include "gha-runner-scale-set-controller.managerSingleNamespaceWatchRoleBinding" . }}
namespace: {{ .Values.flags.watchSingleNamespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "gha-runner-scale-set-controller.managerSingleNamespaceRoleName" . }}
name: {{ include "gha-runner-scale-set-controller.managerSingleNamespaceWatchRoleName" . }}
subjects:
- kind: ServiceAccount
name: {{ include "gha-runner-scale-set-controller.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
{{- end }}
{{- end }}

View File

@@ -1,6 +1,7 @@
package tests
import (
"fmt"
"os"
"path/filepath"
"strings"
@@ -8,6 +9,7 @@ import (
"github.com/gruntwork-io/terratest/modules/helm"
"github.com/gruntwork-io/terratest/modules/k8s"
"github.com/gruntwork-io/terratest/modules/logger"
"github.com/gruntwork-io/terratest/modules/random"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
@@ -33,6 +35,7 @@ func TestTemplate_CreateServiceAccount(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"serviceAccount.create": "true",
"serviceAccount.annotations.foo": "bar",
@@ -46,7 +49,7 @@ func TestTemplate_CreateServiceAccount(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &serviceAccount)
assert.Equal(t, namespaceName, serviceAccount.Namespace)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", serviceAccount.Name)
assert.Equal(t, "test-arc-gha-rs-controller", serviceAccount.Name)
assert.Equal(t, "bar", string(serviceAccount.Annotations["foo"]))
}
@@ -61,6 +64,7 @@ func TestTemplate_CreateServiceAccount_OverwriteName(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"serviceAccount.create": "true",
"serviceAccount.name": "overwritten-name",
@@ -90,6 +94,7 @@ func TestTemplate_CreateServiceAccount_CannotUseDefaultServiceAccount(t *testing
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"serviceAccount.create": "true",
"serviceAccount.name": "default",
@@ -113,6 +118,7 @@ func TestTemplate_NotCreateServiceAccount(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"serviceAccount.create": "false",
"serviceAccount.name": "overwritten-name",
@@ -136,6 +142,7 @@ func TestTemplate_NotCreateServiceAccount_ServiceAccountNotSet(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"serviceAccount.create": "false",
"serviceAccount.annotations.foo": "bar",
@@ -158,6 +165,7 @@ func TestTemplate_CreateManagerClusterRole(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{},
KubectlOptions: k8s.NewKubectlOptions("", "", namespaceName),
}
@@ -168,7 +176,7 @@ func TestTemplate_CreateManagerClusterRole(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &managerClusterRole)
assert.Empty(t, managerClusterRole.Namespace, "ClusterRole should not have a namespace")
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-cluster-role", managerClusterRole.Name)
assert.Equal(t, "test-arc-gha-rs-controller", managerClusterRole.Name)
assert.Equal(t, 16, len(managerClusterRole.Rules))
_, err = helm.RenderTemplateE(t, options, helmChartPath, releaseName, []string{"templates/manager_single_namespace_controller_role.yaml"})
@@ -189,6 +197,7 @@ func TestTemplate_ManagerClusterRoleBinding(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"serviceAccount.create": "true",
},
@@ -201,9 +210,9 @@ func TestTemplate_ManagerClusterRoleBinding(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &managerClusterRoleBinding)
assert.Empty(t, managerClusterRoleBinding.Namespace, "ClusterRoleBinding should not have a namespace")
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-cluster-rolebinding", managerClusterRoleBinding.Name)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-cluster-role", managerClusterRoleBinding.RoleRef.Name)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", managerClusterRoleBinding.Subjects[0].Name)
assert.Equal(t, "test-arc-gha-rs-controller", managerClusterRoleBinding.Name)
assert.Equal(t, "test-arc-gha-rs-controller", managerClusterRoleBinding.RoleRef.Name)
assert.Equal(t, "test-arc-gha-rs-controller", managerClusterRoleBinding.Subjects[0].Name)
assert.Equal(t, namespaceName, managerClusterRoleBinding.Subjects[0].Namespace)
_, err = helm.RenderTemplateE(t, options, helmChartPath, releaseName, []string{"templates/manager_single_namespace_controller_role_binding.yaml"})
@@ -224,6 +233,7 @@ func TestTemplate_CreateManagerListenerRole(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{},
KubectlOptions: k8s.NewKubectlOptions("", "", namespaceName),
}
@@ -234,7 +244,7 @@ func TestTemplate_CreateManagerListenerRole(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &managerListenerRole)
assert.Equal(t, namespaceName, managerListenerRole.Namespace, "Role should have a namespace")
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-listener-role", managerListenerRole.Name)
assert.Equal(t, "test-arc-gha-rs-controller-listener", managerListenerRole.Name)
assert.Equal(t, 4, len(managerListenerRole.Rules))
assert.Equal(t, "pods", managerListenerRole.Rules[0].Resources[0])
assert.Equal(t, "pods/status", managerListenerRole.Rules[1].Resources[0])
@@ -253,6 +263,7 @@ func TestTemplate_ManagerListenerRoleBinding(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"serviceAccount.create": "true",
},
@@ -265,9 +276,9 @@ func TestTemplate_ManagerListenerRoleBinding(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &managerListenerRoleBinding)
assert.Equal(t, namespaceName, managerListenerRoleBinding.Namespace, "RoleBinding should have a namespace")
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-listener-rolebinding", managerListenerRoleBinding.Name)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-listener-role", managerListenerRoleBinding.RoleRef.Name)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", managerListenerRoleBinding.Subjects[0].Name)
assert.Equal(t, "test-arc-gha-rs-controller-listener", managerListenerRoleBinding.Name)
assert.Equal(t, "test-arc-gha-rs-controller-listener", managerListenerRoleBinding.RoleRef.Name)
assert.Equal(t, "test-arc-gha-rs-controller", managerListenerRoleBinding.Subjects[0].Name)
assert.Equal(t, namespaceName, managerListenerRoleBinding.Subjects[0].Namespace)
}
@@ -289,6 +300,7 @@ func TestTemplate_ControllerDeployment_Defaults(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"image.tag": "dev",
},
@@ -301,29 +313,29 @@ func TestTemplate_ControllerDeployment_Defaults(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &deployment)
assert.Equal(t, namespaceName, deployment.Namespace)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", deployment.Name)
assert.Equal(t, "gha-runner-scale-set-controller-"+chart.Version, deployment.Labels["helm.sh/chart"])
assert.Equal(t, "gha-runner-scale-set-controller", deployment.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-arc-gha-rs-controller", deployment.Name)
assert.Equal(t, "gha-rs-controller-"+chart.Version, deployment.Labels["helm.sh/chart"])
assert.Equal(t, "gha-rs-controller", deployment.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-arc", deployment.Labels["app.kubernetes.io/instance"])
assert.Equal(t, chart.AppVersion, deployment.Labels["app.kubernetes.io/version"])
assert.Equal(t, "Helm", deployment.Labels["app.kubernetes.io/managed-by"])
assert.Equal(t, namespaceName, deployment.Labels["actions.github.com/controller-service-account-namespace"])
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", deployment.Labels["actions.github.com/controller-service-account-name"])
assert.Equal(t, "test-arc-gha-rs-controller", deployment.Labels["actions.github.com/controller-service-account-name"])
assert.NotContains(t, deployment.Labels, "actions.github.com/controller-watch-single-namespace")
assert.Equal(t, "gha-runner-scale-set-controller", deployment.Labels["app.kubernetes.io/part-of"])
assert.Equal(t, "gha-rs-controller", deployment.Labels["app.kubernetes.io/part-of"])
assert.Equal(t, int32(1), *deployment.Spec.Replicas)
assert.Equal(t, "gha-runner-scale-set-controller", deployment.Spec.Selector.MatchLabels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs-controller", deployment.Spec.Selector.MatchLabels["app.kubernetes.io/name"])
assert.Equal(t, "test-arc", deployment.Spec.Selector.MatchLabels["app.kubernetes.io/instance"])
assert.Equal(t, "gha-runner-scale-set-controller", deployment.Spec.Template.Labels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs-controller", deployment.Spec.Template.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-arc", deployment.Spec.Template.Labels["app.kubernetes.io/instance"])
assert.Equal(t, "manager", deployment.Spec.Template.Annotations["kubectl.kubernetes.io/default-container"])
assert.Len(t, deployment.Spec.Template.Spec.ImagePullSecrets, 0)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", deployment.Spec.Template.Spec.ServiceAccountName)
assert.Equal(t, "test-arc-gha-rs-controller", deployment.Spec.Template.Spec.ServiceAccountName)
assert.Nil(t, deployment.Spec.Template.Spec.SecurityContext)
assert.Empty(t, deployment.Spec.Template.Spec.PriorityClassName)
assert.Equal(t, int64(10), *deployment.Spec.Template.Spec.TerminationGracePeriodSeconds)
@@ -345,9 +357,16 @@ func TestTemplate_ControllerDeployment_Defaults(t *testing.T) {
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Command, 1)
assert.Equal(t, "/manager", deployment.Spec.Template.Spec.Containers[0].Command[0])
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Args, 2)
assert.Equal(t, "--auto-scaling-runner-set-only", deployment.Spec.Template.Spec.Containers[0].Args[0])
assert.Equal(t, "--log-level=debug", deployment.Spec.Template.Spec.Containers[0].Args[1])
expectedArgs := []string{
"--auto-scaling-runner-set-only",
"--log-level=debug",
"--log-format=text",
"--update-strategy=immediate",
"--metrics-addr=0",
"--listener-metrics-addr=0",
"--listener-metrics-endpoint=",
}
assert.ElementsMatch(t, expectedArgs, deployment.Spec.Template.Spec.Containers[0].Args)
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Env, 3)
assert.Equal(t, "CONTROLLER_MANAGER_CONTAINER_IMAGE", deployment.Spec.Template.Spec.Containers[0].Env[0].Name)
@@ -384,6 +403,7 @@ func TestTemplate_ControllerDeployment_Customize(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"labels.foo": "bar",
"labels.github": "actions",
@@ -391,11 +411,11 @@ func TestTemplate_ControllerDeployment_Customize(t *testing.T) {
"image.pullPolicy": "Always",
"image.tag": "dev",
"imagePullSecrets[0].name": "dockerhub",
"nameOverride": "gha-runner-scale-set-controller-override",
"fullnameOverride": "gha-runner-scale-set-controller-fullname-override",
"nameOverride": "gha-rs-controller-override",
"fullnameOverride": "gha-rs-controller-fullname-override",
"env[0].name": "ENV_VAR_NAME_1",
"env[0].value": "ENV_VAR_VALUE_1",
"serviceAccount.name": "gha-runner-scale-set-controller-sa",
"serviceAccount.name": "gha-rs-controller-sa",
"podAnnotations.foo": "bar",
"podSecurityContext.fsGroup": "1000",
"securityContext.runAsUser": "1000",
@@ -405,7 +425,10 @@ func TestTemplate_ControllerDeployment_Customize(t *testing.T) {
"tolerations[0].key": "foo",
"affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].key": "foo",
"affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].operator": "bar",
"priorityClassName": "test-priority-class",
"priorityClassName": "test-priority-class",
"flags.updateStrategy": "eventual",
"flags.logLevel": "info",
"flags.logFormat": "json",
},
KubectlOptions: k8s.NewKubectlOptions("", "", namespaceName),
}
@@ -416,22 +439,22 @@ func TestTemplate_ControllerDeployment_Customize(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &deployment)
assert.Equal(t, namespaceName, deployment.Namespace)
assert.Equal(t, "gha-runner-scale-set-controller-fullname-override", deployment.Name)
assert.Equal(t, "gha-runner-scale-set-controller-"+chart.Version, deployment.Labels["helm.sh/chart"])
assert.Equal(t, "gha-runner-scale-set-controller-override", deployment.Labels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs-controller-fullname-override", deployment.Name)
assert.Equal(t, "gha-rs-controller-"+chart.Version, deployment.Labels["helm.sh/chart"])
assert.Equal(t, "gha-rs-controller-override", deployment.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-arc", deployment.Labels["app.kubernetes.io/instance"])
assert.Equal(t, chart.AppVersion, deployment.Labels["app.kubernetes.io/version"])
assert.Equal(t, "Helm", deployment.Labels["app.kubernetes.io/managed-by"])
assert.Equal(t, "gha-runner-scale-set-controller", deployment.Labels["app.kubernetes.io/part-of"])
assert.Equal(t, "gha-rs-controller", deployment.Labels["app.kubernetes.io/part-of"])
assert.Equal(t, "bar", deployment.Labels["foo"])
assert.Equal(t, "actions", deployment.Labels["github"])
assert.Equal(t, int32(1), *deployment.Spec.Replicas)
assert.Equal(t, "gha-runner-scale-set-controller-override", deployment.Spec.Selector.MatchLabels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs-controller-override", deployment.Spec.Selector.MatchLabels["app.kubernetes.io/name"])
assert.Equal(t, "test-arc", deployment.Spec.Selector.MatchLabels["app.kubernetes.io/instance"])
assert.Equal(t, "gha-runner-scale-set-controller-override", deployment.Spec.Template.Labels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs-controller-override", deployment.Spec.Template.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-arc", deployment.Spec.Template.Labels["app.kubernetes.io/instance"])
assert.Equal(t, "bar", deployment.Spec.Template.Annotations["foo"])
@@ -442,7 +465,7 @@ func TestTemplate_ControllerDeployment_Customize(t *testing.T) {
assert.Len(t, deployment.Spec.Template.Spec.ImagePullSecrets, 1)
assert.Equal(t, "dockerhub", deployment.Spec.Template.Spec.ImagePullSecrets[0].Name)
assert.Equal(t, "gha-runner-scale-set-controller-sa", deployment.Spec.Template.Spec.ServiceAccountName)
assert.Equal(t, "gha-rs-controller-sa", deployment.Spec.Template.Spec.ServiceAccountName)
assert.Equal(t, int64(1000), *deployment.Spec.Template.Spec.SecurityContext.FSGroup)
assert.Equal(t, "test-priority-class", deployment.Spec.Template.Spec.PriorityClassName)
assert.Equal(t, int64(10), *deployment.Spec.Template.Spec.TerminationGracePeriodSeconds)
@@ -470,10 +493,18 @@ func TestTemplate_ControllerDeployment_Customize(t *testing.T) {
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Command, 1)
assert.Equal(t, "/manager", deployment.Spec.Template.Spec.Containers[0].Command[0])
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Args, 3)
assert.Equal(t, "--auto-scaling-runner-set-only", deployment.Spec.Template.Spec.Containers[0].Args[0])
assert.Equal(t, "--auto-scaler-image-pull-secrets=dockerhub", deployment.Spec.Template.Spec.Containers[0].Args[1])
assert.Equal(t, "--log-level=debug", deployment.Spec.Template.Spec.Containers[0].Args[2])
expectArgs := []string{
"--auto-scaling-runner-set-only",
"--auto-scaler-image-pull-secrets=dockerhub",
"--log-level=info",
"--log-format=json",
"--update-strategy=eventual",
"--listener-metrics-addr=0",
"--listener-metrics-endpoint=",
"--metrics-addr=0",
}
assert.ElementsMatch(t, expectArgs, deployment.Spec.Template.Spec.Containers[0].Args)
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Env, 4)
assert.Equal(t, "CONTROLLER_MANAGER_CONTAINER_IMAGE", deployment.Spec.Template.Spec.Containers[0].Env[0].Name)
@@ -508,6 +539,7 @@ func TestTemplate_EnableLeaderElectionRole(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"replicaCount": "2",
},
@@ -519,7 +551,7 @@ func TestTemplate_EnableLeaderElectionRole(t *testing.T) {
var leaderRole rbacv1.Role
helm.UnmarshalK8SYaml(t, output, &leaderRole)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-leader-election-role", leaderRole.Name)
assert.Equal(t, "test-arc-gha-rs-controller-leader-election", leaderRole.Name)
assert.Equal(t, namespaceName, leaderRole.Namespace)
}
@@ -534,6 +566,7 @@ func TestTemplate_EnableLeaderElectionRoleBinding(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"replicaCount": "2",
},
@@ -545,10 +578,10 @@ func TestTemplate_EnableLeaderElectionRoleBinding(t *testing.T) {
var leaderRoleBinding rbacv1.RoleBinding
helm.UnmarshalK8SYaml(t, output, &leaderRoleBinding)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-leader-election-rolebinding", leaderRoleBinding.Name)
assert.Equal(t, "test-arc-gha-rs-controller-leader-election", leaderRoleBinding.Name)
assert.Equal(t, namespaceName, leaderRoleBinding.Namespace)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-leader-election-role", leaderRoleBinding.RoleRef.Name)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", leaderRoleBinding.Subjects[0].Name)
assert.Equal(t, "test-arc-gha-rs-controller-leader-election", leaderRoleBinding.RoleRef.Name)
assert.Equal(t, "test-arc-gha-rs-controller", leaderRoleBinding.Subjects[0].Name)
}
func TestTemplate_EnableLeaderElection(t *testing.T) {
@@ -562,6 +595,7 @@ func TestTemplate_EnableLeaderElection(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"replicaCount": "2",
"image.tag": "dev",
@@ -575,7 +609,7 @@ func TestTemplate_EnableLeaderElection(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &deployment)
assert.Equal(t, namespaceName, deployment.Namespace)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", deployment.Name)
assert.Equal(t, "test-arc-gha-rs-controller", deployment.Name)
assert.Equal(t, int32(2), *deployment.Spec.Replicas)
@@ -587,11 +621,19 @@ func TestTemplate_EnableLeaderElection(t *testing.T) {
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Command, 1)
assert.Equal(t, "/manager", deployment.Spec.Template.Spec.Containers[0].Command[0])
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Args, 4)
assert.Equal(t, "--auto-scaling-runner-set-only", deployment.Spec.Template.Spec.Containers[0].Args[0])
assert.Equal(t, "--enable-leader-election", deployment.Spec.Template.Spec.Containers[0].Args[1])
assert.Equal(t, "--leader-election-id=test-arc-gha-runner-scale-set-controller", deployment.Spec.Template.Spec.Containers[0].Args[2])
assert.Equal(t, "--log-level=debug", deployment.Spec.Template.Spec.Containers[0].Args[3])
expectedArgs := []string{
"--auto-scaling-runner-set-only",
"--enable-leader-election",
"--leader-election-id=test-arc-gha-rs-controller",
"--log-level=debug",
"--log-format=text",
"--update-strategy=immediate",
"--listener-metrics-addr=0",
"--listener-metrics-endpoint=",
"--metrics-addr=0",
}
assert.ElementsMatch(t, expectedArgs, deployment.Spec.Template.Spec.Containers[0].Args)
}
func TestTemplate_ControllerDeployment_ForwardImagePullSecrets(t *testing.T) {
@@ -605,6 +647,7 @@ func TestTemplate_ControllerDeployment_ForwardImagePullSecrets(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"imagePullSecrets[0].name": "dockerhub",
"imagePullSecrets[1].name": "ghcr",
@@ -619,10 +662,18 @@ func TestTemplate_ControllerDeployment_ForwardImagePullSecrets(t *testing.T) {
assert.Equal(t, namespaceName, deployment.Namespace)
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Args, 3)
assert.Equal(t, "--auto-scaling-runner-set-only", deployment.Spec.Template.Spec.Containers[0].Args[0])
assert.Equal(t, "--auto-scaler-image-pull-secrets=dockerhub,ghcr", deployment.Spec.Template.Spec.Containers[0].Args[1])
assert.Equal(t, "--log-level=debug", deployment.Spec.Template.Spec.Containers[0].Args[2])
expectedArgs := []string{
"--auto-scaling-runner-set-only",
"--auto-scaler-image-pull-secrets=dockerhub,ghcr",
"--log-level=debug",
"--log-format=text",
"--update-strategy=immediate",
"--listener-metrics-addr=0",
"--listener-metrics-endpoint=",
"--metrics-addr=0",
}
assert.ElementsMatch(t, expectedArgs, deployment.Spec.Template.Spec.Containers[0].Args)
}
func TestTemplate_ControllerDeployment_WatchSingleNamespace(t *testing.T) {
@@ -643,6 +694,7 @@ func TestTemplate_ControllerDeployment_WatchSingleNamespace(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"image.tag": "dev",
"flags.watchSingleNamespace": "demo",
@@ -656,28 +708,28 @@ func TestTemplate_ControllerDeployment_WatchSingleNamespace(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &deployment)
assert.Equal(t, namespaceName, deployment.Namespace)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", deployment.Name)
assert.Equal(t, "gha-runner-scale-set-controller-"+chart.Version, deployment.Labels["helm.sh/chart"])
assert.Equal(t, "gha-runner-scale-set-controller", deployment.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-arc-gha-rs-controller", deployment.Name)
assert.Equal(t, "gha-rs-controller-"+chart.Version, deployment.Labels["helm.sh/chart"])
assert.Equal(t, "gha-rs-controller", deployment.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-arc", deployment.Labels["app.kubernetes.io/instance"])
assert.Equal(t, chart.AppVersion, deployment.Labels["app.kubernetes.io/version"])
assert.Equal(t, "Helm", deployment.Labels["app.kubernetes.io/managed-by"])
assert.Equal(t, namespaceName, deployment.Labels["actions.github.com/controller-service-account-namespace"])
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", deployment.Labels["actions.github.com/controller-service-account-name"])
assert.Equal(t, "test-arc-gha-rs-controller", deployment.Labels["actions.github.com/controller-service-account-name"])
assert.Equal(t, "demo", deployment.Labels["actions.github.com/controller-watch-single-namespace"])
assert.Equal(t, int32(1), *deployment.Spec.Replicas)
assert.Equal(t, "gha-runner-scale-set-controller", deployment.Spec.Selector.MatchLabels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs-controller", deployment.Spec.Selector.MatchLabels["app.kubernetes.io/name"])
assert.Equal(t, "test-arc", deployment.Spec.Selector.MatchLabels["app.kubernetes.io/instance"])
assert.Equal(t, "gha-runner-scale-set-controller", deployment.Spec.Template.Labels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs-controller", deployment.Spec.Template.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-arc", deployment.Spec.Template.Labels["app.kubernetes.io/instance"])
assert.Equal(t, "manager", deployment.Spec.Template.Annotations["kubectl.kubernetes.io/default-container"])
assert.Len(t, deployment.Spec.Template.Spec.ImagePullSecrets, 0)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", deployment.Spec.Template.Spec.ServiceAccountName)
assert.Equal(t, "test-arc-gha-rs-controller", deployment.Spec.Template.Spec.ServiceAccountName)
assert.Nil(t, deployment.Spec.Template.Spec.SecurityContext)
assert.Empty(t, deployment.Spec.Template.Spec.PriorityClassName)
assert.Equal(t, int64(10), *deployment.Spec.Template.Spec.TerminationGracePeriodSeconds)
@@ -699,10 +751,18 @@ func TestTemplate_ControllerDeployment_WatchSingleNamespace(t *testing.T) {
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Command, 1)
assert.Equal(t, "/manager", deployment.Spec.Template.Spec.Containers[0].Command[0])
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Args, 3)
assert.Equal(t, "--auto-scaling-runner-set-only", deployment.Spec.Template.Spec.Containers[0].Args[0])
assert.Equal(t, "--log-level=debug", deployment.Spec.Template.Spec.Containers[0].Args[1])
assert.Equal(t, "--watch-single-namespace=demo", deployment.Spec.Template.Spec.Containers[0].Args[2])
expectedArgs := []string{
"--auto-scaling-runner-set-only",
"--log-level=debug",
"--log-format=text",
"--watch-single-namespace=demo",
"--update-strategy=immediate",
"--listener-metrics-addr=0",
"--listener-metrics-endpoint=",
"--metrics-addr=0",
}
assert.ElementsMatch(t, expectedArgs, deployment.Spec.Template.Spec.Containers[0].Args)
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Env, 3)
assert.Equal(t, "CONTROLLER_MANAGER_CONTAINER_IMAGE", deployment.Spec.Template.Spec.Containers[0].Env[0].Name)
@@ -732,6 +792,7 @@ func TestTemplate_ControllerContainerEnvironmentVariables(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"env[0].Name": "ENV_VAR_NAME_1",
"env[0].Value": "ENV_VAR_VALUE_1",
@@ -752,7 +813,7 @@ func TestTemplate_ControllerContainerEnvironmentVariables(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &deployment)
assert.Equal(t, namespaceName, deployment.Namespace)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", deployment.Name)
assert.Equal(t, "test-arc-gha-rs-controller", deployment.Name)
assert.Len(t, deployment.Spec.Template.Spec.Containers[0].Env, 7)
assert.Equal(t, "ENV_VAR_NAME_1", deployment.Spec.Template.Spec.Containers[0].Env[3].Name)
@@ -778,6 +839,7 @@ func TestTemplate_WatchSingleNamespace_NotCreateManagerClusterRole(t *testing.T)
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"flags.watchSingleNamespace": "demo",
},
@@ -799,6 +861,7 @@ func TestTemplate_WatchSingleNamespace_NotManagerClusterRoleBinding(t *testing.T
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"serviceAccount.create": "true",
"flags.watchSingleNamespace": "demo",
@@ -821,6 +884,7 @@ func TestTemplate_CreateManagerSingleNamespaceRole(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"flags.watchSingleNamespace": "demo",
},
@@ -832,7 +896,7 @@ func TestTemplate_CreateManagerSingleNamespaceRole(t *testing.T) {
var managerSingleNamespaceControllerRole rbacv1.Role
helm.UnmarshalK8SYaml(t, output, &managerSingleNamespaceControllerRole)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-single-namespace-role", managerSingleNamespaceControllerRole.Name)
assert.Equal(t, "test-arc-gha-rs-controller-single-namespace", managerSingleNamespaceControllerRole.Name)
assert.Equal(t, namespaceName, managerSingleNamespaceControllerRole.Namespace)
assert.Equal(t, 10, len(managerSingleNamespaceControllerRole.Rules))
@@ -841,7 +905,7 @@ func TestTemplate_CreateManagerSingleNamespaceRole(t *testing.T) {
var managerSingleNamespaceWatchRole rbacv1.Role
helm.UnmarshalK8SYaml(t, output, &managerSingleNamespaceWatchRole)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-single-namespace-role", managerSingleNamespaceWatchRole.Name)
assert.Equal(t, "test-arc-gha-rs-controller-single-namespace-watch", managerSingleNamespaceWatchRole.Name)
assert.Equal(t, "demo", managerSingleNamespaceWatchRole.Namespace)
assert.Equal(t, 14, len(managerSingleNamespaceWatchRole.Rules))
}
@@ -857,6 +921,7 @@ func TestTemplate_ManagerSingleNamespaceRoleBinding(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"flags.watchSingleNamespace": "demo",
},
@@ -868,10 +933,10 @@ func TestTemplate_ManagerSingleNamespaceRoleBinding(t *testing.T) {
var managerSingleNamespaceControllerRoleBinding rbacv1.RoleBinding
helm.UnmarshalK8SYaml(t, output, &managerSingleNamespaceControllerRoleBinding)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-single-namespace-rolebinding", managerSingleNamespaceControllerRoleBinding.Name)
assert.Equal(t, "test-arc-gha-rs-controller-single-namespace", managerSingleNamespaceControllerRoleBinding.Name)
assert.Equal(t, namespaceName, managerSingleNamespaceControllerRoleBinding.Namespace)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-single-namespace-role", managerSingleNamespaceControllerRoleBinding.RoleRef.Name)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", managerSingleNamespaceControllerRoleBinding.Subjects[0].Name)
assert.Equal(t, "test-arc-gha-rs-controller-single-namespace", managerSingleNamespaceControllerRoleBinding.RoleRef.Name)
assert.Equal(t, "test-arc-gha-rs-controller", managerSingleNamespaceControllerRoleBinding.Subjects[0].Name)
assert.Equal(t, namespaceName, managerSingleNamespaceControllerRoleBinding.Subjects[0].Namespace)
output = helm.RenderTemplate(t, options, helmChartPath, releaseName, []string{"templates/manager_single_namespace_watch_role_binding.yaml"})
@@ -879,9 +944,81 @@ func TestTemplate_ManagerSingleNamespaceRoleBinding(t *testing.T) {
var managerSingleNamespaceWatchRoleBinding rbacv1.RoleBinding
helm.UnmarshalK8SYaml(t, output, &managerSingleNamespaceWatchRoleBinding)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-single-namespace-rolebinding", managerSingleNamespaceWatchRoleBinding.Name)
assert.Equal(t, "test-arc-gha-rs-controller-single-namespace-watch", managerSingleNamespaceWatchRoleBinding.Name)
assert.Equal(t, "demo", managerSingleNamespaceWatchRoleBinding.Namespace)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller-manager-single-namespace-role", managerSingleNamespaceWatchRoleBinding.RoleRef.Name)
assert.Equal(t, "test-arc-gha-runner-scale-set-controller", managerSingleNamespaceWatchRoleBinding.Subjects[0].Name)
assert.Equal(t, "test-arc-gha-rs-controller-single-namespace-watch", managerSingleNamespaceWatchRoleBinding.RoleRef.Name)
assert.Equal(t, "test-arc-gha-rs-controller", managerSingleNamespaceWatchRoleBinding.Subjects[0].Name)
assert.Equal(t, namespaceName, managerSingleNamespaceWatchRoleBinding.Subjects[0].Namespace)
}
func TestControllerDeployment_MetricsPorts(t *testing.T) {
t.Parallel()
// Path to the helm chart we will test
helmChartPath, err := filepath.Abs("../../gha-runner-scale-set-controller")
require.NoError(t, err)
chartContent, err := os.ReadFile(filepath.Join(helmChartPath, "Chart.yaml"))
require.NoError(t, err)
chart := new(Chart)
err = yaml.Unmarshal(chartContent, chart)
require.NoError(t, err)
releaseName := "test-arc"
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"image.tag": "dev",
"metrics.controllerManagerAddr": ":8080",
"metrics.listenerAddr": ":8081",
"metrics.listenerEndpoint": "/metrics",
},
KubectlOptions: k8s.NewKubectlOptions("", "", namespaceName),
}
output := helm.RenderTemplate(t, options, helmChartPath, releaseName, []string{"templates/deployment.yaml"})
var deployment appsv1.Deployment
helm.UnmarshalK8SYaml(t, output, &deployment)
require.Len(t, deployment.Spec.Template.Spec.Containers, 1, "Expected one container")
container := deployment.Spec.Template.Spec.Containers[0]
assert.Len(t, container.Ports, 1)
port := container.Ports[0]
assert.Equal(t, corev1.Protocol("TCP"), port.Protocol)
assert.Equal(t, int32(8080), port.ContainerPort)
metricsFlags := map[string]*struct {
expect string
frequency int
}{
"--listener-metrics-addr": {
expect: ":8081",
},
"--listener-metrics-endpoint": {
expect: "/metrics",
},
"--metrics-addr": {
expect: ":8080",
},
}
for _, cmd := range container.Args {
s := strings.Split(cmd, "=")
if len(s) != 2 {
continue
}
flag, ok := metricsFlags[s[0]]
if !ok {
continue
}
flag.frequency++
assert.Equal(t, flag.expect, s[1])
}
for key, value := range metricsFlags {
assert.Equal(t, value.frequency, 1, fmt.Sprintf("frequency of %q is not 1", key))
}
}

View File

@@ -76,10 +76,40 @@ affinity: {}
priorityClassName: ""
flags:
# Log level can be set here with one of the following values: "debug", "info", "warn", "error".
# Defaults to "debug".
## Log level can be set here with one of the following values: "debug", "info", "warn", "error".
## Defaults to "debug".
logLevel: "debug"
## Log format can be set with one of the following values: "text", "json"
## Defaults to "text"
logFormat: "text"
## Restricts the controller to only watch resources in the desired namespace.
## Defaults to watch all namespaces when unset.
# watchSingleNamespace: ""
## Defines how the controller should handle upgrades while having running jobs.
##
## The srategies available are:
## - "immediate": (default) The controller will immediately apply the change causing the
## recreation of the listener and ephemeral runner set. This can lead to an
## overprovisioning of runners, if there are pending / running jobs. This should not
## be a problem at a small scale, but it could lead to a significant increase of
## resources if you have a lot of jobs running concurrently.
##
## - "eventual": The controller will remove the listener and ephemeral runner set
## immediately, but will not recreate them (to apply changes) until all
## pending / running jobs have completed.
## This can lead to a longer time to apply the change but it will ensure
## that you don't have any overprovisioning of runners.
updateStrategy: "immediate"
## If `metrics:` object is not provided, or commented out, the following flags
## will be applied the controller-manager and listener pods with empty values:
## `--metrics-addr`, `--listener-metrics-addr`, `--listener-metrics-endpoint`.
## This will disable metrics.
##
## To enable metrics, uncomment the following lines.
# metrics:
# controllerManagerAddr: ":8080"
# listenerAddr: ":8080"
# listenerEndpoint: "/metrics"

View File

@@ -15,13 +15,13 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.4.0
version: 0.5.0
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "0.4.0"
appVersion: "0.5.0"
home: https://github.com/actions/dev-arc

View File

@@ -1,8 +1,13 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "gha-base-name" -}}
gha-rs
{{- end }}
{{- define "gha-runner-scale-set.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- default (include "gha-base-name" .) .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
@@ -11,7 +16,7 @@ We truncate at 63 chars because some Kubernetes name fields are limited to this
If release name contains chart name it will be used as a full name.
*/}}
{{- define "gha-runner-scale-set.fullname" -}}
{{- $name := default .Chart.Name }}
{{- $name := default (include "gha-base-name" .) }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
@@ -19,7 +24,7 @@ If release name contains chart name it will be used as a full name.
Create chart name and version as used by the chart label.
*/}}
{{- define "gha-runner-scale-set.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- printf "%s-%s" (include "gha-base-name" .) .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
@@ -32,7 +37,7 @@ helm.sh/chart: {{ include "gha-runner-scale-set.chart" . }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
app.kubernetes.io/part-of: gha-runner-scale-set
app.kubernetes.io/part-of: gha-rs
actions.github.com/scale-set-name: {{ .Release.Name }}
actions.github.com/scale-set-namespace: {{ .Release.Namespace }}
{{- end }}
@@ -58,19 +63,19 @@ app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{- define "gha-runner-scale-set.noPermissionServiceAccountName" -}}
{{- include "gha-runner-scale-set.fullname" . }}-no-permission-service-account
{{- include "gha-runner-scale-set.fullname" . }}-no-permission
{{- end }}
{{- define "gha-runner-scale-set.kubeModeRoleName" -}}
{{- include "gha-runner-scale-set.fullname" . }}-kube-mode-role
{{- include "gha-runner-scale-set.fullname" . }}-kube-mode
{{- end }}
{{- define "gha-runner-scale-set.kubeModeRoleBindingName" -}}
{{- include "gha-runner-scale-set.fullname" . }}-kube-mode-role-binding
{{- include "gha-runner-scale-set.fullname" . }}-kube-mode
{{- end }}
{{- define "gha-runner-scale-set.kubeModeServiceAccountName" -}}
{{- include "gha-runner-scale-set.fullname" . }}-kube-mode-service-account
{{- include "gha-runner-scale-set.fullname" . }}-kube-mode
{{- end }}
{{- define "gha-runner-scale-set.dind-init-container" -}}
@@ -428,11 +433,11 @@ volumeMounts:
{{- end }}
{{- define "gha-runner-scale-set.managerRoleName" -}}
{{- include "gha-runner-scale-set.fullname" . }}-manager-role
{{- include "gha-runner-scale-set.fullname" . }}-manager
{{- end }}
{{- define "gha-runner-scale-set.managerRoleBindingName" -}}
{{- include "gha-runner-scale-set.fullname" . }}-manager-role-binding
{{- include "gha-runner-scale-set.fullname" . }}-manager
{{- end }}
{{- define "gha-runner-scale-set.managerServiceAccountName" -}}
@@ -451,7 +456,7 @@ volumeMounts:
{{- $managerServiceAccountName := "" }}
{{- range $index, $deployment := (lookup "apps/v1" "Deployment" "" "").items }}
{{- if kindIs "map" $deployment.metadata.labels }}
{{- if eq (get $deployment.metadata.labels "app.kubernetes.io/part-of") "gha-runner-scale-set-controller" }}
{{- if eq (get $deployment.metadata.labels "app.kubernetes.io/part-of") "gha-rs-controller" }}
{{- if hasKey $deployment.metadata.labels "actions.github.com/controller-watch-single-namespace" }}
{{- $singleNamespaceCounter = add $singleNamespaceCounter 1 }}
{{- $_ := set $singleNamespaceControllerDeployments (get $deployment.metadata.labels "actions.github.com/controller-watch-single-namespace") $deployment}}
@@ -463,13 +468,13 @@ volumeMounts:
{{- end }}
{{- end }}
{{- if and (eq $multiNamespacesCounter 0) (eq $singleNamespaceCounter 0) }}
{{- fail "No gha-runner-scale-set-controller deployment found using label (app.kubernetes.io/part-of=gha-runner-scale-set-controller). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- fail "No gha-rs-controller deployment found using label (app.kubernetes.io/part-of=gha-rs-controller). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- end }}
{{- if and (gt $multiNamespacesCounter 0) (gt $singleNamespaceCounter 0) }}
{{- fail "Found both gha-runner-scale-set-controller installed with flags.watchSingleNamespace set and unset in cluster, this is not supported. Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- fail "Found both gha-rs-controller installed with flags.watchSingleNamespace set and unset in cluster, this is not supported. Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- end }}
{{- if gt $multiNamespacesCounter 1 }}
{{- fail "More than one gha-runner-scale-set-controller deployment found using label (app.kubernetes.io/part-of=gha-runner-scale-set-controller). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- fail "More than one gha-rs-controller deployment found using label (app.kubernetes.io/part-of=gha-rs-controller). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- end }}
{{- if eq $multiNamespacesCounter 1 }}
{{- with $controllerDeployment.metadata }}
@@ -482,11 +487,11 @@ volumeMounts:
{{- $managerServiceAccountName = (get $controllerDeployment.metadata.labels "actions.github.com/controller-service-account-name") }}
{{- end }}
{{- else }}
{{- fail "No gha-runner-scale-set-controller deployment that watch this namespace found using label (actions.github.com/controller-watch-single-namespace). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- fail "No gha-rs-controller deployment that watch this namespace found using label (actions.github.com/controller-watch-single-namespace). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- end }}
{{- end }}
{{- if eq $managerServiceAccountName "" }}
{{- fail "No service account name found for gha-runner-scale-set-controller deployment using label (actions.github.com/controller-service-account-name), consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- fail "No service account name found for gha-rs-controller deployment using label (actions.github.com/controller-service-account-name), consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- end }}
{{- $managerServiceAccountName }}
{{- end }}
@@ -508,7 +513,7 @@ volumeMounts:
{{- $managerServiceAccountNamespace := "" }}
{{- range $index, $deployment := (lookup "apps/v1" "Deployment" "" "").items }}
{{- if kindIs "map" $deployment.metadata.labels }}
{{- if eq (get $deployment.metadata.labels "app.kubernetes.io/part-of") "gha-runner-scale-set-controller" }}
{{- if eq (get $deployment.metadata.labels "app.kubernetes.io/part-of") "gha-rs-controller" }}
{{- if hasKey $deployment.metadata.labels "actions.github.com/controller-watch-single-namespace" }}
{{- $singleNamespaceCounter = add $singleNamespaceCounter 1 }}
{{- $_ := set $singleNamespaceControllerDeployments (get $deployment.metadata.labels "actions.github.com/controller-watch-single-namespace") $deployment}}
@@ -520,13 +525,13 @@ volumeMounts:
{{- end }}
{{- end }}
{{- if and (eq $multiNamespacesCounter 0) (eq $singleNamespaceCounter 0) }}
{{- fail "No gha-runner-scale-set-controller deployment found using label (app.kubernetes.io/part-of=gha-runner-scale-set-controller). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- fail "No gha-rs-controller deployment found using label (app.kubernetes.io/part-of=gha-rs-controller). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- end }}
{{- if and (gt $multiNamespacesCounter 0) (gt $singleNamespaceCounter 0) }}
{{- fail "Found both gha-runner-scale-set-controller installed with flags.watchSingleNamespace set and unset in cluster, this is not supported. Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- fail "Found both gha-rs-controller installed with flags.watchSingleNamespace set and unset in cluster, this is not supported. Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- end }}
{{- if gt $multiNamespacesCounter 1 }}
{{- fail "More than one gha-runner-scale-set-controller deployment found using label (app.kubernetes.io/part-of=gha-runner-scale-set-controller). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- fail "More than one gha-rs-controller deployment found using label (app.kubernetes.io/part-of=gha-rs-controller). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- end }}
{{- if eq $multiNamespacesCounter 1 }}
{{- with $controllerDeployment.metadata }}
@@ -539,11 +544,11 @@ volumeMounts:
{{- $managerServiceAccountNamespace = (get $controllerDeployment.metadata.labels "actions.github.com/controller-service-account-namespace") }}
{{- end }}
{{- else }}
{{- fail "No gha-runner-scale-set-controller deployment that watch this namespace found using label (actions.github.com/controller-watch-single-namespace). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- fail "No gha-rs-controller deployment that watch this namespace found using label (actions.github.com/controller-watch-single-namespace). Consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- end }}
{{- end }}
{{- if eq $managerServiceAccountNamespace "" }}
{{- fail "No service account namespace found for gha-runner-scale-set-controller deployment using label (actions.github.com/controller-service-account-namespace), consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- fail "No service account namespace found for gha-rs-controller deployment using label (actions.github.com/controller-service-account-namespace), consider setting controllerServiceAccount.name in values.yaml to be explicit if you think the discovery is wrong." }}
{{- end }}
{{- $managerServiceAccountNamespace }}
{{- end }}

View File

@@ -119,7 +119,7 @@ spec:
{{- include "gha-runner-scale-set.dind-init-container" . | nindent 8 }}
{{- end }}
{{- with .Values.template.spec.initContainers }}
{{- toYaml . | nindent 8 }}
{{- toYaml . | nindent 6 }}
{{- end }}
{{- end }}
containers:

View File

@@ -10,6 +10,7 @@ import (
actionsgithubcom "github.com/actions/actions-runner-controller/controllers/actions.github.com"
"github.com/gruntwork-io/terratest/modules/helm"
"github.com/gruntwork-io/terratest/modules/k8s"
"github.com/gruntwork-io/terratest/modules/logger"
"github.com/gruntwork-io/terratest/modules/random"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
@@ -28,6 +29,7 @@ func TestTemplateRenderedGitHubSecretWithGitHubToken(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -43,7 +45,7 @@ func TestTemplateRenderedGitHubSecretWithGitHubToken(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &githubSecret)
assert.Equal(t, namespaceName, githubSecret.Namespace)
assert.Equal(t, "test-runners-gha-runner-scale-set-github-secret", githubSecret.Name)
assert.Equal(t, "test-runners-gha-rs-github-secret", githubSecret.Name)
assert.Equal(t, "gh_token12345", string(githubSecret.Data["github_token"]))
assert.Equal(t, "actions.github.com/cleanup-protection", githubSecret.Finalizers[0])
}
@@ -59,6 +61,7 @@ func TestTemplateRenderedGitHubSecretWithGitHubApp(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_app_id": "10",
@@ -92,6 +95,7 @@ func TestTemplateRenderedGitHubSecretErrorWithMissingAuthInput(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_app_id": "",
@@ -119,6 +123,7 @@ func TestTemplateRenderedGitHubSecretErrorWithMissingAppInput(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_app_id": "10",
@@ -145,6 +150,7 @@ func TestTemplateNotRenderedGitHubSecretWithPredefinedSecret(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret": "pre-defined-secret",
@@ -169,6 +175,7 @@ func TestTemplateRenderedSetServiceAccountToNoPermission(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -183,13 +190,13 @@ func TestTemplateRenderedSetServiceAccountToNoPermission(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &serviceAccount)
assert.Equal(t, namespaceName, serviceAccount.Namespace)
assert.Equal(t, "test-runners-gha-runner-scale-set-no-permission-service-account", serviceAccount.Name)
assert.Equal(t, "test-runners-gha-rs-no-permission", serviceAccount.Name)
output = helm.RenderTemplate(t, options, helmChartPath, releaseName, []string{"templates/autoscalingrunnerset.yaml"})
var ars v1alpha1.AutoscalingRunnerSet
helm.UnmarshalK8SYaml(t, output, &ars)
assert.Equal(t, "test-runners-gha-runner-scale-set-no-permission-service-account", ars.Spec.Template.Spec.ServiceAccountName)
assert.Equal(t, "test-runners-gha-rs-no-permission", ars.Spec.Template.Spec.ServiceAccountName)
assert.Empty(t, ars.Annotations[actionsgithubcom.AnnotationKeyKubernetesModeServiceAccountName]) // no finalizer protections in place
}
@@ -204,6 +211,7 @@ func TestTemplateRenderedSetServiceAccountToKubeMode(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -219,7 +227,7 @@ func TestTemplateRenderedSetServiceAccountToKubeMode(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &serviceAccount)
assert.Equal(t, namespaceName, serviceAccount.Namespace)
assert.Equal(t, "test-runners-gha-runner-scale-set-kube-mode-service-account", serviceAccount.Name)
assert.Equal(t, "test-runners-gha-rs-kube-mode", serviceAccount.Name)
assert.Equal(t, "actions.github.com/cleanup-protection", serviceAccount.Finalizers[0])
output = helm.RenderTemplate(t, options, helmChartPath, releaseName, []string{"templates/kube_mode_role.yaml"})
@@ -227,7 +235,7 @@ func TestTemplateRenderedSetServiceAccountToKubeMode(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &role)
assert.Equal(t, namespaceName, role.Namespace)
assert.Equal(t, "test-runners-gha-runner-scale-set-kube-mode-role", role.Name)
assert.Equal(t, "test-runners-gha-rs-kube-mode", role.Name)
assert.Equal(t, "actions.github.com/cleanup-protection", role.Finalizers[0])
@@ -243,11 +251,11 @@ func TestTemplateRenderedSetServiceAccountToKubeMode(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &roleBinding)
assert.Equal(t, namespaceName, roleBinding.Namespace)
assert.Equal(t, "test-runners-gha-runner-scale-set-kube-mode-role-binding", roleBinding.Name)
assert.Equal(t, "test-runners-gha-rs-kube-mode", roleBinding.Name)
assert.Len(t, roleBinding.Subjects, 1)
assert.Equal(t, "test-runners-gha-runner-scale-set-kube-mode-service-account", roleBinding.Subjects[0].Name)
assert.Equal(t, "test-runners-gha-rs-kube-mode", roleBinding.Subjects[0].Name)
assert.Equal(t, namespaceName, roleBinding.Subjects[0].Namespace)
assert.Equal(t, "test-runners-gha-runner-scale-set-kube-mode-role", roleBinding.RoleRef.Name)
assert.Equal(t, "test-runners-gha-rs-kube-mode", roleBinding.RoleRef.Name)
assert.Equal(t, "Role", roleBinding.RoleRef.Kind)
assert.Equal(t, "actions.github.com/cleanup-protection", serviceAccount.Finalizers[0])
@@ -255,7 +263,7 @@ func TestTemplateRenderedSetServiceAccountToKubeMode(t *testing.T) {
var ars v1alpha1.AutoscalingRunnerSet
helm.UnmarshalK8SYaml(t, output, &ars)
expectedServiceAccountName := "test-runners-gha-runner-scale-set-kube-mode-service-account"
expectedServiceAccountName := "test-runners-gha-rs-kube-mode"
assert.Equal(t, expectedServiceAccountName, ars.Spec.Template.Spec.ServiceAccountName)
assert.Equal(t, expectedServiceAccountName, ars.Annotations[actionsgithubcom.AnnotationKeyKubernetesModeServiceAccountName])
}
@@ -271,6 +279,7 @@ func TestTemplateRenderedUserProvideSetServiceAccount(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -303,6 +312,7 @@ func TestTemplateRenderedAutoScalingRunnerSet(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -320,14 +330,14 @@ func TestTemplateRenderedAutoScalingRunnerSet(t *testing.T) {
assert.Equal(t, namespaceName, ars.Namespace)
assert.Equal(t, "test-runners", ars.Name)
assert.Equal(t, "gha-runner-scale-set", ars.Labels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs", ars.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-runners", ars.Labels["app.kubernetes.io/instance"])
assert.Equal(t, "gha-runner-scale-set", ars.Labels["app.kubernetes.io/part-of"])
assert.Equal(t, "gha-rs", ars.Labels["app.kubernetes.io/part-of"])
assert.Equal(t, "autoscaling-runner-set", ars.Labels["app.kubernetes.io/component"])
assert.NotEmpty(t, ars.Labels["app.kubernetes.io/version"])
assert.Equal(t, "https://github.com/actions", ars.Spec.GitHubConfigUrl)
assert.Equal(t, "test-runners-gha-runner-scale-set-github-secret", ars.Spec.GitHubConfigSecret)
assert.Equal(t, "test-runners-gha-rs-github-secret", ars.Spec.GitHubConfigSecret)
assert.Empty(t, ars.Spec.RunnerGroup, "RunnerGroup should be empty")
@@ -354,6 +364,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_RunnerScaleSetName(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -372,10 +383,10 @@ func TestTemplateRenderedAutoScalingRunnerSet_RunnerScaleSetName(t *testing.T) {
assert.Equal(t, namespaceName, ars.Namespace)
assert.Equal(t, "test-runners", ars.Name)
assert.Equal(t, "gha-runner-scale-set", ars.Labels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs", ars.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-runners", ars.Labels["app.kubernetes.io/instance"])
assert.Equal(t, "https://github.com/actions", ars.Spec.GitHubConfigUrl)
assert.Equal(t, "test-runners-gha-runner-scale-set-github-secret", ars.Spec.GitHubConfigSecret)
assert.Equal(t, "test-runners-gha-rs-github-secret", ars.Spec.GitHubConfigSecret)
assert.Equal(t, "test-runner-scale-set-name", ars.Spec.RunnerScaleSetName)
assert.Empty(t, ars.Spec.RunnerGroup, "RunnerGroup should be empty")
@@ -403,6 +414,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_ProvideMetadata(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -450,6 +462,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_MaxRunnersValidationError(t *testi
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -477,6 +490,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_MinRunnersValidationError(t *testi
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -505,6 +519,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_MinMaxRunnersValidationError(t *te
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -533,6 +548,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_MinMaxRunnersValidationSameValue(t
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -564,6 +580,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_MinMaxRunnersValidation_OnlyMin(t
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -594,6 +611,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_MinMaxRunnersValidation_OnlyMax(t
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -627,6 +645,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_MinMaxRunners_FromValuesFile(t *te
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
ValuesFiles: []string{testValuesPath},
KubectlOptions: k8s.NewKubectlOptions("", "", namespaceName),
}
@@ -654,6 +673,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_ExtraVolumes(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"controllerServiceAccount.name": "arc",
"controllerServiceAccount.namespace": "arc-system",
@@ -674,6 +694,50 @@ func TestTemplateRenderedAutoScalingRunnerSet_ExtraVolumes(t *testing.T) {
assert.Equal(t, "/data", ars.Spec.Template.Spec.Volumes[2].HostPath.Path, "Volume host path should be /data")
}
func TestTemplateRenderedAutoScalingRunnerSet_DinD_ExtraInitContainers(t *testing.T) {
t.Parallel()
// Path to the helm chart we will test
helmChartPath, err := filepath.Abs("../../gha-runner-scale-set")
require.NoError(t, err)
testValuesPath, err := filepath.Abs("../tests/values_dind_extra_init_containers.yaml")
require.NoError(t, err)
releaseName := "test-runners"
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"controllerServiceAccount.name": "arc",
"controllerServiceAccount.namespace": "arc-system",
},
ValuesFiles: []string{testValuesPath},
KubectlOptions: k8s.NewKubectlOptions("", "", namespaceName),
}
output := helm.RenderTemplate(t, options, helmChartPath, releaseName, []string{"templates/autoscalingrunnerset.yaml"})
var ars v1alpha1.AutoscalingRunnerSet
helm.UnmarshalK8SYaml(t, output, &ars)
assert.Len(t, ars.Spec.Template.Spec.InitContainers, 3, "InitContainers should be 3")
assert.Equal(t, "kube-init", ars.Spec.Template.Spec.InitContainers[1].Name, "InitContainers[1] Name should be kube-init")
assert.Equal(t, "runner-image:latest", ars.Spec.Template.Spec.InitContainers[1].Image, "InitContainers[1] Image should be runner-image:latest")
assert.Equal(t, "sudo", ars.Spec.Template.Spec.InitContainers[1].Command[0], "InitContainers[1] Command[0] should be sudo")
assert.Equal(t, "chown", ars.Spec.Template.Spec.InitContainers[1].Command[1], "InitContainers[1] Command[1] should be chown")
assert.Equal(t, "-R", ars.Spec.Template.Spec.InitContainers[1].Command[2], "InitContainers[1] Command[2] should be -R")
assert.Equal(t, "1001:123", ars.Spec.Template.Spec.InitContainers[1].Command[3], "InitContainers[1] Command[3] should be 1001:123")
assert.Equal(t, "/home/runner/_work", ars.Spec.Template.Spec.InitContainers[1].Command[4], "InitContainers[1] Command[4] should be /home/runner/_work")
assert.Equal(t, "work", ars.Spec.Template.Spec.InitContainers[1].VolumeMounts[0].Name, "InitContainers[1] VolumeMounts[0] Name should be work")
assert.Equal(t, "/home/runner/_work", ars.Spec.Template.Spec.InitContainers[1].VolumeMounts[0].MountPath, "InitContainers[1] VolumeMounts[0] MountPath should be /home/runner/_work")
assert.Equal(t, "ls", ars.Spec.Template.Spec.InitContainers[2].Name, "InitContainers[2] Name should be ls")
assert.Equal(t, "ubuntu:latest", ars.Spec.Template.Spec.InitContainers[2].Image, "InitContainers[2] Image should be ubuntu:latest")
assert.Equal(t, "ls", ars.Spec.Template.Spec.InitContainers[2].Command[0], "InitContainers[2] Command[0] should be ls")
}
func TestTemplateRenderedAutoScalingRunnerSet_DinD_ExtraVolumes(t *testing.T) {
t.Parallel()
@@ -688,6 +752,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_DinD_ExtraVolumes(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"controllerServiceAccount.name": "arc",
"controllerServiceAccount.namespace": "arc-system",
@@ -724,6 +789,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_K8S_ExtraVolumes(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"controllerServiceAccount.name": "arc",
"controllerServiceAccount.namespace": "arc-system",
@@ -755,6 +821,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_EnableDinD(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -773,10 +840,10 @@ func TestTemplateRenderedAutoScalingRunnerSet_EnableDinD(t *testing.T) {
assert.Equal(t, namespaceName, ars.Namespace)
assert.Equal(t, "test-runners", ars.Name)
assert.Equal(t, "gha-runner-scale-set", ars.Labels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs", ars.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-runners", ars.Labels["app.kubernetes.io/instance"])
assert.Equal(t, "https://github.com/actions", ars.Spec.GitHubConfigUrl)
assert.Equal(t, "test-runners-gha-runner-scale-set-github-secret", ars.Spec.GitHubConfigSecret)
assert.Equal(t, "test-runners-gha-rs-github-secret", ars.Spec.GitHubConfigSecret)
assert.Empty(t, ars.Spec.RunnerGroup, "RunnerGroup should be empty")
@@ -846,6 +913,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_EnableKubernetesMode(t *testing.T)
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -864,10 +932,10 @@ func TestTemplateRenderedAutoScalingRunnerSet_EnableKubernetesMode(t *testing.T)
assert.Equal(t, namespaceName, ars.Namespace)
assert.Equal(t, "test-runners", ars.Name)
assert.Equal(t, "gha-runner-scale-set", ars.Labels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs", ars.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-runners", ars.Labels["app.kubernetes.io/instance"])
assert.Equal(t, "https://github.com/actions", ars.Spec.GitHubConfigUrl)
assert.Equal(t, "test-runners-gha-runner-scale-set-github-secret", ars.Spec.GitHubConfigSecret)
assert.Equal(t, "test-runners-gha-rs-github-secret", ars.Spec.GitHubConfigSecret)
assert.Empty(t, ars.Spec.RunnerGroup, "RunnerGroup should be empty")
assert.Nil(t, ars.Spec.MinRunners, "MinRunners should be nil")
@@ -903,6 +971,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_UsePredefinedSecret(t *testing.T)
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret": "pre-defined-secrets",
@@ -920,7 +989,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_UsePredefinedSecret(t *testing.T)
assert.Equal(t, namespaceName, ars.Namespace)
assert.Equal(t, "test-runners", ars.Name)
assert.Equal(t, "gha-runner-scale-set", ars.Labels["app.kubernetes.io/name"])
assert.Equal(t, "gha-rs", ars.Labels["app.kubernetes.io/name"])
assert.Equal(t, "test-runners", ars.Labels["app.kubernetes.io/instance"])
assert.Equal(t, "https://github.com/actions", ars.Spec.GitHubConfigUrl)
assert.Equal(t, "pre-defined-secrets", ars.Spec.GitHubConfigSecret)
@@ -937,6 +1006,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_ErrorOnEmptyPredefinedSecret(t *te
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret": "",
@@ -963,6 +1033,7 @@ func TestTemplateRenderedWithProxy(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret": "pre-defined-secrets",
@@ -1026,6 +1097,7 @@ func TestTemplateRenderedWithTLS(t *testing.T) {
t.Run("providing githubServerTLS.runnerMountPath", func(t *testing.T) {
t.Run("mode: default", func(t *testing.T) {
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret": "pre-defined-secrets",
@@ -1084,6 +1156,7 @@ func TestTemplateRenderedWithTLS(t *testing.T) {
t.Run("mode: dind", func(t *testing.T) {
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret": "pre-defined-secrets",
@@ -1143,6 +1216,7 @@ func TestTemplateRenderedWithTLS(t *testing.T) {
t.Run("mode: kubernetes", func(t *testing.T) {
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret": "pre-defined-secrets",
@@ -1204,6 +1278,7 @@ func TestTemplateRenderedWithTLS(t *testing.T) {
t.Run("without providing githubServerTLS.runnerMountPath", func(t *testing.T) {
t.Run("mode: default", func(t *testing.T) {
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret": "pre-defined-secrets",
@@ -1258,6 +1333,7 @@ func TestTemplateRenderedWithTLS(t *testing.T) {
t.Run("mode: dind", func(t *testing.T) {
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret": "pre-defined-secrets",
@@ -1313,6 +1389,7 @@ func TestTemplateRenderedWithTLS(t *testing.T) {
t.Run("mode: kubernetes", func(t *testing.T) {
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret": "pre-defined-secrets",
@@ -1402,6 +1479,7 @@ func TestTemplateNamingConstraints(t *testing.T) {
for name, tc := range tt {
t.Run(name, func(t *testing.T) {
options := &helm.Options{
Logger: logger.Discard,
SetValues: setValues,
KubectlOptions: k8s.NewKubectlOptions("", "", tc.namespaceName),
}
@@ -1423,6 +1501,7 @@ func TestTemplateRenderedGitHubConfigUrlEndsWIthSlash(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions/",
"githubConfigSecret.github_token": "gh_token12345",
@@ -1453,6 +1532,7 @@ func TestTemplate_CreateManagerRole(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -1468,7 +1548,7 @@ func TestTemplate_CreateManagerRole(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &managerRole)
assert.Equal(t, namespaceName, managerRole.Namespace, "namespace should match the namespace of the Helm release")
assert.Equal(t, "test-runners-gha-runner-scale-set-manager-role", managerRole.Name)
assert.Equal(t, "test-runners-gha-rs-manager", managerRole.Name)
assert.Equal(t, "actions.github.com/cleanup-protection", managerRole.Finalizers[0])
assert.Equal(t, 6, len(managerRole.Rules))
@@ -1487,6 +1567,7 @@ func TestTemplate_CreateManagerRole_UseConfigMaps(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -1503,7 +1584,7 @@ func TestTemplate_CreateManagerRole_UseConfigMaps(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &managerRole)
assert.Equal(t, namespaceName, managerRole.Namespace, "namespace should match the namespace of the Helm release")
assert.Equal(t, "test-runners-gha-runner-scale-set-manager-role", managerRole.Name)
assert.Equal(t, "test-runners-gha-rs-manager", managerRole.Name)
assert.Equal(t, "actions.github.com/cleanup-protection", managerRole.Finalizers[0])
assert.Equal(t, 7, len(managerRole.Rules))
assert.Equal(t, "configmaps", managerRole.Rules[6].Resources[0])
@@ -1520,6 +1601,7 @@ func TestTemplate_CreateManagerRoleBinding(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -1535,8 +1617,8 @@ func TestTemplate_CreateManagerRoleBinding(t *testing.T) {
helm.UnmarshalK8SYaml(t, output, &managerRoleBinding)
assert.Equal(t, namespaceName, managerRoleBinding.Namespace, "namespace should match the namespace of the Helm release")
assert.Equal(t, "test-runners-gha-runner-scale-set-manager-role-binding", managerRoleBinding.Name)
assert.Equal(t, "test-runners-gha-runner-scale-set-manager-role", managerRoleBinding.RoleRef.Name)
assert.Equal(t, "test-runners-gha-rs-manager", managerRoleBinding.Name)
assert.Equal(t, "test-runners-gha-rs-manager", managerRoleBinding.RoleRef.Name)
assert.Equal(t, "actions.github.com/cleanup-protection", managerRoleBinding.Finalizers[0])
assert.Equal(t, "arc", managerRoleBinding.Subjects[0].Name)
assert.Equal(t, "arc-system", managerRoleBinding.Subjects[0].Namespace)
@@ -1556,6 +1638,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_ExtraContainers(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"controllerServiceAccount.name": "arc",
"controllerServiceAccount.namespace": "arc-system",
@@ -1603,6 +1686,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_ExtraPodSpec(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"controllerServiceAccount.name": "arc",
"controllerServiceAccount.namespace": "arc-system",
@@ -1636,6 +1720,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_DinDMergePodSpec(t *testing.T) {
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"controllerServiceAccount.name": "arc",
"controllerServiceAccount.namespace": "arc-system",
@@ -1681,6 +1766,7 @@ func TestTemplateRenderedAutoScalingRunnerSet_KubeModeMergePodSpec(t *testing.T)
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"controllerServiceAccount.name": "arc",
"controllerServiceAccount.namespace": "arc-system",
@@ -1722,6 +1808,7 @@ func TestTemplateRenderedAutoscalingRunnerSetAnnotation_GitHubSecret(t *testing.
annotationExpectedTests := map[string]*helm.Options{
"GitHub token": {
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -1731,6 +1818,7 @@ func TestTemplateRenderedAutoscalingRunnerSetAnnotation_GitHubSecret(t *testing.
KubectlOptions: k8s.NewKubectlOptions("", "", namespaceName),
},
"GitHub app": {
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_app_id": "10",
@@ -1755,6 +1843,7 @@ func TestTemplateRenderedAutoscalingRunnerSetAnnotation_GitHubSecret(t *testing.
t.Run("Annotation should not be set", func(t *testing.T) {
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret": "pre-defined-secret",
@@ -1782,6 +1871,7 @@ func TestTemplateRenderedAutoscalingRunnerSetAnnotation_KubernetesModeCleanup(t
namespaceName := "test-" + strings.ToLower(random.UniqueId())
options := &helm.Options{
Logger: logger.Discard,
SetValues: map[string]string{
"githubConfigUrl": "https://github.com/actions",
"githubConfigSecret.github_token": "gh_token12345",
@@ -1797,12 +1887,12 @@ func TestTemplateRenderedAutoscalingRunnerSetAnnotation_KubernetesModeCleanup(t
helm.UnmarshalK8SYaml(t, output, &autoscalingRunnerSet)
annotationValues := map[string]string{
actionsgithubcom.AnnotationKeyGitHubSecretName: "test-runners-gha-runner-scale-set-github-secret",
actionsgithubcom.AnnotationKeyManagerRoleName: "test-runners-gha-runner-scale-set-manager-role",
actionsgithubcom.AnnotationKeyManagerRoleBindingName: "test-runners-gha-runner-scale-set-manager-role-binding",
actionsgithubcom.AnnotationKeyKubernetesModeServiceAccountName: "test-runners-gha-runner-scale-set-kube-mode-service-account",
actionsgithubcom.AnnotationKeyKubernetesModeRoleName: "test-runners-gha-runner-scale-set-kube-mode-role",
actionsgithubcom.AnnotationKeyKubernetesModeRoleBindingName: "test-runners-gha-runner-scale-set-kube-mode-role-binding",
actionsgithubcom.AnnotationKeyGitHubSecretName: "test-runners-gha-rs-github-secret",
actionsgithubcom.AnnotationKeyManagerRoleName: "test-runners-gha-rs-manager",
actionsgithubcom.AnnotationKeyManagerRoleBindingName: "test-runners-gha-rs-manager",
actionsgithubcom.AnnotationKeyKubernetesModeServiceAccountName: "test-runners-gha-rs-kube-mode",
actionsgithubcom.AnnotationKeyKubernetesModeRoleName: "test-runners-gha-rs-kube-mode",
actionsgithubcom.AnnotationKeyKubernetesModeRoleBindingName: "test-runners-gha-rs-kube-mode",
}
for annotation, value := range annotationValues {

View File

@@ -0,0 +1,17 @@
githubConfigUrl: https://github.com/actions/actions-runner-controller
githubConfigSecret:
github_token: test
template:
spec:
initContainers:
- name: kube-init
image: runner-image:latest
command: ["sudo", "chown", "-R", "1001:123", "/home/runner/_work"]
volumeMounts:
- name: work
mountPath: /home/runner/_work
- name: ls
image: ubuntu:latest
command: ["ls"]
containerMode:
type: dind

View File

@@ -68,6 +68,12 @@ githubConfigSecret:
# key: ca.crt
# runnerMountPath: /usr/local/share/ca-certificates/
## Container mode is is an object that provides out-of-box configuration
## for dind and kubernetes mode. Template will be modified as documented under the
## template object.
##
## If any customization is required for dind or kubernetes mode, containerMode should remain
## empty, and configuration should be applied to the template.
# containerMode:
# type: "dind" ## type can be set to dind or kubernetes
# ## the following is required when containerMode.type=kubernetes

View File

@@ -114,7 +114,14 @@ func createSession(ctx context.Context, logger *logr.Logger, client actions.Acti
return runnerScaleSetSession, initialMessage, nil
}
return runnerScaleSetSession, nil, nil
initialMessage := &actions.RunnerScaleSetMessage{
MessageId: 0,
MessageType: "RunnerScaleSetJobMessages",
Statistics: runnerScaleSetSession.Statistics,
Body: "",
}
return runnerScaleSetSession, initialMessage, nil
}
func (m *AutoScalerClient) Close() error {

View File

@@ -37,7 +37,7 @@ func TestCreateSession(t *testing.T) {
require.NoError(t, err, "Error creating autoscaler client")
assert.Equal(t, session, session, "Session is not correct")
assert.Nil(t, asClient.initialMessage, "Initial message should be nil")
assert.NotNil(t, asClient.initialMessage, "Initial message should not be nil")
assert.Equal(t, int64(0), asClient.lastMessageId, "Last message id should be 0")
assert.True(t, mockActionsClient.AssertExpectations(t), "All expectations should be met")
}
@@ -188,7 +188,7 @@ func TestCreateSession_RetrySessionConflict(t *testing.T) {
require.NoError(t, err, "Error creating autoscaler client")
assert.Equal(t, session, session, "Session is not correct")
assert.Nil(t, asClient.initialMessage, "Initial message should be nil")
assert.NotNil(t, asClient.initialMessage, "Initial message should not be nil")
assert.Equal(t, int64(0), asClient.lastMessageId, "Last message id should be 0")
assert.True(t, mockActionsClient.AssertExpectations(t), "All expectations should be met")
}
@@ -334,6 +334,14 @@ func TestGetRunnerScaleSetMessage(t *testing.T) {
return nil
})
assert.NoError(t, err, "Error getting message")
assert.Equal(t, int64(0), asClient.lastMessageId, "Initial message")
err = asClient.GetRunnerScaleSetMessage(ctx, func(msg *actions.RunnerScaleSetMessage) error {
logger.Info("Message received", "messageId", msg.MessageId, "messageType", msg.MessageType, "body", msg.Body)
return nil
})
assert.NoError(t, err, "Error getting message")
assert.Equal(t, int64(1), asClient.lastMessageId, "Last message id should be updated")
assert.True(t, mockActionsClient.AssertExpectations(t), "All expectations should be met")
@@ -371,13 +379,21 @@ func TestGetRunnerScaleSetMessage_HandleFailed(t *testing.T) {
})
require.NoError(t, err, "Error creating autoscaler client")
// read initial message
err = asClient.GetRunnerScaleSetMessage(ctx, func(msg *actions.RunnerScaleSetMessage) error {
logger.Info("Message received", "messageId", msg.MessageId, "messageType", msg.MessageType, "body", msg.Body)
return nil
})
assert.NoError(t, err, "Error getting message")
err = asClient.GetRunnerScaleSetMessage(ctx, func(msg *actions.RunnerScaleSetMessage) error {
logger.Info("Message received", "messageId", msg.MessageId, "messageType", msg.MessageType, "body", msg.Body)
return fmt.Errorf("error")
})
assert.ErrorContains(t, err, "handle message failed. error", "Error getting message")
assert.Equal(t, int64(0), asClient.lastMessageId, "Last message id should be updated")
assert.Equal(t, int64(0), asClient.lastMessageId, "Last message id should not be updated")
assert.True(t, mockActionsClient.AssertExpectations(t), "All expectations should be met")
assert.True(t, mockSessionClient.AssertExpectations(t), "All expectations should be met")
}
@@ -513,6 +529,12 @@ func TestGetRunnerScaleSetMessage_RetryUntilGetMessage(t *testing.T) {
})
require.NoError(t, err, "Error creating autoscaler client")
err = asClient.GetRunnerScaleSetMessage(ctx, func(msg *actions.RunnerScaleSetMessage) error {
logger.Info("Message received", "messageId", msg.MessageId, "messageType", msg.MessageType, "body", msg.Body)
return nil
})
assert.NoError(t, err, "Error getting initial message")
err = asClient.GetRunnerScaleSetMessage(ctx, func(msg *actions.RunnerScaleSetMessage) error {
logger.Info("Message received", "messageId", msg.MessageId, "messageType", msg.MessageType, "body", msg.Body)
return nil
@@ -550,6 +572,12 @@ func TestGetRunnerScaleSetMessage_ErrorOnGetMessage(t *testing.T) {
})
require.NoError(t, err, "Error creating autoscaler client")
// process initial message
err = asClient.GetRunnerScaleSetMessage(ctx, func(msg *actions.RunnerScaleSetMessage) error {
return nil
})
assert.NoError(t, err, "Error getting initial message")
err = asClient.GetRunnerScaleSetMessage(ctx, func(msg *actions.RunnerScaleSetMessage) error {
return fmt.Errorf("Should not be called")
})
@@ -592,6 +620,12 @@ func TestDeleteRunnerScaleSetMessage_Error(t *testing.T) {
})
require.NoError(t, err, "Error creating autoscaler client")
err = asClient.GetRunnerScaleSetMessage(ctx, func(msg *actions.RunnerScaleSetMessage) error {
logger.Info("Message received", "messageId", msg.MessageId, "messageType", msg.MessageType, "body", msg.Body)
return nil
})
assert.NoError(t, err, "Error getting initial message")
err = asClient.GetRunnerScaleSetMessage(ctx, func(msg *actions.RunnerScaleSetMessage) error {
logger.Info("Message received", "messageId", msg.MessageId, "messageType", msg.MessageType, "body", msg.Body)
return nil

View File

@@ -3,6 +3,7 @@ package main
import (
"context"
"encoding/json"
"errors"
"fmt"
"math"
"strings"
@@ -25,6 +26,31 @@ type Service struct {
kubeManager KubernetesManager
settings *ScaleSettings
currentRunnerCount int
metricsExporter metricsExporter
errs []error
}
func WithPrometheusMetrics(conf RunnerScaleSetListenerConfig) func(*Service) {
return func(svc *Service) {
parsedURL, err := actions.ParseGitHubConfigFromURL(conf.ConfigureUrl)
if err != nil {
svc.errs = append(svc.errs, err)
}
svc.metricsExporter.withBaseLabels(baseLabels{
scaleSetName: conf.EphemeralRunnerSetName,
scaleSetNamespace: conf.EphemeralRunnerSetNamespace,
enterprise: parsedURL.Enterprise,
organization: parsedURL.Organization,
repository: parsedURL.Repository,
})
}
}
func WithLogger(logger logr.Logger) func(*Service) {
return func(s *Service) {
s.logger = logger.WithName("service")
}
}
func NewService(
@@ -33,13 +59,13 @@ func NewService(
manager KubernetesManager,
settings *ScaleSettings,
options ...func(*Service),
) *Service {
) (*Service, error) {
s := &Service{
ctx: ctx,
rsClient: rsClient,
kubeManager: manager,
settings: settings,
currentRunnerCount: 0,
currentRunnerCount: -1, // force patch on startup
logger: logr.FromContextOrDiscard(ctx),
}
@@ -47,18 +73,14 @@ func NewService(
option(s)
}
return s
if len(s.errs) > 0 {
return nil, errors.Join(s.errs...)
}
return s, nil
}
func (s *Service) Start() error {
if s.settings.MinRunners > 0 {
s.logger.Info("scale to match minimal runners.")
err := s.scaleForAssignedJobCount(0)
if err != nil {
return fmt.Errorf("could not scale to match minimal runners. %w", err)
}
}
for {
s.logger.Info("waiting for message...")
select {
@@ -89,11 +111,17 @@ func (s *Service) processMessage(message *actions.RunnerScaleSetMessage) error {
"busy runners", message.Statistics.TotalBusyRunners,
"idle runners", message.Statistics.TotalIdleRunners)
s.metricsExporter.publishStatistics(message.Statistics)
if message.MessageType != "RunnerScaleSetJobMessages" {
s.logger.Info("skip message with unknown message type.", "messageType", message.MessageType)
return nil
}
if message.MessageId == 0 && message.Body == "" { // initial message with statistics only
return s.scaleForAssignedJobCount(message.Statistics.TotalAssignedJobs)
}
var batchedMessages []json.RawMessage
if err := json.NewDecoder(strings.NewReader(message.Body)).Decode(&batchedMessages); err != nil {
return fmt.Errorf("could not decode job messages. %w", err)
@@ -114,27 +142,54 @@ func (s *Service) processMessage(message *actions.RunnerScaleSetMessage) error {
if err := json.Unmarshal(message, &jobAvailable); err != nil {
return fmt.Errorf("could not decode job available message. %w", err)
}
s.logger.Info("job available message received.", "RequestId", jobAvailable.RunnerRequestId)
s.logger.Info(
"job available message received.",
"RequestId",
jobAvailable.RunnerRequestId,
)
availableJobs = append(availableJobs, jobAvailable.RunnerRequestId)
case "JobAssigned":
var jobAssigned actions.JobAssigned
if err := json.Unmarshal(message, &jobAssigned); err != nil {
return fmt.Errorf("could not decode job assigned message. %w", err)
}
s.logger.Info("job assigned message received.", "RequestId", jobAssigned.RunnerRequestId)
s.logger.Info(
"job assigned message received.",
"RequestId",
jobAssigned.RunnerRequestId,
)
// s.metricsExporter.publishJobAssigned(&jobAssigned)
case "JobStarted":
var jobStarted actions.JobStarted
if err := json.Unmarshal(message, &jobStarted); err != nil {
return fmt.Errorf("could not decode job started message. %w", err)
}
s.logger.Info("job started message received.", "RequestId", jobStarted.RunnerRequestId, "RunnerId", jobStarted.RunnerId)
s.logger.Info(
"job started message received.",
"RequestId",
jobStarted.RunnerRequestId,
"RunnerId",
jobStarted.RunnerId,
)
s.metricsExporter.publishJobStarted(&jobStarted)
s.updateJobInfoForRunner(jobStarted)
case "JobCompleted":
var jobCompleted actions.JobCompleted
if err := json.Unmarshal(message, &jobCompleted); err != nil {
return fmt.Errorf("could not decode job completed message. %w", err)
}
s.logger.Info("job completed message received.", "RequestId", jobCompleted.RunnerRequestId, "Result", jobCompleted.Result, "RunnerId", jobCompleted.RunnerId, "RunnerName", jobCompleted.RunnerName)
s.logger.Info(
"job completed message received.",
"RequestId",
jobCompleted.RunnerRequestId,
"Result",
jobCompleted.Result,
"RunnerId",
jobCompleted.RunnerId,
"RunnerName",
jobCompleted.RunnerName,
)
s.metricsExporter.publishJobCompleted(&jobCompleted)
default:
s.logger.Info("unknown job message type.", "messageType", messageType.MessageType)
}
@@ -150,13 +205,15 @@ func (s *Service) processMessage(message *actions.RunnerScaleSetMessage) error {
func (s *Service) scaleForAssignedJobCount(count int) error {
targetRunnerCount := int(math.Max(math.Min(float64(s.settings.MaxRunners), float64(count)), float64(s.settings.MinRunners)))
s.metricsExporter.publishDesiredRunners(targetRunnerCount)
if targetRunnerCount != s.currentRunnerCount {
s.logger.Info("try scale runner request up/down base on assigned job count",
"assigned job", count,
"decision", targetRunnerCount,
"min", s.settings.MinRunners,
"max", s.settings.MaxRunners,
"currentRunnerCount", s.currentRunnerCount)
"currentRunnerCount", s.currentRunnerCount,
)
err := s.kubeManager.ScaleEphemeralRunnerSet(s.ctx, s.settings.Namespace, s.settings.ResourceName, targetRunnerCount)
if err != nil {
return fmt.Errorf("could not scale ephemeral runner set (%s/%s). %w", s.settings.Namespace, s.settings.ResourceName, err)
@@ -177,7 +234,8 @@ func (s *Service) updateJobInfoForRunner(jobInfo actions.JobStarted) {
"workflowRef", jobInfo.JobWorkflowRef,
"workflowRunId", jobInfo.WorkflowRunId,
"jobDisplayName", jobInfo.JobDisplayName,
"requestId", jobInfo.RunnerRequestId)
"requestId", jobInfo.RunnerRequestId,
)
err := s.kubeManager.UpdateEphemeralRunnerWithJobInfo(s.ctx, s.settings.Namespace, jobInfo.RunnerName, jobInfo.OwnerName, jobInfo.RepositoryName, jobInfo.JobWorkflowRef, jobInfo.JobDisplayName, jobInfo.WorkflowRunId, jobInfo.RunnerRequestId)
if err != nil {
s.logger.Error(err, "could not update ephemeral runner with job info", "runnerName", jobInfo.RunnerName, "requestId", jobInfo.RunnerRequestId)

View File

@@ -21,7 +21,7 @@ func TestNewService(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -36,6 +36,7 @@ func TestNewService(t *testing.T) {
},
)
require.NoError(t, err)
assert.Equal(t, logger, service.logger)
}
@@ -47,7 +48,7 @@ func TestStart(t *testing.T) {
require.NoError(t, log_err, "Error creating logger")
ctx, cancel := context.WithCancel(context.Background())
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -61,9 +62,11 @@ func TestStart(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
mockRsClient.On("GetRunnerScaleSetMessage", service.ctx, mock.Anything).Run(func(args mock.Arguments) { cancel() }).Return(nil).Once()
err := service.Start()
err = service.Start()
assert.NoError(t, err, "Unexpected error")
assert.True(t, mockRsClient.AssertExpectations(t), "All expectations should be met")
@@ -72,13 +75,14 @@ func TestStart(t *testing.T) {
func TestStart_ScaleToMinRunners(t *testing.T) {
mockRsClient := &MockRunnerScaleSetClient{}
mockKubeManager := &MockKubernetesManager{}
logger, log_err := logging.NewLogger(logging.LogLevelDebug, logging.LogFormatText)
logger = logger.WithName(t.Name())
require.NoError(t, log_err, "Error creating logger")
ctx, cancel := context.WithCancel(context.Background())
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -92,11 +96,17 @@ func TestStart_ScaleToMinRunners(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
mockRsClient.On("GetRunnerScaleSetMessage", ctx, mock.Anything).Run(func(args mock.Arguments) {
_ = service.scaleForAssignedJobCount(5)
}).Return(nil)
mockKubeManager.On("ScaleEphemeralRunnerSet", ctx, service.settings.Namespace, service.settings.ResourceName, 5).Run(func(args mock.Arguments) { cancel() }).Return(nil).Once()
err := service.Start()
err = service.Start()
assert.NoError(t, err, "Unexpected error")
assert.True(t, mockRsClient.AssertExpectations(t), "All expectations should be met")
assert.True(t, mockKubeManager.AssertExpectations(t), "All expectations should be met")
}
@@ -110,7 +120,7 @@ func TestStart_ScaleToMinRunnersFailed(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -124,11 +134,16 @@ func TestStart_ScaleToMinRunnersFailed(t *testing.T) {
s.logger = logger
},
)
mockKubeManager.On("ScaleEphemeralRunnerSet", ctx, service.settings.Namespace, service.settings.ResourceName, 5).Return(fmt.Errorf("error")).Once()
require.NoError(t, err)
err := service.Start()
c := mockKubeManager.On("ScaleEphemeralRunnerSet", ctx, service.settings.Namespace, service.settings.ResourceName, 5).Return(fmt.Errorf("error")).Once()
mockRsClient.On("GetRunnerScaleSetMessage", ctx, mock.Anything).Run(func(args mock.Arguments) {
_ = service.scaleForAssignedJobCount(5)
}).Return(c.ReturnArguments.Get(0))
assert.ErrorContains(t, err, "could not scale to match minimal runners", "Unexpected error")
err = service.Start()
assert.ErrorContains(t, err, "could not get and process message", "Unexpected error")
assert.True(t, mockRsClient.AssertExpectations(t), "All expectations should be met")
assert.True(t, mockKubeManager.AssertExpectations(t), "All expectations should be met")
}
@@ -141,7 +156,7 @@ func TestStart_GetMultipleMessages(t *testing.T) {
require.NoError(t, log_err, "Error creating logger")
ctx, cancel := context.WithCancel(context.Background())
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -155,10 +170,12 @@ func TestStart_GetMultipleMessages(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
mockRsClient.On("GetRunnerScaleSetMessage", service.ctx, mock.Anything).Return(nil).Times(5)
mockRsClient.On("GetRunnerScaleSetMessage", service.ctx, mock.Anything).Run(func(args mock.Arguments) { cancel() }).Return(nil).Once()
err := service.Start()
err = service.Start()
assert.NoError(t, err, "Unexpected error")
assert.True(t, mockRsClient.AssertExpectations(t), "All expectations should be met")
@@ -174,7 +191,7 @@ func TestStart_ErrorOnMessage(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -188,10 +205,12 @@ func TestStart_ErrorOnMessage(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
mockRsClient.On("GetRunnerScaleSetMessage", service.ctx, mock.Anything).Return(nil).Times(2)
mockRsClient.On("GetRunnerScaleSetMessage", service.ctx, mock.Anything).Return(fmt.Errorf("error")).Once()
err := service.Start()
err = service.Start()
assert.ErrorContains(t, err, "could not get and process message. error", "Unexpected error")
assert.True(t, mockRsClient.AssertExpectations(t), "All expectations should be met")
@@ -207,7 +226,7 @@ func TestProcessMessage_NoStatistic(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -221,8 +240,9 @@ func TestProcessMessage_NoStatistic(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
err := service.processMessage(&actions.RunnerScaleSetMessage{
err = service.processMessage(&actions.RunnerScaleSetMessage{
MessageId: 1,
MessageType: "test",
Body: "test",
@@ -242,7 +262,7 @@ func TestProcessMessage_IgnoreUnknownMessageType(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -256,8 +276,9 @@ func TestProcessMessage_IgnoreUnknownMessageType(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
err := service.processMessage(&actions.RunnerScaleSetMessage{
err = service.processMessage(&actions.RunnerScaleSetMessage{
MessageId: 1,
MessageType: "unknown",
Statistics: &actions.RunnerScaleSetStatistic{
@@ -280,7 +301,7 @@ func TestProcessMessage_InvalidBatchMessageJson(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -295,7 +316,9 @@ func TestProcessMessage_InvalidBatchMessageJson(t *testing.T) {
},
)
err := service.processMessage(&actions.RunnerScaleSetMessage{
require.NoError(t, err)
err = service.processMessage(&actions.RunnerScaleSetMessage{
MessageId: 1,
MessageType: "RunnerScaleSetJobMessages",
Statistics: &actions.RunnerScaleSetStatistic{
@@ -318,7 +341,7 @@ func TestProcessMessage_InvalidJobMessageJson(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -332,8 +355,9 @@ func TestProcessMessage_InvalidJobMessageJson(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
err := service.processMessage(&actions.RunnerScaleSetMessage{
err = service.processMessage(&actions.RunnerScaleSetMessage{
MessageId: 1,
MessageType: "RunnerScaleSetJobMessages",
Statistics: &actions.RunnerScaleSetStatistic{
@@ -356,7 +380,7 @@ func TestProcessMessage_MultipleMessages(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -370,10 +394,12 @@ func TestProcessMessage_MultipleMessages(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
mockRsClient.On("AcquireJobsForRunnerScaleSet", ctx, mock.MatchedBy(func(ids []int64) bool { return ids[0] == 3 && ids[1] == 4 })).Return(nil).Once()
mockKubeManager.On("ScaleEphemeralRunnerSet", ctx, service.settings.Namespace, service.settings.ResourceName, 2).Run(func(args mock.Arguments) { cancel() }).Return(nil).Once()
err := service.processMessage(&actions.RunnerScaleSetMessage{
err = service.processMessage(&actions.RunnerScaleSetMessage{
MessageId: 1,
MessageType: "RunnerScaleSetJobMessages",
Statistics: &actions.RunnerScaleSetStatistic{
@@ -397,7 +423,7 @@ func TestProcessMessage_AcquireJobsFailed(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -411,9 +437,11 @@ func TestProcessMessage_AcquireJobsFailed(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
mockRsClient.On("AcquireJobsForRunnerScaleSet", ctx, mock.MatchedBy(func(ids []int64) bool { return ids[0] == 1 })).Return(fmt.Errorf("error")).Once()
err := service.processMessage(&actions.RunnerScaleSetMessage{
err = service.processMessage(&actions.RunnerScaleSetMessage{
MessageId: 1,
MessageType: "RunnerScaleSetJobMessages",
Statistics: &actions.RunnerScaleSetStatistic{
@@ -437,7 +465,7 @@ func TestScaleForAssignedJobCount_DeDupScale(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -451,9 +479,11 @@ func TestScaleForAssignedJobCount_DeDupScale(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
mockKubeManager.On("ScaleEphemeralRunnerSet", ctx, service.settings.Namespace, service.settings.ResourceName, 2).Return(nil).Once()
err := service.scaleForAssignedJobCount(2)
err = service.scaleForAssignedJobCount(2)
require.NoError(t, err, "Unexpected error")
err = service.scaleForAssignedJobCount(2)
require.NoError(t, err, "Unexpected error")
@@ -476,7 +506,7 @@ func TestScaleForAssignedJobCount_ScaleWithinMinMax(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -490,13 +520,15 @@ func TestScaleForAssignedJobCount_ScaleWithinMinMax(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
mockKubeManager.On("ScaleEphemeralRunnerSet", ctx, service.settings.Namespace, service.settings.ResourceName, 1).Return(nil).Once()
mockKubeManager.On("ScaleEphemeralRunnerSet", ctx, service.settings.Namespace, service.settings.ResourceName, 3).Return(nil).Once()
mockKubeManager.On("ScaleEphemeralRunnerSet", ctx, service.settings.Namespace, service.settings.ResourceName, 5).Return(nil).Once()
mockKubeManager.On("ScaleEphemeralRunnerSet", ctx, service.settings.Namespace, service.settings.ResourceName, 1).Return(nil).Once()
mockKubeManager.On("ScaleEphemeralRunnerSet", ctx, service.settings.Namespace, service.settings.ResourceName, 5).Return(nil).Once()
err := service.scaleForAssignedJobCount(0)
err = service.scaleForAssignedJobCount(0)
require.NoError(t, err, "Unexpected error")
err = service.scaleForAssignedJobCount(3)
require.NoError(t, err, "Unexpected error")
@@ -521,7 +553,7 @@ func TestScaleForAssignedJobCount_ScaleFailed(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -535,9 +567,11 @@ func TestScaleForAssignedJobCount_ScaleFailed(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
mockKubeManager.On("ScaleEphemeralRunnerSet", ctx, service.settings.Namespace, service.settings.ResourceName, 2).Return(fmt.Errorf("error"))
err := service.scaleForAssignedJobCount(2)
err = service.scaleForAssignedJobCount(2)
assert.ErrorContains(t, err, "could not scale ephemeral runner set (namespace/resource). error", "Unexpected error")
assert.True(t, mockRsClient.AssertExpectations(t), "All expectations should be met")
@@ -553,7 +587,7 @@ func TestProcessMessage_JobStartedMessage(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -567,12 +601,14 @@ func TestProcessMessage_JobStartedMessage(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
service.currentRunnerCount = 1
mockKubeManager.On("UpdateEphemeralRunnerWithJobInfo", ctx, service.settings.Namespace, "runner1", "owner1", "repo1", ".github/workflows/ci.yaml", "job1", int64(100), int64(3)).Run(func(args mock.Arguments) { cancel() }).Return(nil).Once()
mockRsClient.On("AcquireJobsForRunnerScaleSet", ctx, mock.MatchedBy(func(ids []int64) bool { return len(ids) == 0 })).Return(nil).Once()
err := service.processMessage(&actions.RunnerScaleSetMessage{
err = service.processMessage(&actions.RunnerScaleSetMessage{
MessageId: 1,
MessageType: "RunnerScaleSetJobMessages",
Statistics: &actions.RunnerScaleSetStatistic{
@@ -596,7 +632,7 @@ func TestProcessMessage_JobStartedMessageIgnoreRunnerUpdateError(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
service := NewService(
service, err := NewService(
ctx,
mockRsClient,
mockKubeManager,
@@ -610,12 +646,14 @@ func TestProcessMessage_JobStartedMessageIgnoreRunnerUpdateError(t *testing.T) {
s.logger = logger
},
)
require.NoError(t, err)
service.currentRunnerCount = 1
mockKubeManager.On("UpdateEphemeralRunnerWithJobInfo", ctx, service.settings.Namespace, "runner1", "owner1", "repo1", ".github/workflows/ci.yaml", "job1", int64(100), int64(3)).Run(func(args mock.Arguments) { cancel() }).Return(fmt.Errorf("error")).Once()
mockRsClient.On("AcquireJobsForRunnerScaleSet", ctx, mock.MatchedBy(func(ids []int64) bool { return len(ids) == 0 })).Return(nil).Once()
err := service.processMessage(&actions.RunnerScaleSetMessage{
err = service.processMessage(&actions.RunnerScaleSetMessage{
MessageId: 1,
MessageType: "RunnerScaleSetJobMessages",
Statistics: &actions.RunnerScaleSetStatistic{

View File

@@ -25,13 +25,17 @@ import (
"os"
"os/signal"
"syscall"
"time"
"github.com/actions/actions-runner-controller/build"
"github.com/actions/actions-runner-controller/github/actions"
"github.com/actions/actions-runner-controller/logging"
"github.com/go-logr/logr"
"github.com/kelseyhightower/envconfig"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"golang.org/x/net/http/httpproxy"
"golang.org/x/sync/errgroup"
)
type RunnerScaleSetListenerConfig struct {
@@ -45,19 +49,34 @@ type RunnerScaleSetListenerConfig struct {
MaxRunners int `split_words:"true"`
MinRunners int `split_words:"true"`
RunnerScaleSetId int `split_words:"true"`
RunnerScaleSetName string `split_words:"true"`
ServerRootCA string `split_words:"true"`
LogLevel string `split_words:"true"`
LogFormat string `split_words:"true"`
MetricsAddr string `split_words:"true"`
MetricsEndpoint string `split_words:"true"`
}
func main() {
logger, err := logging.NewLogger(logging.LogLevelDebug, logging.LogFormatText)
if err != nil {
fmt.Fprintf(os.Stderr, "Error: creating logger: %v\n", err)
var rc RunnerScaleSetListenerConfig
if err := envconfig.Process("github", &rc); err != nil {
fmt.Fprintf(os.Stderr, "Error: processing environment variables for RunnerScaleSetListenerConfig: %v\n", err)
os.Exit(1)
}
var rc RunnerScaleSetListenerConfig
if err := envconfig.Process("github", &rc); err != nil {
logger.Error(err, "Error: processing environment variables for RunnerScaleSetListenerConfig")
logLevel := string(logging.LogLevelDebug)
if rc.LogLevel != "" {
logLevel = rc.LogLevel
}
logFormat := string(logging.LogFormatText)
if rc.LogFormat != "" {
logFormat = rc.LogFormat
}
logger, err := logging.NewLogger(logLevel, logFormat)
if err != nil {
fmt.Fprintf(os.Stderr, "Error: creating logger: %v\n", err)
os.Exit(1)
}
@@ -67,17 +86,95 @@ func main() {
os.Exit(1)
}
if err := run(rc, logger); err != nil {
logger.Error(err, "Run error")
ctx, stop := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
defer stop()
g, ctx := errgroup.WithContext(ctx)
g.Go(func() error {
opts := runOptions{
serviceOptions: []func(*Service){
WithLogger(logger),
},
}
opts.serviceOptions = append(opts.serviceOptions, WithPrometheusMetrics(rc))
return run(ctx, rc, logger, opts)
})
if len(rc.MetricsAddr) != 0 {
g.Go(func() error {
metricsServer := metricsServer{
rc: rc,
logger: logger,
}
g.Go(func() error {
<-ctx.Done()
return metricsServer.shutdown()
})
return metricsServer.listenAndServe()
})
}
if err := g.Wait(); err != nil {
logger.Error(err, "Error encountered")
os.Exit(1)
}
}
func run(rc RunnerScaleSetListenerConfig, logger logr.Logger) error {
// Create root context and hook with sigint and sigterm
ctx, stop := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
defer stop()
type metricsServer struct {
rc RunnerScaleSetListenerConfig
logger logr.Logger
srv *http.Server
}
func (s *metricsServer) shutdown() error {
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
return s.srv.Shutdown(ctx)
}
func (s *metricsServer) listenAndServe() error {
reg := prometheus.NewRegistry()
reg.MustRegister(
// availableJobs,
// acquiredJobs,
assignedJobs,
runningJobs,
registeredRunners,
busyRunners,
minRunners,
maxRunners,
desiredRunners,
idleRunners,
startedJobsTotal,
completedJobsTotal,
// jobQueueDurationSeconds,
jobStartupDurationSeconds,
jobExecutionDurationSeconds,
)
mux := http.NewServeMux()
mux.Handle(
s.rc.MetricsEndpoint,
promhttp.HandlerFor(reg, promhttp.HandlerOpts{Registry: reg}),
)
s.srv = &http.Server{
Addr: s.rc.MetricsAddr,
Handler: mux,
}
s.logger.Info("Starting metrics server", "address", s.srv.Addr)
return s.srv.ListenAndServe()
}
type runOptions struct {
serviceOptions []func(*Service)
}
func run(ctx context.Context, rc RunnerScaleSetListenerConfig, logger logr.Logger, opts runOptions) error {
// Create root context and hook with sigint and sigterm
creds := &actions.ActionsAuth{}
if rc.Token != "" {
creds.Token = rc.Token
@@ -119,9 +216,10 @@ func run(rc RunnerScaleSetListenerConfig, logger logr.Logger) error {
MinRunners: rc.MinRunners,
}
service := NewService(ctx, autoScalerClient, kubeManager, scaleSettings, func(s *Service) {
s.logger = logger.WithName("service")
})
service, err := NewService(ctx, autoScalerClient, kubeManager, scaleSettings, opts.serviceOptions...)
if err != nil {
return fmt.Errorf("failed to create new service: %v", err)
}
// Start listening for messages
if err = service.Start(); err != nil {

View File

@@ -0,0 +1,330 @@
package main
import (
"strconv"
"github.com/actions/actions-runner-controller/github/actions"
"github.com/prometheus/client_golang/prometheus"
)
// label names
const (
labelKeyRunnerScaleSetName = "name"
labelKeyRunnerScaleSetNamespace = "namespace"
labelKeyEnterprise = "enterprise"
labelKeyOrganization = "organization"
labelKeyRepository = "repository"
labelKeyJobName = "job_name"
labelKeyJobWorkflowRef = "job_workflow_ref"
labelKeyEventName = "event_name"
labelKeyJobResult = "job_result"
labelKeyRunnerID = "runner_id"
labelKeyRunnerName = "runner_name"
)
const githubScaleSetSubsystem = "gha"
// labels
var (
scaleSetLabels = []string{
labelKeyRunnerScaleSetName,
labelKeyRepository,
labelKeyOrganization,
labelKeyEnterprise,
labelKeyRunnerScaleSetNamespace,
}
jobLabels = []string{
labelKeyRepository,
labelKeyOrganization,
labelKeyEnterprise,
labelKeyJobName,
labelKeyJobWorkflowRef,
labelKeyEventName,
}
completedJobsTotalLabels = append(jobLabels, labelKeyJobResult, labelKeyRunnerID, labelKeyRunnerName)
jobExecutionDurationLabels = append(jobLabels, labelKeyJobResult, labelKeyRunnerID, labelKeyRunnerName)
startedJobsTotalLabels = append(jobLabels, labelKeyRunnerID, labelKeyRunnerName)
jobStartupDurationLabels = append(jobLabels, labelKeyRunnerID, labelKeyRunnerName)
)
// metrics
var (
// availableJobs = prometheus.NewGaugeVec(
// prometheus.GaugeOpts{
// Subsystem: githubScaleSetSubsystem,
// Name: "available_jobs",
// Help: "Number of jobs with `runs-on` matching the runner scale set name. Jobs are not yet assigned to the runner scale set.",
// },
// scaleSetLabels,
// )
//
// acquiredJobs = prometheus.NewGaugeVec(
// prometheus.GaugeOpts{
// Subsystem: githubScaleSetSubsystem,
// Name: "acquired_jobs",
// Help: "Number of jobs acquired by the scale set.",
// },
// scaleSetLabels,
// )
assignedJobs = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetSubsystem,
Name: "assigned_jobs",
Help: "Number of jobs assigned to this scale set.",
},
scaleSetLabels,
)
runningJobs = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetSubsystem,
Name: "running_jobs",
Help: "Number of jobs running (or about to be run).",
},
scaleSetLabels,
)
registeredRunners = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetSubsystem,
Name: "registered_runners",
Help: "Number of runners registered by the scale set.",
},
scaleSetLabels,
)
busyRunners = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetSubsystem,
Name: "busy_runners",
Help: "Number of registered runners running a job.",
},
scaleSetLabels,
)
minRunners = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetSubsystem,
Name: "min_runners",
Help: "Minimum number of runners.",
},
scaleSetLabels,
)
maxRunners = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetSubsystem,
Name: "max_runners",
Help: "Maximum number of runners.",
},
scaleSetLabels,
)
desiredRunners = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetSubsystem,
Name: "desired_runners",
Help: "Number of runners desired by the scale set.",
},
scaleSetLabels,
)
idleRunners = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetSubsystem,
Name: "idle_runners",
Help: "Number of registered runners not running a job.",
},
scaleSetLabels,
)
startedJobsTotal = prometheus.NewCounterVec(
prometheus.CounterOpts{
Subsystem: githubScaleSetSubsystem,
Name: "started_jobs_total",
Help: "Total number of jobs started.",
},
startedJobsTotalLabels,
)
completedJobsTotal = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "completed_jobs_total",
Help: "Total number of jobs completed.",
Subsystem: githubScaleSetSubsystem,
},
completedJobsTotalLabels,
)
// jobQueueDurationSeconds = prometheus.NewHistogramVec(
// prometheus.HistogramOpts{
// Subsystem: githubScaleSetSubsystem,
// Name: "job_queue_duration_seconds",
// Help: "Time spent waiting for workflow jobs to get assigned to the scale set after queueing (in seconds).",
// Buckets: runtimeBuckets,
// },
// jobLabels,
// )
jobStartupDurationSeconds = prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Subsystem: githubScaleSetSubsystem,
Name: "job_startup_duration_seconds",
Help: "Time spent waiting for workflow job to get started on the runner owned by the scale set (in seconds).",
Buckets: runtimeBuckets,
},
jobStartupDurationLabels,
)
jobExecutionDurationSeconds = prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Subsystem: githubScaleSetSubsystem,
Name: "job_execution_duration_seconds",
Help: "Time spent executing workflow jobs by the scale set (in seconds).",
Buckets: runtimeBuckets,
},
jobExecutionDurationLabels,
)
)
var runtimeBuckets []float64 = []float64{
0.01,
0.05,
0.1,
0.5,
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
12,
15,
18,
20,
25,
30,
40,
50,
60,
70,
80,
90,
100,
110,
120,
150,
180,
210,
240,
300,
360,
420,
480,
540,
600,
900,
1200,
1800,
2400,
3000,
3600,
}
type metricsExporter struct {
// Initialized during creation.
baseLabels
}
type baseLabels struct {
scaleSetName string
scaleSetNamespace string
enterprise string
organization string
repository string
}
func (b *baseLabels) jobLabels(jobBase *actions.JobMessageBase) prometheus.Labels {
return prometheus.Labels{
labelKeyEnterprise: b.enterprise,
labelKeyOrganization: b.organization,
labelKeyRepository: b.repository,
labelKeyJobName: jobBase.JobDisplayName,
labelKeyJobWorkflowRef: jobBase.JobWorkflowRef,
labelKeyEventName: jobBase.EventName,
}
}
func (b *baseLabels) scaleSetLabels() prometheus.Labels {
return prometheus.Labels{
labelKeyRunnerScaleSetName: b.scaleSetName,
labelKeyRunnerScaleSetNamespace: b.scaleSetNamespace,
labelKeyEnterprise: b.enterprise,
labelKeyOrganization: b.organization,
labelKeyRepository: b.repository,
}
}
func (b *baseLabels) completedJobLabels(msg *actions.JobCompleted) prometheus.Labels {
l := b.jobLabels(&msg.JobMessageBase)
l[labelKeyRunnerID] = strconv.Itoa(msg.RunnerId)
l[labelKeyJobResult] = msg.Result
l[labelKeyRunnerName] = msg.RunnerName
return l
}
func (b *baseLabels) startedJobLabels(msg *actions.JobStarted) prometheus.Labels {
l := b.jobLabels(&msg.JobMessageBase)
l[labelKeyRunnerID] = strconv.Itoa(msg.RunnerId)
l[labelKeyRunnerName] = msg.RunnerName
return l
}
func (m *metricsExporter) withBaseLabels(base baseLabels) {
m.baseLabels = base
}
func (m *metricsExporter) publishStatistics(stats *actions.RunnerScaleSetStatistic) {
l := m.scaleSetLabels()
// availableJobs.With(l).Set(float64(stats.TotalAvailableJobs))
// acquiredJobs.With(l).Set(float64(stats.TotalAcquiredJobs))
assignedJobs.With(l).Set(float64(stats.TotalAssignedJobs))
runningJobs.With(l).Set(float64(stats.TotalRunningJobs))
registeredRunners.With(l).Set(float64(stats.TotalRegisteredRunners))
busyRunners.With(l).Set(float64(stats.TotalBusyRunners))
idleRunners.With(l).Set(float64(stats.TotalIdleRunners))
}
func (m *metricsExporter) publishJobStarted(msg *actions.JobStarted) {
l := m.startedJobLabels(msg)
startedJobsTotal.With(l).Inc()
startupDuration := msg.JobMessageBase.RunnerAssignTime.Unix() - msg.JobMessageBase.ScaleSetAssignTime.Unix()
jobStartupDurationSeconds.With(l).Observe(float64(startupDuration))
}
// func (m *metricsExporter) publishJobAssigned(msg *actions.JobAssigned) {
// l := m.jobLabels(&msg.JobMessageBase)
// queueDuration := msg.JobMessageBase.ScaleSetAssignTime.Unix() - msg.JobMessageBase.QueueTime.Unix()
// jobQueueDurationSeconds.With(l).Observe(float64(queueDuration))
// }
func (m *metricsExporter) publishJobCompleted(msg *actions.JobCompleted) {
l := m.completedJobLabels(msg)
completedJobsTotal.With(l).Inc()
executionDuration := msg.JobMessageBase.FinishTime.Unix() - msg.JobMessageBase.RunnerAssignTime.Unix()
jobExecutionDurationSeconds.With(l).Observe(float64(executionDuration))
}
func (m *metricsExporter) publishDesiredRunners(count int) {
desiredRunners.With(m.scaleSetLabels()).Set(float64(count))
}

View File

@@ -124,7 +124,7 @@ func main() {
if watchNamespace == "" {
logger.Info("-watch-namespace is empty. HorizontalRunnerAutoscalers in all the namespaces are watched, cached, and considered as scale targets.")
} else {
logger.Info("-watch-namespace is %q. Only HorizontalRunnerAutoscalers in %q are watched, cached, and considered as scale targets.", watchNamespace, watchNamespace)
logger.Info(fmt.Sprintf("-watch-namespace is %q. Only HorizontalRunnerAutoscalers in %q are watched, cached, and considered as scale targets.", watchNamespace, watchNamespace))
}
ctrl.SetLogger(logger)

View File

@@ -113,7 +113,7 @@ spec:
description: ScaleDownDelaySecondsAfterScaleUp is the approximate delay for a scale down followed by a scale up Used to prevent flapping (down->up->down->... loop)
type: integer
scaleTargetRef:
description: ScaleTargetRef sis the reference to scaled resource like RunnerDeployment
description: ScaleTargetRef is the reference to scaled resource like RunnerDeployment
properties:
kind:
description: Kind is the type of resource being referenced

View File

@@ -1497,6 +1497,12 @@ spec:
type: integer
dockerRegistryMirror:
type: string
dockerVarRunVolumeSizeLimit:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
dockerVolumeMounts:
items:
description: VolumeMount describes a mounting of a Volume within a container.

View File

@@ -1479,6 +1479,12 @@ spec:
type: integer
dockerRegistryMirror:
type: string
dockerVarRunVolumeSizeLimit:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
dockerVolumeMounts:
items:
description: VolumeMount describes a mounting of a Volume within a container.

View File

@@ -1432,6 +1432,12 @@ spec:
type: integer
dockerRegistryMirror:
type: string
dockerVarRunVolumeSizeLimit:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
dockerVolumeMounts:
items:
description: VolumeMount describes a mounting of a Volume within a container.

View File

@@ -55,6 +55,12 @@ spec:
type: integer
dockerRegistryMirror:
type: string
dockerVarRunVolumeSizeLimit:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
dockerdWithinRunnerContainer:
type: boolean
effectiveTime:

View File

@@ -5,7 +5,7 @@ kind: Kustomization
images:
- name: controller
newName: summerwind/actions-runner-controller
newTag: dev
newTag: latest
replacements:
- path: env-replacement.yaml

View File

@@ -31,6 +31,6 @@ All additional docs are kept in the `docs/` folder, this README is solely for do
| `autoscaler.enabled` | Enable the HorizontalRunnerAutoscaler, if its enabled then replica count will not be used | true |
| `autoscaler.minReplicas` | Minimum no of replicas | 1 |
| `autoscaler.maxReplicas` | Maximum no of replicas | 5 |
| `autoscaler.scaleDownDelaySecondsAfterScaleOut` | [Anti-Flapping Configuration](https://github.com/actions/actions-runner-controller#anti-flapping-configuration) | 120 |
| `autoscaler.metrics` | [Pull driven scaling](https://github.com/actions/actions-runner-controller#pull-driven-scaling) | default |
| `autoscaler.scaleUpTriggers` | [Webhook driven scaling](https://github.com/actions/actions-runner-controller#webhook-driven-scaling) | |
| `autoscaler.scaleDownDelaySecondsAfterScaleOut` | [Anti-Flapping Configuration](https://github.com/actions/actions-runner-controller/blob/master/docs/automatically-scaling-runners.md#anti-flapping-configuration) | 120 |
| `autoscaler.metrics` | [Pull driven scaling](https://github.com/actions/actions-runner-controller/blob/master/docs/automatically-scaling-runners.md#pull-driven-scaling) | default |
| `autoscaler.scaleUpTriggers` | [Webhook driven scaling](https://github.com/actions/actions-runner-controller/blob/master/docs/automatically-scaling-runners.md#webhook-driven-scaling) | |

View File

@@ -17,7 +17,7 @@ runnerLabels:
replicaCount: 1
# The Runner Group that the runner(s) should be associated with.
# See https://docs.github.com/en/github-ae@latest/actions/hosting-your-own-runners/managing-access-to-self-hosted-runners-using-groups.
# See https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/managing-access-to-self-hosted-runners-using-groups.
group: Default
autoscaler:

View File

@@ -33,6 +33,8 @@ import (
"sigs.k8s.io/controller-runtime/pkg/source"
v1alpha1 "github.com/actions/actions-runner-controller/apis/actions.github.com/v1alpha1"
"github.com/actions/actions-runner-controller/controllers/actions.github.com/metrics"
"github.com/actions/actions-runner-controller/github/actions"
hash "github.com/actions/actions-runner-controller/hash"
corev1 "k8s.io/api/core/v1"
rbacv1 "k8s.io/api/rbac/v1"
@@ -49,6 +51,10 @@ type AutoscalingListenerReconciler struct {
client.Client
Log logr.Logger
Scheme *runtime.Scheme
// ListenerMetricsAddr is address that the metrics endpoint binds to.
// If it is set to "0", the metrics server is not started.
ListenerMetricsAddr string
ListenerMetricsEndpoint string
resourceBuilder resourceBuilder
}
@@ -227,6 +233,11 @@ func (r *AutoscalingListenerReconciler) Reconcile(ctx context.Context, req ctrl.
return ctrl.Result{}, err
}
if err := r.publishRunningListener(autoscalingListener, false); err != nil {
// If publish fails, URL is incorrect which means the listener pod would never be able to start
return ctrl.Result{}, nil
}
// Create a listener pod in the controller namespace
log.Info("Creating a listener pod")
return r.createListenerPod(ctx, &autoscalingRunnerSet, autoscalingListener, serviceAccount, mirrorSecret, log)
@@ -242,6 +253,16 @@ func (r *AutoscalingListenerReconciler) Reconcile(ctx context.Context, req ctrl.
}
}
if listenerPod.Status.Phase == corev1.PodRunning {
if err := r.publishRunningListener(autoscalingListener, true); err != nil {
log.Error(err, "Unable to publish running listener", "namespace", listenerPod.Namespace, "name", listenerPod.Name)
// stop reconciling. We should never get to this point but if we do,
// listener won't be able to start up, and the crash from the pod should
// notify the reconciler again.
return ctrl.Result{}, nil
}
}
return ctrl.Result{}, nil
}
@@ -260,6 +281,9 @@ func (r *AutoscalingListenerReconciler) cleanupResources(ctx context.Context, au
return false, nil
case err != nil && !kerrors.IsNotFound(err):
return false, fmt.Errorf("failed to get listener pods: %v", err)
default: // NOT FOUND
_ = r.publishRunningListener(autoscalingListener, false) // If error is returned, we never published metrics so it is safe to ignore
}
logger.Info("Listener pod is deleted")
@@ -371,9 +395,22 @@ func (r *AutoscalingListenerReconciler) createListenerPod(ctx context.Context, a
envs = append(envs, env)
}
newPod := r.resourceBuilder.newScaleSetListenerPod(autoscalingListener, serviceAccount, secret, envs...)
var metricsConfig *listenerMetricsServerConfig
if r.ListenerMetricsAddr != "0" {
metricsConfig = &listenerMetricsServerConfig{
addr: r.ListenerMetricsAddr,
endpoint: r.ListenerMetricsEndpoint,
}
}
newPod, err := r.resourceBuilder.newScaleSetListenerPod(autoscalingListener, serviceAccount, secret, metricsConfig, envs...)
if err != nil {
logger.Error(err, "Failed to build listener pod")
return ctrl.Result{}, err
}
if err := ctrl.SetControllerReference(autoscalingListener, newPod, r.Scheme); err != nil {
logger.Error(err, "Failed to set controller reference")
return ctrl.Result{}, err
}
@@ -556,6 +593,30 @@ func (r *AutoscalingListenerReconciler) createRoleBindingForListener(ctx context
return ctrl.Result{Requeue: true}, nil
}
func (r *AutoscalingListenerReconciler) publishRunningListener(autoscalingListener *v1alpha1.AutoscalingListener, isUp bool) error {
githubConfigURL := autoscalingListener.Spec.GitHubConfigUrl
parsedURL, err := actions.ParseGitHubConfigFromURL(githubConfigURL)
if err != nil {
return err
}
commonLabels := metrics.CommonLabels{
Name: autoscalingListener.Name,
Namespace: autoscalingListener.Namespace,
Repository: parsedURL.Repository,
Organization: parsedURL.Organization,
Enterprise: parsedURL.Enterprise,
}
if isUp {
metrics.AddRunningListener(commonLabels)
} else {
metrics.SubRunningListener(commonLabels)
}
return nil
}
// SetupWithManager sets up the controller with the Manager.
func (r *AutoscalingListenerReconciler) SetupWithManager(mgr ctrl.Manager) error {
groupVersionIndexer := func(rawObj client.Object) []string {

View File

@@ -24,6 +24,7 @@ import (
"strings"
"github.com/actions/actions-runner-controller/apis/actions.github.com/v1alpha1"
"github.com/actions/actions-runner-controller/build"
"github.com/actions/actions-runner-controller/github/actions"
"github.com/go-logr/logr"
corev1 "k8s.io/api/core/v1"
@@ -48,6 +49,24 @@ const (
runnerScaleSetNameAnnotationKey = "runner-scale-set-name"
)
type UpdateStrategy string
// Defines how the controller should handle upgrades while having running jobs.
const (
// "immediate": (default) The controller will immediately apply the change causing the
// recreation of the listener and ephemeral runner set. This can lead to an
// overprovisioning of runners, if there are pending / running jobs. This should not
// be a problem at a small scale, but it could lead to a significant increase of
// resources if you have a lot of jobs running concurrently.
UpdateStrategyImmediate = UpdateStrategy("immediate")
// "eventual": The controller will remove the listener and ephemeral runner set
// immediately, but will not recreate them (to apply changes) until all
// pending / running jobs have completed.
// This can lead to a longer time to apply the change but it will ensure
// that you don't have any overprovisioning of runners.
UpdateStrategyEventual = UpdateStrategy("eventual")
)
// AutoscalingRunnerSetReconciler reconciles a AutoscalingRunnerSet object
type AutoscalingRunnerSetReconciler struct {
client.Client
@@ -56,6 +75,7 @@ type AutoscalingRunnerSetReconciler struct {
ControllerNamespace string
DefaultRunnerScaleSetListenerImage string
DefaultRunnerScaleSetListenerImagePullSecrets []string
UpdateStrategy UpdateStrategy
ActionsClient actions.MultiClient
resourceBuilder resourceBuilder
@@ -136,6 +156,22 @@ func (r *AutoscalingRunnerSetReconciler) Reconcile(ctx context.Context, req ctrl
return ctrl.Result{}, nil
}
if autoscalingRunnerSet.Labels[LabelKeyKubernetesVersion] != build.Version {
if err := r.Delete(ctx, autoscalingRunnerSet); err != nil {
log.Error(err, "Failed to delete autoscaling runner set on version mismatch",
"targetVersion", build.Version,
"actualVersion", autoscalingRunnerSet.Labels[LabelKeyKubernetesVersion],
)
return ctrl.Result{}, nil
}
log.Info("Autoscaling runner set version doesn't match the build version. Deleting the resource.",
"targetVersion", build.Version,
"actualVersion", autoscalingRunnerSet.Labels[LabelKeyKubernetesVersion],
)
return ctrl.Result{}, nil
}
if !controllerutil.ContainsFinalizer(autoscalingRunnerSet, autoscalingRunnerSetFinalizerName) {
log.Info("Adding finalizer")
if err := patch(ctx, r.Client, autoscalingRunnerSet, func(obj *v1alpha1.AutoscalingRunnerSet) {
@@ -201,7 +237,48 @@ func (r *AutoscalingRunnerSetReconciler) Reconcile(ctx context.Context, req ctrl
log.Info("Find existing ephemeral runner set", "name", runnerSet.Name, "specHash", runnerSet.Labels[labelKeyRunnerSpecHash])
}
// Make sure the AutoscalingListener is up and running in the controller namespace
listener := new(v1alpha1.AutoscalingListener)
listenerFound := true
if err := r.Get(ctx, client.ObjectKey{Namespace: r.ControllerNamespace, Name: scaleSetListenerName(autoscalingRunnerSet)}, listener); err != nil {
if !kerrors.IsNotFound(err) {
log.Error(err, "Failed to get AutoscalingListener resource")
return ctrl.Result{}, err
}
listenerFound = false
log.Info("AutoscalingListener does not exist.")
}
// Our listener pod is out of date, so we need to delete it to get a new recreate.
if listenerFound && (listener.Labels[labelKeyRunnerSpecHash] != autoscalingRunnerSet.ListenerSpecHash()) {
log.Info("RunnerScaleSetListener is out of date. Deleting it so that it is recreated", "name", listener.Name)
if err := r.Delete(ctx, listener); err != nil {
if kerrors.IsNotFound(err) {
return ctrl.Result{}, nil
}
log.Error(err, "Failed to delete AutoscalingListener resource")
return ctrl.Result{}, err
}
log.Info("Deleted RunnerScaleSetListener since existing one is out of date")
return ctrl.Result{}, nil
}
if desiredSpecHash != latestRunnerSet.Labels[labelKeyRunnerSpecHash] {
if r.drainingJobs(&latestRunnerSet.Status) {
log.Info("Latest runner set spec hash does not match the current autoscaling runner set. Waiting for the running and pending runners to finish:", "running", latestRunnerSet.Status.RunningEphemeralRunners, "pending", latestRunnerSet.Status.PendingEphemeralRunners)
log.Info("Scaling down the number of desired replicas to 0")
// We are in the process of draining the jobs. The listener has been deleted and the ephemeral runner set replicas
// need to scale down to 0
err := patch(ctx, r.Client, latestRunnerSet, func(obj *v1alpha1.EphemeralRunnerSet) {
obj.Spec.Replicas = 0
})
if err != nil {
log.Error(err, "Failed to patch runner set to set desired count to 0")
}
return ctrl.Result{}, err
}
log.Info("Latest runner set spec hash does not match the current autoscaling runner set. Creating a new runner set")
return r.createEphemeralRunnerSet(ctx, autoscalingRunnerSet, log)
}
@@ -217,30 +294,13 @@ func (r *AutoscalingRunnerSetReconciler) Reconcile(ctx context.Context, req ctrl
}
// Make sure the AutoscalingListener is up and running in the controller namespace
listener := new(v1alpha1.AutoscalingListener)
if err := r.Get(ctx, client.ObjectKey{Namespace: r.ControllerNamespace, Name: scaleSetListenerName(autoscalingRunnerSet)}, listener); err != nil {
if kerrors.IsNotFound(err) {
// We don't have a listener
log.Info("Creating a new AutoscalingListener for the runner set", "ephemeralRunnerSetName", latestRunnerSet.Name)
return r.createAutoScalingListenerForRunnerSet(ctx, autoscalingRunnerSet, latestRunnerSet, log)
if !listenerFound {
if r.drainingJobs(&latestRunnerSet.Status) {
log.Info("Creating a new AutoscalingListener is waiting for the running and pending runners to finish. Waiting for the running and pending runners to finish:", "running", latestRunnerSet.Status.RunningEphemeralRunners, "pending", latestRunnerSet.Status.PendingEphemeralRunners)
return ctrl.Result{}, nil
}
log.Error(err, "Failed to get AutoscalingListener resource")
return ctrl.Result{}, err
}
// Our listener pod is out of date, so we need to delete it to get a new recreate.
if listener.Labels[labelKeyRunnerSpecHash] != autoscalingRunnerSet.ListenerSpecHash() {
log.Info("RunnerScaleSetListener is out of date. Deleting it so that it is recreated", "name", listener.Name)
if err := r.Delete(ctx, listener); err != nil {
if kerrors.IsNotFound(err) {
return ctrl.Result{}, nil
}
log.Error(err, "Failed to delete AutoscalingListener resource")
return ctrl.Result{}, err
}
log.Info("Deleted RunnerScaleSetListener since existing one is out of date")
return ctrl.Result{}, nil
log.Info("Creating a new AutoscalingListener for the runner set", "ephemeralRunnerSetName", latestRunnerSet.Name)
return r.createAutoScalingListenerForRunnerSet(ctx, autoscalingRunnerSet, latestRunnerSet, log)
}
// Update the status of autoscaling runner set.
@@ -259,6 +319,16 @@ func (r *AutoscalingRunnerSetReconciler) Reconcile(ctx context.Context, req ctrl
return ctrl.Result{}, nil
}
// Prevents overprovisioning of runners.
// We reach this code path when runner scale set has been patched with a new runner spec but there are still running ephemeral runners.
// The safest approach is to wait for the running ephemeral runners to finish before creating a new runner set.
func (r *AutoscalingRunnerSetReconciler) drainingJobs(latestRunnerSetStatus *v1alpha1.EphemeralRunnerSetStatus) bool {
if r.UpdateStrategy == UpdateStrategyEventual && ((latestRunnerSetStatus.RunningEphemeralRunners + latestRunnerSetStatus.PendingEphemeralRunners) > 0) {
return true
}
return false
}
func (r *AutoscalingRunnerSetReconciler) cleanupListener(ctx context.Context, autoscalingRunnerSet *v1alpha1.AutoscalingRunnerSet, logger logr.Logger) (done bool, err error) {
logger.Info("Cleaning up the listener")
var listener v1alpha1.AutoscalingListener

View File

@@ -27,6 +27,7 @@ import (
"k8s.io/apimachinery/pkg/types"
"github.com/actions/actions-runner-controller/apis/actions.github.com/v1alpha1"
"github.com/actions/actions-runner-controller/build"
"github.com/actions/actions-runner-controller/github/actions"
"github.com/actions/actions-runner-controller/github/actions/fake"
"github.com/actions/actions-runner-controller/github/actions/testserver"
@@ -38,19 +39,32 @@ const (
autoscalingRunnerSetTestGitHubToken = "gh_token"
)
var _ = Describe("Test AutoScalingRunnerSet controller", func() {
var _ = Describe("Test AutoScalingRunnerSet controller", Ordered, func() {
var ctx context.Context
var mgr ctrl.Manager
var controller *AutoscalingRunnerSetReconciler
var autoscalingNS *corev1.Namespace
var autoscalingRunnerSet *v1alpha1.AutoscalingRunnerSet
var configSecret *corev1.Secret
var originalBuildVersion string
buildVersion := "0.1.0"
BeforeAll(func() {
originalBuildVersion = build.Version
build.Version = buildVersion
})
AfterAll(func() {
build.Version = originalBuildVersion
})
BeforeEach(func() {
ctx = context.Background()
autoscalingNS, mgr = createNamespace(GinkgoT(), k8sClient)
configSecret = createDefaultSecret(GinkgoT(), k8sClient, autoscalingNS.Name)
controller := &AutoscalingRunnerSetReconciler{
controller = &AutoscalingRunnerSetReconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
Log: logf.Log,
@@ -67,6 +81,9 @@ var _ = Describe("Test AutoScalingRunnerSet controller", func() {
ObjectMeta: metav1.ObjectMeta{
Name: "test-asrs",
Namespace: autoscalingNS.Name,
Labels: map[string]string{
LabelKeyKubernetesVersion: buildVersion,
},
},
Spec: v1alpha1.AutoscalingRunnerSetSpec{
GitHubConfigUrl: "https://github.com/owner/repo",
@@ -408,6 +425,110 @@ var _ = Describe("Test AutoScalingRunnerSet controller", func() {
})
})
Context("When updating an AutoscalingRunnerSet with running or pending jobs", func() {
It("It should wait for running and pending jobs to finish before applying the update. Update Strategy is set to eventual.", func() {
// Switch update strategy to eventual (drain jobs )
controller.UpdateStrategy = UpdateStrategyEventual
// Wait till the listener is created
listener := new(v1alpha1.AutoscalingListener)
Eventually(
func() error {
return k8sClient.Get(ctx, client.ObjectKey{Name: scaleSetListenerName(autoscalingRunnerSet), Namespace: autoscalingRunnerSet.Namespace}, listener)
},
autoscalingRunnerSetTestTimeout,
autoscalingRunnerSetTestInterval,
).Should(Succeed(), "Listener should be created")
// Wait till the ephemeral runner set is created
Eventually(
func() (int, error) {
runnerSetList := new(v1alpha1.EphemeralRunnerSetList)
err := k8sClient.List(ctx, runnerSetList, client.InNamespace(autoscalingRunnerSet.Namespace))
if err != nil {
return 0, err
}
return len(runnerSetList.Items), nil
},
autoscalingRunnerSetTestTimeout,
autoscalingRunnerSetTestInterval,
).Should(BeEquivalentTo(1), "Only one EphemeralRunnerSet should be created")
runnerSetList := new(v1alpha1.EphemeralRunnerSetList)
err := k8sClient.List(ctx, runnerSetList, client.InNamespace(autoscalingRunnerSet.Namespace))
Expect(err).NotTo(HaveOccurred(), "failed to list EphemeralRunnerSet")
// Emulate running and pending jobs
runnerSet := runnerSetList.Items[0]
activeRunnerSet := runnerSet.DeepCopy()
activeRunnerSet.Status.CurrentReplicas = 6
activeRunnerSet.Status.FailedEphemeralRunners = 1
activeRunnerSet.Status.RunningEphemeralRunners = 2
activeRunnerSet.Status.PendingEphemeralRunners = 3
desiredStatus := v1alpha1.AutoscalingRunnerSetStatus{
CurrentRunners: activeRunnerSet.Status.CurrentReplicas,
State: "",
PendingEphemeralRunners: activeRunnerSet.Status.PendingEphemeralRunners,
RunningEphemeralRunners: activeRunnerSet.Status.RunningEphemeralRunners,
FailedEphemeralRunners: activeRunnerSet.Status.FailedEphemeralRunners,
}
err = k8sClient.Status().Patch(ctx, activeRunnerSet, client.MergeFrom(&runnerSet))
Expect(err).NotTo(HaveOccurred(), "Failed to patch runner set status")
Eventually(
func() (v1alpha1.AutoscalingRunnerSetStatus, error) {
updated := new(v1alpha1.AutoscalingRunnerSet)
err := k8sClient.Get(ctx, client.ObjectKey{Name: autoscalingRunnerSet.Name, Namespace: autoscalingRunnerSet.Namespace}, updated)
if err != nil {
return v1alpha1.AutoscalingRunnerSetStatus{}, fmt.Errorf("failed to get AutoScalingRunnerSet: %w", err)
}
return updated.Status, nil
},
autoscalingRunnerSetTestTimeout,
autoscalingRunnerSetTestInterval,
).Should(BeEquivalentTo(desiredStatus), "AutoScalingRunnerSet status should be updated")
// Patch the AutoScalingRunnerSet image which should trigger
// the recreation of the Listener and EphemeralRunnerSet
patched := autoscalingRunnerSet.DeepCopy()
patched.Spec.Template.Spec = corev1.PodSpec{
Containers: []corev1.Container{
{
Name: "runner",
Image: "ghcr.io/actions/abcd:1.1.1",
},
},
}
// patched.Spec.Template.Spec.PriorityClassName = "test-priority-class"
err = k8sClient.Patch(ctx, patched, client.MergeFrom(autoscalingRunnerSet))
Expect(err).NotTo(HaveOccurred(), "failed to patch AutoScalingRunnerSet")
autoscalingRunnerSet = patched.DeepCopy()
// The EphemeralRunnerSet should not be recreated
Consistently(
func() (string, error) {
runnerSetList := new(v1alpha1.EphemeralRunnerSetList)
err := k8sClient.List(ctx, runnerSetList, client.InNamespace(autoscalingRunnerSet.Namespace))
Expect(err).NotTo(HaveOccurred(), "failed to fetch AutoScalingRunnerSet")
return runnerSetList.Items[0].Name, nil
},
autoscalingRunnerSetTestTimeout,
autoscalingRunnerSetTestInterval,
).Should(Equal(activeRunnerSet.Name), "The EphemeralRunnerSet should not be recreated")
// The listener should not be recreated
Consistently(
func() error {
return k8sClient.Get(ctx, client.ObjectKey{Name: scaleSetListenerName(autoscalingRunnerSet), Namespace: autoscalingRunnerSet.Namespace}, listener)
},
autoscalingRunnerSetTestTimeout,
autoscalingRunnerSetTestInterval,
).ShouldNot(Succeed(), "Listener should not be recreated")
})
})
It("Should update Status on EphemeralRunnerSet status Update", func() {
ars := new(v1alpha1.AutoscalingRunnerSet)
Eventually(
@@ -474,7 +595,19 @@ var _ = Describe("Test AutoScalingRunnerSet controller", func() {
})
})
var _ = Describe("Test AutoScalingController updates", func() {
var _ = Describe("Test AutoScalingController updates", Ordered, func() {
var originalBuildVersion string
buildVersion := "0.1.0"
BeforeAll(func() {
originalBuildVersion = build.Version
build.Version = buildVersion
})
AfterAll(func() {
build.Version = originalBuildVersion
})
Context("Creating autoscaling runner set with RunnerScaleSetName set", func() {
var ctx context.Context
var mgr ctrl.Manager
@@ -483,6 +616,7 @@ var _ = Describe("Test AutoScalingController updates", func() {
var configSecret *corev1.Secret
BeforeEach(func() {
originalBuildVersion = build.Version
ctx = context.Background()
autoscalingNS, mgr = createNamespace(GinkgoT(), k8sClient)
configSecret = createDefaultSecret(GinkgoT(), k8sClient, autoscalingNS.Name)
@@ -528,6 +662,9 @@ var _ = Describe("Test AutoScalingController updates", func() {
ObjectMeta: metav1.ObjectMeta{
Name: "test-asrs",
Namespace: autoscalingNS.Name,
Labels: map[string]string{
LabelKeyKubernetesVersion: buildVersion,
},
},
Spec: v1alpha1.AutoscalingRunnerSetSpec{
GitHubConfigUrl: "https://github.com/owner/repo",
@@ -598,7 +735,18 @@ var _ = Describe("Test AutoScalingController updates", func() {
})
})
var _ = Describe("Test AutoscalingController creation failures", func() {
var _ = Describe("Test AutoscalingController creation failures", Ordered, func() {
var originalBuildVersion string
buildVersion := "0.1.0"
BeforeAll(func() {
originalBuildVersion = build.Version
build.Version = buildVersion
})
AfterAll(func() {
build.Version = originalBuildVersion
})
Context("When autoscaling runner set creation fails on the client", func() {
var ctx context.Context
var mgr ctrl.Manager
@@ -629,6 +777,9 @@ var _ = Describe("Test AutoscalingController creation failures", func() {
ObjectMeta: metav1.ObjectMeta{
Name: "test-asrs",
Namespace: autoscalingNS.Name,
Labels: map[string]string{
LabelKeyKubernetesVersion: buildVersion,
},
},
Spec: v1alpha1.AutoscalingRunnerSetSpec{
GitHubConfigUrl: "https://github.com/owner/repo",
@@ -707,7 +858,18 @@ var _ = Describe("Test AutoscalingController creation failures", func() {
})
})
var _ = Describe("Test Client optional configuration", func() {
var _ = Describe("Test client optional configuration", Ordered, func() {
var originalBuildVersion string
buildVersion := "0.1.0"
BeforeAll(func() {
originalBuildVersion = build.Version
build.Version = buildVersion
})
AfterAll(func() {
build.Version = originalBuildVersion
})
Context("When specifying a proxy", func() {
var ctx context.Context
var mgr ctrl.Manager
@@ -747,6 +909,9 @@ var _ = Describe("Test Client optional configuration", func() {
ObjectMeta: metav1.ObjectMeta{
Name: "test-asrs",
Namespace: autoscalingNS.Name,
Labels: map[string]string{
LabelKeyKubernetesVersion: buildVersion,
},
},
Spec: v1alpha1.AutoscalingRunnerSetSpec{
GitHubConfigUrl: "http://example.com/org/repo",
@@ -823,6 +988,9 @@ var _ = Describe("Test Client optional configuration", func() {
ObjectMeta: metav1.ObjectMeta{
Name: "test-asrs",
Namespace: autoscalingNS.Name,
Labels: map[string]string{
LabelKeyKubernetesVersion: buildVersion,
},
},
Spec: v1alpha1.AutoscalingRunnerSetSpec{
GitHubConfigUrl: "http://example.com/org/repo",
@@ -939,6 +1107,9 @@ var _ = Describe("Test Client optional configuration", func() {
ObjectMeta: metav1.ObjectMeta{
Name: "test-asrs",
Namespace: autoscalingNS.Name,
Labels: map[string]string{
LabelKeyKubernetesVersion: buildVersion,
},
},
Spec: v1alpha1.AutoscalingRunnerSetSpec{
GitHubConfigUrl: server.ConfigURLForOrg("my-org"),
@@ -989,6 +1160,9 @@ var _ = Describe("Test Client optional configuration", func() {
ObjectMeta: metav1.ObjectMeta{
Name: "test-asrs",
Namespace: autoscalingNS.Name,
Labels: map[string]string{
LabelKeyKubernetesVersion: buildVersion,
},
},
Spec: v1alpha1.AutoscalingRunnerSetSpec{
GitHubConfigUrl: "https://github.com/owner/repo",
@@ -1050,6 +1224,9 @@ var _ = Describe("Test Client optional configuration", func() {
ObjectMeta: metav1.ObjectMeta{
Name: "test-asrs",
Namespace: autoscalingNS.Name,
Labels: map[string]string{
LabelKeyKubernetesVersion: buildVersion,
},
},
Spec: v1alpha1.AutoscalingRunnerSetSpec{
GitHubConfigUrl: "https://github.com/owner/repo",
@@ -1102,7 +1279,19 @@ var _ = Describe("Test Client optional configuration", func() {
})
})
var _ = Describe("Test external permissions cleanup", func() {
var _ = Describe("Test external permissions cleanup", Ordered, func() {
var originalBuildVersion string
buildVersion := "0.1.0"
BeforeAll(func() {
originalBuildVersion = build.Version
build.Version = buildVersion
})
AfterAll(func() {
build.Version = originalBuildVersion
})
It("Should clean up kubernetes mode permissions", func() {
ctx := context.Background()
autoscalingNS, mgr := createNamespace(GinkgoT(), k8sClient)
@@ -1129,7 +1318,8 @@ var _ = Describe("Test external permissions cleanup", func() {
Name: "test-asrs",
Namespace: autoscalingNS.Name,
Labels: map[string]string{
"app.kubernetes.io/name": "gha-runner-scale-set",
"app.kubernetes.io/name": "gha-runner-scale-set",
LabelKeyKubernetesVersion: buildVersion,
},
Annotations: map[string]string{
AnnotationKeyKubernetesModeRoleBindingName: "kube-mode-role-binding",
@@ -1286,7 +1476,8 @@ var _ = Describe("Test external permissions cleanup", func() {
Name: "test-asrs",
Namespace: autoscalingNS.Name,
Labels: map[string]string{
"app.kubernetes.io/name": "gha-runner-scale-set",
"app.kubernetes.io/name": "gha-runner-scale-set",
LabelKeyKubernetesVersion: buildVersion,
},
Annotations: map[string]string{
AnnotationKeyManagerRoleName: "manager-role",
@@ -1465,3 +1656,80 @@ var _ = Describe("Test external permissions cleanup", func() {
).Should(BeTrue(), "Expected role to be cleaned up")
})
})
var _ = Describe("Test resource version and build version mismatch", func() {
It("Should delete and recreate the autoscaling runner set to match the build version", func() {
ctx := context.Background()
autoscalingNS, mgr := createNamespace(GinkgoT(), k8sClient)
configSecret := createDefaultSecret(GinkgoT(), k8sClient, autoscalingNS.Name)
controller := &AutoscalingRunnerSetReconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
Log: logf.Log,
ControllerNamespace: autoscalingNS.Name,
DefaultRunnerScaleSetListenerImage: "ghcr.io/actions/arc",
ActionsClient: fake.NewMultiClient(),
}
err := controller.SetupWithManager(mgr)
Expect(err).NotTo(HaveOccurred(), "failed to setup controller")
originalVersion := build.Version
defer func() {
build.Version = originalVersion
}()
build.Version = "0.2.0"
min := 1
max := 10
autoscalingRunnerSet := &v1alpha1.AutoscalingRunnerSet{
ObjectMeta: metav1.ObjectMeta{
Name: "test-asrs",
Namespace: autoscalingNS.Name,
Labels: map[string]string{
"app.kubernetes.io/name": "gha-runner-scale-set",
"app.kubernetes.io/version": "0.1.0",
},
Annotations: map[string]string{
AnnotationKeyKubernetesModeRoleBindingName: "kube-mode-role-binding",
AnnotationKeyKubernetesModeRoleName: "kube-mode-role",
AnnotationKeyKubernetesModeServiceAccountName: "kube-mode-service-account",
},
},
Spec: v1alpha1.AutoscalingRunnerSetSpec{
GitHubConfigUrl: "https://github.com/owner/repo",
GitHubConfigSecret: configSecret.Name,
MaxRunners: &max,
MinRunners: &min,
RunnerGroup: "testgroup",
Template: corev1.PodTemplateSpec{
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{
Name: "runner",
Image: "ghcr.io/actions/runner",
},
},
},
},
},
}
// create autoscaling runner set before starting a manager
err = k8sClient.Create(ctx, autoscalingRunnerSet)
Expect(err).NotTo(HaveOccurred())
startManagers(GinkgoT(), mgr)
Eventually(
func() bool {
ars := new(v1alpha1.AutoscalingRunnerSet)
err := k8sClient.Get(ctx, types.NamespacedName{Namespace: autoscalingRunnerSet.Namespace, Name: autoscalingRunnerSet.Name}, ars)
return errors.IsNotFound(err)
},
autoscalingRunnerSetTestTimeout,
autoscalingRunnerSetTestInterval,
).Should(BeTrue())
})
})

View File

@@ -1,6 +1,9 @@
package actionsgithubcom
import corev1 "k8s.io/api/core/v1"
import (
"github.com/actions/actions-runner-controller/logging"
corev1 "k8s.io/api/core/v1"
)
const (
LabelKeyRunnerTemplateHash = "runner-template-hash"
@@ -60,5 +63,11 @@ const (
// to the listener when ImagePullPolicy is not specified
const DefaultScaleSetListenerImagePullPolicy = corev1.PullIfNotPresent
// DefaultScaleSetListenerLogLevel is the default log level applied
const DefaultScaleSetListenerLogLevel = string(logging.LogLevelDebug)
// DefaultScaleSetListenerLogFormat is the default log format applied
const DefaultScaleSetListenerLogFormat = string(logging.LogFormatText)
// ownerKey is field selector matching the owner name of a particular resource
const resourceOwnerKey = ".metadata.controller"

View File

@@ -25,6 +25,7 @@ import (
"strings"
"github.com/actions/actions-runner-controller/apis/actions.github.com/v1alpha1"
"github.com/actions/actions-runner-controller/controllers/actions.github.com/metrics"
"github.com/actions/actions-runner-controller/github/actions"
"github.com/go-logr/logr"
"go.uber.org/multierr"
@@ -50,6 +51,8 @@ type EphemeralRunnerSetReconciler struct {
Scheme *runtime.Scheme
ActionsClient actions.MultiClient
PublishMetrics bool
resourceBuilder resourceBuilder
}
@@ -163,6 +166,29 @@ func (r *EphemeralRunnerSetReconciler) Reconcile(ctx context.Context, req ctrl.R
"deleting", len(deletingEphemeralRunners),
)
if r.PublishMetrics {
githubConfigURL := ephemeralRunnerSet.Spec.EphemeralRunnerSpec.GitHubConfigUrl
parsedURL, err := actions.ParseGitHubConfigFromURL(githubConfigURL)
if err != nil {
log.Error(err, "Github Config URL is invalid", "URL", githubConfigURL)
// stop reconciling on this object
return ctrl.Result{}, nil
}
metrics.SetEphemeralRunnerCountsByStatus(
metrics.CommonLabels{
Name: ephemeralRunnerSet.Labels[LabelKeyGitHubScaleSetName],
Namespace: ephemeralRunnerSet.Labels[LabelKeyGitHubScaleSetNamespace],
Repository: parsedURL.Repository,
Organization: parsedURL.Organization,
Enterprise: parsedURL.Enterprise,
},
len(pendingEphemeralRunners),
len(runningEphemeralRunners),
len(failedEphemeralRunners),
)
}
// cleanup finished runners and proceed
var errs []error
for i := range finishedEphemeralRunners {

View File

@@ -0,0 +1,92 @@
package metrics
import (
"github.com/prometheus/client_golang/prometheus"
"sigs.k8s.io/controller-runtime/pkg/metrics"
)
var githubScaleSetControllerSubsystem = "gha_controller"
var labels = []string{
"name",
"namespace",
"repository",
"organization",
"enterprise",
}
type CommonLabels struct {
Name string
Namespace string
Repository string
Organization string
Enterprise string
}
func (l *CommonLabels) labels() prometheus.Labels {
return prometheus.Labels{
"name": l.Name,
"namespace": l.Namespace,
"repository": l.Repository,
"organization": l.Organization,
"enterprise": l.Enterprise,
}
}
var (
pendingEphemeralRunners = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetControllerSubsystem,
Name: "pending_ephemeral_runners",
Help: "Number of ephemeral runners in a pending state.",
},
labels,
)
runningEphemeralRunners = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetControllerSubsystem,
Name: "running_ephemeral_runners",
Help: "Number of ephemeral runners in a running state.",
},
labels,
)
failedEphemeralRunners = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetControllerSubsystem,
Name: "failed_ephemeral_runners",
Help: "Number of ephemeral runners in a failed state.",
},
labels,
)
runningListeners = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Subsystem: githubScaleSetControllerSubsystem,
Name: "running_listeners",
Help: "Number of listeners in a running state.",
},
labels,
)
)
func RegisterMetrics() {
metrics.Registry.MustRegister(
pendingEphemeralRunners,
runningEphemeralRunners,
failedEphemeralRunners,
runningListeners,
)
}
func SetEphemeralRunnerCountsByStatus(commonLabels CommonLabels, pending, running, failed int) {
pendingEphemeralRunners.With(commonLabels.labels()).Set(float64(pending))
runningEphemeralRunners.With(commonLabels.labels()).Set(float64(running))
failedEphemeralRunners.With(commonLabels.labels()).Set(float64(failed))
}
func AddRunningListener(commonLabels CommonLabels) {
runningListeners.With(commonLabels.labels()).Set(1)
}
func SubRunningListener(commonLabels CommonLabels) {
runningListeners.With(commonLabels.labels()).Set(0)
}

View File

@@ -4,12 +4,14 @@ import (
"context"
"fmt"
"math"
"net"
"strconv"
"github.com/actions/actions-runner-controller/apis/actions.github.com/v1alpha1"
"github.com/actions/actions-runner-controller/build"
"github.com/actions/actions-runner-controller/github/actions"
"github.com/actions/actions-runner-controller/hash"
"github.com/actions/actions-runner-controller/logging"
corev1 "k8s.io/api/core/v1"
rbacv1 "k8s.io/api/rbac/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
@@ -46,6 +48,27 @@ func SetListenerImagePullPolicy(pullPolicy string) bool {
}
}
var scaleSetListenerLogLevel = DefaultScaleSetListenerLogLevel
var scaleSetListenerLogFormat = DefaultScaleSetListenerLogFormat
func SetListenerLoggingParameters(level string, format string) bool {
switch level {
case logging.LogLevelDebug, logging.LogLevelInfo, logging.LogLevelWarn, logging.LogLevelError:
default:
return false
}
switch format {
case logging.LogFormatJSON, logging.LogFormatText:
default:
return false
}
scaleSetListenerLogLevel = level
scaleSetListenerLogFormat = format
return true
}
type resourceBuilder struct{}
func (b *resourceBuilder) newAutoScalingListener(autoscalingRunnerSet *v1alpha1.AutoscalingRunnerSet, ephemeralRunnerSet *v1alpha1.EphemeralRunnerSet, namespace, image string, imagePullSecrets []corev1.LocalObjectReference) (*v1alpha1.AutoscalingListener, error) {
@@ -63,26 +86,24 @@ func (b *resourceBuilder) newAutoScalingListener(autoscalingRunnerSet *v1alpha1.
effectiveMinRunners = *autoscalingRunnerSet.Spec.MinRunners
}
githubConfig, err := actions.ParseGitHubConfigFromURL(autoscalingRunnerSet.Spec.GitHubConfigUrl)
if err != nil {
return nil, fmt.Errorf("failed to parse github config from url: %v", err)
labels := map[string]string{
LabelKeyGitHubScaleSetNamespace: autoscalingRunnerSet.Namespace,
LabelKeyGitHubScaleSetName: autoscalingRunnerSet.Name,
LabelKeyKubernetesPartOf: labelValueKubernetesPartOf,
LabelKeyKubernetesComponent: "runner-scale-set-listener",
LabelKeyKubernetesVersion: autoscalingRunnerSet.Labels[LabelKeyKubernetesVersion],
labelKeyRunnerSpecHash: autoscalingRunnerSet.ListenerSpecHash(),
}
if err := applyGitHubURLLabels(autoscalingRunnerSet.Spec.GitHubConfigUrl, labels); err != nil {
return nil, fmt.Errorf("failed to apply GitHub URL labels: %v", err)
}
autoscalingListener := &v1alpha1.AutoscalingListener{
ObjectMeta: metav1.ObjectMeta{
Name: scaleSetListenerName(autoscalingRunnerSet),
Namespace: namespace,
Labels: map[string]string{
LabelKeyGitHubScaleSetNamespace: autoscalingRunnerSet.Namespace,
LabelKeyGitHubScaleSetName: autoscalingRunnerSet.Name,
LabelKeyKubernetesPartOf: labelValueKubernetesPartOf,
LabelKeyKubernetesComponent: "runner-scale-set-listener",
LabelKeyKubernetesVersion: autoscalingRunnerSet.Labels[LabelKeyKubernetesVersion],
LabelKeyGitHubEnterprise: githubConfig.Enterprise,
LabelKeyGitHubOrganization: githubConfig.Organization,
LabelKeyGitHubRepository: githubConfig.Repository,
labelKeyRunnerSpecHash: autoscalingRunnerSet.ListenerSpecHash(),
},
Labels: labels,
},
Spec: v1alpha1.AutoscalingListenerSpec{
GitHubConfigUrl: autoscalingRunnerSet.Spec.GitHubConfigUrl,
@@ -104,7 +125,12 @@ func (b *resourceBuilder) newAutoScalingListener(autoscalingRunnerSet *v1alpha1.
return autoscalingListener, nil
}
func (b *resourceBuilder) newScaleSetListenerPod(autoscalingListener *v1alpha1.AutoscalingListener, serviceAccount *corev1.ServiceAccount, secret *corev1.Secret, envs ...corev1.EnvVar) *corev1.Pod {
type listenerMetricsServerConfig struct {
addr string
endpoint string
}
func (b *resourceBuilder) newScaleSetListenerPod(autoscalingListener *v1alpha1.AutoscalingListener, serviceAccount *corev1.ServiceAccount, secret *corev1.Secret, metricsConfig *listenerMetricsServerConfig, envs ...corev1.EnvVar) (*corev1.Pod, error) {
listenerEnv := []corev1.EnvVar{
{
Name: "GITHUB_CONFIGURE_URL",
@@ -130,6 +156,18 @@ func (b *resourceBuilder) newScaleSetListenerPod(autoscalingListener *v1alpha1.A
Name: "GITHUB_RUNNER_SCALE_SET_ID",
Value: strconv.Itoa(autoscalingListener.Spec.RunnerScaleSetId),
},
{
Name: "GITHUB_RUNNER_SCALE_SET_NAME",
Value: autoscalingListener.Spec.AutoscalingRunnerSetName,
},
{
Name: "GITHUB_RUNNER_LOG_LEVEL",
Value: scaleSetListenerLogLevel,
},
{
Name: "GITHUB_RUNNER_LOG_FORMAT",
Value: scaleSetListenerLogFormat,
},
}
listenerEnv = append(listenerEnv, envs...)
@@ -189,6 +227,38 @@ func (b *resourceBuilder) newScaleSetListenerPod(autoscalingListener *v1alpha1.A
})
}
var ports []corev1.ContainerPort
if metricsConfig != nil && len(metricsConfig.addr) != 0 {
listenerEnv = append(
listenerEnv,
corev1.EnvVar{
Name: "GITHUB_METRICS_ADDR",
Value: metricsConfig.addr,
},
corev1.EnvVar{
Name: "GITHUB_METRICS_ENDPOINT",
Value: metricsConfig.endpoint,
},
)
_, portStr, err := net.SplitHostPort(metricsConfig.addr)
if err != nil {
return nil, fmt.Errorf("failed to split host:port for metrics address: %v", err)
}
port, err := strconv.ParseInt(portStr, 10, 32)
if err != nil {
return nil, fmt.Errorf("failed to convert port %q to int32: %v", portStr, err)
}
ports = append(
ports,
corev1.ContainerPort{
ContainerPort: int32(port),
Protocol: corev1.ProtocolTCP,
Name: "metrics",
},
)
}
podSpec := corev1.PodSpec{
ServiceAccountName: serviceAccount.Name,
Containers: []corev1.Container{
@@ -200,6 +270,7 @@ func (b *resourceBuilder) newScaleSetListenerPod(autoscalingListener *v1alpha1.A
Command: []string{
"/github-runnerscaleset-listener",
},
Ports: ports,
},
},
ImagePullSecrets: autoscalingListener.Spec.ImagePullSecrets,
@@ -224,7 +295,7 @@ func (b *resourceBuilder) newScaleSetListenerPod(autoscalingListener *v1alpha1.A
Spec: podSpec,
}
return newRunnerScaleSetListenerPod
return newRunnerScaleSetListenerPod, nil
}
func (b *resourceBuilder) newScaleSetListenerServiceAccount(autoscalingListener *v1alpha1.AutoscalingListener) *corev1.ServiceAccount {
@@ -323,7 +394,7 @@ func (b *resourceBuilder) newEphemeralRunnerSet(autoscalingRunnerSet *v1alpha1.A
}
runnerSpecHash := autoscalingRunnerSet.RunnerSetSpecHash()
newLabels := map[string]string{
labels := map[string]string{
labelKeyRunnerSpecHash: runnerSpecHash,
LabelKeyKubernetesPartOf: labelValueKubernetesPartOf,
LabelKeyKubernetesComponent: "runner-set",
@@ -332,7 +403,7 @@ func (b *resourceBuilder) newEphemeralRunnerSet(autoscalingRunnerSet *v1alpha1.A
LabelKeyGitHubScaleSetNamespace: autoscalingRunnerSet.Namespace,
}
if err := applyGitHubURLLabels(autoscalingRunnerSet.Spec.GitHubConfigUrl, newLabels); err != nil {
if err := applyGitHubURLLabels(autoscalingRunnerSet.Spec.GitHubConfigUrl, labels); err != nil {
return nil, fmt.Errorf("failed to apply GitHub URL labels: %v", err)
}
@@ -345,7 +416,7 @@ func (b *resourceBuilder) newEphemeralRunnerSet(autoscalingRunnerSet *v1alpha1.A
ObjectMeta: metav1.ObjectMeta{
GenerateName: autoscalingRunnerSet.ObjectMeta.Name + "-",
Namespace: autoscalingRunnerSet.ObjectMeta.Namespace,
Labels: newLabels,
Labels: labels,
Annotations: newAnnotations,
},
Spec: v1alpha1.EphemeralRunnerSetSpec{
@@ -545,14 +616,23 @@ func applyGitHubURLLabels(url string, labels map[string]string) error {
}
if len(githubConfig.Enterprise) > 0 {
labels[LabelKeyGitHubEnterprise] = githubConfig.Enterprise
labels[LabelKeyGitHubEnterprise] = trimLabelValue(githubConfig.Enterprise)
}
if len(githubConfig.Organization) > 0 {
labels[LabelKeyGitHubOrganization] = githubConfig.Organization
labels[LabelKeyGitHubOrganization] = trimLabelValue(githubConfig.Organization)
}
if len(githubConfig.Repository) > 0 {
labels[LabelKeyGitHubRepository] = githubConfig.Repository
labels[LabelKeyGitHubRepository] = trimLabelValue(githubConfig.Repository)
}
return nil
}
const trimLabelVauleSuffix = "-trim"
func trimLabelValue(val string) string {
if len(val) > 63 {
return val[:63-len(trimLabelVauleSuffix)] + trimLabelVauleSuffix
}
return val
}

View File

@@ -2,6 +2,8 @@ package actionsgithubcom
import (
"context"
"fmt"
"strings"
"testing"
"github.com/actions/actions-runner-controller/apis/actions.github.com/v1alpha1"
@@ -66,7 +68,8 @@ func TestLabelPropagation(t *testing.T) {
Name: "test",
},
}
listenerPod := b.newScaleSetListenerPod(listener, listenerServiceAccount, listenerSecret)
listenerPod, err := b.newScaleSetListenerPod(listener, listenerServiceAccount, listenerSecret, nil)
require.NoError(t, err)
assert.Equal(t, listenerPod.Labels, listener.Labels)
ephemeralRunner := b.newEphemeralRunner(ephemeralRunnerSet)
@@ -91,3 +94,70 @@ func TestLabelPropagation(t *testing.T) {
assert.Equal(t, ephemeralRunner.Labels[key], pod.Labels[key])
}
}
func TestGitHubURLTrimLabelValues(t *testing.T) {
enterprise := strings.Repeat("a", 64)
organization := strings.Repeat("b", 64)
repository := strings.Repeat("c", 64)
autoscalingRunnerSet := v1alpha1.AutoscalingRunnerSet{
ObjectMeta: metav1.ObjectMeta{
Name: "test-scale-set",
Namespace: "test-ns",
Labels: map[string]string{
LabelKeyKubernetesPartOf: labelValueKubernetesPartOf,
LabelKeyKubernetesVersion: "0.2.0",
},
Annotations: map[string]string{
runnerScaleSetIdAnnotationKey: "1",
AnnotationKeyGitHubRunnerGroupName: "test-group",
},
},
}
t.Run("org/repo", func(t *testing.T) {
autoscalingRunnerSet := autoscalingRunnerSet.DeepCopy()
autoscalingRunnerSet.Spec = v1alpha1.AutoscalingRunnerSetSpec{
GitHubConfigUrl: fmt.Sprintf("https://github.com/%s/%s", organization, repository),
}
var b resourceBuilder
ephemeralRunnerSet, err := b.newEphemeralRunnerSet(autoscalingRunnerSet)
require.NoError(t, err)
assert.Len(t, ephemeralRunnerSet.Labels[LabelKeyGitHubEnterprise], 0)
assert.Len(t, ephemeralRunnerSet.Labels[LabelKeyGitHubOrganization], 63)
assert.Len(t, ephemeralRunnerSet.Labels[LabelKeyGitHubRepository], 63)
assert.True(t, strings.HasSuffix(ephemeralRunnerSet.Labels[LabelKeyGitHubOrganization], trimLabelVauleSuffix))
assert.True(t, strings.HasSuffix(ephemeralRunnerSet.Labels[LabelKeyGitHubRepository], trimLabelVauleSuffix))
listener, err := b.newAutoScalingListener(autoscalingRunnerSet, ephemeralRunnerSet, autoscalingRunnerSet.Namespace, "test:latest", nil)
require.NoError(t, err)
assert.Len(t, listener.Labels[LabelKeyGitHubEnterprise], 0)
assert.Len(t, listener.Labels[LabelKeyGitHubOrganization], 63)
assert.Len(t, listener.Labels[LabelKeyGitHubRepository], 63)
assert.True(t, strings.HasSuffix(ephemeralRunnerSet.Labels[LabelKeyGitHubOrganization], trimLabelVauleSuffix))
assert.True(t, strings.HasSuffix(ephemeralRunnerSet.Labels[LabelKeyGitHubRepository], trimLabelVauleSuffix))
})
t.Run("enterprise", func(t *testing.T) {
autoscalingRunnerSet := autoscalingRunnerSet.DeepCopy()
autoscalingRunnerSet.Spec = v1alpha1.AutoscalingRunnerSetSpec{
GitHubConfigUrl: fmt.Sprintf("https://github.com/enterprises/%s", enterprise),
}
var b resourceBuilder
ephemeralRunnerSet, err := b.newEphemeralRunnerSet(autoscalingRunnerSet)
require.NoError(t, err)
assert.Len(t, ephemeralRunnerSet.Labels[LabelKeyGitHubEnterprise], 63)
assert.True(t, strings.HasSuffix(ephemeralRunnerSet.Labels[LabelKeyGitHubEnterprise], trimLabelVauleSuffix))
assert.Len(t, ephemeralRunnerSet.Labels[LabelKeyGitHubOrganization], 0)
assert.Len(t, ephemeralRunnerSet.Labels[LabelKeyGitHubRepository], 0)
listener, err := b.newAutoScalingListener(autoscalingRunnerSet, ephemeralRunnerSet, autoscalingRunnerSet.Namespace, "test:latest", nil)
require.NoError(t, err)
assert.Len(t, listener.Labels[LabelKeyGitHubEnterprise], 63)
assert.True(t, strings.HasSuffix(ephemeralRunnerSet.Labels[LabelKeyGitHubEnterprise], trimLabelVauleSuffix))
assert.Len(t, listener.Labels[LabelKeyGitHubOrganization], 0)
assert.Len(t, listener.Labels[LabelKeyGitHubRepository], 0)
})
}

View File

@@ -11,7 +11,7 @@ import (
"github.com/actions/actions-runner-controller/apis/actions.summerwind.net/v1alpha1"
prometheus_metrics "github.com/actions/actions-runner-controller/controllers/actions.summerwind.net/metrics"
arcgithub "github.com/actions/actions-runner-controller/github"
"github.com/google/go-github/v47/github"
"github.com/google/go-github/v52/github"
corev1 "k8s.io/api/core/v1"
"sigs.k8s.io/controller-runtime/pkg/client"
)
@@ -118,10 +118,10 @@ func (r *HorizontalRunnerAutoscalerReconciler) suggestReplicasByQueuedAndInProgr
}
var total, inProgress, queued, completed, unknown int
type callback func()
listWorkflowJobs := func(user string, repoName string, runID int64, fallback_cb callback) {
listWorkflowJobs := func(user string, repoName string, runID int64) {
if runID == 0 {
fallback_cb()
// should not happen in reality
r.Log.Info("Detected run with no runID of 0, ignoring the case and not scaling.", "repo_name", repoName, "run_id", runID)
return
}
opt := github.ListWorkflowJobsOptions{ListOptions: github.ListOptions{PerPage: 50}}
@@ -139,7 +139,8 @@ func (r *HorizontalRunnerAutoscalerReconciler) suggestReplicasByQueuedAndInProgr
opt.Page = resp.NextPage
}
if len(allJobs) == 0 {
fallback_cb()
// GitHub API can return run with empty job array - should be ignored
r.Log.Info("Detected run with no jobs, ignoring the case and not scaling.", "repo_name", repoName, "run_id", runID)
} else {
JOB:
for _, job := range allJobs {
@@ -201,9 +202,9 @@ func (r *HorizontalRunnerAutoscalerReconciler) suggestReplicasByQueuedAndInProgr
case "completed":
completed++
case "in_progress":
listWorkflowJobs(user, repoName, run.GetID(), func() { inProgress++ })
listWorkflowJobs(user, repoName, run.GetID())
case "queued":
listWorkflowJobs(user, repoName, run.GetID(), func() { queued++ })
listWorkflowJobs(user, repoName, run.GetID())
default:
unknown++
}

View File

@@ -61,8 +61,9 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
want int
err string
}{
// case_0
// Legacy functionality
// 3 demanded, max at 3
// 0 demanded due to zero runID, min at 2
{
repo: "test/valid",
min: intPtr(2),
@@ -70,9 +71,10 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 3,
want: 2,
},
// Explicitly speified the default `self-hosted` label which is ignored by the simulator,
// case_1
// Explicitly specified the default `self-hosted` label which is ignored by the simulator,
// as we assume that GitHub Actions automatically associates the `self-hosted` label to every self-hosted runner.
// 3 demanded, max at 3
{
@@ -80,11 +82,17 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
labels: []string{"self-hosted"},
min: intPtr(2),
max: intPtr(3),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 3,
workflowRuns: `{"total_count": 4, "workflow_runs":[{"id": 1, "status":"queued"}, {"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"id": 1, "status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}]}"`,
workflowJobs: map[int]string{
1: `{"jobs": [{"status": "queued", "labels":["self-hosted"]}]}`,
2: `{"jobs": [{"status": "in_progress", "labels":["self-hosted"]}]}`,
3: `{"jobs": [{"status": "in_progress", "labels":["self-hosted"]}]}`,
},
want: 3,
},
// case_2
// 2 demanded, max at 3, currently 3, delay scaling down due to grace period
{
repo: "test/valid",
@@ -97,6 +105,7 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 3,
},
// case_3
// 3 demanded, max at 2
{
repo: "test/valid",
@@ -107,6 +116,7 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 2,
},
// case_4
// 2 demanded, min at 2
{
repo: "test/valid",
@@ -117,6 +127,7 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 2,
},
// case_5
// 1 demanded, min at 2
{
repo: "test/valid",
@@ -127,6 +138,7 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
want: 2,
},
// case_6
// 1 demanded, min at 2
{
repo: "test/valid",
@@ -137,6 +149,7 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 2,
},
// case_7
// 1 demanded, min at 1
{
repo: "test/valid",
@@ -147,6 +160,7 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
want: 1,
},
// case_8
// 1 demanded, min at 1
{
repo: "test/valid",
@@ -157,6 +171,7 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 1,
},
// case_9
// fixed at 3
{
repo: "test/valid",
@@ -166,9 +181,36 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 3, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 3,
want: 1,
},
// Case for empty GitHub Actions reponse - should not trigger scale up
{
description: "GitHub Actions Jobs Array is empty - no scale up",
repo: "test/valid",
min: intPtr(0),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
workflowJobs: map[int]string{
1: `{"jobs": []}`,
},
want: 0,
},
// Case for hosted GitHub Actions run
{
description: "Hosted GitHub Actions run - no scale up",
repo: "test/valid",
min: intPtr(0),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"id": 1, "status":"queued"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"id": 1, "status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
workflowJobs: map[int]string{
1: `{"jobs": [{"status":"queued"}]}`,
},
want: 0,
},
{
description: "Job-level autoscaling with no explicit runner label (runners have implicit self-hosted, requested self-hosted, 5 jobs from 3 workflows)",
repo: "test/valid",
@@ -422,7 +464,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
want int
err string
}{
// 3 demanded, max at 3
// case_0
// 0 demanded due to zero runID, min at 2
{
org: "test",
repos: []string{"valid"},
@@ -431,8 +474,9 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 3,
want: 2,
},
// case_1
// 2 demanded, max at 3, currently 3, delay scaling down due to grace period
{
org: "test",
@@ -446,6 +490,7 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 3,
},
// case_2
// 3 demanded, max at 2
{
org: "test",
@@ -457,6 +502,7 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 2,
},
// case_3
// 2 demanded, min at 2
{
org: "test",
@@ -468,6 +514,7 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 2,
},
// case_4
// 1 demanded, min at 2
{
org: "test",
@@ -479,6 +526,7 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
want: 2,
},
// case_5
// 1 demanded, min at 2
{
org: "test",
@@ -512,6 +560,7 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 1,
},
// case_6
// fixed at 3
{
org: "test",
@@ -522,8 +571,9 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 3, "workflow_runs":[{"status":"in_progress"},{"status":"in_progress"},{"status":"in_progress"}]}"`,
want: 3,
want: 1,
},
// case_7
// org runner, fixed at 3
{
org: "test",
@@ -534,8 +584,9 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 3, "workflow_runs":[{"status":"in_progress"},{"status":"in_progress"},{"status":"in_progress"}]}"`,
want: 3,
want: 1,
},
// case_8
// org runner, 1 demanded, min at 1, no repos
{
org: "test",

View File

@@ -44,8 +44,8 @@ type scaleOperation struct {
// Add the scale target to the unbounded queue, blocking until the target is successfully added to the queue.
// All the targets in the queue are dequeued every 3 seconds, grouped by the HRA, and applied.
// In a happy path, batchScaler update each HRA only once, even though the HRA had two or more associated webhook events in the 3 seconds interval,
// which results in less K8s API calls and less HRA update conflicts in case your ARC installation receives a lot of webhook events
// In a happy path, batchScaler updates each HRA only once, even though the HRA had two or more associated webhook events in the 3 seconds interval,
// which results in fewer K8s API calls and fewer HRA update conflicts in case your ARC installation receives a lot of webhook events
func (s *batchScaler) Add(st *ScaleTarget) {
if st == nil {
return
@@ -142,87 +142,130 @@ func (s *batchScaler) batchScale(ctx context.Context, batch batchScaleOperation)
return err
}
now := time.Now()
copy, err := s.planBatchScale(ctx, batch, &hra, now)
if err != nil {
return err
}
if err := s.Client.Patch(ctx, copy, client.MergeFrom(&hra)); err != nil {
return fmt.Errorf("patching horizontalrunnerautoscaler to add capacity reservation: %w", err)
}
return nil
}
func (s *batchScaler) planBatchScale(ctx context.Context, batch batchScaleOperation, hra *v1alpha1.HorizontalRunnerAutoscaler, now time.Time) (*v1alpha1.HorizontalRunnerAutoscaler, error) {
copy := hra.DeepCopy()
if hra.Spec.MaxReplicas != nil && len(copy.Spec.CapacityReservations) > *copy.Spec.MaxReplicas {
// We have more reservations than MaxReplicas, meaning that we previously
// could not scale up to meet a capacity demand because we had hit MaxReplicas.
// Therefore, there are reservations that are starved for capacity. We extend the
// expiration time on these starved reservations because the "duration" is meant
// to apply to reservations that have launched replicas, not replicas in the backlog.
// Of course, if MaxReplicas is nil, then there is no max to hit, and we do not need this adjustment.
// See https://github.com/actions/actions-runner-controller/issues/2254 for more context.
// Extend the expiration time of all the reservations not yet assigned to replicas.
//
// Note that we assume that the two scenarios equivalent here.
// The first case is where the number of reservations become greater than MaxReplicas.
// The second case is where MaxReplicas become greater than the number of reservations equivalent.
// Presuming the HRA.spec.scaleTriggers[].duration as "the duration until the reservation expires after a corresponding runner was deployed",
// it's correct.
//
// In other words, we settle on a capacity reservation's ExpirationTime only after the corresponding runner is "about to be" deployed.
// It's "about to be deployed" not "deployed" because we have no way to correlate a capacity reservation and the runner;
// the best we can do here is to simulate the desired behavior by reading MaxReplicas and assuming it will be equal to the number of active runners soon.
//
// Perhaps we could use RunnerDeployment.Status.Replicas or RunnerSet.Status.Replicas instead of the MaxReplicas as a better source of "the number of active runners".
// However, note that the status is not guaranteed to be up-to-date.
// It might not be that easy to decide which is better to use.
for i := *hra.Spec.MaxReplicas; i < len(copy.Spec.CapacityReservations); i++ {
// Let's say maxReplicas=3 and the workflow job of status=completed result in deleting the first capacity reservation
// copy.Spec.CapacityReservations[i] where i=0.
// We are interested in at least four reservations and runners:
// i=0 - already included in the current desired replicas, but may be about to be deleted
// i=1-2 - already included in the current desired replicas
// i=3 - not yet included in the current desired replicas, might have been expired while waiting in the queue
//
// i=3 is especially important here- If we didn't reset the expiration time of this reservation,
// it might expire before it is assigned to a runner, due to the delay between the time the
// expiration timer starts and the time a runner becomes available.
//
// Why is there such delay? Because ARC implements the scale duration and expiration as such.
// The expiration timer starts when the reservation is created, while the runner is created only after
// the corresponding reservation fits within maxReplicas.
//
// We address that, by resetting the expiration time for fourth(i=3 in the above example)
// and subsequent reservations whenever a batch is run (which is when expired reservations get deleted).
// There is no guarantee that all the reservations have the same duration, and even if there were,
// at this point we have lost the reference to the duration that was intended.
// However, we can compute the intended duration from the existing interval.
//
// In other words, updating HRA.spec.scaleTriggers[].duration does not result in delaying capacity reservations expiration any longer
// than the "intended" duration, which is the duration of the trigger when the reservation was created.
duration := copy.Spec.CapacityReservations[i].ExpirationTime.Time.Sub(copy.Spec.CapacityReservations[i].EffectiveTime.Time)
copy.Spec.CapacityReservations[i].EffectiveTime = metav1.Time{Time: now}
copy.Spec.CapacityReservations[i].ExpirationTime = metav1.Time{Time: now.Add(duration)}
}
}
// Now we can filter out any expired reservations from consideration.
// This could leave us with 0 reservations left.
copy.Spec.CapacityReservations = getValidCapacityReservations(copy)
before := len(hra.Spec.CapacityReservations)
expired := before - len(copy.Spec.CapacityReservations)
var added, completed int
for _, scale := range batch.scaleOps {
amount := 1
amount := scale.trigger.Amount
if scale.trigger.Amount != 0 {
amount = scale.trigger.Amount
}
scale.log.V(2).Info("Adding capacity reservation", "amount", amount)
now := time.Now()
// We do not track if a webhook-based scale-down event matches an expired capacity reservation
// or a job for which the scale-up event was never received. This means that scale-down
// events could drive capacity reservations into the negative numbers if we let it.
// We ensure capacity never falls below zero, but that also means that the
// final number of capacity reservations depends on the order in which events come in.
// If capacity is at zero and we get a scale-down followed by a scale-up,
// the scale-down will be ignored and we will end up with a desired capacity of 1.
// However, if we get the scale-up first, the scale-down will drive desired capacity back to zero.
// This could be fixed by matching events' `workflow_job.run_id` with capacity reservations,
// but that would be a lot of work. So for now we allow for some slop, and hope that
// GitHub provides a better autoscaling solution soon.
if amount > 0 {
copy.Spec.CapacityReservations = append(copy.Spec.CapacityReservations, v1alpha1.CapacityReservation{
EffectiveTime: metav1.Time{Time: now},
ExpirationTime: metav1.Time{Time: now.Add(scale.trigger.Duration.Duration)},
Replicas: amount,
})
scale.log.V(2).Info("Adding capacity reservation", "amount", amount)
// Parts of this function require that Spec.CapacityReservations.Replicas always equals 1.
// Enforce that rule no matter what the `amount` value is
for i := 0; i < amount; i++ {
copy.Spec.CapacityReservations = append(copy.Spec.CapacityReservations, v1alpha1.CapacityReservation{
EffectiveTime: metav1.Time{Time: now},
ExpirationTime: metav1.Time{Time: now.Add(scale.trigger.Duration.Duration)},
Replicas: 1,
})
}
added += amount
} else if amount < 0 {
var reservations []v1alpha1.CapacityReservation
scale.log.V(2).Info("Removing capacity reservation", "amount", -amount)
var (
found bool
foundIdx int
)
for i, r := range copy.Spec.CapacityReservations {
r := r
if !found && r.Replicas+amount == 0 {
found = true
foundIdx = i
} else {
// Note that we nil-check max replicas because this "fix" is needed only when there is the upper limit of runners.
// In other words, you don't need to reset effective time and expiration time when there is no max replicas.
// That's because the desired replicas would already contain the reservation since it's creation.
if found && copy.Spec.MaxReplicas != nil && i > foundIdx+*copy.Spec.MaxReplicas {
// Update newer CapacityReservations' time to now to trigger reconcile
// Without this, we might stuck in minReplicas unnecessarily long.
// That is, we might not scale up after an ephemeral runner has been deleted
// until a new scale up, all runners finish, or after DefaultRunnerPodRecreationDelayAfterWebhookScale
// See https://github.com/actions/actions-runner-controller/issues/2254 for more context.
r.EffectiveTime = metav1.Time{Time: now}
// We also reset the scale trigger expiration time, so that you don't need to tweak
// scale trigger duratoin depending on maxReplicas.
// A detailed explanation follows.
//
// Let's say maxReplicas=3 and the workflow job of status=canceled result in deleting the first capacity reservation hence i=0.
// We are interested in at least four reservations and runners:
// i=0 - already included in the current desired replicas, but just got deleted
// i=1-2 - already included in the current desired replicas
// i=3 - not yet included in the current desired replicas, might have been expired while waiting in the queue
//
// i=3 is especially important here- If we didn't reset the expiration time of 3rd reservation,
// it might expire before a corresponding runner is created, due to the delay between the expiration timer starts and the runner is created.
//
// Why is there such delay? Because ARC implements the scale duration and expiration as such...
// The expiration timer starts when the reservation is created, while the runner is created only after the corresponding reservation fits within maxReplicas.
//
// We address that, by resetting the expiration time for fourth(i=3 in the above example) and subsequent reservations when the first reservation gets cancelled.
r.ExpirationTime = metav1.Time{Time: now.Add(scale.trigger.Duration.Duration)}
}
reservations = append(reservations, r)
}
// Remove the requested number of reservations unless there are not that many left
if len(copy.Spec.CapacityReservations) > -amount {
copy.Spec.CapacityReservations = copy.Spec.CapacityReservations[-amount:]
} else {
copy.Spec.CapacityReservations = nil
}
copy.Spec.CapacityReservations = reservations
completed += amount
// This "completed" represents the number of completed and therefore removed runners in this batch,
// which is logged later.
// As the amount is negative for a scale-down trigger, we make the "completed" amount positive by negating the amount.
// That way, the user can see the number of removed runners(like 3), rather than the delta (like -3) in the number of runners.
completed -= amount
}
}
before := len(hra.Spec.CapacityReservations)
expired := before - len(copy.Spec.CapacityReservations)
after := len(copy.Spec.CapacityReservations)
s.Log.V(1).Info(
@@ -234,9 +277,5 @@ func (s *batchScaler) batchScale(ctx context.Context, batch batchScaleOperation)
"after", after,
)
if err := s.Client.Patch(ctx, copy, client.MergeFrom(&hra)); err != nil {
return fmt.Errorf("patching horizontalrunnerautoscaler to add capacity reservation: %w", err)
}
return nil
return copy, nil
}

View File

@@ -0,0 +1,166 @@
package actionssummerwindnet
import (
"context"
"testing"
"time"
"github.com/actions/actions-runner-controller/apis/actions.summerwind.net/v1alpha1"
"github.com/go-logr/logr"
"github.com/stretchr/testify/require"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
func TestPlanBatchScale(t *testing.T) {
s := &batchScaler{Log: logr.Discard()}
var (
expiry = 10 * time.Second
interval = 3 * time.Second
t0 = time.Now()
t1 = t0.Add(interval)
t2 = t1.Add(interval)
)
check := func(t *testing.T, amount int, newExpiry time.Duration, wantReservations []v1alpha1.CapacityReservation) {
t.Helper()
var (
op = batchScaleOperation{
scaleOps: []scaleOperation{
{
log: logr.Discard(),
trigger: v1alpha1.ScaleUpTrigger{
Amount: amount,
Duration: metav1.Duration{Duration: newExpiry},
},
},
},
}
hra = &v1alpha1.HorizontalRunnerAutoscaler{
Spec: v1alpha1.HorizontalRunnerAutoscalerSpec{
MaxReplicas: intPtr(1),
ScaleUpTriggers: []v1alpha1.ScaleUpTrigger{
{
Amount: 1,
Duration: metav1.Duration{Duration: newExpiry},
},
},
CapacityReservations: []v1alpha1.CapacityReservation{
{
EffectiveTime: metav1.NewTime(t0),
ExpirationTime: metav1.NewTime(t0.Add(expiry)),
Replicas: 1,
},
{
EffectiveTime: metav1.NewTime(t1),
ExpirationTime: metav1.NewTime(t1.Add(expiry)),
Replicas: 1,
},
},
},
}
)
want := hra.DeepCopy()
want.Spec.CapacityReservations = wantReservations
got, err := s.planBatchScale(context.Background(), op, hra, t2)
require.NoError(t, err)
require.Equal(t, want, got)
}
t.Run("scale up", func(t *testing.T) {
check(t, 1, expiry, []v1alpha1.CapacityReservation{
{
// This is kept based on t0 because it falls within maxReplicas
// i.e. the corresponding runner has assumbed to be already deployed.
EffectiveTime: metav1.NewTime(t0),
ExpirationTime: metav1.NewTime(t0.Add(expiry)),
Replicas: 1,
},
{
// Updated from t1 to t2 due to this exceeded maxReplicas
EffectiveTime: metav1.NewTime(t2),
ExpirationTime: metav1.NewTime(t2.Add(expiry)),
Replicas: 1,
},
{
// This is based on t2(=now) because it has been added just now.
EffectiveTime: metav1.NewTime(t2),
ExpirationTime: metav1.NewTime(t2.Add(expiry)),
Replicas: 1,
},
})
})
t.Run("scale up reuses previous scale trigger duration for extension", func(t *testing.T) {
newExpiry := expiry + time.Second
check(t, 1, newExpiry, []v1alpha1.CapacityReservation{
{
// This is kept based on t0 because it falls within maxReplicas
// i.e. the corresponding runner has assumbed to be already deployed.
EffectiveTime: metav1.NewTime(t0),
ExpirationTime: metav1.NewTime(t0.Add(expiry)),
Replicas: 1,
},
{
// Updated from t1 to t2 due to this exceeded maxReplicas
EffectiveTime: metav1.NewTime(t2),
ExpirationTime: metav1.NewTime(t2.Add(expiry)),
Replicas: 1,
},
{
// This is based on t2(=now) because it has been added just now.
EffectiveTime: metav1.NewTime(t2),
ExpirationTime: metav1.NewTime(t2.Add(newExpiry)),
Replicas: 1,
},
})
})
t.Run("scale down", func(t *testing.T) {
check(t, -1, expiry, []v1alpha1.CapacityReservation{
{
// Updated from t1 to t2 due to this exceeded maxReplicas
EffectiveTime: metav1.NewTime(t2),
ExpirationTime: metav1.NewTime(t2.Add(expiry)),
Replicas: 1,
},
})
})
t.Run("scale down is not affected by new scale trigger duration", func(t *testing.T) {
check(t, -1, expiry+time.Second, []v1alpha1.CapacityReservation{
{
// Updated from t1 to t2 due to this exceeded maxReplicas
EffectiveTime: metav1.NewTime(t2),
ExpirationTime: metav1.NewTime(t2.Add(expiry)),
Replicas: 1,
},
})
})
// TODO: Keep refreshing the expiry date even when there are no other scale down/up triggers before the expiration
t.Run("extension", func(t *testing.T) {
check(t, 0, expiry, []v1alpha1.CapacityReservation{
{
// This is kept based on t0 because it falls within maxReplicas
// i.e. the corresponding runner has assumbed to be already deployed.
EffectiveTime: metav1.NewTime(t0),
ExpirationTime: metav1.NewTime(t0.Add(expiry)),
Replicas: 1,
},
{
// Updated from t1 to t2 due to this exceeded maxReplicas
EffectiveTime: metav1.NewTime(t2),
ExpirationTime: metav1.NewTime(t2.Add(expiry)),
Replicas: 1,
},
})
})
}

View File

@@ -30,7 +30,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/reconcile"
"github.com/go-logr/logr"
gogithub "github.com/google/go-github/v47/github"
gogithub "github.com/google/go-github/v52/github"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/client-go/tools/record"
ctrl "sigs.k8s.io/controller-runtime"
@@ -115,7 +115,7 @@ func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) Handle(w http.Respons
}()
// respond ok to GET / e.g. for health check
if r.Method == http.MethodGet {
if strings.ToUpper(r.Method) == http.MethodGet {
ok = true
fmt.Fprintln(w, "webhook server is running")
return
@@ -210,13 +210,23 @@ func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) Handle(w http.Respons
if e.GetAction() == "queued" {
target.Amount = 1
break
} else if e.GetAction() == "completed" && e.GetWorkflowJob().GetConclusion() != "skipped" && e.GetWorkflowJob().GetRunnerID() > 0 {
// A negative amount is processed in the tryScale func as a scale-down request,
// that erases the oldest CapacityReservation with the same amount.
// If the first CapacityReservation was with Replicas=1, this negative scale target erases that,
// so that the resulting desired replicas decreases by 1.
target.Amount = -1
break
} else if e.GetAction() == "completed" && e.GetWorkflowJob().GetConclusion() != "skipped" {
// We want to filter out "completed" events sent by check runs.
// See https://github.com/actions/actions-runner-controller/issues/2118
// and https://github.com/actions/actions-runner-controller/pull/2119
// But canceled events have runner_id == 0 and GetRunnerID() returns 0 when RunnerID == nil,
// so we need to be more specific in filtering out the check runs.
// See example check run completion at https://gist.github.com/nathanklick/268fea6496a4d7b14cecb2999747ef84
if e.GetWorkflowJob().GetConclusion() == "success" && e.GetWorkflowJob().RunnerID == nil {
log.V(1).Info("Ignoring workflow_job event because it does not relate to a self-hosted runner")
} else {
// A negative amount is processed in the tryScale func as a scale-down request,
// that erases the oldest CapacityReservation with the same amount.
// If the first CapacityReservation was with Replicas=1, this negative scale target erases that,
// so that the resulting desired replicas decreases by 1.
target.Amount = -1
break
}
}
// If the conclusion is "skipped", we will ignore it and fallthrough to the default case.
fallthrough

View File

@@ -14,7 +14,7 @@ import (
actionsv1alpha1 "github.com/actions/actions-runner-controller/apis/actions.summerwind.net/v1alpha1"
"github.com/go-logr/logr"
"github.com/google/go-github/v47/github"
"github.com/google/go-github/v52/github"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
@@ -376,19 +376,24 @@ func TestGetRequest(t *testing.T) {
func TestGetValidCapacityReservations(t *testing.T) {
now := time.Now()
duration, _ := time.ParseDuration("10m")
effectiveTime := now.Add(-duration)
hra := &actionsv1alpha1.HorizontalRunnerAutoscaler{
Spec: actionsv1alpha1.HorizontalRunnerAutoscalerSpec{
CapacityReservations: []actionsv1alpha1.CapacityReservation{
{
EffectiveTime: metav1.Time{Time: effectiveTime.Add(-time.Second)},
ExpirationTime: metav1.Time{Time: now.Add(-time.Second)},
Replicas: 1,
},
{
EffectiveTime: metav1.Time{Time: effectiveTime},
ExpirationTime: metav1.Time{Time: now},
Replicas: 2,
},
{
EffectiveTime: metav1.Time{Time: effectiveTime.Add(time.Second)},
ExpirationTime: metav1.Time{Time: now.Add(time.Second)},
Replicas: 3,
},

View File

@@ -8,7 +8,7 @@ import (
"time"
github2 "github.com/actions/actions-runner-controller/github"
"github.com/google/go-github/v47/github"
"github.com/google/go-github/v52/github"
"github.com/actions/actions-runner-controller/github/fake"

View File

@@ -91,7 +91,7 @@ func TestNewRunnerPod(t *testing.T) {
},
},
{
Name: "docker-sock",
Name: "var-run",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{
Medium: corev1.StorageMediumMemory,
@@ -155,7 +155,7 @@ func TestNewRunnerPod(t *testing.T) {
},
{
Name: "DOCKER_HOST",
Value: "unix:///run/docker/docker.sock",
Value: "unix:///run/docker.sock",
},
},
VolumeMounts: []corev1.VolumeMount{
@@ -168,8 +168,8 @@ func TestNewRunnerPod(t *testing.T) {
MountPath: "/runner/_work",
},
{
Name: "docker-sock",
MountPath: "/run/docker",
Name: "var-run",
MountPath: "/run",
},
},
ImagePullPolicy: corev1.PullAlways,
@@ -180,7 +180,7 @@ func TestNewRunnerPod(t *testing.T) {
Image: "default-docker-image",
Args: []string{
"dockerd",
"--host=unix:///run/docker/docker.sock",
"--host=unix:///run/docker.sock",
"--group=$(DOCKER_GROUP_GID)",
},
Env: []corev1.EnvVar{
@@ -195,8 +195,8 @@ func TestNewRunnerPod(t *testing.T) {
MountPath: "/runner",
},
{
Name: "docker-sock",
MountPath: "/run/docker",
Name: "var-run",
MountPath: "/run",
},
{
Name: "work",
@@ -543,7 +543,7 @@ func TestNewRunnerPod(t *testing.T) {
},
},
{
Name: "docker-sock",
Name: "var-run",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{
Medium: corev1.StorageMediumMemory,
@@ -562,8 +562,8 @@ func TestNewRunnerPod(t *testing.T) {
MountPath: "/runner",
},
{
Name: "docker-sock",
MountPath: "/run/docker",
Name: "var-run",
MountPath: "/run",
},
}
}),
@@ -587,7 +587,7 @@ func TestNewRunnerPod(t *testing.T) {
},
},
{
Name: "docker-sock",
Name: "var-run",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{
Medium: corev1.StorageMediumMemory,
@@ -676,7 +676,7 @@ func TestNewRunnerPodFromRunnerController(t *testing.T) {
},
},
{
Name: "docker-sock",
Name: "var-run",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{
Medium: corev1.StorageMediumMemory,
@@ -740,7 +740,7 @@ func TestNewRunnerPodFromRunnerController(t *testing.T) {
},
{
Name: "DOCKER_HOST",
Value: "unix:///run/docker/docker.sock",
Value: "unix:///run/docker.sock",
},
{
Name: "RUNNER_NAME",
@@ -761,8 +761,8 @@ func TestNewRunnerPodFromRunnerController(t *testing.T) {
MountPath: "/runner/_work",
},
{
Name: "docker-sock",
MountPath: "/run/docker",
Name: "var-run",
MountPath: "/run",
},
},
ImagePullPolicy: corev1.PullAlways,
@@ -773,7 +773,7 @@ func TestNewRunnerPodFromRunnerController(t *testing.T) {
Image: "default-docker-image",
Args: []string{
"dockerd",
"--host=unix:///run/docker/docker.sock",
"--host=unix:///run/docker.sock",
"--group=$(DOCKER_GROUP_GID)",
},
Env: []corev1.EnvVar{
@@ -788,8 +788,8 @@ func TestNewRunnerPodFromRunnerController(t *testing.T) {
MountPath: "/runner",
},
{
Name: "docker-sock",
MountPath: "/run/docker",
Name: "var-run",
MountPath: "/run",
},
{
Name: "work",
@@ -1149,8 +1149,8 @@ func TestNewRunnerPodFromRunnerController(t *testing.T) {
MountPath: "/runner/_work",
},
{
Name: "docker-sock",
MountPath: "/run/docker",
Name: "var-run",
MountPath: "/run",
},
},
},
@@ -1170,7 +1170,7 @@ func TestNewRunnerPodFromRunnerController(t *testing.T) {
},
},
{
Name: "docker-sock",
Name: "var-run",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{
Medium: corev1.StorageMediumMemory,
@@ -1186,8 +1186,8 @@ func TestNewRunnerPodFromRunnerController(t *testing.T) {
MountPath: "/runner/_work",
},
{
Name: "docker-sock",
MountPath: "/run/docker",
Name: "var-run",
MountPath: "/run",
},
{
Name: "runner",
@@ -1219,7 +1219,7 @@ func TestNewRunnerPodFromRunnerController(t *testing.T) {
},
},
{
Name: "docker-sock",
Name: "var-run",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{
Medium: corev1.StorageMediumMemory,

View File

@@ -20,6 +20,7 @@ import (
"context"
"errors"
"fmt"
"k8s.io/apimachinery/pkg/api/resource"
"reflect"
"strconv"
"strings"
@@ -30,7 +31,6 @@ import (
"github.com/go-logr/logr"
kerrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/api/resource"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/client-go/tools/record"
ctrl "sigs.k8s.io/controller-runtime"
@@ -607,10 +607,13 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
if runnerSpec.ContainerMode == "kubernetes" {
return pod, errors.New("volume mount \"work\" should be specified by workVolumeClaimTemplate in container mode kubernetes")
}
// remove work volume since it will be provided from runnerSpec.Volumes
// if we don't remove it here we would get a duplicate key error, i.e. two volumes named work
_, index := workVolumeMountPresent(pod.Spec.Containers[0].VolumeMounts)
pod.Spec.Containers[0].VolumeMounts = append(pod.Spec.Containers[0].VolumeMounts[:index], pod.Spec.Containers[0].VolumeMounts[index+1:]...)
podSpecIsPresent, index := workVolumeMountPresent(pod.Spec.Containers[0].VolumeMounts)
if podSpecIsPresent {
// remove work volume since it will be provided from runnerSpec.Volumes
// if we don't remove it here we would get a duplicate key error, i.e. two volumes named work
pod.Spec.Containers[0].VolumeMounts = append(pod.Spec.Containers[0].VolumeMounts[:index], pod.Spec.Containers[0].VolumeMounts[index+1:]...)
}
}
pod.Spec.Containers[0].VolumeMounts = append(pod.Spec.Containers[0].VolumeMounts, runnerSpec.VolumeMounts...)
@@ -623,11 +626,13 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
if runnerSpec.ContainerMode == "kubernetes" {
return pod, errors.New("volume \"work\" should be specified by workVolumeClaimTemplate in container mode kubernetes")
}
_, index := workVolumePresent(pod.Spec.Volumes)
// remove work volume since it will be provided from runnerSpec.Volumes
// if we don't remove it here we would get a duplicate key error, i.e. two volumes named work
pod.Spec.Volumes = append(pod.Spec.Volumes[:index], pod.Spec.Volumes[index+1:]...)
podSpecIsPresent, index := workVolumePresent(pod.Spec.Volumes)
if podSpecIsPresent {
// remove work volume since it will be provided from runnerSpec.Volumes
// if we don't remove it here we would get a duplicate key error, i.e. two volumes named work
pod.Spec.Volumes = append(pod.Spec.Volumes[:index], pod.Spec.Volumes[index+1:]...)
}
}
pod.Spec.Volumes = append(pod.Spec.Volumes, runnerSpec.Volumes...)
@@ -778,6 +783,11 @@ func newRunnerPodWithContainerMode(containerMode string, template corev1.Pod, ru
useRunnerStatusUpdateHook = d.UseRunnerStatusUpdateHook
)
const (
varRunVolumeName = "var-run"
varRunVolumeMountPath = "/run"
)
if containerMode == "kubernetes" {
dockerdInRunner = false
dockerEnabled = false
@@ -805,6 +815,11 @@ func newRunnerPodWithContainerMode(containerMode string, template corev1.Pod, ru
dockerRegistryMirror = *runnerSpec.DockerRegistryMirror
}
if runnerSpec.DockerVarRunVolumeSizeLimit == nil {
runnerSpec.DockerVarRunVolumeSizeLimit = resource.NewScaledQuantity(1, resource.Mega)
}
// Be aware some of the environment variables are used
// in the runner entrypoint script
env := []corev1.EnvVar{
@@ -1020,7 +1035,7 @@ func newRunnerPodWithContainerMode(containerMode string, template corev1.Pod, ru
// explicitly invoke `dockerd` to avoid automatic TLS / TCP binding
dockerdContainer.Args = append([]string{
"dockerd",
"--host=unix:///run/docker/docker.sock",
"--host=unix:///run/docker.sock",
}, dockerdContainer.Args...)
// this must match a GID for the user in the runner image
@@ -1054,7 +1069,7 @@ func newRunnerPodWithContainerMode(containerMode string, template corev1.Pod, ru
runnerContainer.Env = append(runnerContainer.Env,
corev1.EnvVar{
Name: "DOCKER_HOST",
Value: "unix:///run/docker/docker.sock",
Value: "unix:///run/docker.sock",
},
)
@@ -1071,11 +1086,11 @@ func newRunnerPodWithContainerMode(containerMode string, template corev1.Pod, ru
pod.Spec.Volumes = append(pod.Spec.Volumes,
corev1.Volume{
Name: "docker-sock",
Name: varRunVolumeName,
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{
Medium: corev1.StorageMediumMemory,
SizeLimit: resource.NewScaledQuantity(1, resource.Mega),
SizeLimit: runnerSpec.DockerVarRunVolumeSizeLimit,
},
},
},
@@ -1090,11 +1105,11 @@ func newRunnerPodWithContainerMode(containerMode string, template corev1.Pod, ru
)
}
if ok, _ := volumeMountPresent("docker-sock", runnerContainer.VolumeMounts); !ok {
if ok, _ := volumeMountPresent(varRunVolumeName, runnerContainer.VolumeMounts); !ok {
runnerContainer.VolumeMounts = append(runnerContainer.VolumeMounts,
corev1.VolumeMount{
Name: "docker-sock",
MountPath: "/run/docker",
Name: varRunVolumeName,
MountPath: varRunVolumeMountPath,
},
)
}
@@ -1108,10 +1123,10 @@ func newRunnerPodWithContainerMode(containerMode string, template corev1.Pod, ru
},
}
if p, _ := volumeMountPresent("docker-sock", dockerdContainer.VolumeMounts); !p {
if p, _ := volumeMountPresent(varRunVolumeName, dockerdContainer.VolumeMounts); !p {
dockerVolumeMounts = append(dockerVolumeMounts, corev1.VolumeMount{
Name: "docker-sock",
MountPath: "/run/docker",
Name: varRunVolumeName,
MountPath: varRunVolumeMountPath,
})
}

View File

@@ -9,7 +9,7 @@ import (
"github.com/actions/actions-runner-controller/github"
"github.com/go-logr/logr"
gogithub "github.com/google/go-github/v47/github"
gogithub "github.com/google/go-github/v52/github"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
ctrl "sigs.k8s.io/controller-runtime"

View File

@@ -14,7 +14,7 @@ You can create workflows that build and test every pull request to your reposito
Runners execute the job that is assigned to them by Github Actions workflow. There are two types of Runners:
- [Github-hosted runners](https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners) - GitHub provides Linux, Windows, and macOS virtual machines to run your workflows. These virtual machines are hosted in the cloud by Github.
- [Self-hosted runners](https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners) - you can host your own self-hosted runners in your own data center or cloud infrastructure. ARC deploys self-hosted runners.
- [Self-hosted runners](https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners) - you can host your own self-hosted runners in your own data center or cloud infrastructure. ARC deploys self-hosted runners.
## Self hosted runners
Self-hosted runners offer more control of hardware, operating system, and software tools than GitHub-hosted runners. With self-hosted runners, you can create custom hardware configurations that meet your needs with processing power or memory to run larger jobs, install software available on your local network, and choose an operating system not offered by GitHub-hosted runners.
@@ -83,7 +83,7 @@ The GitHub hosted runners include a large amount of pre-installed software packa
ARC maintains a few runner images with `latest` aligning with GitHub's Ubuntu version. These images do not contain all of the software installed on the GitHub runners. They contain subset of packages from the GitHub runners: Basic CLI packages, git, docker and build-essentials. To install additional software, it is recommended to use the corresponding setup actions. For instance, `actions/setup-java` for Java or `actions/setup-node` for Node.
## Executing workflows
Now, all the setup and configuration is done. A workflow can be created in the same repository that could target the self hosted runner created from ARC. The workflow needs to have `runs-on: self-hosted` so it can target the self host pool. For more information on targeting workflows to run on self hosted runners, see "[Using Self-hosted runners](https://docs.github.com/en/actions/hosting-your-own-runners/using-self-hosted-runners-in-a-workflow)."
Now, all the setup and configuration is done. A workflow can be created in the same repository that could target the self hosted runner created from ARC. The workflow needs to have `runs-on: self-hosted` so it can target the self host pool. For more information on targeting workflows to run on self hosted runners, see "[Using Self-hosted runners](https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/using-self-hosted-runners-in-a-workflow)."
## Scaling runners - statically with replicas count
With a small tweak to the replicas count (for eg - `replicas: 2`) in the `runnerdeployment.yaml` file, more runners can be created. Depending on the count of replicas, those many sets of pods would be created. As before, Each pod contains the two containers.

View File

@@ -10,7 +10,7 @@ When a new [runner](https://github.com/actions/runner) version is released, new
images need to be built in
[actions-runner-controller/releases](https://github.com/actions-runner-controller/releases).
This is currently started by the
[release-runners](https://github.com/actions/actions-runner-controller/blob/master/.github/workflows/release-runners.yaml)
[release-runners](https://github.com/actions/actions-runner-controller/blob/master/.github/workflows/arc-release-runners.yaml)
workflow, although this only starts when the set of file containing the runner
version is updated (and this is currently done manually).
@@ -19,7 +19,7 @@ version is updated (and this is currently done manually).
We can have another workflow running on a cadence (hourly seems sensible) and checking for new runner
releases, creating a PR updating `RUNNER_VERSION` in:
- `.github/workflows/release-runners.yaml`
- `.github/workflows/arc-release-runners.yaml`
- `Makefile`
- `runner/Makefile`
- `test/e2e/e2e_test.go`

View File

@@ -26,7 +26,7 @@ At the moment we have three workflows that validate Go code:
- [Validate ARC](https://github.com/actions/actions-runner-controller/blob/01e9dd3/.github/workflows/validate-arc.yaml):
this is a bit of a catch-all workflow, other than Go tests this also validates
Kubernetes manifests, runs `go generate`, `go fmt` and `go vet`
- [Run CodeQL](https://github.com/actions/actions-runner-controller/blob/a095f0b66aad5fbc8aa8d7032f3299233e4c84d2/.github/workflows/run-codeql.yaml)
- [Run CodeQL](https://github.com/actions/actions-runner-controller/blob/master/.github/workflows/global-run-codeql.yaml)
### Proposal

View File

@@ -0,0 +1,213 @@
# Exposing metrics
Date: 2023-05-08
**Status**: Proposed
## Context
Prometheus metrics are a common way to monitor the cluster. Providing metrics
can be a helpful way to monitor scale sets and the health of the ephemeral runners.
## Proposal
Two main components are driving the behavior of the scale set:
1. ARC controllers responsible for managing Kubernetes resources.
2. The `AutoscalingListener`, driver of the autoscaling solution responsible for
describing the desired state.
We can approach publishing those metrics in 3 different ways
### Option 1: Expose a metrics endpoint for the controller-manager and every instance of the listener
To expose metrics, we would need to create 3 additional resources:
1. `ServiceMonitor` - a resource used by Prometheus to match namespaces and
services from where it needs to gather metrics
2. `Service` for the `gha-runner-scale-set-controller` - service that will
target ARC controller `Deployment`
3. `Service` for each `gha-runner-scale-set` listener - service that will target
a single listener pod for each `AutoscalingRunnerSet`
#### Pros
- Easy to control which scale set exposes metrics and which does not.
- Easy to implement using helm charts in case they are enabled per chart
installation.
#### Cons
- With a cluster running many scale sets, we are going to create a lot of
resources.
- In case metrics are enabled on the controller manager level, and they should
be applied across all `AutoscalingRunnerSets`, it is difficult to inherit this
configuration by applying helm charts.
### Option 2: Create a single metrics aggregator service
To create an aggregator service, we can create a simple web application
responsible for publishing and gathering metrics. All listeners would be
responsible to communicate the metrics on each message, and controllers are
responsible to communicate the metrics on each reconciliation.
The application can be executed as a single pod, or as a side container next to
the manager.
#### Running the aggregator as a container in the controller-manager pod
**Pros:**
- It exists side by side and is following the life cycle of the controller
manager
- We don't need to introduce another controller managing the state of the pod
**Cons**
- Crashes of the aggregator can influence the controller manager execution
- The controller manager pod needs more resources to run
#### Running the aggregator in a separate pod
**Pros**
- Does not influence the controller manager pod
- The life cycle of the metric can be controlled by the controller manager (by
implementing another controller)
**Cons**
- We need to implement the controller that can spin up the aggregator in case of
the crash.
- If we choose not to implement the controller, the resource like `Deployment`
can be used to manage the aggregator, but we lose control over its life cycle.
#### Metrics webserver requirements
1. Create a web server with a single `/metrics` endpoint. The endpoint will have
`POST` and `GET` methods registered. The `GET` is used by Prometheus to
fetch the metrics, while the `POST` is going to be used by controllers and
listeners to publish their metrics.
2. `ServiceMonitor` - to target the metrics aggregator service
3. `Service` sitting in front of the web server.
**Pros**
- This implementation requires a few additional resources to be created
in a cluster.
- Web server is easy to implement and easy to document - all metrics are aggregated in a
single package, and the web server only needs to apply them to its state on
`POST`. The `GET` handler is simple.
- We can avoid Pushgateway from Prometheus.
**Cons**
- Another image that we need to publish on release.
- Change in metric configuration (on manager update) would require re-creation
of all listeners. This is not a big problem but is something to point out.
- Managing requests/limits can be tricky.
### Option 3: Use a Prometheus Pushgateway
#### Pros
- Using a supported way of pushing the metrics.
- Easy to implement using their library.
#### Cons
- In the Prometheus docs, they specify that: "Usually, the only valid use case
for Pushgateway is for capturing the outcome of a service-level batch job".
The listener does not really fit this criteria.
- Pushgateway is a single point of failure and potential bottleneck.
- You lose Prometheus's automatic instance health monitoring via the up metric (generated on every scrape).
- The Pushgateway never forgets series pushed to it and will expose them to Prometheus forever unless those series are manually deleted via the Pushgateway's API.
## Decision
Since there are many ways in which you can collect metrics, we have decided not
to apply `prometheus-operator` resources nor `Service`.
The responsibility of the controller and the autoscaling listener is
only to expose metrics. It is up to the user to decide how to collect them.
When installing the ARC, the configuration for both the controller manager
and autoscaling listeners' metric servers is established.
### Controller metrics
By default, metrics server is listening on `0.0.0.0:8080`.
You can control the port of the metrics server using the `--metrics-addr` flag.
Metrics can be collected from `/metrics` endpoint
If the value of `--metrics-addr` is an empty string, metrics server won't be
started.
### Autoscaling listeners
By default, metrics server is listening on `0.0.0.0:8080`.
The endpoint used to expose metrics is `/metrics`.
You can control both the address and the endpoint using `--listener-metrics-addr` and `--listener-metrics-endpoint` flags.
If the value of `--listener-metrics-addr` is an empty string, metrics server won't be
started.
### Metrics exposed by the controller
To get a better understanding of health and workings of the cluster
resources, we need to expose the following metrics:
- `pending_ephemeral_runners` - Number of ephemeral runners in a pending state.
This information can show the latency between creating an `EphemeralRunner`
resource, and having an ephemeral runner pod started and ready to receive a
job.
- `running_ephemeral_runners` - Number of ephemeral runners currently running.
This information is helpful to see how many ephemeral runner pods are running
at any given time.
- `failed_ephemeral_runners` - Number of ephemeral runners in a `Failed` state.
This information is helpful to catch the faulty image, or some underlying
problem. When the ephemeral runner controller is not able to start the
ephemeral runner pod after multiple retries, it will set the state of the
`EphemeralRunner` to failed. Since the controller can not recover from this
state, it can be useful to set Prometheus alerts to catch this issue quickly.
### Metrics exposed by the `AutoscalingListener`
Since the listener is responsible for communicating the state with the actions
service, it can expose actions service related data through metrics. In
particular:
- `available_jobs` - Number of jobs with `runs-on` matching the runner scale set name. Jobs are not yet assigned but are acquired by the runner scale set.
- `acquired_jobs`- Number of jobs acquired by the scale set.
- `assigned_jobs` - Number of jobs assigned to this scale set.
- `running_jobs` - Number of jobs running (or about to be run).
- `registered_runners` - Number of registered runners.
- `busy_runners` - Number of registered runners running a job.
- `min_runners` - Number of runners desired by the scale set.
- `max_runners` - Number of runners desired by the scale set.
- `desired_runners` - Number of runners desired by the scale set.
- `idle_runners` - Number of registered runners not running a job.
- `available_jobs_total` - Total number of jobs available for the scale set (runs-on matches and scale set passes all the runner group permission checks).
- `acquired_jobs_total` - Total number of jobs acquired by the scale set.
- `assigned_jobs_total` - Total number of jobs assigned to the scale set.
- `started_jobs_total` - Total number of jobs started.
- `completed_jobs_total` - Total number of jobs completed.
- `job_queue_duration_seconds` - Time spent waiting for workflow jobs to get assigned to the scale set after queueing (in seconds).
- `job_startup_duration_seconds` - Time spent waiting for a workflow job to get started on the runner owned by the scale set (in seconds).
- `job_execution_duration_seconds` - Time spent executing workflow jobs by the scale set (in seconds).
### Metric names
Listener metrics belong to the `github_runner_scale_set` subsystem, so the names
are going to have the `github_runner_scale_set_` prefix.
Controller metrics belong to the `github_runner_scale_set_controller` subsystem,
so the names are going to have `github_runner_scale_set_controller` prefix.
## Consequences
Users can define alerts, monitor the behavior of both the actions-based metrics
(gathered from the listener) and the Kubernetes resource-based metrics
(gathered from the controller manager).

View File

@@ -0,0 +1,54 @@
# Customize listener pod
**Status**: Proposed
## Context
The Autoscaling listener is a critical component of the autoscaling solution, as it monitors the workload and communicates with the ephemeral runner set to adjust the number of runners as needed.
The problem can arise when cluster policies are configured to disallow pods with (or without) certain fields. Since the Autoscaling listener pod is an internal detail that is not currently customizable, it can be a blocker for users with Kubernetes clusters that enforce such policies.
## Decision
Expose field on the `AutoscalingRunnerSetSpec` resource called `ListenerTemplate` of type `PodTemplateSpec`.
Expose field on the `AutoscalingListenerSpec` resource called `Template` of type `PodTemplateSpec`.
The `AutoscalingRunnerSetController` is responsible for creating the `AutoscalingListener` with the `ListenerTemplate`.
The `AutoscalingListenerController` then creates the listener pod based on the default spec, and the customized spec.
List of fields that are going to be ignored by the merge:
- `spec.serviceAccountName`: Created by the AutoscalingListener.
- reserved `metadata.labels`: Labels that collide with reserved labels used by the system are ignored.
- reserved `spec.containers[0].env`: Environment variables used by the listener application
- `metadata.name`: Name of the listener pod
- `metadata.namespace`: Namespace of the listener pod
The change is extending the gha-runner-scale-set template. We will extend `values.yaml` file to add `listenerTemplate` object that is optional.
If not provided, the listener will be created the same way it was created before. Otherwise, the `listenerTemplate.metadata` and the `listenerTemplate.spec` are going to be merged with the default listener specification.
The way the merge will work is:
1. Create a default spec used for the listener
2. All non-reserved fields are going to be applied from the provided `listenerTemplate` if they are not empty. If empty, the default configuration is used.
3. For the container:
1. If the container name is "listener", values specified for that container are going to be merged with the default listener container spec. The name "listener" serves just as an indicator that the container spec should be merged with the listener container. Name will be overwritten by the controller. All fields are optional, and non-null fields are merged as described above.
2. If the container name is **not** "listener", the spec provided for that container will be appended to the `pod.spec.containers` without any modifications. Fields that must be specified are the required fields for the kubernetes container spec.
### Pros:
- Env `CONTROLLER_MANAGER_LISTENER_IMAGE_PULL_POLICY` can be removed as a global configuration for building Autoscaling listener resources
- Ability to customize securityContext, requests, limits, and other fields.
- Avoid re-creating CRDs whenever a new field requirement occurs. Fields that are not reserved by the controller are applied if specified.
### Cons:
- Keep the documentation updated when new reserved fields are introduced.
- Since the listener spec can be customized, debugging possible problems with customized spec can be harder.
## Consequences
With the listener pod spec exposed, we can provide a way to run ARC for users with policies prohibiting them to do so at this moment.

View File

@@ -22,7 +22,7 @@ _Note: Links are provided further down to create an app for your logged in user
* Actions (read)
* Administration (read / write)
* Checks (read) (if you are going to use [Webhook Driven Scaling](#webhook-driven-scaling))
* Checks (read) (if you are going to use [Webhook Driven Scaling](automatically-scaling-runners.md#webhook-driven-scaling))
* Metadata (read)
**Required Permissions for Organization Runners:**<br />
@@ -39,7 +39,7 @@ _Note: All API routes mapped to their permissions can be found [here](https://do
**Subscribe to events**
At this point you have a choice of configuring a webhook, a webhook is needed if you are going to use [webhook driven scaling](#webhook-driven-scaling). The webhook can be configured centrally in the GitHub app itself or separately. In either case you need to subscribe to the `Workflow Job` event.
At this point you have a choice of configuring a webhook, a webhook is needed if you are going to use [webhook driven scaling](automatically-scaling-runners.md#webhook-driven-scaling). The webhook can be configured centrally in the GitHub app itself or separately. In either case you need to subscribe to the `Workflow Job` event.
---

View File

@@ -17,7 +17,7 @@ This anti-flap configuration also has the final say on if a runner can be scaled
This delay is configurable via 2 methods:
1. By setting a new default via the controller's `--default-scale-down-delay` flag
2. By setting by setting the attribute `scaleDownDelaySecondsAfterScaleOut:` in a `HorizontalRunnerAutoscaler` kind's `spec:`.
2. By setting the attribute `scaleDownDelaySecondsAfterScaleOut:` in a `HorizontalRunnerAutoscaler` kind's `spec:`.
Below is a complete basic example of one of the pull driven scaling metrics.
@@ -133,7 +133,7 @@ The `HorizontalRunnerAutoscaler` will poll GitHub for the number of runners in t
**Benefits of this metric**
1. Supports named repositories server-side the same as the `TotalNumberOfQueuedAndInProgressWorkflowRuns` metric [#313](https://github.com/actions/actions-runner-controller/pull/313)
2. Supports GitHub organization wide scaling without maintaining an explicit list of repositories, this is especially useful for those that are working at a larger scale. [#223](https://github.com/actions/actions-runner-controller/pull/223)
3. Like all scaling metrics, you can manage workflow allocation to the RunnerDeployment through the use of [GitHub labels](#runner-labels)
3. Like all scaling metrics, you can manage workflow allocation to the RunnerDeployment through the use of [GitHub labels](using-arc-runners-in-a-workflow.md#runner-labels)
4. Supports scaling desired runner count on both a percentage increase / decrease basis as well as on a fixed increase / decrease count basis [#223](https://github.com/actions/actions-runner-controller/pull/223) [#315](https://github.com/actions/actions-runner-controller/pull/315)
**Drawbacks of this metric**
@@ -186,6 +186,38 @@ spec:
scaleDownAdjustment: 1 # The scale down runner count subtracted from the desired count
```
**Combining Pull Driven Scaling Metrics**
If a HorizontalRunnerAutoscaler is configured with a secondary metric of `TotalNumberOfQueuedAndInProgressWorkflowRuns`, then be aware that the controller will check the primary metric of `PercentageRunnersBusy` first and will only use the secondary metric to calculate the desired replica count if the primary metric returns 0 desired replicas.
`PercentageRunnersBusy` metrics must appear before `TotalNumberOfQueuedAndInProgressWorkflowRuns`; otherwise, the controller will fail to process the `HorizontalRunnerAutoscaler`. A valid configuration follows.
```yaml
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
name: example-runner-deployment-autoscaler
spec:
scaleTargetRef:
kind: RunnerDeployment
# # In case the scale target is RunnerSet:
# kind: RunnerSet
name: example-runner-deployment
minReplicas: 1
maxReplicas: 5
metrics:
- type: PercentageRunnersBusy
scaleUpThreshold: '0.75' # The percentage of busy runners at which the number of desired runners are re-evaluated to scale up
scaleDownThreshold: '0.3' # The percentage of busy runners at which the number of desired runners are re-evaluated to scale down
scaleUpAdjustment: 2 # The scale up runner count added to desired count
scaleDownAdjustment: 1 # The scale down runner count subtracted from the desired count
- type: TotalNumberOfQueuedAndInProgressWorkflowRuns
repositoryNames:
# A repository name is the REPO part of `github.com/OWNER/REPO`
- myrepo
```
## Webhook Driven Scaling
> This feature requires controller version => [v0.20.0](https://github.com/actions/actions-runner-controller/releases/tag/v0.20.0)
@@ -224,34 +256,54 @@ spec:
duration: "30m"
```
The lifecycle of a runner provisioned from a webhook is different to a runner provisioned from the pull based scaling method:
With the `workflowJob` trigger, each event adds or subtracts a single runner. the `scaleUpTriggers.amount` field is ignored.
The `duration` field is there because event delivery is not guaranteed. If a scale-up event is received, but the corresponding
scale-down event is not, then the extra runner would be left running forever if there were not some clean-up mechanism.
The `duration` field sets the maximum amount of time to wait for a scale-down event. Scale-down happens at the
earlier of receiving the scale-down event or the expiration of `duration` after the scale-up event is processed and
the scale-up itself is initiated.
The lifecycle of a runner provisioned from a webhook is different from that of a runner provisioned from the pull based scaling method:
1. GitHub sends a `workflow_job` event to ARC with `status=queued`
2. ARC finds a HRA with a `workflow_job` webhook scale trigger that backs a RunnerDeployment / RunnerSet with matching runner labels
3. The matched HRA adds a unit to its `capacityReservations` list
4. ARC adds a replica and sets the EffectiveTime of that replica to current + `HRA.spec.scaleUpTriggers[].duration`
2. ARC finds the HRA with a `workflow_job` webhook scale trigger that backs a RunnerDeployment / RunnerSet with matching runner labels. (If it finds more than one match, the event is ignored.)
3. The matched HRA adds a `capacityReservation` to its list and sets it to expire at current time + `HRA.spec.scaleUpTriggers[].duration`
4. If there are fewer replicas running than `maxReplicas`, HRA adds a replica and sets the EffectiveTime of that replica to the current time
At this point there are a few things that can happen, either the job gets allocated to the runner or the runner is left dangling due to it not being used, if the runner gets assigned the job that triggered the scale up the lifecycle looks like this:
At this point there are a few things that can happen:
1. Due to idle runners already being available, the job is assigned to one of them and the new runner is left dangling due to it not being used
2. The job gets allocated to the runner just launched
3. If there are already `maxReplicas` replicas running, the job waits for its `capacityReservation` to be assigned to one of them
If the runner gets assigned the job that triggered the scale up, the lifecycle looks like this:
1. The new runner gets allocated the job and processes it
2. Upon the job ending GitHub sends another `workflow_job` event to ARC but with `status=completed`
3. The HRA removes the oldest capacity reservation from its `capacityReservations` and picks a runner to terminate ensuring it isn't busy via the GitHub API beforehand
If the job has to wait for a runner because there are already `maxReplicas` replicas running, the lifecycle looks like this:
1. A `capacityReservation` is added to the list, but no scale-up happens because that would exceed `maxReplicas`
2. When one of the existing runners finishes a job, GitHub sends another `workflow_job` event to ARC but with `status=completed` (or `status=canceled` if the job was cancelled)
3. The HRA removes the oldest capacity reservation from its `capacityReservations`, the oldest waiting `capacityReservation` becomes active, and its `duration` timer starts
4. GitHub assigns a waiting job to the newly available runner
If the job is cancelled before it is allocated to a runner then the lifecycle looks like this:
1. Upon the job cancellation GitHub sends another `workflow_job` event to ARC but with `status=cancelled`
2. The HRA removes the oldest capacity reservation from its `capacityReservations` and picks a runner to terminate ensuring it isn't busy via the GitHub API beforehand
If runner is never used due to other runners matching needed runner group and required runner labels are allocated the job then the lifecycle looks like this:
If the `status=completed` or `status=cancelled` is never delivered to ARC (which happens occasionally) then the lifecycle looks like this:
1. The scale trigger duration specified via `HRA.spec.scaleUpTriggers[].duration` elapses
2. The HRA thinks the capacity reservation is expired, removes it from HRA's `capacityReservations` and terminates the expired runner ensuring it isn't busy via the GitHub API beforehand
2. The HRA notices that the capacity reservation has expired, removes it from HRA's `capacityReservation` list and (unless there are `maxReplicas` running and jobs waiting) terminates the expired runner ensuring it isn't busy via the GitHub API beforehand
Your `HRA.spec.scaleUpTriggers[].duration` value should be set long enough to account for the following things:
1. the potential amount of time it could take for a pod to become `Running` e.g. you need to scale horizontally because there isn't a node avaliable
2. the amount of time it takes for GitHub to allocate a job to that runner
3. the amount of time it takes for the runner to notice the allocated job and starts running it
1. The potential amount of time it could take for a pod to become `Running` e.g. you need to scale horizontally because there isn't a node available +
2. The amount of time it takes for GitHub to allocate a job to that runner +
3. The amount of time it takes for the runner to notice the allocated job and starts running it +
4. The length of time it takes for the runner to complete the job
### Install with Helm
@@ -411,8 +463,6 @@ The main use case for scaling from 0 is with the `HorizontalRunnerAutoscaler` ki
`PercentageRunnersBusy` can't be used alone for scale-from-zero as, by its definition, it needs one or more GitHub runners to become `busy` to be able to scale. If there isn't a runner to pick up a job and enter a `busy` state then the controller will never know to provision a runner to begin with as this metric has no knowledge of the job queue and is relying on using the number of busy runners as a means for calculating the desired replica count.
If a HorizontalRunnerAutoscaler is configured with a secondary metric of `TotalNumberOfQueuedAndInProgressWorkflowRuns` then be aware that the controller will check the primary metric of `PercentageRunnersBusy` first and will only use the secondary metric to calculate the desired replica count if the primary metric returns 0 desired replicas.
Webhook-based autoscaling is the best option as it is relatively easy to configure and also it can scale quickly.
## Scheduled Overrides

View File

@@ -2,7 +2,7 @@
## Usage
[GitHub self-hosted runners can be deployed at various levels in a management hierarchy](https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#about-self-hosted-runners):
[GitHub self-hosted runners can be deployed at various levels in a management hierarchy](https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners#about-self-hosted-runners):
- The repository level
- The organization level
- The enterprise level

Some files were not shown because too many files have changed in this diff Show More