Compare commits

..

34 Commits

Author SHA1 Message Date
callum-tait-pbx
0fce761686 fix: add trunate to ensure service kinds have valid names (#325)
* fix: adding truncate for service kinds

* chore : bumping chart version
2021-02-18 08:43:48 +09:00
Yusuke Kuoka
c88ff44518 Fix wip.yml workflow for building controller canary tags (#323)
In #306 we seem to have accidentally updated a wrong workflow, which was for runner builds. This updates the one for the controller.

Resolves #302
2021-02-18 08:42:24 +09:00
Yusuke Kuoka
2fdf35ac9d Refactor integration test to use helpers (#320)
This should make the test code a bit more DRY and readable.
2021-02-17 10:23:35 +09:00
Johannes Nicolai
6cce3fefc5 Add project to awesome-runners list (#319) 2021-02-17 09:14:42 +09:00
Yusuke Kuoka
eb2eaf8130 Fix TotalNumberOfQueuedAndInProgressWorkflowRuns to work with a lot of remaining completed jobs (#316)
I have heard from some user that they have hundred thousands of `status=completed` workflow runs in their repository which effectively blocked TotalNumberOfQueuedAndInProgressWorkflowRuns from working because of GitHub API rate limit due to excessive paginated requests.

This fixes that by separating list-workflow-runs calls to two - one for `queued` and one for `in_progress`, which can make the minimum API call from 1 to 2, but allows it to work regardless of number of remaining `completed` workflow runs.
2021-02-16 18:55:55 +09:00
callum-tait-pbx
7bf712d0d4 fix: duplicate name attribute (#318) 2021-02-16 18:52:08 +09:00
Yusuke Kuoka
7d024a6c05 Fix "duplicate metrics collector registration attempted" errors at startup (#317)
I have seen this error a lot in our integration test. It turned out due to https://github.com/kubernetes-sigs/controller-runtime/issues/484 and is being fixed with this change.
2021-02-16 18:51:33 +09:00
Yusuke Kuoka
434823bcb3 scale{Up,Down}Adjustment to add/remove constant number of replicas on scaling (#315)
* `scale{Up,Down}Adjustment` to add/remove constant number of replicas on scaling

Ref #305

* Bump chart version
2021-02-16 17:16:26 +09:00
Yusuke Kuoka
35d047db01 Fix enterprise runners misusing cached token (#314)
Follow-up for #290
2021-02-16 12:56:52 +09:00
Yusuke Kuoka
f1db6af1c5 Add repository runners support for PercentageRunnersBusy-based autoscaling (#313)
Resolves #258
2021-02-16 12:44:51 +09:00
Hidetake Iwata
4f3f2fb60d Add metrics for GitHub API rate limit (#312) 2021-02-16 09:58:09 +09:00
Johannes Nicolai
2623140c9a Make log message less scary (#311)
* the reconciliation loop is often much faster than the runner startup, 
so changing runner not found related messages to debug and also add the 
possibility that the runner just needs more time
2021-02-16 09:55:55 +09:00
Johannes Nicolai
1db9d9d574 Use ARM64 compatible kube-rbac-proxy from upstream (#310)
* as pointed out in #281 the currently used image for the 
kube-rbac-proxy - gcr.io/kubebuilder/kube-rbac-proxy:v0.4.1" - does not 
have an ARM64 image
* hence, trying to use the standard deployment manifest / helm char will 
fail on ARM64 systems
* replaced image with quay.io/brancz/kube-rbac-proxy:v0.8.0 which is the 
latest version from the upstream maintainer 
(https://github.com/brancz/kube-rbac-proxy/blob/master/Makefile#L13)
* successfully tested on both AMD64 and ARM64 clusters
* fixes #281
2021-02-16 09:55:03 +09:00
callum-tait-pbx
d046350240 chore: bumping helm chart sematically (#296)
* chore: bumping helm chart sematically

* chore: removing the app version config
2021-02-16 09:45:56 +09:00
callum-tait-pbx
cca4d249e9 feat: create workflow for runner releases (#306) 2021-02-16 09:42:28 +09:00
Johannes Nicolai
bc8bc70f69 Fix rate limit and runner registration logic (#309)
* errors.Is compares all members of a struct to return true which never 
happened
* switched to type check instead of exact value check
* notRegistered was using double negation in if statement which lead to 
unregistering runners after the registration timeout
2021-02-15 09:36:49 +09:00
Johannes Nicolai
34c6c3d9cd Pod eviction policy examples (crashed nodes) (#308)
* ... otherwise it will take 40 seconds (until a node is detected as unreachable) + 5 minutes (until pods are evicted from unreachable/crashed nodes)
* pods stuck in "Terminating" status on unreachable nodes will only be freed once #307 gets merged
2021-02-15 09:33:01 +09:00
Johannes Nicolai
9c8d7305f1 Introduce pod deletion timeout and forcefully delete stuck pods (#307)
* if a k8s node becomes unresponsive, the kube controller will soft
delete all pods after the eviction time (default 5 mins)
* as long as the node stays unresponsive, the pod will never leave the
last status and hence the runner controller will assume that everything
is fine with the pod and will not try to create new pods
* this can result in a situation where a horizontal autoscaler thinks
that none of its runners are currently busy and will not schedule any
further runners / pods, resulting in a broken  runner deployment until the
runnerreplicaset is deleted or the node comes back online
* introducing a pod deletion timeout (1 minute) after which the runner
controller will try to reboot the runner and create a pod on a working
node
* use forceful deletion and requeue for pods that have been stuck for
more than one minute in terminating state
* gracefully handling race conditions if pod gets finally forcefully deleted within
2021-02-15 09:32:28 +09:00
Yusuke Kuoka
addcbfa7ee Fix runner registration timeout (#301)
Fixes #300
2021-02-12 10:00:20 +09:00
Yusuke Kuoka
bbb036e732 feat: Prevent blocking on transient runner registration failure (#297)
This enhances the controller to recreate the runner pod if the corresponding runner has failed to register itself to GitHub within 10 minutes(currently hard-coded).

It should alleviate #288 in case the root cause is some kind of transient failures(network unreliability, GitHub down, temporarly compute resource shortage, etc).

Formerly you had to manually detect and delete such pods or even force-delete corresponding runners to unblock the controller.

Since this enhancement, the controller does the pod deletion automatically after 10 minutes after pod creation, which result in the controller create another pod that might work.

Ref #288
2021-02-09 10:17:52 +09:00
Yusuke Kuoka
9301409aec fix: Paginate ListRepositoryWorkflowRuns (#295)
When we used `QueuedAndInProgressWorkflowRuns`-based autoscaling, it only fetched and considered only the first 30 workflow runs at the reconcilation time. This may have resulted in unreliable scaling behaviour, like scale-in/out not happening when it was expected.
2021-02-09 10:13:53 +09:00
Yusuke Kuoka
ab1c39de57 feat: HorizontalRunnerAutoscaler Webhook server (#282)
* feat: HorizontalRunnerAutoscaler Webhook server

This introduces a Webhook server that responds GitHub `check_run`, `pull_request`, and `push` events by scaling up matched HorizontalRunnerAutoscaler by 1 replica. This allows you to immediately add "resource slack" for future GitHub Actions job runs, without waiting next sync period to add insufficient runners.

This feature is highly inspired by https://github.com/philips-labs/terraform-aws-github-runner. terraform-aws-github-runner can manage one set of runners per deployment, where actions-runner-controller with this feature can manage as many sets of runners as you declare with HorizontalRunnerAutoscaler and RunnerDeployment pairs.

On each GitHub event received, the webhook server queries repository-wide and organizational runners from the cluster and searches for the single target to scale up. The webhook server tries to match HorizontalRunnerAutoscaler.Spec.ScaleUpTriggers[].GitHubEvent.[CheckRun|Push|PullRequest] against the event and if it finds only one HRA, it is the scale target. If none or two or more targets are found for repository-wide runners, it does the same on organizational runners.

Changes:

* Fix integration test
* Update manifests
* chart: Add support for github webhook server
* dockerfile: Include github-webhook-server binary
* Do not import unversioned go-github
* Update README
2021-02-07 17:37:27 +09:00
alex-mozejko
a4350d0fc2 bug-fix: patched dir owned by runner (#284)
* bug-fix: patched dir owned by runner

* always build with latest runner version

* Revert "always build with latest runner version"

This reverts commit e719724ae9fe92a12d4a087185cf2a2ff543a0dd.

* Also patch dindrunner.Dockerfile

Co-authored-by: Yusuke Kuoka <ykuoka@gmail.com>
2021-02-07 17:21:10 +09:00
callum-tait-pbx
2146c62c9e chore: bumping Helm chart version (#294)
* chore: adding mac rubbish to gitignore

* chore: bumping chart version
2021-02-07 16:46:19 +09:00
Jesse Haka
28e80a2d28 Add support for enterprise runners (#290)
* Add support for enterprise runners

* update docs
2021-02-05 09:31:06 +09:00
Tom Bamford
831db9ee2a Added github.sha to DockerHub push (#286)
* Added GITHUB.RUN_NUMBER to DockerHub push

* switch run_number to sha on docker tag

* re-add mutable tags for backwards compatability

* truncate to short SHA (7 chars)

* behaviour workaround

* use ENV to define sha_short

* use ::set-output to define sha_short

* bump action
2021-02-04 09:29:32 +09:00
Donovan Muller
4d69e0806e Update GitHub runner version (#280) 2021-02-02 14:06:08 +09:00
Donovan Muller
d37cd69e9b feat/helm: Bump appVersion to 0.6.1 release (#272)
* feat/helm: Bump appVersion to 0.6.1 release

* Also bump chart version to trigger a new chart release

Co-authored-by: Yusuke Kuoka <c-ykuoka@zlab.co.jp>
2021-01-29 09:29:43 +09:00
Yusuke Kuoka
a2690aa5cb Update README.md
Follow-up for #275
2021-01-29 09:29:26 +09:00
Clément
da020df0fd docs: fix install installation method (#275) 2021-01-29 09:28:34 +09:00
Jonas Lergell
6c64ae6a01 Actually use 'dockerdContainerResources' to set resources on the dind container (#273) 2021-01-29 09:18:28 +09:00
Yusuke Kuoka
42c7d0489d chart: Bump to 0.2.0 2021-01-25 09:14:49 +09:00
Donovan Muller
b3bef6404c Add support for additional environment variables (#271) 2021-01-25 09:00:03 +09:00
David Young
1127c447c4 Add GitHub Actions to publish helm chart (#257)
* Add chart workflows (#1)

* Add chart workflows

* Fix publishing step in CI

Signed-off-by: David Young <davidy@funkypenguin.co.nz>

* Update CI on push-to-master (#3)

* Put helm installation step in the correct CI job

Signed-off-by: David Young <davidy@funkypenguin.co.nz>

* Put helm installation step in the correct CI job (#4)

* Update on-push-master-publish-chart.yml

* Remove references to certmanager dependency

Signed-off-by: David Young <davidy@funkypenguin.co.nz>

* Add ability to customize kube-rbac-proxy image

Signed-off-by: David Young <davidy@funkypenguin.co.nz>

* Only install cert-manager if we're going to spin up KinD

Signed-off-by: David Young <davidy@funkypenguin.co.nz>
2021-01-24 15:37:01 +09:00
64 changed files with 3235 additions and 1586 deletions

View File

@@ -1,3 +1,5 @@
name: Build and Release Runners
on:
pull_request:
branches:
@@ -14,7 +16,7 @@ on:
- runner/dindrunner.Dockerfile
- runner/entrypoint.sh
- .github/workflows/build-runner.yml
name: Runner
jobs:
build:
runs-on: ubuntu-latest
@@ -27,10 +29,14 @@ jobs:
- name: actions-runner-dind
dockerfile: dindrunner.Dockerfile
env:
RUNNER_VERSION: 2.275.1
RUNNER_VERSION: 2.276.1
DOCKER_VERSION: 19.03.12
DOCKERHUB_USERNAME: ${{ github.repository_owner }}
steps:
- name: Set outputs
id: vars
run: echo ::set-output name=sha_short::${GITHUB_SHA::7}
- name: Checkout
uses: actions/checkout@v2
@@ -49,16 +55,16 @@ jobs:
username: ${{ github.repository_owner }}
password: ${{ secrets.DOCKER_ACCESS_TOKEN }}
- name: Build [and Push]
- name: Build and Push
uses: docker/build-push-action@v2
with:
context: ./runner
file: ./runner/${{ matrix.dockerfile }}
platforms: linux/amd64,linux/arm64
push: ${{ github.event_name != 'pull_request' }}
build-args: |
RUNNER_VERSION=${{ env.RUNNER_VERSION }}
DOCKER_VERSION=${{ env.DOCKER_VERSION }}
tags: |
${{ env.DOCKERHUB_USERNAME }}/${{ matrix.name }}:v${{ env.RUNNER_VERSION }}
${{ env.DOCKERHUB_USERNAME }}/${{ matrix.name }}:v${{ env.RUNNER_VERSION }}-${{ steps.vars.outputs.sha_short }}
${{ env.DOCKERHUB_USERNAME }}/${{ matrix.name }}:latest

View File

@@ -0,0 +1,75 @@
name: Lint and Test Charts
on:
push:
paths:
- 'charts/**'
- '.github/**'
workflow_dispatch:
env:
KUBE_SCORE_VERSION: 1.10.0
HELM_VERSION: v3.4.1
jobs:
lint-test:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Set up Helm
uses: azure/setup-helm@v1
with:
version: ${{ env.HELM_VERSION }}
- name: Set up kube-score
run: |
wget https://github.com/zegl/kube-score/releases/download/v${{ env.KUBE_SCORE_VERSION }}/kube-score_${{ env.KUBE_SCORE_VERSION }}_linux_amd64 -O kube-score
chmod 755 kube-score
- name: Kube-score generated manifests
run: helm template --values charts/.ci/values-kube-score.yaml charts/* | ./kube-score score -
--ignore-test pod-networkpolicy
--ignore-test deployment-has-poddisruptionbudget
--ignore-test deployment-has-host-podantiaffinity
--ignore-test container-security-context
--ignore-test pod-probes
--ignore-test container-image-tag
--enable-optional-test container-security-context-privileged
--enable-optional-test container-security-context-readonlyrootfilesystem
# python is a requirement for the chart-testing action below (supports yamllint among other tests)
- uses: actions/setup-python@v2
with:
python-version: 3.7
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.0.1
- name: Run chart-testing (list-changed)
id: list-changed
run: |
changed=$(ct list-changed --config charts/.ci/ct-config.yaml)
if [[ -n "$changed" ]]; then
echo "::set-output name=changed::true"
fi
- name: Run chart-testing (lint)
run: ct lint --config charts/.ci/ct-config.yaml
- name: Create kind cluster
uses: helm/kind-action@v1.0.0
if: steps.list-changed.outputs.changed == 'true'
# We need cert-manager already installed in the cluster because we assume the CRDs exist
- name: Install cert-manager
run: |
helm repo add jetstack https://charts.jetstack.io --force-update
helm install cert-manager jetstack/cert-manager --set installCRDs=true --wait
if: steps.list-changed.outputs.changed == 'true'
- name: Run chart-testing (install)
run: ct install --config charts/.ci/ct-config.yaml

View File

@@ -0,0 +1,101 @@
name: Publish helm chart
on:
push:
branches:
- master
- main # assume that the branch name may change in future
paths:
- 'charts/**'
- '.github/**'
workflow_dispatch:
env:
KUBE_SCORE_VERSION: 1.10.0
HELM_VERSION: v3.4.1
jobs:
lint-chart:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Set up Helm
uses: azure/setup-helm@v1
with:
version: ${{ env.HELM_VERSION }}
- name: Set up kube-score
run: |
wget https://github.com/zegl/kube-score/releases/download/v${{ env.KUBE_SCORE_VERSION }}/kube-score_${{ env.KUBE_SCORE_VERSION }}_linux_amd64 -O kube-score
chmod 755 kube-score
- name: Kube-score generated manifests
run: helm template --values charts/.ci/values-kube-score.yaml charts/* | ./kube-score score -
--ignore-test pod-networkpolicy
--ignore-test deployment-has-poddisruptionbudget
--ignore-test deployment-has-host-podantiaffinity
--ignore-test container-security-context
--ignore-test pod-probes
--ignore-test container-image-tag
--enable-optional-test container-security-context-privileged
--enable-optional-test container-security-context-readonlyrootfilesystem
# python is a requirement for the chart-testing action below (supports yamllint among other tests)
- uses: actions/setup-python@v2
with:
python-version: 3.7
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.0.1
- name: Run chart-testing (list-changed)
id: list-changed
run: |
changed=$(ct list-changed --config charts/.ci/ct-config.yaml)
if [[ -n "$changed" ]]; then
echo "::set-output name=changed::true"
fi
- name: Run chart-testing (lint)
run: ct lint --config charts/.ci/ct-config.yaml
- name: Create kind cluster
uses: helm/kind-action@v1.0.0
if: steps.list-changed.outputs.changed == 'true'
# We need cert-manager already installed in the cluster because we assume the CRDs exist
- name: Install cert-manager
run: |
helm repo add jetstack https://charts.jetstack.io --force-update
helm install cert-manager jetstack/cert-manager --set installCRDs=true --wait
if: steps.list-changed.outputs.changed == 'true'
- name: Run chart-testing (install)
run: ct install --config charts/.ci/ct-config.yaml
if: steps.list-changed.outputs.changed == 'true'
publish-chart:
runs-on: ubuntu-latest
needs: lint-chart
steps:
- name: Checkout
uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Run chart-releaser
uses: helm/chart-releaser-action@v1.1.0
env:
CR_TOKEN: "${{ secrets.GITHUB_TOKEN }}"

View File

@@ -9,6 +9,10 @@ jobs:
env:
DOCKERHUB_USERNAME: ${{ github.repository_owner }}
steps:
- name: Set outputs
id: vars
run: echo ::set-output name=sha_short::${GITHUB_SHA::7}
- name: Checkout
uses: actions/checkout@v2
@@ -52,5 +56,8 @@ jobs:
file: Dockerfile
platforms: linux/amd64,linux/arm64
push: true
tags: ${{ env.DOCKERHUB_USERNAME }}/actions-runner-controller:${{ env.VERSION }}
tags: |
${{ env.DOCKERHUB_USERNAME }}/actions-runner-controller:latest
${{ env.DOCKERHUB_USERNAME }}/actions-runner-controller:${{ env.VERSION }}
${{ env.DOCKERHUB_USERNAME }}/actions-runner-controller:${{ env.VERSION }}-${{ steps.vars.outputs.sha_short }}

View File

@@ -30,11 +30,13 @@ jobs:
username: ${{ github.repository_owner }}
password: ${{ secrets.DOCKER_ACCESS_TOKEN }}
# Considered unstable builds
# See Issue #285, PR #286, and PR #323 for more information
- name: Build and Push
uses: docker/build-push-action@v2
with:
file: Dockerfile
platforms: linux/amd64,linux/arm64
push: true
tags: ${{ env.DOCKERHUB_USERNAME }}/actions-runner-controller:latest
tags: |
${{ env.DOCKERHUB_USERNAME }}/actions-runner-controller:canary

3
.gitignore vendored
View File

@@ -26,3 +26,6 @@ bin
.envrc
*.pem
# OS
.DS_STORE

View File

@@ -22,7 +22,8 @@ COPY . .
RUN export GOOS=$(echo ${TARGETPLATFORM} | cut -d / -f1) && \
export GOARCH=$(echo ${TARGETPLATFORM} | cut -d / -f2) && \
GOARM=$(echo ${TARGETPLATFORM} | cut -d / -f3 | cut -c2-) && \
go build -a -o manager main.go
go build -a -o manager main.go && \
go build -a -o github-webhook-server ./cmd/githubwebhookserver
# Use distroless as minimal base image to package the manager binary
# Refer to https://github.com/GoogleContainerTools/distroless for more details
@@ -31,6 +32,7 @@ FROM gcr.io/distroless/static:nonroot
WORKDIR /
COPY --from=builder /workspace/manager .
COPY --from=builder /workspace/github-webhook-server .
USER nonroot:nonroot

224
README.md
View File

@@ -1,7 +1,33 @@
# actions-runner-controller
[![awesome-runners](https://img.shields.io/badge/listed%20on-awesome--runners-blue.svg)](https://github.com/jonico/awesome-runners)
This controller operates self-hosted runners for GitHub Actions on your Kubernetes cluster.
ToC:
- [Motivation](#motivation)
- [Installation](#installation)
- [GitHub Enterprise support](#github-enterprise-support)
- [Setting up authentication with GitHub API](#setting-up-authentication-with-github-api)
- [Using GitHub App](#using-github-app)
- [Using Personal AccessToken ](#using-personal-access-token)
- [Usage](#usage)
- [Repository Runners](#repository-runners)
- [Organization Runners](#organization-runners)
- [Runner Deployments](#runnerdeployments)
- [Autoscaling](#autoscaling)
- [Faster Autoscaling with GitHub Webhook](#faster-autoscaling-with-github-webhook)
- [Runner with DinD](#runner-with-dind)
- [Additional tweaks](#additional-tweaks)
- [Runner labels](#runner-labels)
- [Runer groups](#runner-groups)
- [Using EKS IAM role for service accounts](#using-eks-iam-role-for-service-accounts)
- [Software installed in the runner image](#software-installed-in-the-runner-image)
- [Common errors](#common-errors)
- [Developing](#developing)
- [Alternatives](#alternatives)
## Motivation
[GitHub Actions](https://github.com/features/actions) is a very useful tool for automating development. GitHub Actions jobs are run in the cloud by default, but you may want to run your jobs in your environment. [Self-hosted runner](https://github.com/actions/runner) can be used for such use cases, but requires the provisioning and configuration of a virtual machine instance. Instead if you already have a Kubernetes cluster, it makes more sense to run the self-hosted runner on top of it.
@@ -14,21 +40,71 @@ actions-runner-controller uses [cert-manager](https://cert-manager.io/docs/insta
- [Installing cert-manager on Kubernetes](https://cert-manager.io/docs/installation/kubernetes/)
Install the custom resource and actions-runner-controller itself. This will create actions-runner-system namespace in your Kubernetes and deploy the required resources.
Install the custom resource and actions-runner-controller with `kubectl` or `helm`. This will create actions-runner-system namespace in your Kubernetes and deploy the required resources.
`kubectl`:
```
kubectl apply -f https://github.com/summerwind/actions-runner-controller/releases/latest/download/actions-runner-controller.yaml
# REPLACE "v0.16.1" with the latest release
kubectl apply -f https://github.com/summerwind/actions-runner-controller/releases/download/v0.16.1/actions-runner-controller.yaml
```
`helm`:
```
helm repo add actions-runner-controller https://summerwind.github.io/actions-runner-controller
helm upgrade --install -n actions-runner-system actions-runner-controller/actions-runner-controller
```
### Github Enterprise support
If you use either Github Enterprise Cloud or Server (and have recent enought version supporting Actions), you can use **actions-runner-controller** with those, too. Authentication works same way as with public Github (repo and organization level).
If you use either Github Enterprise Cloud or Server, you can use **actions-runner-controller** with those, too.
Authentication works same way as with public Github (repo and organization level).
The minimum version of Github Enterprise Server is 3.0.0 (or rc1/rc2).
In most cases maintainers do not have environment where to test changes and are reliant on the community for testing.
```shell
kubectl set env deploy controller-manager -c manager GITHUB_ENTERPRISE_URL=<GHEC/S URL> --namespace actions-runner-system
```
[Enterprise level](https://docs.github.com/en/enterprise-server@2.22/actions/hosting-your-own-runners/adding-self-hosted-runners#adding-a-self-hosted-runner-to-an-enterprise) runners are not working yet as there's no API definition for those.
#### Enterprise runners usage
In order to use enterprise runners you must have Admin access to Github Enterprise and you should do Personal Access Token (PAT)
with `enterprise:admin` access. Enterprise runners are not possible to run with Github APP or any other permission.
When you use enterprise runners those will get access to Github Organisations. However, access to the repositories is **NOT**
allowed by default. Each Github Organisation must allow Enterprise runner groups to be used in repositories.
This is needed only one time and is permanent after that.
Example:
```yaml
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: ghe-runner-deployment
spec:
replicas: 2
template:
spec:
enterprise: your-enterprise-name
dockerdWithinRunnerContainer: true
resources:
limits:
cpu: "4000m"
memory: "2Gi"
requests:
cpu: "200m"
memory: "200Mi"
volumeMounts:
- mountPath: /runner
name: runner
volumes:
- name: runner
emptyDir: {}
```
## Setting up authentication with GitHub API
@@ -289,7 +365,119 @@ spec:
scaleDownFactor: '0.7'
```
## Runner with DinD
#### Faster Autoscaling with GitHub Webhook
> This feature is an ADVANCED feature which may require more work to set up.
> Please get prepared to put some time and effort to learn and leverage this feature!
`actions-runner-controller` has an optional Webhook server that receives GitHub Webhook events and scale
[`RunnerDeployment`s](#runnerdeployments) by updating corresponding [`HorizontalRunnerAutoscaler`s](#autoscaling).
Today, the Webhook server can be configured to respond GitHub `check_run`, `pull_request`, and `push` events
by scaling up the matching `HorizontalRunnerAutoscaler` by N replica(s), where `N` is configurable within
`HorizontalRunerAutoscaler`'s `Spec`.
More concretely, you can configure the targeted GitHub event types and the `N` in
`scaleUpTriggers`:
```yaml
kind: HorizontalRunnerAutoscaler
spec:
scaleTargetRef:
name: myrunners
scaleUpTrigggers:
- githubEvent:
checkRun:
types: ["created"]
status: "queued"
amount: 1
duration: "5m"
```
With the above example, the webhook server scales `myrunners` by `1` replica for 5 minutes on each `check_run` event
with the type of `created` and the status of `queued` received.
The primary benefit of autoscaling on Webhook compared to the standard autoscaling is that this one allows you to
immediately add "resource slack" for future GitHub Actions job runs.
In contrast, the standard autoscaling requires you to wait next sync period to add
insufficient runners. You can definitely shorten the sync period to make the standard autoscaling more responsive.
But doing so eventually result in the controller not functional due to GitHub API rate limit.
> You can learn the implementation details in #282
To enable this feature, you firstly need to install the webhook server.
Currently, only our Helm chart has the ability install it.
```console
$ helm --upgrade install actions-runner-controller/actions-runner-controller \
githubWebhookServer.enabled=true \
githubWebhookServer.ports[0].nodePort=33080
```
The above command will result in exposing the node port 33080 for Webhook events. Usually, you need to create an
external loadbalancer targeted to the node port, and register the hostname or the IP address of the external loadbalancer
to the GitHub Webhook.
Once you were able to confirm that the Webhook server is ready and running from GitHub - this is usually verified by the
GitHub sending PING events to the Webhook server - create or update your `HorizontalRunnerAutoscaler` resources
by learning the following configuration examples.
- [Example 1: Scale up on each `check_run` event](#example-1-scale-up-on-each-check_run-event)
- [Example 2: Scale on each `pull_request` event against `develop` or `main` branches](#example-2-scale-on-each-pull_request-event-against-develop-or-main-branches)
##### Example 1: Scale up on each `check_run` event
> Note: This should work almost like https://github.com/philips-labs/terraform-aws-github-runner
To scale up replicas of the runners for `example/myrepo` by 1 for 5 minutes on each `check_run`, you write manifests like the below:
```yaml
kind: RunnerDeployment
metadata:
name: myrunners
spec:
repository: example/myrepo
---
kind: HorizontalRunnerAutoscaler
spec:
scaleTargetRef:
name: myrunners
scaleUpTrigggers:
- githubEvent:
checkRun:
types: ["created"]
status: "queued"
amount: 1
duration: "5m"
```
###### Example 2: Scale on each `pull_request` event against `develop` or `main` branches
```yaml
kind: RunnerDeployment:
metadata:
name: myrunners
spec:
repository: example/myrepo
---
kind: HorizontalRunnerAutoscaler
spec:
scaleTargetRef:
name: myrunners
scaleUpTrigggers:
- githubEvent:
pullRequest:
types: ["synchronize"]
branches: ["main", "develop"]
amount: 1
duration: "5m"
```
See ["activity types"](https://docs.github.com/en/actions/reference/events-that-trigger-workflows#pull_request) for the list of valid values for `scaleUpTriggers[].githubEvent.pullRequest.types`.
### Runner with DinD
When using default runner, runner pod starts up 2 containers: runner and DinD (Docker-in-Docker). This might create issues if there's `LimitRange` set to namespace.
@@ -311,7 +499,7 @@ spec:
This also helps with resources, as you don't need to give resources separately to docker and runner.
## Additional tweaks
### Additional tweaks
You can pass details through the spec selector. Here's an eg. of what you may like to do:
@@ -343,6 +531,14 @@ spec:
requests:
cpu: "2.0"
memory: "4Gi"
# Timeout after a node crashed or became unreachable to evict your pods somewhere else (default 5mins)
tolerations:
- key: "node.kubernetes.io/unreachable"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 10
# If set to false, there are no privileged container and you cannot use docker.
dockerEnabled: false
# If set to true, runner pod container only 1 container that's expected to be able to run docker, too.
@@ -370,7 +566,7 @@ spec:
workDir: /home/runner/work
```
## Runner labels
### Runner labels
To run a workflow job on a self-hosted runner, you can use the following syntax in your workflow:
@@ -407,7 +603,7 @@ jobs:
Note that if you specify `self-hosted` in your workflow, then this will run your job on _any_ self-hosted runner, regardless of the labels that they have.
## Runner Groups
### Runner Groups
Runner groups can be used to limit which repositories are able to use the GitHub Runner at an Organisation level. Runner groups have to be [created in GitHub first](https://docs.github.com/en/actions/hosting-your-own-runners/managing-access-to-self-hosted-runners-using-groups) before they can be referenced.
@@ -426,7 +622,7 @@ spec:
group: NewGroup
```
## Using EKS IAM role for service accounts
### Using EKS IAM role for service accounts
`actions-runner-controller` v0.15.0 or later has support for EKS IAM role for service accounts.
@@ -452,7 +648,7 @@ spec:
fsGroup: 1447
```
## Software installed in the runner image
### Software installed in the runner image
The GitHub hosted runners include a large amount of pre-installed software packages. For Ubuntu 18.04, this list can be found at <https://github.com/actions/virtual-environments/blob/master/images/linux/Ubuntu1804-README.md>
@@ -487,9 +683,9 @@ spec:
image: YOUR_CUSTOM_DOCKER_IMAGE
```
## Common Errors
### Common Errors
### invalid header field value
#### invalid header field value
```json
2020-11-12T22:17:30.693Z ERROR controller-runtime.controller Reconciler error {"controller": "runner", "request": "actions-runner-system/runner-deployment-dk7q8-dk5c9", "error": "failed to create registration token: Post \"https://api.github.com/orgs/$YOUR_ORG_HERE/actions/runners/registration-token\": net/http: invalid header field value \"Bearer $YOUR_TOKEN_HERE\\n\" for key Authorization"}
@@ -519,7 +715,7 @@ NAME=$DOCKER_USER/actions-runner-controller \
Please follow the instructions explained in [Using Personal Access Token](#using-personal-access-token) to obtain
`GITHUB_TOKEN`, and those in [Using GitHub App](#using-github-app) to obtain `APP_ID`, `INSTALLATION_ID`, and
`PRIVATE_KEY_FILE_PATH`.
`PRIAVTE_KEY_FILE_PATH`.
The test creates a one-off `kind` cluster, deploys `cert-manager` and `actions-runner-controller`,
creates a `RunnerDeployment` custom resource for a public Git repository to confirm that the
@@ -527,7 +723,7 @@ controller is able to bring up a runner pod with the actions runner registration
If you prefer to test in a non-kind cluster, you can instead run:
```shell
```shell script
KUBECONFIG=path/to/kubeconfig \
NAME=$DOCKER_USER/actions-runner-controller \
GITHUB_TOKEN=*** \

View File

@@ -41,6 +41,56 @@ type HorizontalRunnerAutoscalerSpec struct {
// Metrics is the collection of various metric targets to calculate desired number of runners
// +optional
Metrics []MetricSpec `json:"metrics,omitempty"`
// ScaleUpTriggers is an experimental feature to increase the desired replicas by 1
// on each webhook requested received by the webhookBasedAutoscaler.
//
// This feature requires you to also enable and deploy the webhookBasedAutoscaler onto your cluster.
//
// Note that the added runners remain until the next sync period at least,
// and they may or may not be used by GitHub Actions depending on the timing.
// They are intended to be used to gain "resource slack" immediately after you
// receive a webhook from GitHub, so that you can loosely expect MinReplicas runners to be always available.
ScaleUpTriggers []ScaleUpTrigger `json:"scaleUpTriggers,omitempty"`
CapacityReservations []CapacityReservation `json:"capacityReservations,omitempty" patchStrategy:"merge" patchMergeKey:"name"`
}
type ScaleUpTrigger struct {
GitHubEvent *GitHubEventScaleUpTriggerSpec `json:"githubEvent,omitempty"`
Amount int `json:"amount,omitempty"`
Duration metav1.Duration `json:"duration,omitempty"`
}
type GitHubEventScaleUpTriggerSpec struct {
CheckRun *CheckRunSpec `json:"checkRun,omitempty"`
PullRequest *PullRequestSpec `json:"pullRequest,omitempty"`
Push *PushSpec `json:"push,omitempty"`
}
// https://docs.github.com/en/actions/reference/events-that-trigger-workflows#check_run
type CheckRunSpec struct {
Types []string `json:"types,omitempty"`
Status string `json:"status,omitempty"`
}
// https://docs.github.com/en/actions/reference/events-that-trigger-workflows#pull_request
type PullRequestSpec struct {
Types []string `json:"types,omitempty"`
Branches []string `json:"branches,omitempty"`
}
// PushSpec is the condition for triggering scale-up on push event
// Also see https://docs.github.com/en/actions/reference/events-that-trigger-workflows#push
type PushSpec struct {
}
// CapacityReservation specifies the number of replicas temporarily added
// to the scale target until ExpirationTime.
type CapacityReservation struct {
Name string `json:"name,omitempty"`
ExpirationTime metav1.Time `json:"expirationTime,omitempty"`
Replicas int `json:"replicas,omitempty"`
}
type ScaleTargetRef struct {
@@ -76,6 +126,16 @@ type MetricSpec struct {
// to determine how many pods should be removed.
// +optional
ScaleDownFactor string `json:"scaleDownFactor,omitempty"`
// ScaleUpAdjustment is the number of runners added on scale-up.
// You can only specify either ScaleUpFactor or ScaleUpAdjustment.
// +optional
ScaleUpAdjustment int `json:"scaleUpAdjustment,omitempty"`
// ScaleDownAdjustment is the number of runners removed on scale-down.
// You can only specify either ScaleDownFactor or ScaleDownAdjustment.
// +optional
ScaleDownAdjustment int `json:"scaleDownAdjustment,omitempty"`
}
type HorizontalRunnerAutoscalerStatus struct {
@@ -91,6 +151,17 @@ type HorizontalRunnerAutoscalerStatus struct {
// +optional
LastSuccessfulScaleOutTime *metav1.Time `json:"lastSuccessfulScaleOutTime,omitempty"`
// +optional
CacheEntries []CacheEntry `json:"cacheEntries,omitempty"`
}
const CacheEntryKeyDesiredReplicas = "desiredReplicas"
type CacheEntry struct {
Key string `json:"key,omitempty"`
Value int `json:"value,omitempty"`
ExpirationTime metav1.Time `json:"expirationTime,omitempty"`
}
// +kubebuilder:object:root=true

View File

@@ -25,6 +25,10 @@ import (
// RunnerSpec defines the desired state of Runner
type RunnerSpec struct {
// +optional
// +kubebuilder:validation:Pattern=`^[^/]+$`
Enterprise string `json:"enterprise,omitempty"`
// +optional
// +kubebuilder:validation:Pattern=`^[^/]+$`
Organization string `json:"organization,omitempty"`
@@ -92,12 +96,22 @@ type RunnerSpec struct {
// ValidateRepository validates repository field.
func (rs *RunnerSpec) ValidateRepository() error {
// Organization and repository are both exclusive.
if len(rs.Organization) == 0 && len(rs.Repository) == 0 {
return errors.New("Spec needs organization or repository")
// Enterprise, Organization and repository are both exclusive.
foundCount := 0
if len(rs.Organization) > 0 {
foundCount += 1
}
if len(rs.Organization) > 0 && len(rs.Repository) > 0 {
return errors.New("Spec cannot have both organization and repository")
if len(rs.Repository) > 0 {
foundCount += 1
}
if len(rs.Enterprise) > 0 {
foundCount += 1
}
if foundCount == 0 {
return errors.New("Spec needs enterprise, organization or repository")
}
if foundCount > 1 {
return errors.New("Spec cannot have many fields defined enterprise, organization and repository")
}
return nil
@@ -113,6 +127,7 @@ type RunnerStatus struct {
// RunnerStatusRegistration contains runner registration status
type RunnerStatusRegistration struct {
Enterprise string `json:"enterprise,omitempty"`
Organization string `json:"organization,omitempty"`
Repository string `json:"repository,omitempty"`
Labels []string `json:"labels,omitempty"`
@@ -122,6 +137,7 @@ type RunnerStatusRegistration struct {
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:JSONPath=".spec.enterprise",name=Enterprise,type=string
// +kubebuilder:printcolumn:JSONPath=".spec.organization",name=Organization,type=string
// +kubebuilder:printcolumn:JSONPath=".spec.repository",name=Repository,type=string
// +kubebuilder:printcolumn:JSONPath=".spec.labels",name=Labels,type=string

View File

@@ -25,6 +25,88 @@ import (
"k8s.io/apimachinery/pkg/runtime"
)
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *CacheEntry) DeepCopyInto(out *CacheEntry) {
*out = *in
in.ExpirationTime.DeepCopyInto(&out.ExpirationTime)
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new CacheEntry.
func (in *CacheEntry) DeepCopy() *CacheEntry {
if in == nil {
return nil
}
out := new(CacheEntry)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *CapacityReservation) DeepCopyInto(out *CapacityReservation) {
*out = *in
in.ExpirationTime.DeepCopyInto(&out.ExpirationTime)
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new CapacityReservation.
func (in *CapacityReservation) DeepCopy() *CapacityReservation {
if in == nil {
return nil
}
out := new(CapacityReservation)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *CheckRunSpec) DeepCopyInto(out *CheckRunSpec) {
*out = *in
if in.Types != nil {
in, out := &in.Types, &out.Types
*out = make([]string, len(*in))
copy(*out, *in)
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new CheckRunSpec.
func (in *CheckRunSpec) DeepCopy() *CheckRunSpec {
if in == nil {
return nil
}
out := new(CheckRunSpec)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *GitHubEventScaleUpTriggerSpec) DeepCopyInto(out *GitHubEventScaleUpTriggerSpec) {
*out = *in
if in.CheckRun != nil {
in, out := &in.CheckRun, &out.CheckRun
*out = new(CheckRunSpec)
(*in).DeepCopyInto(*out)
}
if in.PullRequest != nil {
in, out := &in.PullRequest, &out.PullRequest
*out = new(PullRequestSpec)
(*in).DeepCopyInto(*out)
}
if in.Push != nil {
in, out := &in.Push, &out.Push
*out = new(PushSpec)
**out = **in
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new GitHubEventScaleUpTriggerSpec.
func (in *GitHubEventScaleUpTriggerSpec) DeepCopy() *GitHubEventScaleUpTriggerSpec {
if in == nil {
return nil
}
out := new(GitHubEventScaleUpTriggerSpec)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *HorizontalRunnerAutoscaler) DeepCopyInto(out *HorizontalRunnerAutoscaler) {
*out = *in
@@ -110,6 +192,20 @@ func (in *HorizontalRunnerAutoscalerSpec) DeepCopyInto(out *HorizontalRunnerAuto
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
if in.ScaleUpTriggers != nil {
in, out := &in.ScaleUpTriggers, &out.ScaleUpTriggers
*out = make([]ScaleUpTrigger, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
if in.CapacityReservations != nil {
in, out := &in.CapacityReservations, &out.CapacityReservations
*out = make([]CapacityReservation, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new HorizontalRunnerAutoscalerSpec.
@@ -134,6 +230,13 @@ func (in *HorizontalRunnerAutoscalerStatus) DeepCopyInto(out *HorizontalRunnerAu
in, out := &in.LastSuccessfulScaleOutTime, &out.LastSuccessfulScaleOutTime
*out = (*in).DeepCopy()
}
if in.CacheEntries != nil {
in, out := &in.CacheEntries, &out.CacheEntries
*out = make([]CacheEntry, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new HorizontalRunnerAutoscalerStatus.
@@ -166,6 +269,46 @@ func (in *MetricSpec) DeepCopy() *MetricSpec {
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *PullRequestSpec) DeepCopyInto(out *PullRequestSpec) {
*out = *in
if in.Types != nil {
in, out := &in.Types, &out.Types
*out = make([]string, len(*in))
copy(*out, *in)
}
if in.Branches != nil {
in, out := &in.Branches, &out.Branches
*out = make([]string, len(*in))
copy(*out, *in)
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PullRequestSpec.
func (in *PullRequestSpec) DeepCopy() *PullRequestSpec {
if in == nil {
return nil
}
out := new(PullRequestSpec)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *PushSpec) DeepCopyInto(out *PushSpec) {
*out = *in
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PushSpec.
func (in *PushSpec) DeepCopy() *PushSpec {
if in == nil {
return nil
}
out := new(PushSpec)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *Runner) DeepCopyInto(out *Runner) {
*out = *in
@@ -615,3 +758,24 @@ func (in *ScaleTargetRef) DeepCopy() *ScaleTargetRef {
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *ScaleUpTrigger) DeepCopyInto(out *ScaleUpTrigger) {
*out = *in
if in.GitHubEvent != nil {
in, out := &in.GitHubEvent, &out.GitHubEvent
*out = new(GitHubEventScaleUpTriggerSpec)
(*in).DeepCopyInto(*out)
}
out.Duration = in.Duration
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScaleUpTrigger.
func (in *ScaleUpTrigger) DeepCopy() *ScaleUpTrigger {
if in == nil {
return nil
}
out := new(ScaleUpTrigger)
in.DeepCopyInto(out)
return out
}

View File

@@ -1,4 +0,0 @@
repositoryID: 6e120248-b034-45e5-b16c-6015ecfa7c6c
owners:
- name: mumoshu
email: ykuoka@gmail.com

View File

@@ -0,0 +1,4 @@
# This file defines the config for "ct" (chart tester) used by the helm linting GitHub workflow
lint-conf: charts/.ci/lint-config.yaml
chart-repos:
- jetstack=https://charts.jetstack.io

View File

@@ -0,0 +1,6 @@
rules:
# One blank line is OK
empty-lines:
max-start: 1
max-end: 1
max: 1

View File

@@ -0,0 +1,3 @@
#!/bin/bash
docker run --rm -it -w /repo -v $(pwd):/repo quay.io/helmpack/chart-testing ct lint --all --config charts/.ci/ct-config.yaml

View File

@@ -0,0 +1,15 @@
#!/bin/bash
for chart in `ls charts`;
do
helm template --values charts/$chart/ci/ci-values.yaml charts/$chart | kube-score score - \
--ignore-test pod-networkpolicy \
--ignore-test deployment-has-poddisruptionbudget \
--ignore-test deployment-has-host-podantiaffinity \
--ignore-test pod-probes \
--ignore-test container-image-tag \
--enable-optional-test container-security-context-privileged \
--enable-optional-test container-security-context-readonlyrootfilesystem \
--ignore-test container-security-context
done

View File

@@ -15,9 +15,17 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.0
version: 0.5.1
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
appVersion: 0.11.2
home: https://github.com/summerwind/actions-runner-controller
sources:
- https://github.com/summerwind/actions-runner-controller
maintainers:
- name: summerwind
email: contact@summerwind.jp
url: https://github.com/summerwind
- name: funkypenguin
email: davidy@funkypenguin.co.nz
url: https://www.funkypenguin.co.nz

View File

@@ -0,0 +1,27 @@
# This file sets some opinionated values for kube-score to use
# when parsing the chart
image:
pullPolicy: Always
podSecurityContext:
fsGroup: 2000
securityContext:
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 2000
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
# Set the following to true to create a dummy secret, allowing the manager pod to start
# This is only useful in CI
createDummySecret: true

View File

@@ -48,6 +48,20 @@ spec:
description: HorizontalRunnerAutoscalerSpec defines the desired state of
HorizontalRunnerAutoscaler
properties:
capacityReservations:
items:
description: CapacityReservation specifies the number of replicas
temporarily added to the scale target until ExpirationTime.
properties:
expirationTime:
format: date-time
type: string
name:
type: string
replicas:
type: integer
type: object
type: array
maxReplicas:
description: MinReplicas is the maximum number of replicas the deployment
is allowed to scale
@@ -64,6 +78,11 @@ spec:
items:
type: string
type: array
scaleDownAdjustment:
description: ScaleDownAdjustment is the number of runners removed
on scale-down. You can only specify either ScaleDownFactor or
ScaleDownAdjustment.
type: integer
scaleDownFactor:
description: ScaleDownFactor is the multiplicative factor applied
to the current number of runners used to determine how many
@@ -73,6 +92,10 @@ spec:
description: ScaleDownThreshold is the percentage of busy runners
less than which will trigger the hpa to scale the runners down.
type: string
scaleUpAdjustment:
description: ScaleUpAdjustment is the number of runners added
on scale-up. You can only specify either ScaleUpFactor or ScaleUpAdjustment.
type: integer
scaleUpFactor:
description: ScaleUpFactor is the multiplicative factor applied
to the current number of runners used to determine how many
@@ -104,9 +127,68 @@ spec:
name:
type: string
type: object
scaleUpTriggers:
description: "ScaleUpTriggers is an experimental feature to increase
the desired replicas by 1 on each webhook requested received by the
webhookBasedAutoscaler. \n This feature requires you to also enable
and deploy the webhookBasedAutoscaler onto your cluster. \n Note that
the added runners remain until the next sync period at least, and
they may or may not be used by GitHub Actions depending on the timing.
They are intended to be used to gain \"resource slack\" immediately
after you receive a webhook from GitHub, so that you can loosely expect
MinReplicas runners to be always available."
items:
properties:
amount:
type: integer
duration:
type: string
githubEvent:
properties:
checkRun:
description: https://docs.github.com/en/actions/reference/events-that-trigger-workflows#check_run
properties:
status:
type: string
types:
items:
type: string
type: array
type: object
pullRequest:
description: https://docs.github.com/en/actions/reference/events-that-trigger-workflows#pull_request
properties:
branches:
items:
type: string
type: array
types:
items:
type: string
type: array
type: object
push:
description: PushSpec is the condition for triggering scale-up
on push event Also see https://docs.github.com/en/actions/reference/events-that-trigger-workflows#push
type: object
type: object
type: object
type: array
type: object
status:
properties:
cacheEntries:
items:
properties:
expirationTime:
format: date-time
type: string
key:
type: string
value:
type: integer
type: object
type: array
desiredReplicas:
description: DesiredReplicas is the total number of desired, non-terminated
and latest pods to be set for the primary RunnerSet This doesn't include

View File

@@ -426,6 +426,9 @@ spec:
type: object
dockerdWithinRunnerContainer:
type: boolean
enterprise:
pattern: ^[^/]+$
type: string
env:
items:
description: EnvVar represents an environment variable present in a Container.

View File

@@ -426,6 +426,9 @@ spec:
type: object
dockerdWithinRunnerContainer:
type: boolean
enterprise:
pattern: ^[^/]+$
type: string
env:
items:
description: EnvVar represents an environment variable present in a Container.

View File

@@ -7,6 +7,9 @@ metadata:
name: runners.actions.summerwind.dev
spec:
additionalPrinterColumns:
- JSONPath: .spec.enterprise
name: Enterprise
type: string
- JSONPath: .spec.organization
name: Organization
type: string
@@ -419,6 +422,9 @@ spec:
type: object
dockerdWithinRunnerContainer:
type: boolean
enterprise:
pattern: ^[^/]+$
type: string
env:
items:
description: EnvVar represents an environment variable present in a Container.
@@ -1541,6 +1547,8 @@ spec:
registration:
description: RunnerStatusRegistration contains runner registration status
properties:
enterprise:
type: string
expiresAt:
format: date-time
type: string

View File

@@ -0,0 +1,52 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "actions-runner-controller-github-webhook-server.name" -}}
{{- default .Chart.Name .Values.githubWebhookServer.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- define "actions-runner-controller-github-webhook-server.instance" -}}
{{- printf "%s-%s" .Release.Name "github-webhook-server" }}
{{- end }}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "actions-runner-controller-github-webhook-server.fullname" -}}
{{- if .Values.githubWebhookServer.fullnameOverride }}
{{- .Values.githubWebhookServer.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.githubWebhookServer.nameOverride }}
{{- $instance := include "actions-runner-controller-github-webhook-server.instance" . }}
{{- if contains $name $instance }}
{{- $instance | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s-%s" .Release.Name $name "github-webhook-server" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "actions-runner-controller-github-webhook-server.selectorLabels" -}}
app.kubernetes.io/name: {{ include "actions-runner-controller-github-webhook-server.name" . }}
app.kubernetes.io/instance: {{ include "actions-runner-controller-github-webhook-server.instance" . }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "actions-runner-controller-github-webhook-server.serviceAccountName" -}}
{{- if .Values.githubWebhookServer.serviceAccount.create }}
{{- default (include "actions-runner-controller-github-webhook-server.fullname" .) .Values.githubWebhookServer.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.githubWebhookServer.serviceAccount.name }}
{{- end }}
{{- end }}
{{- define "actions-runner-controller-github-webhook-server.roleName" -}}
{{- include "actions-runner-controller-github-webhook-server.fullname" . }}
{{- end }}

View File

@@ -85,11 +85,11 @@ Create the name of the service account to use
{{- end }}
{{- define "actions-runner-controller.webhookServiceName" -}}
{{- include "actions-runner-controller.fullname" . }}-webhook
{{- include "actions-runner-controller.fullname" . | trunc 55 }}-webhook
{{- end }}
{{- define "actions-runner-controller.authProxyServiceName" -}}
{{- include "actions-runner-controller.fullname" . }}-controller-manager-metrics-service
{{- include "actions-runner-controller.fullname" . | trunc 47 }}-metrics-service
{{- end }}
{{- define "actions-runner-controller.selfsignedIssuerName" -}}

View File

@@ -0,0 +1,10 @@
# This template only exists to facilitate CI testing of the chart, since
# a secret is expected to be found in the namespace by the controller manager
{{ if .Values.createDummySecret -}}
apiVersion: v1
data:
github_token: dGVzdA==
kind: Secret
metadata:
name: controller-manager
{{- end }}

View File

@@ -57,6 +57,10 @@ spec:
optional: true
- name: GITHUB_APP_PRIVATE_KEY
value: /etc/actions-runner-controller/github_app_private_key
{{- range $key, $val := .Values.env }}
- name: {{ $key }}
value: {{ $val | quote }}
{{- end }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default (cat "v" .Chart.AppVersion | replace " " "") }}"
name: manager
imagePullPolicy: {{ .Values.image.pullPolicy }}
@@ -66,10 +70,14 @@ spec:
protocol: TCP
resources:
{{- toYaml .Values.resources | nindent 12 }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
volumeMounts:
- mountPath: "/etc/actions-runner-controller"
name: controller-manager
readOnly: true
- mountPath: /tmp
name: tmp
- mountPath: /tmp/k8s-webhook-server/serving-certs
name: cert
readOnly: true
@@ -78,11 +86,16 @@ spec:
- "--upstream=http://127.0.0.1:8080/"
- "--logtostderr=true"
- "--v=10"
image: gcr.io/kubebuilder/kube-rbac-proxy:v0.4.1
image: "{{ .Values.kube_rbac_proxy.image.repository }}:{{ .Values.kube_rbac_proxy.image.tag }}"
name: kube-rbac-proxy
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: 8443
name: https
resources:
{{- toYaml .Values.resources | nindent 12 }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
terminationGracePeriodSeconds: 10
volumes:
- name: controller-manager
@@ -92,6 +105,8 @@ spec:
secret:
defaultMode: 420
secretName: webhook-server-cert
- name: tmp
emptyDir: {}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}

View File

@@ -0,0 +1,93 @@
{{- if .Values.githubWebhookServer.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "actions-runner-controller-github-webhook-server.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "actions-runner-controller-github-webhook-server.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.githubWebhookServer.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "actions-runner-controller-github-webhook-server.selectorLabels" . | nindent 8 }}
spec:
{{- with .Values.githubWebhookServer.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "actions-runner-controller-github-webhook-server.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.githubWebhookServer.podSecurityContext | nindent 8 }}
{{- with .Values.githubWebhookServer.priorityClassName }}
priorityClassName: "{{ . }}"
{{- end }}
containers:
- args:
- "--metrics-addr=127.0.0.1:8080"
- "--enable-leader-election"
- "--sync-period={{ .Values.githubWebhookServer.syncPeriod }}"
command:
- "/github-webhook-server"
env:
- name: GITHUB_WEBHOOK_SECRET_TOKEN
valueFrom:
secretKeyRef:
key: github_webhook_secret_token
name: github-webhook-server
optional: true
{{- range $key, $val := .Values.githubWebhookServer.env }}
- name: {{ $key }}
value: {{ $val | quote }}
{{- end }}
image: "{{ .Values.githubWebhookServer.image.repository }}:{{ .Values.githubWebhookServer.image.tag | default (cat "v" .Chart.AppVersion | replace " " "") }}"
name: github-webhook-server
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: 8000
name: github-webhook-server
protocol: TCP
resources:
{{- toYaml .Values.githubWebhookServer.resources | nindent 12 }}
securityContext:
{{- toYaml .Values.githubWebhookServer.securityContext | nindent 12 }}
- args:
- "--secure-listen-address=0.0.0.0:8443"
- "--upstream=http://127.0.0.1:8080/"
- "--logtostderr=true"
- "--v=10"
image: "{{ .Values.kube_rbac_proxy.image.repository }}:{{ .Values.kube_rbac_proxy.image.tag }}"
name: kube-rbac-proxy
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: 8443
name: https
resources:
{{- toYaml .Values.resources | nindent 12 }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
terminationGracePeriodSeconds: 10
volumes:
- name: github-webhook-server
secret:
secretName: github-webhook-server
{{- with .Values.githubWebhookServer.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.githubWebhookServer.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.githubWebhookServer.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,70 @@
{{- if .Values.githubWebhookServer.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: null
name: {{ include "actions-runner-controller-github-webhook-server.roleName" . }}
rules:
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers
verbs:
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- horizontalrunnerautoscalers/status
verbs:
- get
- patch
- update
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- actions.summerwind.dev
resources:
- runnerdeployments/status
verbs:
- get
- patch
- update
{{- end }}

View File

@@ -0,0 +1,16 @@
{{- if .Values.githubWebhookServer.enabled }}
{{- if .Values.githubWebhookServer.secret.enabled }}
apiVersion: v1
kind: Secret
metadata:
name: github-webhook-server
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
type: Opaque
data:
{{- range $k, $v := .Values.githubWebhookServer.secret }}
{{ $k }}: {{ $v | toString | b64enc }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,17 @@
{{- if .Values.githubWebhookServer.enabled }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "actions-runner-controller-github-webhook-server.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
spec:
type: {{ .Values.githubWebhookServer.service.type }}
ports:
{{ range $_, $port := .Values.githubWebhookServer.service.ports -}}
- {{ $port | toYaml | nindent 6 }}
{{- end }}
selector:
{{- include "actions-runner-controller-github-webhook-server.selectorLabels" . | nindent 4 }}
{{- end }}

View File

@@ -0,0 +1,15 @@
{{- if .Values.githubWebhookServer.enabled -}}
{{- if .Values.githubWebhookServer.serviceAccount.create -}}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "actions-runner-controller-github-webhook-server.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "actions-runner-controller.labels" . | nindent 4 }}
{{- with .Values.githubWebhookServer.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -26,6 +26,11 @@ image:
dindSidecarRepositoryAndTag: "docker:dind"
pullPolicy: IfNotPresent
kube_rbac_proxy:
image:
repository: quay.io/brancz/kube-rbac-proxy
tag: v0.8.0
imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""
@@ -98,3 +103,51 @@ affinity: {}
# ref: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
# PriorityClass: system-cluster-critical
priorityClassName: ""
env: {}
# http_proxy: "proxy.com:8080"
# https_proxy: "proxy.com:8080"
# no_proxy: ""
githubWebhookServer:
enabled: false
labels: {}
replicaCount: 1
syncPeriod: 10m
secret:
enabled: false
### GitHub Webhook Configuration
#github_webhook_secret_token: ""
image:
repository: summerwind/actions-runner-controller
# Overrides the manager image tag whose default is the chart appVersion if the tag key is commented out
tag: "latest"
pullPolicy: IfNotPresent
imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""
serviceAccount:
# Specifies whether a service account should be created
create: true
# Annotations to add to the service account
annotations: {}
# The name of the service account to use.
# If not set and create is true, a name is generated using the fullname template
name: ""
podAnnotations: {}
podSecurityContext: {}
# fsGroup: 2000
securityContext: {}
resources: {}
nodeSelector: {}
tolerations: []
affinity: {}
priorityClassName: ""
service:
type: NodePort
ports:
- port: 80
targetPort: 8000
protocol: TCP
name: http
#nodePort: someFixedPortForUseWithTerraformCdkCfnEtc

View File

@@ -0,0 +1,169 @@
/*
Copyright 2021 The actions-runner-controller authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package main
import (
"context"
"errors"
"flag"
"net/http"
"os"
"sync"
"time"
actionsv1alpha1 "github.com/summerwind/actions-runner-controller/api/v1alpha1"
"github.com/summerwind/actions-runner-controller/controllers"
"k8s.io/apimachinery/pkg/runtime"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
_ "k8s.io/client-go/plugin/pkg/client/auth/exec"
_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
_ "k8s.io/client-go/plugin/pkg/client/auth/oidc"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/log/zap"
// +kubebuilder:scaffold:imports
)
var (
scheme = runtime.NewScheme()
setupLog = ctrl.Log.WithName("setup")
)
func init() {
_ = clientgoscheme.AddToScheme(scheme)
_ = actionsv1alpha1.AddToScheme(scheme)
// +kubebuilder:scaffold:scheme
}
func main() {
var (
err error
webhookAddr string
metricsAddr string
// The secret token of the GitHub Webhook. See https://docs.github.com/en/developers/webhooks-and-events/securing-your-webhooks
webhookSecretToken string
watchNamespace string
enableLeaderElection bool
syncPeriod time.Duration
)
webhookSecretToken = os.Getenv("GITHUB_WEBHOOK_SECRET_TOKEN")
flag.StringVar(&webhookAddr, "webhook-addr", ":8000", "The address the metric endpoint binds to.")
flag.StringVar(&metricsAddr, "metrics-addr", ":8080", "The address the metric endpoint binds to.")
flag.StringVar(&watchNamespace, "watch-namespace", "", "The namespace to watch for HorizontalRunnerAutoscaler's to scale on Webhook. Set to empty for letting it watch for all namespaces.")
flag.BoolVar(&enableLeaderElection, "enable-leader-election", false,
"Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.")
flag.DurationVar(&syncPeriod, "sync-period", 10*time.Minute, "Determines the minimum frequency at which K8s resources managed by this controller are reconciled. When you use autoscaling, set to a lower value like 10 minute, because this corresponds to the minimum time to react on demand change")
flag.Parse()
if webhookSecretToken == "" {
setupLog.Info("-webhook-secret-token is missing or empty. Create one following https://docs.github.com/en/developers/webhooks-and-events/securing-your-webhooks")
}
if watchNamespace == "" {
setupLog.Info("-watch-namespace is empty. HorizontalRunnerAutoscalers in all the namespaces are watched, cached, and considered as scale targets.")
} else {
setupLog.Info("-watch-namespace is %q. Only HorizontalRunnerAutoscalers in %q are watched, cached, and considered as scale targets.")
}
logger := zap.New(func(o *zap.Options) {
o.Development = true
})
ctrl.SetLogger(logger)
mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
Scheme: scheme,
SyncPeriod: &syncPeriod,
LeaderElection: enableLeaderElection,
Namespace: watchNamespace,
MetricsBindAddress: metricsAddr,
Port: 9443,
})
if err != nil {
setupLog.Error(err, "unable to start manager")
os.Exit(1)
}
hraGitHubWebhook := &controllers.HorizontalRunnerAutoscalerGitHubWebhook{
Client: mgr.GetClient(),
Log: ctrl.Log.WithName("controllers").WithName("Runner"),
Recorder: nil,
Scheme: mgr.GetScheme(),
SecretKeyBytes: []byte(webhookSecretToken),
WatchNamespace: watchNamespace,
}
if err = hraGitHubWebhook.SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "Runner")
os.Exit(1)
}
var wg sync.WaitGroup
ctx, cancel := context.WithCancel(context.Background())
wg.Add(1)
go func() {
defer cancel()
defer wg.Done()
setupLog.Info("starting webhook server")
if err := mgr.Start(ctx.Done()); err != nil {
setupLog.Error(err, "problem running manager")
os.Exit(1)
}
}()
mux := http.NewServeMux()
mux.HandleFunc("/", hraGitHubWebhook.Handle)
srv := http.Server{
Addr: webhookAddr,
Handler: mux,
}
wg.Add(1)
go func() {
defer cancel()
defer wg.Done()
go func() {
<-ctx.Done()
srv.Shutdown(context.Background())
}()
if err := srv.ListenAndServe(); err != nil {
if !errors.Is(err, http.ErrServerClosed) {
setupLog.Error(err, "problem running http server")
}
}
}()
go func() {
<-ctrl.SetupSignalHandler()
cancel()
}()
wg.Wait()
}

View File

@@ -48,6 +48,20 @@ spec:
description: HorizontalRunnerAutoscalerSpec defines the desired state of
HorizontalRunnerAutoscaler
properties:
capacityReservations:
items:
description: CapacityReservation specifies the number of replicas
temporarily added to the scale target until ExpirationTime.
properties:
expirationTime:
format: date-time
type: string
name:
type: string
replicas:
type: integer
type: object
type: array
maxReplicas:
description: MinReplicas is the maximum number of replicas the deployment
is allowed to scale
@@ -64,6 +78,11 @@ spec:
items:
type: string
type: array
scaleDownAdjustment:
description: ScaleDownAdjustment is the number of runners removed
on scale-down. You can only specify either ScaleDownFactor or
ScaleDownAdjustment.
type: integer
scaleDownFactor:
description: ScaleDownFactor is the multiplicative factor applied
to the current number of runners used to determine how many
@@ -73,6 +92,10 @@ spec:
description: ScaleDownThreshold is the percentage of busy runners
less than which will trigger the hpa to scale the runners down.
type: string
scaleUpAdjustment:
description: ScaleUpAdjustment is the number of runners added
on scale-up. You can only specify either ScaleUpFactor or ScaleUpAdjustment.
type: integer
scaleUpFactor:
description: ScaleUpFactor is the multiplicative factor applied
to the current number of runners used to determine how many
@@ -104,9 +127,68 @@ spec:
name:
type: string
type: object
scaleUpTriggers:
description: "ScaleUpTriggers is an experimental feature to increase
the desired replicas by 1 on each webhook requested received by the
webhookBasedAutoscaler. \n This feature requires you to also enable
and deploy the webhookBasedAutoscaler onto your cluster. \n Note that
the added runners remain until the next sync period at least, and
they may or may not be used by GitHub Actions depending on the timing.
They are intended to be used to gain \"resource slack\" immediately
after you receive a webhook from GitHub, so that you can loosely expect
MinReplicas runners to be always available."
items:
properties:
amount:
type: integer
duration:
type: string
githubEvent:
properties:
checkRun:
description: https://docs.github.com/en/actions/reference/events-that-trigger-workflows#check_run
properties:
status:
type: string
types:
items:
type: string
type: array
type: object
pullRequest:
description: https://docs.github.com/en/actions/reference/events-that-trigger-workflows#pull_request
properties:
branches:
items:
type: string
type: array
types:
items:
type: string
type: array
type: object
push:
description: PushSpec is the condition for triggering scale-up
on push event Also see https://docs.github.com/en/actions/reference/events-that-trigger-workflows#push
type: object
type: object
type: object
type: array
type: object
status:
properties:
cacheEntries:
items:
properties:
expirationTime:
format: date-time
type: string
key:
type: string
value:
type: integer
type: object
type: array
desiredReplicas:
description: DesiredReplicas is the total number of desired, non-terminated
and latest pods to be set for the primary RunnerSet This doesn't include

View File

@@ -426,6 +426,9 @@ spec:
type: object
dockerdWithinRunnerContainer:
type: boolean
enterprise:
pattern: ^[^/]+$
type: string
env:
items:
description: EnvVar represents an environment variable present in a Container.

View File

@@ -426,6 +426,9 @@ spec:
type: object
dockerdWithinRunnerContainer:
type: boolean
enterprise:
pattern: ^[^/]+$
type: string
env:
items:
description: EnvVar represents an environment variable present in a Container.

View File

@@ -7,6 +7,9 @@ metadata:
name: runners.actions.summerwind.dev
spec:
additionalPrinterColumns:
- JSONPath: .spec.enterprise
name: Enterprise
type: string
- JSONPath: .spec.organization
name: Organization
type: string
@@ -419,6 +422,9 @@ spec:
type: object
dockerdWithinRunnerContainer:
type: boolean
enterprise:
pattern: ^[^/]+$
type: string
env:
items:
description: EnvVar represents an environment variable present in a Container.
@@ -1541,6 +1547,8 @@ spec:
registration:
description: RunnerStatusRegistration contains runner registration status
properties:
enterprise:
type: string
expiresAt:
format: date-time
type: string

View File

@@ -10,7 +10,7 @@ spec:
spec:
containers:
- name: kube-rbac-proxy
image: gcr.io/kubebuilder/kube-rbac-proxy:v0.4.1
image: quay.io/brancz/kube-rbac-proxy:v0.8.0
args:
- "--secure-listen-address=0.0.0.0:8443"
- "--upstream=http://127.0.0.1:8080/"

View File

@@ -7,6 +7,7 @@ import (
"math"
"strconv"
"strings"
"time"
"github.com/summerwind/actions-runner-controller/api/v1alpha1"
"sigs.k8s.io/controller-runtime/pkg/client"
@@ -19,6 +20,47 @@ const (
defaultScaleDownFactor = 0.7
)
func getValueAvailableAt(now time.Time, from, to *time.Time, reservedValue int) *int {
if to != nil && now.After(*to) {
return nil
}
if from != nil && now.Before(*from) {
return nil
}
return &reservedValue
}
func (r *HorizontalRunnerAutoscalerReconciler) getDesiredReplicasFromCache(hra v1alpha1.HorizontalRunnerAutoscaler) *int {
var entry *v1alpha1.CacheEntry
for i := range hra.Status.CacheEntries {
ent := hra.Status.CacheEntries[i]
if ent.Key != v1alpha1.CacheEntryKeyDesiredReplicas {
continue
}
if !time.Now().Before(ent.ExpirationTime.Time) {
continue
}
entry = &ent
break
}
if entry != nil {
v := getValueAvailableAt(time.Now(), nil, &entry.ExpirationTime.Time, entry.Value)
if v != nil {
return v
}
}
return nil
}
func (r *HorizontalRunnerAutoscalerReconciler) determineDesiredReplicas(rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler) (*int, error) {
if hra.Spec.MinReplicas == nil {
return nil, fmt.Errorf("horizontalrunnerautoscaler %s/%s is missing minReplicas", hra.Namespace, hra.Name)
@@ -96,12 +138,12 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByQueuedAndInPro
for _, repo := range repos {
user, repoName := repo[0], repo[1]
list, _, err := r.GitHubClient.Actions.ListRepositoryWorkflowRuns(context.TODO(), user, repoName, nil)
workflowRuns, err := r.GitHubClient.ListRepositoryWorkflowRuns(context.TODO(), user, repoName)
if err != nil {
return nil, err
}
for _, run := range list.WorkflowRuns {
for _, run := range workflowRuns {
total++
// In May 2020, there are only 3 statuses.
@@ -147,6 +189,9 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByQueuedAndInPro
"workflow_runs_in_progress", inProgress,
"workflow_runs_queued", queued,
"workflow_runs_unknown", unknown,
"namespace", hra.Namespace,
"runner_deployment", rd.Name,
"horizontal_runner_autoscaler", hra.Name,
)
return &replicas, nil
@@ -154,7 +199,6 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByQueuedAndInPro
func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunnersBusy(rd v1alpha1.RunnerDeployment, hra v1alpha1.HorizontalRunnerAutoscaler) (*int, error) {
ctx := context.Background()
orgName := rd.Spec.Template.Spec.Organization
minReplicas := *hra.Spec.MinReplicas
maxReplicas := *hra.Spec.MaxReplicas
metrics := hra.Spec.Metrics[0]
@@ -178,14 +222,34 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunn
scaleDownThreshold = sdt
}
scaleUpAdjustment := metrics.ScaleUpAdjustment
if scaleUpAdjustment != 0 {
if metrics.ScaleUpAdjustment < 0 {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleUpAdjustment cannot be lower than 0")
}
if metrics.ScaleUpFactor != "" {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[]: scaleUpAdjustment and scaleUpFactor cannot be specified together")
}
} else if metrics.ScaleUpFactor != "" {
suf, err := strconv.ParseFloat(metrics.ScaleUpFactor, 64)
if err != nil {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleUpFactor cannot be parsed into a float64")
}
scaleUpFactor = suf
}
scaleDownAdjustment := metrics.ScaleDownAdjustment
if scaleDownAdjustment != 0 {
if metrics.ScaleDownAdjustment < 0 {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleDownAdjustment cannot be lower than 0")
}
if metrics.ScaleDownFactor != "" {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[]: scaleDownAdjustment and scaleDownFactor cannot be specified together")
}
} else if metrics.ScaleDownFactor != "" {
sdf, err := strconv.ParseFloat(metrics.ScaleDownFactor, 64)
if err != nil {
return nil, errors.New("validating autoscaling metrics: spec.autoscaling.metrics[].scaleDownFactor cannot be parsed into a float64")
@@ -203,8 +267,18 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunn
runnerMap[items.Name] = struct{}{}
}
var (
enterprise = rd.Spec.Template.Spec.Enterprise
organization = rd.Spec.Template.Spec.Organization
repository = rd.Spec.Template.Spec.Repository
)
// ListRunners will return all runners managed by GitHub - not restricted to ns
runners, err := r.GitHubClient.ListRunners(ctx, orgName, "")
runners, err := r.GitHubClient.ListRunners(
ctx,
enterprise,
organization,
repository)
if err != nil {
return nil, err
}
@@ -219,9 +293,17 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunn
var desiredReplicas int
fractionBusy := float64(numRunnersBusy) / float64(numRunners)
if fractionBusy >= scaleUpThreshold {
if scaleUpAdjustment > 0 {
desiredReplicas = numRunners + scaleUpAdjustment
} else {
desiredReplicas = int(math.Ceil(float64(numRunners) * scaleUpFactor))
}
} else if fractionBusy < scaleDownThreshold {
if scaleDownAdjustment > 0 {
desiredReplicas = numRunners - scaleDownAdjustment
} else {
desiredReplicas = int(float64(numRunners) * scaleDownFactor)
}
} else {
desiredReplicas = *rd.Spec.Replicas
}
@@ -240,6 +322,12 @@ func (r *HorizontalRunnerAutoscalerReconciler) calculateReplicasByPercentageRunn
"current_replicas", rd.Spec.Replicas,
"num_runners", numRunners,
"num_runners_busy", numRunnersBusy,
"namespace", hra.Namespace,
"runner_deployment", rd.Name,
"horizontal_runner_autoscaler", hra.Name,
"enterprise", enterprise,
"organization", organization,
"repository", repository,
)
rd.Status.Replicas = &desiredReplicas

View File

@@ -47,7 +47,11 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
min *int
sReplicas *int
sTime *metav1.Time
workflowRuns string
workflowRuns_queued string
workflowRuns_in_progress string
workflowJobs map[int]string
want int
err string
@@ -59,6 +63,8 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(3),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 3,
},
// 2 demanded, max at 3, currently 3, delay scaling down due to grace period
@@ -68,7 +74,9 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
max: intPtr(3),
sReplicas: intPtr(3),
sTime: &metav1Now,
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns: `{"total_count": 3, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 3,
},
// 3 demanded, max at 2
@@ -77,6 +85,8 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(2),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 2,
},
// 2 demanded, min at 2
@@ -85,6 +95,8 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(3),
workflowRuns: `{"total_count": 3, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 2,
},
// 1 demanded, min at 2
@@ -93,6 +105,8 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
want: 2,
},
// 1 demanded, min at 2
@@ -101,6 +115,8 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 2,
},
// 1 demanded, min at 1
@@ -109,6 +125,8 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
min: intPtr(1),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
want: 1,
},
// 1 demanded, min at 1
@@ -117,6 +135,8 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
min: intPtr(1),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 1,
},
// fixed at 3
@@ -126,6 +146,8 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
max: intPtr(3),
fixed: intPtr(3),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 3, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 3,
},
@@ -136,6 +158,8 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(10),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"id": 1, "status":"queued"}, {"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"id": 1, "status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}]}"`,
workflowJobs: map[int]string{
1: `{"jobs": [{"status":"queued"}, {"status":"queued"}]}`,
2: `{"jobs": [{"status": "in_progress"}, {"status":"completed"}]}`,
@@ -157,7 +181,11 @@ func TestDetermineDesiredReplicas_RepositoryRunner(t *testing.T) {
_ = v1alpha1.AddToScheme(scheme)
t.Run(fmt.Sprintf("case %d", i), func(t *testing.T) {
server := fake.NewServer(fake.WithListRepositoryWorkflowRunsResponse(200, tc.workflowRuns), fake.WithListWorkflowJobsResponse(200, tc.workflowJobs))
server := fake.NewServer(
fake.WithListRepositoryWorkflowRunsResponse(200, tc.workflowRuns, tc.workflowRuns_queued, tc.workflowRuns_in_progress),
fake.WithListWorkflowJobsResponse(200, tc.workflowJobs),
fake.WithListRunnersResponse(200, fake.RunnersListBody),
)
defer server.Close()
client := newGithubClient(server)
@@ -231,7 +259,11 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
min *int
sReplicas *int
sTime *metav1.Time
workflowRuns string
workflowRuns_queued string
workflowRuns_in_progress string
workflowJobs map[int]string
want int
err string
@@ -243,6 +275,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(3),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 3,
},
// 2 demanded, max at 3, currently 3, delay scaling down due to grace period
@@ -254,6 +288,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
sReplicas: intPtr(3),
sTime: &metav1Now,
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 3,
},
// 3 demanded, max at 2
@@ -263,6 +299,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(2),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`,
want: 2,
},
// 2 demanded, min at 2
@@ -272,6 +310,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(3),
workflowRuns: `{"total_count": 3, "workflow_runs":[{"status":"queued"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 2,
},
// 1 demanded, min at 2
@@ -281,6 +321,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
want: 2,
},
// 1 demanded, min at 2
@@ -290,6 +332,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 2,
},
// 1 demanded, min at 1
@@ -299,6 +343,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
min: intPtr(1),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 0, "workflow_runs":[]}"`,
want: 1,
},
// 1 demanded, min at 1
@@ -308,6 +354,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
min: intPtr(1),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
want: 1,
},
// fixed at 3
@@ -317,7 +365,9 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
fixed: intPtr(1),
min: intPtr(1),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 3, "workflow_runs":[{"status":"in_progress"},{"status":"in_progress"},{"status":"in_progress"}]}"`,
want: 3,
},
// org runner, fixed at 3
@@ -327,7 +377,9 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
fixed: intPtr(1),
min: intPtr(1),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns: `{"total_count": 4, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 3, "workflow_runs":[{"status":"in_progress"},{"status":"in_progress"},{"status":"in_progress"}]}"`,
want: 3,
},
// org runner, 1 demanded, min at 1, no repos
@@ -336,6 +388,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
min: intPtr(1),
max: intPtr(3),
workflowRuns: `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 0, "workflow_runs":[]}"`,
workflowRuns_in_progress: `{"total_count": 1, "workflow_runs":[{"status":"in_progress"}]}"`,
err: "validating autoscaling metrics: spec.autoscaling.metrics[].repositoryNames is required and must have one more more entries for organizational runner deployment",
},
@@ -347,6 +401,8 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
min: intPtr(2),
max: intPtr(10),
workflowRuns: `{"total_count": 4, "workflow_runs":[{"id": 1, "status":"queued"}, {"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}, {"status":"completed"}]}"`,
workflowRuns_queued: `{"total_count": 1, "workflow_runs":[{"id": 1, "status":"queued"}]}"`,
workflowRuns_in_progress: `{"total_count": 2, "workflow_runs":[{"id": 2, "status":"in_progress"}, {"id": 3, "status":"in_progress"}, {"status":"completed"}]}"`,
workflowJobs: map[int]string{
1: `{"jobs": [{"status":"queued"}, {"status":"queued"}]}`,
2: `{"jobs": [{"status": "in_progress"}, {"status":"completed"}]}`,
@@ -368,7 +424,11 @@ func TestDetermineDesiredReplicas_OrganizationalRunner(t *testing.T) {
_ = v1alpha1.AddToScheme(scheme)
t.Run(fmt.Sprintf("case %d", i), func(t *testing.T) {
server := fake.NewServer(fake.WithListRepositoryWorkflowRunsResponse(200, tc.workflowRuns), fake.WithListWorkflowJobsResponse(200, tc.workflowJobs))
server := fake.NewServer(
fake.WithListRepositoryWorkflowRunsResponse(200, tc.workflowRuns, tc.workflowRuns_queued, tc.workflowRuns_in_progress),
fake.WithListWorkflowJobsResponse(200, tc.workflowJobs),
fake.WithListRunnersResponse(200, fake.RunnersListBody),
)
defer server.Close()
client := newGithubClient(server)

View File

@@ -0,0 +1,377 @@
/*
Copyright 2020 The actions-runner-controller authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package controllers
import (
"context"
"fmt"
"io/ioutil"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"net/http"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
"time"
"github.com/go-logr/logr"
gogithub "github.com/google/go-github/v33/github"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/client-go/tools/record"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"github.com/summerwind/actions-runner-controller/api/v1alpha1"
)
const (
scaleTargetKey = "scaleTarget"
)
// HorizontalRunnerAutoscalerGitHubWebhook autoscales a HorizontalRunnerAutoscaler and the RunnerDeployment on each
// GitHub Webhook received
type HorizontalRunnerAutoscalerGitHubWebhook struct {
client.Client
Log logr.Logger
Recorder record.EventRecorder
Scheme *runtime.Scheme
// SecretKeyBytes is the byte representation of the Webhook secret token
// the administrator is generated and specified in GitHub Web UI.
SecretKeyBytes []byte
// WatchNamespace is the namespace to watch for HorizontalRunnerAutoscaler's to be
// scaled on Webhook.
// Set to empty for letting it watch for all namespaces.
WatchNamespace string
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) Reconcile(request reconcile.Request) (reconcile.Result, error) {
return ctrl.Result{}, nil
}
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=horizontalrunnerautoscalers,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=horizontalrunnerautoscalers/finalizers,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=horizontalrunnerautoscalers/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=core,resources=events,verbs=create;patch
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) Handle(w http.ResponseWriter, r *http.Request) {
var (
ok bool
err error
)
defer func() {
if !ok {
w.WriteHeader(http.StatusInternalServerError)
if err != nil {
msg := err.Error()
if written, err := w.Write([]byte(msg)); err != nil {
autoscaler.Log.Error(err, "failed writing http error response", "msg", msg, "written", written)
}
}
}
}()
defer func() {
if r.Body != nil {
r.Body.Close()
}
}()
var payload []byte
if len(autoscaler.SecretKeyBytes) > 0 {
payload, err = gogithub.ValidatePayload(r, autoscaler.SecretKeyBytes)
if err != nil {
autoscaler.Log.Error(err, "error validating request body")
return
}
} else {
payload, err = ioutil.ReadAll(r.Body)
if err != nil {
autoscaler.Log.Error(err, "error reading request body")
return
}
}
webhookType := gogithub.WebHookType(r)
event, err := gogithub.ParseWebHook(webhookType, payload)
if err != nil {
var s string
if payload != nil {
s = string(payload)
}
autoscaler.Log.Error(err, "could not parse webhook", "webhookType", webhookType, "payload", s)
return
}
var target *ScaleTarget
autoscaler.Log.Info("processing webhook event", "eventType", webhookType)
switch e := event.(type) {
case *gogithub.PushEvent:
target, err = autoscaler.getScaleUpTarget(
context.TODO(),
*e.Repo.Name,
*e.Repo.Organization,
autoscaler.MatchPushEvent(e),
)
case *gogithub.PullRequestEvent:
target, err = autoscaler.getScaleUpTarget(
context.TODO(),
*e.Repo.Name,
*e.Repo.Organization.Name,
autoscaler.MatchPullRequestEvent(e),
)
case *gogithub.CheckRunEvent:
target, err = autoscaler.getScaleUpTarget(
context.TODO(),
*e.Repo.Name,
*e.Org.Name,
autoscaler.MatchCheckRunEvent(e),
)
case *gogithub.PingEvent:
ok = true
w.WriteHeader(http.StatusOK)
msg := "pong"
if written, err := w.Write([]byte(msg)); err != nil {
autoscaler.Log.Error(err, "failed writing http response", "msg", msg, "written", written)
}
autoscaler.Log.Info("received ping event")
return
default:
autoscaler.Log.Info("unknown event type", "eventType", webhookType)
return
}
if err != nil {
autoscaler.Log.Error(err, "handling check_run event")
return
}
if target == nil {
msg := "no horizontalrunnerautoscaler to scale for this github event"
autoscaler.Log.Info(msg, "eventType", webhookType)
ok = true
w.WriteHeader(http.StatusOK)
if written, err := w.Write([]byte(msg)); err != nil {
autoscaler.Log.Error(err, "failed writing http response", "msg", msg, "written", written)
}
return
}
if err := autoscaler.tryScaleUp(context.TODO(), target); err != nil {
autoscaler.Log.Error(err, "could not scale up")
return
}
ok = true
w.WriteHeader(http.StatusOK)
msg := fmt.Sprintf("scaled %s by 1", target.Name)
autoscaler.Log.Info(msg)
if written, err := w.Write([]byte(msg)); err != nil {
autoscaler.Log.Error(err, "failed writing http response", "msg", msg, "written", written)
}
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) findHRAsByKey(ctx context.Context, value string) ([]v1alpha1.HorizontalRunnerAutoscaler, error) {
ns := autoscaler.WatchNamespace
var defaultListOpts []client.ListOption
if ns != "" {
defaultListOpts = append(defaultListOpts, client.InNamespace(ns))
}
var hras []v1alpha1.HorizontalRunnerAutoscaler
if value != "" {
opts := append([]client.ListOption{}, defaultListOpts...)
opts = append(opts, client.MatchingFields{scaleTargetKey: value})
var hraList v1alpha1.HorizontalRunnerAutoscalerList
if err := autoscaler.List(ctx, &hraList, opts...); err != nil {
return nil, err
}
for _, d := range hraList.Items {
hras = append(hras, d)
}
}
return hras, nil
}
func matchTriggerConditionAgainstEvent(types []string, eventAction *string) bool {
if len(types) == 0 {
return true
}
if eventAction == nil {
return false
}
for _, tpe := range types {
if tpe == *eventAction {
return true
}
}
return false
}
type ScaleTarget struct {
v1alpha1.HorizontalRunnerAutoscaler
v1alpha1.ScaleUpTrigger
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) searchScaleTargets(hras []v1alpha1.HorizontalRunnerAutoscaler, f func(v1alpha1.ScaleUpTrigger) bool) []ScaleTarget {
var matched []ScaleTarget
for _, hra := range hras {
if !hra.ObjectMeta.DeletionTimestamp.IsZero() {
continue
}
for _, scaleUpTrigger := range hra.Spec.ScaleUpTriggers {
if !f(scaleUpTrigger) {
continue
}
matched = append(matched, ScaleTarget{
HorizontalRunnerAutoscaler: hra,
ScaleUpTrigger: scaleUpTrigger,
})
}
}
return matched
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) getScaleTarget(ctx context.Context, name string, f func(v1alpha1.ScaleUpTrigger) bool) (*ScaleTarget, error) {
hras, err := autoscaler.findHRAsByKey(ctx, name)
if err != nil {
return nil, err
}
targets := autoscaler.searchScaleTargets(hras, f)
if len(targets) != 1 {
return nil, nil
}
return &targets[0], nil
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) getScaleUpTarget(ctx context.Context, repoNameFromWebhook, orgNameFromWebhook string, f func(v1alpha1.ScaleUpTrigger) bool) (*ScaleTarget, error) {
if target, err := autoscaler.getScaleTarget(ctx, repoNameFromWebhook, f); err != nil {
return nil, err
} else if target != nil {
autoscaler.Log.Info("scale up target is repository-wide runners", "repository", repoNameFromWebhook)
return target, nil
}
if target, err := autoscaler.getScaleTarget(ctx, orgNameFromWebhook, f); err != nil {
return nil, err
} else if target != nil {
autoscaler.Log.Info("scale up target is organizational runners", "repository", orgNameFromWebhook)
return target, nil
}
return nil, nil
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) tryScaleUp(ctx context.Context, target *ScaleTarget) error {
if target == nil {
return nil
}
log := autoscaler.Log.WithValues("horizontalrunnerautoscaler", target.HorizontalRunnerAutoscaler.Name)
copy := target.HorizontalRunnerAutoscaler.DeepCopy()
amount := 1
if target.ScaleUpTrigger.Amount > 0 {
amount = target.ScaleUpTrigger.Amount
}
copy.Spec.CapacityReservations = append(copy.Spec.CapacityReservations, v1alpha1.CapacityReservation{
ExpirationTime: metav1.Time{Time: time.Now().Add(target.ScaleUpTrigger.Duration.Duration)},
Replicas: amount,
})
if err := autoscaler.Client.Update(ctx, copy); err != nil {
log.Error(err, "Failed to update horizontalrunnerautoscaler resource")
return err
}
return nil
}
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) SetupWithManager(mgr ctrl.Manager) error {
name := "webhookbasedautoscaler"
autoscaler.Recorder = mgr.GetEventRecorderFor(name)
if err := mgr.GetFieldIndexer().IndexField(&v1alpha1.HorizontalRunnerAutoscaler{}, scaleTargetKey, func(rawObj runtime.Object) []string {
hra := rawObj.(*v1alpha1.HorizontalRunnerAutoscaler)
if hra.Spec.ScaleTargetRef.Name == "" {
return nil
}
var rd v1alpha1.RunnerDeployment
if err := autoscaler.Client.Get(context.Background(), types.NamespacedName{Namespace: hra.Namespace, Name: hra.Spec.ScaleTargetRef.Name}, &rd); err != nil {
return nil
}
return []string{rd.Spec.Template.Spec.Repository, rd.Spec.Template.Spec.Organization}
}); err != nil {
return err
}
return ctrl.NewControllerManagedBy(mgr).
For(&v1alpha1.HorizontalRunnerAutoscaler{}).
Named(name).
Complete(autoscaler)
}

View File

@@ -0,0 +1,32 @@
package controllers
import (
"github.com/google/go-github/v33/github"
"github.com/summerwind/actions-runner-controller/api/v1alpha1"
)
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) MatchCheckRunEvent(event *github.CheckRunEvent) func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
return func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
g := scaleUpTrigger.GitHubEvent
if g == nil {
return false
}
cr := g.CheckRun
if cr == nil {
return false
}
if !matchTriggerConditionAgainstEvent(cr.Types, event.Action) {
return false
}
if cr.Status != "" && (event.CheckRun == nil || event.CheckRun.Status == nil || *event.CheckRun.Status != cr.Status) {
return false
}
return true
}
}

View File

@@ -0,0 +1,32 @@
package controllers
import (
"github.com/google/go-github/v33/github"
"github.com/summerwind/actions-runner-controller/api/v1alpha1"
)
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) MatchPullRequestEvent(event *github.PullRequestEvent) func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
return func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
g := scaleUpTrigger.GitHubEvent
if g == nil {
return false
}
pr := g.PullRequest
if pr == nil {
return false
}
if !matchTriggerConditionAgainstEvent(pr.Types, event.Action) {
return false
}
if !matchTriggerConditionAgainstEvent(pr.Branches, event.PullRequest.Base.Ref) {
return false
}
return true
}
}

View File

@@ -0,0 +1,24 @@
package controllers
import (
"github.com/google/go-github/v33/github"
"github.com/summerwind/actions-runner-controller/api/v1alpha1"
)
func (autoscaler *HorizontalRunnerAutoscalerGitHubWebhook) MatchPushEvent(event *github.PushEvent) func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
return func(scaleUpTrigger v1alpha1.ScaleUpTrigger) bool {
g := scaleUpTrigger.GitHubEvent
if g == nil {
return false
}
push := g.Push
if push == nil {
return false
}
return true
}
}

View File

@@ -0,0 +1,245 @@
package controllers
import (
"bytes"
"encoding/json"
"fmt"
"github.com/go-logr/logr"
"github.com/google/go-github/v33/github"
actionsv1alpha1 "github.com/summerwind/actions-runner-controller/api/v1alpha1"
"io"
"io/ioutil"
"k8s.io/apimachinery/pkg/runtime"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
"net/http"
"net/http/httptest"
"net/url"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
"testing"
)
var (
sc = runtime.NewScheme()
)
func init() {
_ = clientgoscheme.AddToScheme(sc)
_ = actionsv1alpha1.AddToScheme(sc)
}
func TestWebhookCheckRun(t *testing.T) {
testServer(t,
"check_run",
&github.CheckRunEvent{
CheckRun: &github.CheckRun{
Status: github.String("queued"),
},
Repo: &github.Repository{
Name: github.String("myorg/myrepo"),
},
Org: &github.Organization{
Name: github.String("myorg"),
},
Action: github.String("created"),
},
200,
"no horizontalrunnerautoscaler to scale for this github event",
)
}
func TestWebhookPullRequest(t *testing.T) {
testServer(t,
"pull_request",
&github.PullRequestEvent{
PullRequest: &github.PullRequest{
Base: &github.PullRequestBranch{
Ref: github.String("main"),
},
},
Repo: &github.Repository{
Name: github.String("myorg/myrepo"),
Organization: &github.Organization{
Name: github.String("myorg"),
},
},
Action: github.String("created"),
},
200,
"no horizontalrunnerautoscaler to scale for this github event",
)
}
func TestWebhookPush(t *testing.T) {
testServer(t,
"push",
&github.PushEvent{
Repo: &github.PushEventRepository{
Name: github.String("myrepo"),
Organization: github.String("myorg"),
},
},
200,
"no horizontalrunnerautoscaler to scale for this github event",
)
}
func TestWebhookPing(t *testing.T) {
testServer(t,
"ping",
&github.PingEvent{
Zen: github.String("zen"),
},
200,
"pong",
)
}
func installTestLogger(webhook *HorizontalRunnerAutoscalerGitHubWebhook) *bytes.Buffer {
logs := &bytes.Buffer{}
log := testLogger{
name: "testlog",
writer: logs,
}
webhook.Log = &log
return logs
}
func testServer(t *testing.T, eventType string, event interface{}, wantCode int, wantBody string) {
t.Helper()
hraWebhook := &HorizontalRunnerAutoscalerGitHubWebhook{}
var initObjs []runtime.Object
client := fake.NewFakeClientWithScheme(sc, initObjs...)
logs := installTestLogger(hraWebhook)
defer func() {
if t.Failed() {
t.Logf("diagnostics: %s", logs.String())
}
}()
hraWebhook.Client = client
mux := http.NewServeMux()
mux.HandleFunc("/", hraWebhook.Handle)
server := httptest.NewServer(mux)
defer server.Close()
resp, err := sendWebhook(server, eventType, event)
if err != nil {
t.Fatal(err)
}
defer func() {
if resp != nil {
resp.Body.Close()
}
}()
if resp.StatusCode != wantCode {
t.Error("status:", resp.StatusCode)
}
respBody, err := ioutil.ReadAll(resp.Body)
if err != nil {
t.Fatal(err)
}
if string(respBody) != wantBody {
t.Fatal("body:", string(respBody))
}
}
func sendWebhook(server *httptest.Server, eventType string, event interface{}) (*http.Response, error) {
jsonBuf := &bytes.Buffer{}
enc := json.NewEncoder(jsonBuf)
enc.SetIndent(" ", "")
err := enc.Encode(event)
if err != nil {
return nil, fmt.Errorf("[bug in test] encoding event to json: %+v", err)
}
reqBody := jsonBuf.Bytes()
u, err := url.Parse(server.URL)
if err != nil {
return nil, fmt.Errorf("parsing server url: %v", err)
}
req := &http.Request{
Method: http.MethodPost,
URL: u,
Header: map[string][]string{
"X-GitHub-Event": {eventType},
"Content-Type": {"application/json"},
},
Body: ioutil.NopCloser(bytes.NewBuffer(reqBody)),
}
return http.DefaultClient.Do(req)
}
// testLogger is a sample logr.Logger that logs in-memory.
// It's only for testing log outputs.
type testLogger struct {
name string
keyValues map[string]interface{}
writer io.Writer
}
var _ logr.Logger = &testLogger{}
func (l *testLogger) Info(msg string, kvs ...interface{}) {
fmt.Fprintf(l.writer, "%s] %s\t", l.name, msg)
for k, v := range l.keyValues {
fmt.Fprintf(l.writer, "%s=%+v ", k, v)
}
for i := 0; i < len(kvs); i += 2 {
fmt.Fprintf(l.writer, "%s=%+v ", kvs[i], kvs[i+1])
}
fmt.Fprintf(l.writer, "\n")
}
func (_ *testLogger) Enabled() bool {
return true
}
func (l *testLogger) Error(err error, msg string, kvs ...interface{}) {
kvs = append(kvs, "error", err)
l.Info(msg, kvs...)
}
func (l *testLogger) V(_ int) logr.InfoLogger {
return l
}
func (l *testLogger) WithName(name string) logr.Logger {
return &testLogger{
name: l.name + "." + name,
keyValues: l.keyValues,
writer: l.writer,
}
}
func (l *testLogger) WithValues(kvs ...interface{}) logr.Logger {
newMap := make(map[string]interface{}, len(l.keyValues)+len(kvs)/2)
for k, v := range l.keyValues {
newMap[k] = v
}
for i := 0; i < len(kvs); i += 2 {
newMap[kvs[i].(string)] = kvs[i+1]
}
return &testLogger{
name: l.name,
keyValues: newMap,
writer: l.writer,
}
}

View File

@@ -46,6 +46,8 @@ type HorizontalRunnerAutoscalerReconciler struct {
Log logr.Logger
Recorder record.EventRecorder
Scheme *runtime.Scheme
CacheDuration time.Duration
}
// +kubebuilder:rbac:groups=actions.summerwind.dev,resources=runnerdeployments,verbs=get;list;watch;update;patch
@@ -79,7 +81,16 @@ func (r *HorizontalRunnerAutoscalerReconciler) Reconcile(req ctrl.Request) (ctrl
return ctrl.Result{}, nil
}
replicas, err := r.computeReplicas(rd, hra)
var replicas *int
replicasFromCache := r.getDesiredReplicasFromCache(hra)
if replicasFromCache != nil {
replicas = replicasFromCache
} else {
var err error
replicas, err = r.computeReplicas(rd, hra)
if err != nil {
r.Recorder.Event(&hra, corev1.EventTypeNormal, "RunnerAutoscalingFailure", err.Error())
@@ -87,12 +98,25 @@ func (r *HorizontalRunnerAutoscalerReconciler) Reconcile(req ctrl.Request) (ctrl
return ctrl.Result{}, err
}
}
const defaultReplicas = 1
currentDesiredReplicas := getIntOrDefault(rd.Spec.Replicas, defaultReplicas)
newDesiredReplicas := getIntOrDefault(replicas, defaultReplicas)
now := time.Now()
for _, reservation := range hra.Spec.CapacityReservations {
if reservation.ExpirationTime.Time.After(now) {
newDesiredReplicas += reservation.Replicas
}
}
if hra.Spec.MaxReplicas != nil && *hra.Spec.MaxReplicas < newDesiredReplicas {
newDesiredReplicas = *hra.Spec.MaxReplicas
}
// Please add more conditions that we can in-place update the newest runnerreplicaset without disruption
if currentDesiredReplicas != newDesiredReplicas {
copy := rd.DeepCopy()
@@ -103,12 +127,12 @@ func (r *HorizontalRunnerAutoscalerReconciler) Reconcile(req ctrl.Request) (ctrl
return ctrl.Result{}, err
}
return ctrl.Result{}, err
}
var updated *v1alpha1.HorizontalRunnerAutoscaler
if hra.Status.DesiredReplicas == nil || *hra.Status.DesiredReplicas != *replicas {
updated := hra.DeepCopy()
updated = hra.DeepCopy()
if (hra.Status.DesiredReplicas == nil && *replicas > 1) ||
(hra.Status.DesiredReplicas != nil && *replicas > *hra.Status.DesiredReplicas) {
@@ -117,7 +141,37 @@ func (r *HorizontalRunnerAutoscalerReconciler) Reconcile(req ctrl.Request) (ctrl
}
updated.Status.DesiredReplicas = replicas
}
if replicasFromCache == nil {
if updated == nil {
updated = hra.DeepCopy()
}
var cacheEntries []v1alpha1.CacheEntry
for _, ent := range updated.Status.CacheEntries {
if ent.ExpirationTime.Before(&metav1.Time{Time: now}) {
cacheEntries = append(cacheEntries, ent)
}
}
var cacheDuration time.Duration
if r.CacheDuration > 0 {
cacheDuration = r.CacheDuration
} else {
cacheDuration = 10 * time.Minute
}
updated.Status.CacheEntries = append(updated.Status.CacheEntries, v1alpha1.CacheEntry{
Key: v1alpha1.CacheEntryKeyDesiredReplicas,
Value: *replicas,
ExpirationTime: metav1.Time{Time: time.Now().Add(cacheDuration)},
})
}
if updated != nil {
if err := r.Status().Update(ctx, updated); err != nil {
log.Error(err, "Failed to update horizontalrunnerautoscaler status")
@@ -129,10 +183,12 @@ func (r *HorizontalRunnerAutoscalerReconciler) Reconcile(req ctrl.Request) (ctrl
}
func (r *HorizontalRunnerAutoscalerReconciler) SetupWithManager(mgr ctrl.Manager) error {
r.Recorder = mgr.GetEventRecorderFor("horizontalrunnerautoscaler-controller")
name := "horizontalrunnerautoscaler-controller"
r.Recorder = mgr.GetEventRecorderFor(name)
return ctrl.NewControllerManagedBy(mgr).
For(&v1alpha1.HorizontalRunnerAutoscaler{}).
Named(name).
Complete(r)
}

View File

@@ -2,6 +2,14 @@ package controllers
import (
"context"
"fmt"
"github.com/google/go-github/v33/github"
github3 "github.com/google/go-github/v33/github"
github2 "github.com/summerwind/actions-runner-controller/github"
"k8s.io/apimachinery/pkg/runtime"
"net/http"
"net/http/httptest"
"strings"
"time"
"github.com/summerwind/actions-runner-controller/github/fake"
@@ -27,9 +35,19 @@ type testEnvironment struct {
var (
workflowRunsFor3Replicas = `{"total_count": 5, "workflow_runs":[{"status":"queued"}, {"status":"queued"}, {"status":"in_progress"}, {"status":"in_progress"}, {"status":"completed"}]}"`
workflowRunsFor3Replicas_queued = `{"total_count": 2, "workflow_runs":[{"status":"queued"}, {"status":"queued"}]}"`
workflowRunsFor3Replicas_in_progress = `{"total_count": 2, "workflow_runs":[{"status":"in_progress"}, {"status":"in_progress"}]}"`
workflowRunsFor1Replicas = `{"total_count": 6, "workflow_runs":[{"status":"queued"}, {"status":"completed"}, {"status":"completed"}, {"status":"completed"}, {"status":"completed"}]}"`
workflowRunsFor1Replicas_queued = `{"total_count": 1, "workflow_runs":[{"status":"queued"}]}"`
workflowRunsFor1Replicas_in_progress = `{"total_count": 0, "workflow_runs":[]}"`
)
var webhookServer *httptest.Server
var ghClient *github2.Client
var fakeRunnerList *fake.RunnersList
// SetupIntegrationTest will set up a testing environment.
// This includes:
// * creating a Namespace to be used during the test
@@ -41,10 +59,17 @@ func SetupIntegrationTest(ctx context.Context) *testEnvironment {
ns := &corev1.Namespace{}
responses := &fake.FixedResponses{}
responses.ListRunners = fake.DefaultListRunnersHandler()
responses.ListRepositoryWorkflowRuns = &fake.Handler{
Status: 200,
Body: workflowRunsFor3Replicas,
Statuses: map[string]string{
"queued": workflowRunsFor3Replicas_queued,
"in_progress": workflowRunsFor3Replicas_in_progress,
},
}
fakeRunnerList = fake.NewRunnersList()
responses.ListRunners = fakeRunnerList.HandleList()
fakeGithubServer := fake.NewServer(fake.WithFixedResponses(responses))
BeforeEach(func() {
@@ -59,9 +84,7 @@ func SetupIntegrationTest(ctx context.Context) *testEnvironment {
mgr, err := ctrl.NewManager(cfg, ctrl.Options{})
Expect(err).NotTo(HaveOccurred(), "failed to create manager")
runnersList = fake.NewRunnersList()
server = runnersList.GetServer()
ghClient := newGithubClient(server)
ghClient = newGithubClient(fakeGithubServer)
replicasetController := &RunnerReplicaSetReconciler{
Client: mgr.GetClient(),
@@ -90,10 +113,25 @@ func SetupIntegrationTest(ctx context.Context) *testEnvironment {
Log: logf.Log,
GitHubClient: client,
Recorder: mgr.GetEventRecorderFor("horizontalrunnerautoscaler-controller"),
CacheDuration: 1 * time.Second,
}
err = autoscalerController.SetupWithManager(mgr)
Expect(err).NotTo(HaveOccurred(), "failed to setup controller")
autoscalerWebhook := &HorizontalRunnerAutoscalerGitHubWebhook{
Client: mgr.GetClient(),
Scheme: scheme.Scheme,
Log: logf.Log,
Recorder: mgr.GetEventRecorderFor("horizontalrunnerautoscaler-controller"),
}
err = autoscalerWebhook.SetupWithManager(mgr)
Expect(err).NotTo(HaveOccurred(), "failed to setup autoscaler webhook")
mux := http.NewServeMux()
mux.HandleFunc("/", autoscalerWebhook.Handle)
webhookServer = httptest.NewServer(mux)
go func() {
defer GinkgoRecover()
@@ -106,6 +144,7 @@ func SetupIntegrationTest(ctx context.Context) *testEnvironment {
close(stopCh)
fakeGithubServer.Close()
webhookServer.Close()
err := k8sClient.Delete(ctx, ns)
Expect(err).NotTo(HaveOccurred(), "failed to delete test namespace")
@@ -114,7 +153,7 @@ func SetupIntegrationTest(ctx context.Context) *testEnvironment {
return &testEnvironment{Namespace: ns, Responses: responses}
}
var _ = Context("Inside of a new namespace", func() {
var _ = Context("INTEGRATION: Inside of a new namespace", func() {
ctx := context.TODO()
env := SetupIntegrationTest(ctx)
ns := env.Namespace
@@ -126,7 +165,7 @@ var _ = Context("Inside of a new namespace", func() {
name := "example-runnerdeploy"
{
rs := &actionsv1alpha1.RunnerDeployment{
rd := &actionsv1alpha1.RunnerDeployment{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: ns.Name,
@@ -146,80 +185,17 @@ var _ = Context("Inside of a new namespace", func() {
},
}
err := k8sClient.Create(ctx, rs)
Expect(err).NotTo(HaveOccurred(), "failed to create test RunnerDeployment resource")
runnerSets := actionsv1alpha1.RunnerReplicaSetList{Items: []actionsv1alpha1.RunnerReplicaSet{}}
Eventually(
func() int {
err := k8sClient.List(ctx, &runnerSets, client.InNamespace(ns.Name))
if err != nil {
logf.Log.Error(err, "list runner sets")
}
return len(runnerSets.Items)
},
time.Second*5, time.Millisecond*500).Should(BeEquivalentTo(1))
Eventually(
func() int {
err := k8sClient.List(ctx, &runnerSets, client.InNamespace(ns.Name))
if err != nil {
logf.Log.Error(err, "list runner sets")
}
if len(runnerSets.Items) == 0 {
logf.Log.Info("No runnerreplicasets exist yet")
return -1
}
return *runnerSets.Items[0].Spec.Replicas
},
time.Second*5, time.Millisecond*500).Should(BeEquivalentTo(1))
ExpectCreate(ctx, rd, "test RunnerDeployment")
ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1)
ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 1)
}
{
// We wrap the update in the Eventually block to avoid the below error that occurs due to concurrent modification
// made by the controller to update .Status.AvailableReplicas and .Status.ReadyReplicas
// Operation cannot be fulfilled on runnersets.actions.summerwind.dev "example-runnerset": the object has been modified; please apply your changes to the latest version and try again
Eventually(func() error {
var rd actionsv1alpha1.RunnerDeployment
err := k8sClient.Get(ctx, types.NamespacedName{Namespace: ns.Name, Name: name}, &rd)
Expect(err).NotTo(HaveOccurred(), "failed to get test RunnerDeployment resource")
ExpectRunnerDeploymentEventuallyUpdates(ctx, ns.Name, name, func(rd *actionsv1alpha1.RunnerDeployment) {
rd.Spec.Replicas = intPtr(2)
return k8sClient.Update(ctx, &rd)
},
time.Second*1, time.Millisecond*500).Should(BeNil())
runnerSets := actionsv1alpha1.RunnerReplicaSetList{Items: []actionsv1alpha1.RunnerReplicaSet{}}
Eventually(
func() int {
err := k8sClient.List(ctx, &runnerSets, client.InNamespace(ns.Name))
if err != nil {
logf.Log.Error(err, "list runner sets")
}
return len(runnerSets.Items)
},
time.Second*5, time.Millisecond*500).Should(BeEquivalentTo(1))
Eventually(
func() int {
err := k8sClient.List(ctx, &runnerSets, client.InNamespace(ns.Name))
if err != nil {
logf.Log.Error(err, "list runner sets")
}
return *runnerSets.Items[0].Spec.Replicas
},
time.Second*5, time.Millisecond*500).Should(BeEquivalentTo(2))
})
ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1)
ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Namespace, 2)
}
// Scale-up to 3 replicas
@@ -235,48 +211,59 @@ var _ = Context("Inside of a new namespace", func() {
},
MinReplicas: intPtr(1),
MaxReplicas: intPtr(3),
ScaleDownDelaySecondsAfterScaleUp: nil,
ScaleDownDelaySecondsAfterScaleUp: intPtr(1),
Metrics: nil,
ScaleUpTriggers: []actionsv1alpha1.ScaleUpTrigger{
{
GitHubEvent: &actionsv1alpha1.GitHubEventScaleUpTriggerSpec{
PullRequest: &actionsv1alpha1.PullRequestSpec{
Types: []string{"created"},
Branches: []string{"main"},
},
},
Amount: 1,
Duration: metav1.Duration{Duration: time.Minute},
},
},
},
}
err := k8sClient.Create(ctx, hra)
ExpectCreate(ctx, hra, "test HorizontalRunnerAutoscaler")
Expect(err).NotTo(HaveOccurred(), "failed to create test HorizontalRunnerAutoscaler resource")
ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1)
ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 3)
}
runnerSets := actionsv1alpha1.RunnerReplicaSetList{Items: []actionsv1alpha1.RunnerReplicaSet{}}
{
var runnerList actionsv1alpha1.RunnerList
Eventually(
func() int {
err := k8sClient.List(ctx, &runnerSets, client.InNamespace(ns.Name))
err := k8sClient.List(ctx, &runnerList, client.InNamespace(ns.Name))
if err != nil {
logf.Log.Error(err, "list runner sets")
logf.Log.Error(err, "list runners")
}
return len(runnerSets.Items)
},
time.Second*5, time.Millisecond*500).Should(BeEquivalentTo(1))
Eventually(
func() int {
err := k8sClient.List(ctx, &runnerSets, client.InNamespace(ns.Name))
if err != nil {
logf.Log.Error(err, "list runner sets")
for i, r := range runnerList.Items {
fakeRunnerList.Add(&github3.Runner{
ID: github.Int64(int64(i)),
Name: github.String(r.Name),
OS: github.String("linux"),
Status: github.String("online"),
Busy: github.Bool(false),
})
}
if len(runnerSets.Items) == 0 {
logf.Log.Info("No runnerreplicasets exist yet")
return -1
}
return *runnerSets.Items[0].Spec.Replicas
},
time.Second*5, time.Millisecond*500).Should(BeEquivalentTo(3))
rs, err := ghClient.ListRunners(context.Background(), "", "", "test/valid")
Expect(err).NotTo(HaveOccurred(), "verifying list fake runners response")
Expect(len(rs)).To(Equal(3), "count of fake list runners")
}
// Scale-down to 1 replica
{
time.Sleep(time.Second)
responses.ListRepositoryWorkflowRuns.Body = workflowRunsFor1Replicas
responses.ListRepositoryWorkflowRuns.Statuses["queued"] = workflowRunsFor1Replicas_queued
responses.ListRepositoryWorkflowRuns.Statuses["in_progress"] = workflowRunsFor1Replicas_in_progress
var hra actionsv1alpha1.HorizontalRunnerAutoscaler
@@ -292,11 +279,97 @@ var _ = Context("Inside of a new namespace", func() {
Expect(err).NotTo(HaveOccurred(), "failed to get test HorizontalRunnerAutoscaler resource")
Eventually(
func() int {
var runnerSets actionsv1alpha1.RunnerReplicaSetList
ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 1, "runners after HRA force update for scale-down")
}
err := k8sClient.List(ctx, &runnerSets, client.InNamespace(ns.Name))
// Scale-up to 2 replicas on first pull_request create webhook event
{
SendPullRequestEvent("test/valid", "main", "created")
ExpectRunnerSetsCountEventuallyEquals(ctx, ns.Name, 1, "runner sets after webhook")
ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 2, "runners after first webhook event")
}
// Scale-up to 3 replicas on second pull_request create webhook event
{
SendPullRequestEvent("test/valid", "main", "created")
ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx, ns.Name, 3, "runners after second webhook event")
}
})
})
})
func SendPullRequestEvent(repo string, branch string, action string) {
org := strings.Split(repo, "/")[0]
resp, err := sendWebhook(webhookServer, "pull_request", &github.PullRequestEvent{
PullRequest: &github.PullRequest{
Base: &github.PullRequestBranch{
Ref: github.String(branch),
},
},
Repo: &github.Repository{
Name: github.String(repo),
Organization: &github.Organization{
Name: github.String(org),
},
},
Action: github.String(action),
})
ExpectWithOffset(1, err).NotTo(HaveOccurred(), "failed to send pull_request event")
ExpectWithOffset(1, resp.StatusCode).To(Equal(200))
}
func ExpectCreate(ctx context.Context, rd runtime.Object, s string) {
err := k8sClient.Create(ctx, rd)
ExpectWithOffset(1, err).NotTo(HaveOccurred(), fmt.Sprintf("failed to create %s resource", s))
}
func ExpectRunnerDeploymentEventuallyUpdates(ctx context.Context, ns string, name string, f func(rd *actionsv1alpha1.RunnerDeployment)) {
// We wrap the update in the Eventually block to avoid the below error that occurs due to concurrent modification
// made by the controller to update .Status.AvailableReplicas and .Status.ReadyReplicas
// Operation cannot be fulfilled on runnersets.actions.summerwind.dev "example-runnerset": the object has been modified; please apply your changes to the latest version and try again
EventuallyWithOffset(
1,
func() error {
var rd actionsv1alpha1.RunnerDeployment
err := k8sClient.Get(ctx, types.NamespacedName{Namespace: ns, Name: name}, &rd)
Expect(err).NotTo(HaveOccurred(), "failed to get test RunnerDeployment resource")
f(&rd)
return k8sClient.Update(ctx, &rd)
},
time.Second*1, time.Millisecond*500).Should(BeNil())
}
func ExpectRunnerSetsCountEventuallyEquals(ctx context.Context, ns string, count int, optionalDescription ...interface{}) {
runnerSets := actionsv1alpha1.RunnerReplicaSetList{Items: []actionsv1alpha1.RunnerReplicaSet{}}
EventuallyWithOffset(
1,
func() int {
err := k8sClient.List(ctx, &runnerSets, client.InNamespace(ns))
if err != nil {
logf.Log.Error(err, "list runner sets")
}
return len(runnerSets.Items)
},
time.Second*5, time.Millisecond*500).Should(BeEquivalentTo(count), optionalDescription...)
}
func ExpectRunnerSetsManagedReplicasCountEventuallyEquals(ctx context.Context, ns string, count int, optionalDescription ...interface{}) {
runnerSets := actionsv1alpha1.RunnerReplicaSetList{Items: []actionsv1alpha1.RunnerReplicaSet{}}
EventuallyWithOffset(
1,
func() int {
err := k8sClient.List(ctx, &runnerSets, client.InNamespace(ns))
if err != nil {
logf.Log.Error(err, "list runner sets")
}
@@ -308,8 +381,5 @@ var _ = Context("Inside of a new namespace", func() {
return *runnerSets.Items[0].Spec.Replicas
},
time.Second*5, time.Millisecond*500).Should(BeEquivalentTo(1))
time.Second*5, time.Millisecond*500).Should(BeEquivalentTo(count), optionalDescription...)
}
})
})
})

View File

@@ -18,12 +18,15 @@ package controllers
import (
"context"
"errors"
"fmt"
gogithub "github.com/google/go-github/v33/github"
"github.com/summerwind/actions-runner-controller/hash"
"strings"
"time"
"github.com/go-logr/logr"
"k8s.io/apimachinery/pkg/api/errors"
kerrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/client-go/tools/record"
ctrl "sigs.k8s.io/controller-runtime"
@@ -41,6 +44,8 @@ const (
finalizerName = "runner.actions.summerwind.dev"
LabelKeyPodTemplateHash = "pod-template-hash"
retryDelayOnGitHubAPIRateLimitError = 30 * time.Second
)
// RunnerReconciler reconciles a Runner object
@@ -95,9 +100,22 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
if removed {
if len(runner.Status.Registration.Token) > 0 {
ok, err := r.unregisterRunner(ctx, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
ok, err := r.unregisterRunner(ctx, runner.Spec.Enterprise, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
if err != nil {
log.Error(err, "Failed to unregister runner")
if errors.Is(err, &gogithub.RateLimitError{}) {
// We log the underlying error when we failed calling GitHub API to list or unregisters,
// or the runner is still busy.
log.Error(
err,
fmt.Sprintf(
"Failed to unregister runner due to GitHub API rate limits. Delaying retry for %s to avoid excessive GitHub API calls",
retryDelayOnGitHubAPIRateLimitError,
),
)
return ctrl.Result{RequeueAfter: retryDelayOnGitHubAPIRateLimitError}, err
}
return ctrl.Result{}, err
}
@@ -124,7 +142,7 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
var pod corev1.Pod
if err := r.Get(ctx, req.NamespacedName, &pod); err != nil {
if !errors.IsNotFound(err) {
if !kerrors.IsNotFound(err) {
return ctrl.Result{}, err
}
@@ -167,8 +185,40 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
}
if !pod.ObjectMeta.DeletionTimestamp.IsZero() {
deletionTimeout := 1 * time.Minute
currentTime := time.Now()
deletionDidTimeout := currentTime.Sub(pod.DeletionTimestamp.Add(deletionTimeout)) > 0
if deletionDidTimeout {
log.Info(
"Pod failed to delete itself in a timely manner. "+
"This is typically the case when a Kubernetes node became unreachable "+
"and the kube controller started evicting nodes. Forcefully deleting the pod to not get stuck.",
"podDeletionTimestamp", pod.DeletionTimestamp,
"currentTime", currentTime,
"configuredDeletionTimeout", deletionTimeout,
)
var force int64 = 0
// forcefully delete runner as we would otherwise get stuck if the node stays unreachable
if err := r.Delete(ctx, &pod, &client.DeleteOptions{GracePeriodSeconds: &force}); err != nil {
// probably
if !kerrors.IsNotFound(err) {
log.Error(err, "Failed to forcefully delete pod resource ...")
return ctrl.Result{}, err
}
// forceful deletion finally succeeded
return ctrl.Result{Requeue: true}, nil
}
r.Recorder.Event(&runner, corev1.EventTypeNormal, "PodDeleted", fmt.Sprintf("Forcefully deleted pod '%s'", pod.Name))
log.Info("Forcefully deleted runner pod", "repository", runner.Spec.Repository)
// give kube manager a little time to forcefully delete the stuck pod
return ctrl.Result{RequeueAfter: 3 * time.Second}, err
} else {
return ctrl.Result{}, err
}
}
if pod.Status.Phase == corev1.PodRunning {
for _, status := range pod.Status.ContainerStatuses {
@@ -194,10 +244,33 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
return ctrl.Result{}, err
}
runnerBusy, err := r.isRunnerBusy(ctx, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
notRegistered := false
runnerBusy, err := r.GitHubClient.IsRunnerBusy(ctx, runner.Spec.Enterprise, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
if err != nil {
log.Error(err, "Failed to check if runner is busy")
return ctrl.Result{}, nil
var e *github.RunnerNotFound
if errors.As(err, &e) {
log.V(1).Info("Failed to check if runner is busy. Either this runner has never been successfully registered to GitHub or it still needs more time.", "runnerName", runner.Name)
notRegistered = true
} else {
var e *gogithub.RateLimitError
if errors.As(err, &e) {
// We log the underlying error when we failed calling GitHub API to list or unregisters,
// or the runner is still busy.
log.Error(
err,
fmt.Sprintf(
"Failed to check if runner is busy due to Github API rate limit. Retrying in %s to avoid excessive GitHub API calls",
retryDelayOnGitHubAPIRateLimitError,
),
)
return ctrl.Result{RequeueAfter: retryDelayOnGitHubAPIRateLimitError}, err
}
return ctrl.Result{}, err
}
}
// See the `newPod` function called above for more information
@@ -209,9 +282,27 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
restart = true
}
registrationTimeout := 10 * time.Minute
currentTime := time.Now()
registrationDidTimeout := currentTime.Sub(pod.CreationTimestamp.Add(registrationTimeout)) > 0
if notRegistered && registrationDidTimeout {
log.Info(
"Runner failed to register itself to GitHub in timely manner. "+
"Recreating the pod to see if it resolves the issue. "+
"CAUTION: If you see this a lot, you should investigate the root cause. "+
"See https://github.com/summerwind/actions-runner-controller/issues/288",
"podCreationTimestamp", pod.CreationTimestamp,
"currentTime", currentTime,
"configuredRegistrationTimeout", registrationTimeout,
)
restart = true
}
// Don't do anything if there's no need to restart the runner
if !restart {
return ctrl.Result{}, err
return ctrl.Result{}, nil
}
// Delete current pod if recreation is needed
@@ -227,23 +318,8 @@ func (r *RunnerReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
return ctrl.Result{}, nil
}
func (r *RunnerReconciler) isRunnerBusy(ctx context.Context, org, repo, name string) (bool, error) {
runners, err := r.GitHubClient.ListRunners(ctx, org, repo)
if err != nil {
return false, err
}
for _, runner := range runners {
if runner.GetName() == name {
return runner.GetBusy(), nil
}
}
return false, fmt.Errorf("runner not found")
}
func (r *RunnerReconciler) unregisterRunner(ctx context.Context, org, repo, name string) (bool, error) {
runners, err := r.GitHubClient.ListRunners(ctx, org, repo)
func (r *RunnerReconciler) unregisterRunner(ctx context.Context, enterprise, org, repo, name string) (bool, error) {
runners, err := r.GitHubClient.ListRunners(ctx, enterprise, org, repo)
if err != nil {
return false, err
}
@@ -263,7 +339,7 @@ func (r *RunnerReconciler) unregisterRunner(ctx context.Context, org, repo, name
return false, nil
}
if err := r.GitHubClient.RemoveRunner(ctx, org, repo, id); err != nil {
if err := r.GitHubClient.RemoveRunner(ctx, enterprise, org, repo, id); err != nil {
return false, err
}
@@ -277,7 +353,7 @@ func (r *RunnerReconciler) updateRegistrationToken(ctx context.Context, runner v
log := r.Log.WithValues("runner", runner.Name)
rt, err := r.GitHubClient.GetRegistrationToken(ctx, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
rt, err := r.GitHubClient.GetRegistrationToken(ctx, runner.Spec.Enterprise, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
if err != nil {
r.Recorder.Event(&runner, corev1.EventTypeWarning, "FailedUpdateRegistrationToken", "Updating registration token failed")
log.Error(err, "Failed to get new registration token")
@@ -339,6 +415,10 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
Name: "RUNNER_REPO",
Value: runner.Spec.Repository,
},
{
Name: "RUNNER_ENTERPRISE",
Value: runner.Spec.Enterprise,
},
{
Name: "RUNNER_LABELS",
Value: strings.Join(runner.Spec.Labels, ","),
@@ -504,6 +584,7 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
SecurityContext: &corev1.SecurityContext{
Privileged: &privileged,
},
Resources: runner.Spec.DockerdContainerResources,
})
}
@@ -574,11 +655,14 @@ func (r *RunnerReconciler) newPod(runner v1alpha1.Runner) (corev1.Pod, error) {
}
func (r *RunnerReconciler) SetupWithManager(mgr ctrl.Manager) error {
r.Recorder = mgr.GetEventRecorderFor("runner-controller")
name := "runner-controller"
r.Recorder = mgr.GetEventRecorderFor(name)
return ctrl.NewControllerManagedBy(mgr).
For(&v1alpha1.Runner{}).
Owns(&corev1.Pod{}).
Named(name).
Complete(r)
}

View File

@@ -177,7 +177,7 @@ func (r *RunnerDeploymentReconciler) Reconcile(req ctrl.Request) (ctrl.Result, e
rs := oldSets[i]
if err := r.Client.Delete(ctx, &rs); err != nil {
log.Error(err, "Failed to delete runner resource")
log.Error(err, "Failed to delete runnerreplicaset resource")
return ctrl.Result{}, err
}
@@ -285,7 +285,8 @@ func (r *RunnerDeploymentReconciler) newRunnerReplicaSet(rd v1alpha1.RunnerDeplo
}
func (r *RunnerDeploymentReconciler) SetupWithManager(mgr ctrl.Manager) error {
r.Recorder = mgr.GetEventRecorderFor("runnerdeployment-controller")
name := "runnerdeployment-controller"
r.Recorder = mgr.GetEventRecorderFor(name)
if err := mgr.GetFieldIndexer().IndexField(&v1alpha1.RunnerReplicaSet{}, runnerSetOwnerKey, func(rawObj runtime.Object) []string {
runnerSet := rawObj.(*v1alpha1.RunnerReplicaSet)
@@ -306,5 +307,6 @@ func (r *RunnerDeploymentReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&v1alpha1.RunnerDeployment{}).
Owns(&v1alpha1.RunnerReplicaSet{}).
Named(name).
Complete(r)
}

View File

@@ -18,10 +18,13 @@ package controllers
import (
"context"
"errors"
"fmt"
gogithub "github.com/google/go-github/v33/github"
"time"
"github.com/go-logr/logr"
"k8s.io/apimachinery/pkg/api/errors"
kerrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/client-go/tools/record"
ctrl "sigs.k8s.io/controller-runtime"
@@ -65,7 +68,7 @@ func (r *RunnerReplicaSetReconciler) Reconcile(req ctrl.Request) (ctrl.Result, e
var allRunners v1alpha1.RunnerList
if err := r.List(ctx, &allRunners, client.InNamespace(req.Namespace)); err != nil {
if !errors.IsNotFound(err) {
if !kerrors.IsNotFound(err) {
return ctrl.Result{}, err
}
}
@@ -102,12 +105,51 @@ func (r *RunnerReplicaSetReconciler) Reconcile(req ctrl.Request) (ctrl.Result, e
// get runners that are currently not busy
var notBusy []v1alpha1.Runner
for _, runner := range myRunners {
busy, err := r.isRunnerBusy(ctx, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
busy, err := r.GitHubClient.IsRunnerBusy(ctx, runner.Spec.Enterprise, runner.Spec.Organization, runner.Spec.Repository, runner.Name)
if err != nil {
log.Error(err, "Failed to check if runner is busy")
notRegistered := false
var e *github.RunnerNotFound
if errors.As(err, &e) {
log.V(1).Info("Failed to check if runner is busy. Either this runner has never been successfully registered to GitHub or has not managed yet to, and therefore we prioritize it for deletion", "runnerName", runner.Name)
notRegistered = true
} else {
var e *gogithub.RateLimitError
if errors.As(err, &e) {
// We log the underlying error when we failed calling GitHub API to list or unregisters,
// or the runner is still busy.
log.Error(
err,
fmt.Sprintf(
"Failed to check if runner is busy due to GitHub API rate limit. Retrying in %s to avoid excessive GitHub API calls",
retryDelayOnGitHubAPIRateLimitError,
),
)
return ctrl.Result{RequeueAfter: retryDelayOnGitHubAPIRateLimitError}, err
}
return ctrl.Result{}, err
}
if !busy {
registrationTimeout := 15 * time.Minute
currentTime := time.Now()
registrationDidTimeout := currentTime.Sub(runner.CreationTimestamp.Add(registrationTimeout)) > 0
if notRegistered && registrationDidTimeout {
log.Info(
"Runner failed to register itself to GitHub in timely manner. "+
"Recreating the pod to see if it resolves the issue. "+
"CAUTION: If you see this a lot, you should investigate the root cause. "+
"See https://github.com/summerwind/actions-runner-controller/issues/288",
"runnerCreationTimestamp", runner.CreationTimestamp,
"currentTime", currentTime,
"configuredRegistrationTimeout", registrationTimeout,
)
notBusy = append(notBusy, runner)
}
} else if !busy {
notBusy = append(notBusy, runner)
}
}
@@ -117,7 +159,7 @@ func (r *RunnerReplicaSetReconciler) Reconcile(req ctrl.Request) (ctrl.Result, e
}
for i := 0; i < n; i++ {
if err := r.Client.Delete(ctx, &notBusy[i]); err != nil {
if err := r.Client.Delete(ctx, &notBusy[i]); client.IgnoreNotFound(err) != nil {
log.Error(err, "Failed to delete runner resource")
return ctrl.Result{}, err
@@ -179,26 +221,12 @@ func (r *RunnerReplicaSetReconciler) newRunner(rs v1alpha1.RunnerReplicaSet) (v1
}
func (r *RunnerReplicaSetReconciler) SetupWithManager(mgr ctrl.Manager) error {
r.Recorder = mgr.GetEventRecorderFor("runnerreplicaset-controller")
name := "runnerreplicaset-controller"
r.Recorder = mgr.GetEventRecorderFor(name)
return ctrl.NewControllerManagedBy(mgr).
For(&v1alpha1.RunnerReplicaSet{}).
Owns(&v1alpha1.Runner{}).
Named(name).
Complete(r)
}
func (r *RunnerReplicaSetReconciler) isRunnerBusy(ctx context.Context, org, repo, name string) (bool, error) {
runners, err := r.GitHubClient.ListRunners(ctx, org, repo)
r.Log.Info("runners", "github", runners)
if err != nil {
return false, err
}
for _, runner := range runners {
if runner.GetName() == name {
return runner.GetBusy(), nil
}
}
return false, fmt.Errorf("runner not found")
}

View File

@@ -17,6 +17,8 @@ limitations under the License.
package controllers
import (
"github.com/onsi/ginkgo/config"
"os"
"path/filepath"
"testing"
@@ -43,6 +45,8 @@ var testEnv *envtest.Environment
func TestAPIs(t *testing.T) {
RegisterFailHandler(Fail)
config.GinkgoConfig.FocusString = os.Getenv("GINKGO_FOCUS")
RunSpecsWithDefaultAndCustomReporters(t,
"Controller Suite",
[]Reporter{envtest.NewlineReporter{}})

View File

@@ -24,13 +24,34 @@ const (
`
)
type Handler struct {
type ListRunnersHandler struct {
Status int
Body string
}
func (h *ListRunnersHandler) ServeHTTP(w http.ResponseWriter, req *http.Request) {
w.WriteHeader(h.Status)
fmt.Fprintf(w, h.Body)
}
type Handler struct {
Status int
Body string
Statuses map[string]string
}
func (h *Handler) ServeHTTP(w http.ResponseWriter, req *http.Request) {
w.WriteHeader(h.Status)
status := req.URL.Query().Get("status")
if h.Statuses != nil {
if body, ok := h.Statuses[status]; ok {
fmt.Fprintf(w, body)
return
}
}
fmt.Fprintf(w, h.Body)
}
@@ -92,12 +113,21 @@ func NewServer(opts ...Option) *httptest.Server {
Status: http.StatusBadRequest,
Body: "",
},
"/enterprises/test/actions/runners/registration-token": &Handler{
Status: http.StatusCreated,
Body: fmt.Sprintf("{\"token\": \"%s\", \"expires_at\": \"%s\"}", RegistrationToken, time.Now().Add(time.Hour*1).Format(time.RFC3339)),
},
"/enterprises/invalid/actions/runners/registration-token": &Handler{
Status: http.StatusOK,
Body: fmt.Sprintf("{\"token\": \"%s\", \"expires_at\": \"%s\"}", RegistrationToken, time.Now().Add(time.Hour*1).Format(time.RFC3339)),
},
"/enterprises/error/actions/runners/registration-token": &Handler{
Status: http.StatusBadRequest,
Body: "",
},
// For ListRunners
"/repos/test/valid/actions/runners": &Handler{
Status: http.StatusOK,
Body: RunnersListBody,
},
"/repos/test/valid/actions/runners": config.FixedResponses.ListRunners,
"/repos/test/invalid/actions/runners": &Handler{
Status: http.StatusNoContent,
Body: "",
@@ -118,6 +148,18 @@ func NewServer(opts ...Option) *httptest.Server {
Status: http.StatusBadRequest,
Body: "",
},
"/enterprises/test/actions/runners": &Handler{
Status: http.StatusOK,
Body: RunnersListBody,
},
"/enterprises/invalid/actions/runners": &Handler{
Status: http.StatusNoContent,
Body: "",
},
"/enterprises/error/actions/runners": &Handler{
Status: http.StatusBadRequest,
Body: "",
},
// For RemoveRunner
"/repos/test/valid/actions/runners/1": &Handler{
@@ -144,6 +186,18 @@ func NewServer(opts ...Option) *httptest.Server {
Status: http.StatusBadRequest,
Body: "",
},
"/enterprises/test/actions/runners/1": &Handler{
Status: http.StatusNoContent,
Body: "",
},
"/enterprises/invalid/actions/runners/1": &Handler{
Status: http.StatusOK,
Body: "",
},
"/enterprises/error/actions/runners/1": &Handler{
Status: http.StatusBadRequest,
Body: "",
},
// For auto-scaling based on the number of queued(pending) workflow runs
"/repos/test/valid/actions/runs": config.FixedResponses.ListRepositoryWorkflowRuns,
@@ -159,3 +213,10 @@ func NewServer(opts ...Option) *httptest.Server {
return httptest.NewServer(mux)
}
func DefaultListRunnersHandler() *ListRunnersHandler {
return &ListRunnersHandler{
Status: http.StatusOK,
Body: RunnersListBody,
}
}

View File

@@ -1,17 +1,24 @@
package fake
import "net/http"
type FixedResponses struct {
ListRepositoryWorkflowRuns *Handler
ListWorkflowJobs *MapHandler
ListRunners http.Handler
}
type Option func(*ServerConfig)
func WithListRepositoryWorkflowRunsResponse(status int, body string) Option {
func WithListRepositoryWorkflowRunsResponse(status int, body, queued, in_progress string) Option {
return func(c *ServerConfig) {
c.FixedResponses.ListRepositoryWorkflowRuns = &Handler{
Status: status,
Body: body,
Statuses: map[string]string{
"queued": queued,
"in_progress": in_progress,
},
}
}
}
@@ -25,6 +32,15 @@ func WithListWorkflowJobsResponse(status int, bodies map[int]string) Option {
}
}
func WithListRunnersResponse(status int, body string) Option {
return func(c *ServerConfig) {
c.FixedResponses.ListRunners = &ListRunnersHandler{
Status: status,
Body: body,
}
}
}
func WithFixedResponses(responses *FixedResponses) Option {
return func(c *ServerConfig) {
c.FixedResponses = responses

View File

@@ -29,15 +29,15 @@ func (r *RunnersList) Add(runner *github.Runner) {
func (r *RunnersList) GetServer() *httptest.Server {
router := mux.NewRouter()
router.Handle("/repos/{owner}/{repo}/actions/runners", r.handleList())
router.Handle("/repos/{owner}/{repo}/actions/runners", r.HandleList())
router.Handle("/repos/{owner}/{repo}/actions/runners/{id}", r.handleRemove())
router.Handle("/orgs/{org}/actions/runners", r.handleList())
router.Handle("/orgs/{org}/actions/runners", r.HandleList())
router.Handle("/orgs/{org}/actions/runners/{id}", r.handleRemove())
return httptest.NewServer(router)
}
func (r *RunnersList) handleList() http.HandlerFunc {
func (r *RunnersList) HandleList() http.HandlerFunc {
return func(w http.ResponseWriter, res *http.Request) {
j, err := json.Marshal(github.Runners{
TotalCount: len(r.runners),

View File

@@ -11,6 +11,7 @@ import (
"github.com/bradleyfalzon/ghinstallation"
"github.com/google/go-github/v33/github"
"github.com/summerwind/actions-runner-controller/github/metrics"
"golang.org/x/oauth2"
)
@@ -34,15 +35,9 @@ type Client struct {
// NewClient creates a Github Client
func (c *Config) NewClient() (*Client, error) {
var (
httpClient *http.Client
client *github.Client
)
githubBaseURL := "https://github.com/"
var transport http.RoundTripper
if len(c.Token) > 0 {
httpClient = oauth2.NewClient(context.Background(), oauth2.StaticTokenSource(
&oauth2.Token{AccessToken: c.Token},
))
transport = oauth2.NewClient(context.Background(), oauth2.StaticTokenSource(&oauth2.Token{AccessToken: c.Token})).Transport
} else {
tr, err := ghinstallation.NewKeyFromFile(http.DefaultTransport, c.AppID, c.AppInstallationID, c.AppPrivateKey)
if err != nil {
@@ -55,9 +50,13 @@ func (c *Config) NewClient() (*Client, error) {
}
tr.BaseURL = githubAPIURL
}
httpClient = &http.Client{Transport: tr}
transport = tr
}
transport = metrics.Transport{Transport: transport}
httpClient := &http.Client{Transport: transport}
var client *github.Client
var githubBaseURL string
if len(c.EnterpriseURL) > 0 {
var err error
client, err = github.NewEnterpriseClient(c.EnterpriseURL, c.EnterpriseURL, httpClient)
@@ -67,6 +66,7 @@ func (c *Config) NewClient() (*Client, error) {
githubBaseURL = fmt.Sprintf("%s://%s%s", client.BaseURL.Scheme, client.BaseURL.Host, strings.TrimSuffix(client.BaseURL.Path, "api/v3/"))
} else {
client = github.NewClient(httpClient)
githubBaseURL = "https://github.com/"
}
return &Client{
@@ -78,24 +78,24 @@ func (c *Config) NewClient() (*Client, error) {
}
// GetRegistrationToken returns a registration token tied with the name of repository and runner.
func (c *Client) GetRegistrationToken(ctx context.Context, org, repo, name string) (*github.RegistrationToken, error) {
func (c *Client) GetRegistrationToken(ctx context.Context, enterprise, org, repo, name string) (*github.RegistrationToken, error) {
c.mu.Lock()
defer c.mu.Unlock()
key := getRegistrationKey(org, repo)
key := getRegistrationKey(org, repo, enterprise)
rt, ok := c.regTokens[key]
if ok && rt.GetExpiresAt().After(time.Now()) {
return rt, nil
}
owner, repo, err := getOwnerAndRepo(org, repo)
enterprise, owner, repo, err := getEnterpriseOrganisationAndRepo(enterprise, org, repo)
if err != nil {
return rt, err
}
rt, res, err := c.createRegistrationToken(ctx, owner, repo)
rt, res, err := c.createRegistrationToken(ctx, enterprise, owner, repo)
if err != nil {
return nil, fmt.Errorf("failed to create registration token: %v", err)
@@ -114,17 +114,17 @@ func (c *Client) GetRegistrationToken(ctx context.Context, org, repo, name strin
}
// RemoveRunner removes a runner with specified runner ID from repository.
func (c *Client) RemoveRunner(ctx context.Context, org, repo string, runnerID int64) error {
owner, repo, err := getOwnerAndRepo(org, repo)
func (c *Client) RemoveRunner(ctx context.Context, enterprise, org, repo string, runnerID int64) error {
enterprise, owner, repo, err := getEnterpriseOrganisationAndRepo(enterprise, org, repo)
if err != nil {
return err
}
res, err := c.removeRunner(ctx, owner, repo, runnerID)
res, err := c.removeRunner(ctx, enterprise, owner, repo, runnerID)
if err != nil {
return fmt.Errorf("failed to remove runner: %v", err)
return fmt.Errorf("failed to remove runner: %w", err)
}
if res.StatusCode != 204 {
@@ -135,8 +135,8 @@ func (c *Client) RemoveRunner(ctx context.Context, org, repo string, runnerID in
}
// ListRunners returns a list of runners of specified owner/repository name.
func (c *Client) ListRunners(ctx context.Context, org, repo string) ([]*github.Runner, error) {
owner, repo, err := getOwnerAndRepo(org, repo)
func (c *Client) ListRunners(ctx context.Context, enterprise, org, repo string) ([]*github.Runner, error) {
enterprise, owner, repo, err := getEnterpriseOrganisationAndRepo(enterprise, org, repo)
if err != nil {
return nil, err
@@ -146,10 +146,10 @@ func (c *Client) ListRunners(ctx context.Context, org, repo string) ([]*github.R
opts := github.ListOptions{PerPage: 10}
for {
list, res, err := c.listRunners(ctx, owner, repo, &opts)
list, res, err := c.listRunners(ctx, enterprise, owner, repo, &opts)
if err != nil {
return runners, fmt.Errorf("failed to list runners: %v", err)
return runners, fmt.Errorf("failed to list runners: %w", err)
}
runners = append(runners, list.Runners...)
@@ -174,49 +174,106 @@ func (c *Client) cleanup() {
}
}
// wrappers for github functions (switch between organization/repository mode)
// wrappers for github functions (switch between enterprise/organization/repository mode)
// so the calling functions don't need to switch and their code is a bit cleaner
func (c *Client) createRegistrationToken(ctx context.Context, owner, repo string) (*github.RegistrationToken, *github.Response, error) {
func (c *Client) createRegistrationToken(ctx context.Context, enterprise, org, repo string) (*github.RegistrationToken, *github.Response, error) {
if len(repo) > 0 {
return c.Client.Actions.CreateRegistrationToken(ctx, owner, repo)
}
return c.Client.Actions.CreateOrganizationRegistrationToken(ctx, owner)
}
func (c *Client) removeRunner(ctx context.Context, owner, repo string, runnerID int64) (*github.Response, error) {
if len(repo) > 0 {
return c.Client.Actions.RemoveRunner(ctx, owner, repo, runnerID)
}
return c.Client.Actions.RemoveOrganizationRunner(ctx, owner, runnerID)
}
func (c *Client) listRunners(ctx context.Context, owner, repo string, opts *github.ListOptions) (*github.Runners, *github.Response, error) {
if len(repo) > 0 {
return c.Client.Actions.ListRunners(ctx, owner, repo, opts)
}
return c.Client.Actions.ListOrganizationRunners(ctx, owner, opts)
}
// Validates owner and repo arguments. Both are optional, but at least one should be specified
func getOwnerAndRepo(org, repo string) (string, string, error) {
if len(repo) > 0 {
return splitOwnerAndRepo(repo)
return c.Client.Actions.CreateRegistrationToken(ctx, org, repo)
}
if len(org) > 0 {
return org, "", nil
return c.Client.Actions.CreateOrganizationRegistrationToken(ctx, org)
}
return "", "", fmt.Errorf("organization and repository are both empty")
return c.Client.Enterprise.CreateRegistrationToken(ctx, enterprise)
}
func getRegistrationKey(org, repo string) string {
if len(org) > 0 {
return org
func (c *Client) removeRunner(ctx context.Context, enterprise, org, repo string, runnerID int64) (*github.Response, error) {
if len(repo) > 0 {
return c.Client.Actions.RemoveRunner(ctx, org, repo, runnerID)
}
return repo
if len(org) > 0 {
return c.Client.Actions.RemoveOrganizationRunner(ctx, org, runnerID)
}
return c.Client.Enterprise.RemoveRunner(ctx, enterprise, runnerID)
}
func (c *Client) listRunners(ctx context.Context, enterprise, org, repo string, opts *github.ListOptions) (*github.Runners, *github.Response, error) {
if len(repo) > 0 {
return c.Client.Actions.ListRunners(ctx, org, repo, opts)
}
if len(org) > 0 {
return c.Client.Actions.ListOrganizationRunners(ctx, org, opts)
}
return c.Client.Enterprise.ListRunners(ctx, enterprise, opts)
}
func (c *Client) ListRepositoryWorkflowRuns(ctx context.Context, user string, repoName string) ([]*github.WorkflowRun, error) {
c.Client.Actions.ListRepositoryWorkflowRuns(ctx, user, repoName, nil)
queued, err := c.listRepositoryWorkflowRuns(ctx, user, repoName, "queued")
if err != nil {
return nil, fmt.Errorf("listing queued workflow runs: %w", err)
}
inProgress, err := c.listRepositoryWorkflowRuns(ctx, user, repoName, "in_progress")
if err != nil {
return nil, fmt.Errorf("listing in_progress workflow runs: %w", err)
}
var workflowRuns []*github.WorkflowRun
workflowRuns = append(workflowRuns, queued...)
workflowRuns = append(workflowRuns, inProgress...)
return workflowRuns, nil
}
func (c *Client) listRepositoryWorkflowRuns(ctx context.Context, user string, repoName, status string) ([]*github.WorkflowRun, error) {
c.Client.Actions.ListRepositoryWorkflowRuns(ctx, user, repoName, nil)
var workflowRuns []*github.WorkflowRun
opts := github.ListWorkflowRunsOptions{
ListOptions: github.ListOptions{
PerPage: 100,
},
Status: status,
}
for {
list, res, err := c.Client.Actions.ListRepositoryWorkflowRuns(ctx, user, repoName, &opts)
if err != nil {
return workflowRuns, fmt.Errorf("failed to list workflow runs: %v", err)
}
workflowRuns = append(workflowRuns, list.WorkflowRuns...)
if res.NextPage == 0 {
break
}
opts.Page = res.NextPage
}
return workflowRuns, nil
}
// Validates enterprise, organisation and repo arguments. Both are optional, but at least one should be specified
func getEnterpriseOrganisationAndRepo(enterprise, org, repo string) (string, string, string, error) {
if len(repo) > 0 {
owner, repository, err := splitOwnerAndRepo(repo)
return "", owner, repository, err
}
if len(org) > 0 {
return "", org, "", nil
}
if len(enterprise) > 0 {
return enterprise, "", "", nil
}
return "", "", "", fmt.Errorf("enterprise, organization and repository are all empty")
}
func getRegistrationKey(org, repo, enterprise string) string {
return fmt.Sprintf("org=%s,repo=%s,enterprise=%s", org, repo, enterprise)
}
func splitOwnerAndRepo(repo string) (string, string, error) {
@@ -244,3 +301,26 @@ func getEnterpriseApiUrl(baseURL string) (string, error) {
// Trim trailing slash, otherwise there's double slash added to token endpoint
return fmt.Sprintf("%s://%s%s", baseEndpoint.Scheme, baseEndpoint.Host, strings.TrimSuffix(baseEndpoint.Path, "/")), nil
}
type RunnerNotFound struct {
runnerName string
}
func (e *RunnerNotFound) Error() string {
return fmt.Sprintf("runner %q not found", e.runnerName)
}
func (r *Client) IsRunnerBusy(ctx context.Context, enterprise, org, repo, name string) (bool, error) {
runners, err := r.ListRunners(ctx, enterprise, org, repo)
if err != nil {
return false, err
}
for _, runner := range runners {
if runner.GetName() == name {
return runner.GetBusy(), nil
}
}
return false, &RunnerNotFound{runnerName: name}
}

View File

@@ -32,29 +32,36 @@ func newTestClient() *Client {
}
func TestMain(m *testing.M) {
server = fake.NewServer()
res := &fake.FixedResponses{
ListRunners: fake.DefaultListRunnersHandler(),
}
server = fake.NewServer(fake.WithFixedResponses(res))
defer server.Close()
m.Run()
}
func TestGetRegistrationToken(t *testing.T) {
tests := []struct {
enterprise string
org string
repo string
token string
err bool
}{
{org: "", repo: "test/valid", token: fake.RegistrationToken, err: false},
{org: "", repo: "test/invalid", token: "", err: true},
{org: "", repo: "test/error", token: "", err: true},
{org: "test", repo: "", token: fake.RegistrationToken, err: false},
{org: "invalid", repo: "", token: "", err: true},
{org: "error", repo: "", token: "", err: true},
{enterprise: "", org: "", repo: "test/valid", token: fake.RegistrationToken, err: false},
{enterprise: "", org: "", repo: "test/invalid", token: "", err: true},
{enterprise: "", org: "", repo: "test/error", token: "", err: true},
{enterprise: "", org: "test", repo: "", token: fake.RegistrationToken, err: false},
{enterprise: "", org: "invalid", repo: "", token: "", err: true},
{enterprise: "", org: "error", repo: "", token: "", err: true},
{enterprise: "test", org: "", repo: "", token: fake.RegistrationToken, err: false},
{enterprise: "invalid", org: "", repo: "", token: "", err: true},
{enterprise: "error", org: "", repo: "", token: "", err: true},
}
client := newTestClient()
for i, tt := range tests {
rt, err := client.GetRegistrationToken(context.Background(), tt.org, tt.repo, "test")
rt, err := client.GetRegistrationToken(context.Background(), tt.enterprise, tt.org, tt.repo, "test")
if !tt.err && err != nil {
t.Errorf("[%d] unexpected error: %v", i, err)
}
@@ -66,22 +73,26 @@ func TestGetRegistrationToken(t *testing.T) {
func TestListRunners(t *testing.T) {
tests := []struct {
enterprise string
org string
repo string
length int
err bool
}{
{org: "", repo: "test/valid", length: 2, err: false},
{org: "", repo: "test/invalid", length: 0, err: true},
{org: "", repo: "test/error", length: 0, err: true},
{org: "test", repo: "", length: 2, err: false},
{org: "invalid", repo: "", length: 0, err: true},
{org: "error", repo: "", length: 0, err: true},
{enterprise: "", org: "", repo: "test/valid", length: 2, err: false},
{enterprise: "", org: "", repo: "test/invalid", length: 0, err: true},
{enterprise: "", org: "", repo: "test/error", length: 0, err: true},
{enterprise: "", org: "test", repo: "", length: 2, err: false},
{enterprise: "", org: "invalid", repo: "", length: 0, err: true},
{enterprise: "", org: "error", repo: "", length: 0, err: true},
{enterprise: "test", org: "", repo: "", length: 2, err: false},
{enterprise: "invalid", org: "", repo: "", length: 0, err: true},
{enterprise: "error", org: "", repo: "", length: 0, err: true},
}
client := newTestClient()
for i, tt := range tests {
runners, err := client.ListRunners(context.Background(), tt.org, tt.repo)
runners, err := client.ListRunners(context.Background(), tt.enterprise, tt.org, tt.repo)
if !tt.err && err != nil {
t.Errorf("[%d] unexpected error: %v", i, err)
}
@@ -93,21 +104,25 @@ func TestListRunners(t *testing.T) {
func TestRemoveRunner(t *testing.T) {
tests := []struct {
enterprise string
org string
repo string
err bool
}{
{org: "", repo: "test/valid", err: false},
{org: "", repo: "test/invalid", err: true},
{org: "", repo: "test/error", err: true},
{org: "test", repo: "", err: false},
{org: "invalid", repo: "", err: true},
{org: "error", repo: "", err: true},
{enterprise: "", org: "", repo: "test/valid", err: false},
{enterprise: "", org: "", repo: "test/invalid", err: true},
{enterprise: "", org: "", repo: "test/error", err: true},
{enterprise: "", org: "test", repo: "", err: false},
{enterprise: "", org: "invalid", repo: "", err: true},
{enterprise: "", org: "error", repo: "", err: true},
{enterprise: "test", org: "", repo: "", err: false},
{enterprise: "invalid", org: "", repo: "", err: true},
{enterprise: "error", org: "", repo: "", err: true},
}
client := newTestClient()
for i, tt := range tests {
err := client.RemoveRunner(context.Background(), tt.org, tt.repo, int64(1))
err := client.RemoveRunner(context.Background(), tt.enterprise, tt.org, tt.repo, int64(1))
if !tt.err && err != nil {
t.Errorf("[%d] unexpected error: %v", i, err)
}

View File

@@ -0,0 +1,63 @@
// Package metrics provides monitoring of the GitHub related metrics.
//
// This depends on the metrics exporter of kubebuilder.
// See https://book.kubebuilder.io/reference/metrics.html for details.
package metrics
import (
"net/http"
"strconv"
"github.com/prometheus/client_golang/prometheus"
"sigs.k8s.io/controller-runtime/pkg/metrics"
)
func init() {
metrics.Registry.MustRegister(metricRateLimit, metricRateLimitRemaining)
}
var (
// https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-limiting
metricRateLimit = prometheus.NewGauge(
prometheus.GaugeOpts{
Name: "github_rate_limit",
Help: "The maximum number of requests you're permitted to make per hour",
},
)
metricRateLimitRemaining = prometheus.NewGauge(
prometheus.GaugeOpts{
Name: "github_rate_limit_remaining",
Help: "The number of requests remaining in the current rate limit window",
},
)
)
const (
// https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-limiting
headerRateLimit = "X-RateLimit-Limit"
headerRateLimitRemaining = "X-RateLimit-Remaining"
)
// Transport wraps a transport with metrics monitoring
type Transport struct {
Transport http.RoundTripper
}
func (t Transport) RoundTrip(req *http.Request) (*http.Response, error) {
resp, err := t.Transport.RoundTrip(req)
if resp != nil {
parseResponse(resp)
}
return resp, err
}
func parseResponse(resp *http.Response) {
rateLimit, err := strconv.Atoi(resp.Header.Get(headerRateLimit))
if err == nil {
metricRateLimit.Set(float64(rateLimit))
}
rateLimitRemaining, err := strconv.Atoi(resp.Header.Get(headerRateLimitRemaining))
if err == nil {
metricRateLimitRemaining.Set(float64(rateLimitRemaining))
}
}

6
go.mod
View File

@@ -6,14 +6,12 @@ require (
github.com/bradleyfalzon/ghinstallation v1.1.1
github.com/davecgh/go-spew v1.1.1
github.com/go-logr/logr v0.1.0
github.com/google/go-github v17.0.0+incompatible // indirect
github.com/google/go-github/v32 v32.1.1-0.20200822031813-d57a3a84ba04
github.com/google/go-github/v33 v33.0.0
github.com/google/go-querystring v1.0.0
github.com/google/go-github/v33 v33.0.1-0.20210204004227-319dcffb518a
github.com/gorilla/mux v1.8.0
github.com/kelseyhightower/envconfig v1.4.0
github.com/onsi/ginkgo v1.8.0
github.com/onsi/gomega v1.5.0
github.com/prometheus/client_golang v0.9.2
github.com/stretchr/testify v1.4.0 // indirect
golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45
k8s.io/api v0.0.0-20190918155943-95b840bb6a1f

8
go.sum
View File

@@ -116,14 +116,10 @@ github.com/google/go-cmp v0.3.0 h1:crn/baboCvb5fXaQ0IJ1SGTsTVrWpDsCWC8EGETZijY=
github.com/google/go-cmp v0.3.0/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
github.com/google/go-cmp v0.3.1 h1:Xye71clBPdm5HgqGwUkwhbynsUJZhDbS20FvLhQ2izg=
github.com/google/go-cmp v0.3.1/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
github.com/google/go-github v17.0.0+incompatible h1:N0LgJ1j65A7kfXrZnUDaYCs/Sf4rEjNlfyDHW9dolSY=
github.com/google/go-github v17.0.0+incompatible/go.mod h1:zLgOLi98H3fifZn+44m+umXrS52loVEgC2AApnigrVQ=
github.com/google/go-github/v29 v29.0.2 h1:opYN6Wc7DOz7Ku3Oh4l7prmkOMwEcQxpFtxdU8N8Pts=
github.com/google/go-github/v29 v29.0.2/go.mod h1:CHKiKKPHJ0REzfwc14QMklvtHwCveD0PxlMjLlzAM5E=
github.com/google/go-github/v32 v32.1.1-0.20200822031813-d57a3a84ba04 h1:wEYk2h/GwOhImcVjiTIceP88WxVbXw2F+ARYUQMEsfg=
github.com/google/go-github/v32 v32.1.1-0.20200822031813-d57a3a84ba04/go.mod h1:rIEpZD9CTDQwDK9GDrtMTycQNA4JU3qBsCizh3q2WCI=
github.com/google/go-github/v33 v33.0.0 h1:qAf9yP0qc54ufQxzwv+u9H0tiVOnPJxo0lI/JXqw3ZM=
github.com/google/go-github/v33 v33.0.0/go.mod h1:GMdDnVZY/2TsWgp/lkYnpSAh6TrzhANBBwm6k6TTEXg=
github.com/google/go-github/v33 v33.0.1-0.20210204004227-319dcffb518a h1:Z9Nzq8ntvvXCLnFGOkzzcD8HDOzOo+obuwE5oK85vNQ=
github.com/google/go-github/v33 v33.0.1-0.20210204004227-319dcffb518a/go.mod h1:GMdDnVZY/2TsWgp/lkYnpSAh6TrzhANBBwm6k6TTEXg=
github.com/google/go-querystring v1.0.0 h1:Xkwi/a1rcvNg1PPYe5vI8GbeBY/jrVuDX5ASuANWTrk=
github.com/google/go-querystring v1.0.0/go.mod h1:odCYkC5MyYFN7vkCjXpyrEuKhc/BUO6wN/zVPAxq5ck=
github.com/google/gofuzz v0.0.0-20161122191042-44d81051d367/go.mod h1:HP5RmnzzSNb993RKQDq4+1A4ia9nllfqcQFTQJedwGI=

1089
index.yaml

File diff suppressed because it is too large Load Diff

View File

@@ -148,6 +148,7 @@ func main() {
Log: ctrl.Log.WithName("controllers").WithName("HorizontalRunnerAutoscaler"),
Scheme: mgr.GetScheme(),
GitHubClient: ghClient,
CacheDuration: syncPeriod - 10*time.Second,
}
if err = horizontalRunnerAutoscaler.SetupWithManager(mgr); err != nil {

View File

@@ -83,7 +83,7 @@ RUN echo AGENT_TOOLSDIRECTORY=/opt/hostedtoolcache > .env \
&& chmod g+rwx /opt/hostedtoolcache
COPY entrypoint.sh /
COPY patched $RUNNER_ASSETS_DIR/patched
COPY --chown=runner:docker patched $RUNNER_ASSETS_DIR/patched
USER runner
ENTRYPOINT ["/usr/local/bin/dumb-init", "--"]

View File

@@ -104,7 +104,7 @@ RUN export ARCH=$(echo ${TARGETPLATFORM} | cut -d / -f2) \
VOLUME /var/lib/docker
COPY patched $RUNNER_ASSETS_DIR/patched
COPY --chown=runner:docker patched $RUNNER_ASSETS_DIR/patched
# No group definition, as that makes it harder to run docker.
USER runner

View File

@@ -16,14 +16,16 @@ if [ -z "${RUNNER_NAME}" ]; then
exit 1
fi
if [ -n "${RUNNER_ORG}" ] && [ -n "${RUNNER_REPO}" ]; then
if [ -n "${RUNNER_ORG}" ] && [ -n "${RUNNER_REPO}" ] && [ -n "${RUNNER_ENTERPRISE}" ]; then
ATTACH="${RUNNER_ORG}/${RUNNER_REPO}"
elif [ -n "${RUNNER_ORG}" ]; then
ATTACH="${RUNNER_ORG}"
elif [ -n "${RUNNER_REPO}" ]; then
ATTACH="${RUNNER_REPO}"
elif [ -n "${RUNNER_ENTERPRISE}" ]; then
ATTACH="enterprises/${RUNNER_ENTERPRISE}"
else
echo "At least one of RUNNER_ORG or RUNNER_REPO must be set" 1>&2
echo "At least one of RUNNER_ORG or RUNNER_REPO or RUNNER_ENTERPRISE must be set" 1>&2
exit 1
fi