(Configuring DinD) Unable to set MTU in gha-runner-scale-set #2993
-
Checks
Controller Version0.5.0 Helm Chart Version0.5.0 CertManager VersionNo response Deployment MethodHelm cert-manager installationN/A Checks
Resource DefinitionsapiVersion: actions.github.com/v1alpha1
kind: EphemeralRunnerSet
metadata:
annotations:
actions.github.com/runner-group-name: Default
creationTimestamp: '2023-09-26T15:02:59Z'
finalizers:
- ephemeralrunner.actions.github.com/finalizer
generateName: REDACTED-github-runner-
generation: 60
labels:
actions.github.com/organization: REDACTED
actions.github.com/scale-set-name: REDACTED-github-runner
actions.github.com/scale-set-namespace: arc-runner
app.kubernetes.io/component: runner-set
app.kubernetes.io/part-of: gha-runner-scale-set
app.kubernetes.io/version: 0.5.0
k8slens-edit-resource-version: v1alpha1
runner-spec-hash: 6d7bc7bccb
name: REDACTED-github-runner-mpsxl
namespace: arc-runner
resourceVersion: '526935102'
uid: dcb3f04c-e839-4a98-8db5-a8b82af479ea
selfLink: >-
/apis/actions.github.com/v1alpha1/namespaces/arc-runner/ephemeralrunnersets/REDACTED-github-runner-mpsxl
status:
currentReplicas: 0
pendingEphemeralRunners: 0
runningEphemeralRunners: 0
spec:
ephemeralRunnerSpec:
githubConfigSecret: github-runner
githubConfigUrl: https://github.com/REDACTED
metadata: {}
runnerScaleSetId: 1
spec:
containers:
- command:
- /home/runner/run.sh
env:
- name: DOCKER_HOST
value: tcp://localhost:2376
- name: DOCKER_TLS_VERIFY
value: '1'
- name: DOCKER_CERT_PATH
value: /certs/client
image: >-
us-docker.pkg.dev/REGISTRY/CUSTOM_IMAGE:latest
name: runner
resources:
limits:
memory: 4Gi
requests:
cpu: '1'
memory: 2Gi
volumeMounts:
- mountPath: /home/runner/_work
name: work
- mountPath: /certs/client
name: dind-cert
readOnly: true
- image: docker:dind
name: dind
resources: {}
securityContext:
privileged: true
volumeMounts:
- mountPath: /home/runner/_work
name: work
- mountPath: /certs/client
name: dind-cert
- mountPath: /home/runner/externals
name: dind-externals
initContainers:
- command:
- cp
- '-r'
- '-v'
- /home/runner/externals/.
- /home/runner/tmpDir/
image: ghcr.io/actions/actions-runner:latest
name: init-dind-externals
resources: {}
volumeMounts:
- mountPath: /home/runner/tmpDir
name: dind-externals
serviceAccountName: REDACTED-github-runner-gha-rs-no-permission
volumes:
- emptyDir: {}
name: work
- emptyDir: {}
name: dind-cert
- emptyDir: {}
name: dind-externals To Reproduce1. Deploy the gha-runner-scale-set-controller helm chart with default values on a GKE cluster
2. Deploy the gha-runner-scale-set chart using `containerMode: dind`
3. Use the `docker/build-push-action` to build a docker image that downloads go modules. e.g.
FROM golang:1.19.10
RUN go get github.com/gin-gonic/[email protected]
Whole Runner Pod LogsNA Additional ContextNo response |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 8 replies
-
I was able to get this working by creating a configmap like
and mounting it into the dind container in the EphemeralRunnerSet
Similar to #1652 (comment) It's still not clear if this is intended to be configurable in the CRDs/Helm chart though - I think some additional documentation could be helpful here |
Beta Was this translation helpful? Give feedback.
-
Hey @bkonicek-calm, The way you managed to solve the issue is actually the intended way |
Beta Was this translation helpful? Give feedback.
-
@nikola-jokic Is it just me or is it more a matter of the documentation simply not existing? I mean the legacy troubleshooting guide has a mention of this (but not an updated solution), but the "new" docs don't seem to mention it at all? It just seems this is a problem that people are extremely likely to hit (non-standard MTU is common in cloud and with VLANs), and fixing it requires quite a lot of digging. E.g. first you have to figure out you have this problem and debug it, and looking in the new troubleshooting guide this issue is not described. I am attempting to piece together the steps based on the above, but so far it has not worked for me :( Could you help clarify the steps needed? (then we almost have the documentation :) My current ephemeral runner set looks like this after being patched, but container jobs are still starting with the wrong MTU: apiVersion: actions.github.com/v1alpha1
kind: EphemeralRunnerSet
metadata:
annotations:
actions.github.com/runner-group-name: Default
creationTimestamp: "2023-10-28T13:29:59Z"
finalizers:
- ephemeralrunner.actions.github.com/finalizer
generateName: arc-runner-set-
generation: 28
labels:
actions.github.com/organization: ********
actions.github.com/scale-set-name: arc-runner-set
actions.github.com/scale-set-namespace: arc-runners
app.kubernetes.io/component: runner-set
app.kubernetes.io/part-of: gha-runner-scale-set
app.kubernetes.io/version: 0.6.1
runner-spec-hash: 776854fcb
name: arc-runner-set-2mfg9
namespace: arc-runners
ownerReferences:
- apiVersion: actions.github.com/v1alpha1
blockOwnerDeletion: true
controller: true
kind: AutoscalingRunnerSet
name: arc-runner-set
uid: fcbf4763-7a92-4012-af91-5abba02ff7cb
resourceVersion: "564950"
uid: f545bdf5-6b5f-48c0-994a-af36321b5a2f
spec:
ephemeralRunnerSpec:
githubConfigSecret: github-app-secret
githubConfigUrl: **********
metadata: {}
runnerScaleSetId: 9
spec:
containers:
- command:
- /home/runner/run.sh
env:
- name: DOCKER_HOST
value: unix:///run/docker/docker.sock
- name: RUNNER_WAIT_FOR_DOCKER_IN_SECONDS
value: "120"
image: ghcr.io/actions/actions-runner:latest
name: runner
resources: {}
volumeMounts:
- mountPath: /home/runner/_work
name: work
- mountPath: /run/docker
name: dind-sock
readOnly: true
- args:
- dockerd
- --host=unix:///run/docker/docker.sock
- --group=$(DOCKER_GROUP_GID)
env:
- name: DOCKER_GROUP_GID
value: "123"
image: docker:dind
name: dind
resources: {}
securityContext:
privileged: true
volumeMounts:
- mountPath: /home/runner/_work
name: work
- mountPath: /run/docker
name: dind-sock
- mountPath: /home/runner/externals
name: dind-externals
- mountPath: /etc/docker/daemon.json
name: daemon-json
readOnly: true
subPath: daemon.json
initContainers:
- args:
- -r
- -v
- /home/runner/externals/.
- /home/runner/tmpDir/
command:
- cp
image: ghcr.io/actions/actions-runner:latest
name: init-dind-externals
resources: {}
volumeMounts:
- mountPath: /home/runner/tmpDir
name: dind-externals
restartPolicy: Never
serviceAccountName: arc-runner-set-gha-rs-no-permission
volumes:
- emptyDir: {}
name: dind-sock
- emptyDir: {}
name: dind-externals
- emptyDir: {}
name: work
- configMap:
name: docker-daemon-config
name: daemon-json
status:
currentReplicas: 0
pendingEphemeralRunners: 0
runningEphemeralRunners: 0 and the config map: apiVersion: v1
data:
daemon.json: |
{
"mtu": 1400
}
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","data":{"daemon.json":"{\n \"mtu\": 1400\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"docker-daemon-config","namespace":"arc-runners"}}
creationTimestamp: "2023-10-30T19:15:04Z"
name: docker-daemon-config
namespace: arc-runners
resourceVersion: "539728"
uid: 55b7992d-82be-4e35-b54c-b37c72c16a4a
|
Beta Was this translation helpful? Give feedback.
-
Okay, I have debugged this in more detail. My configmap does apply, and bridge network does get MTU 1400. However my job container is not attached to the bridge its attached to a network called |
Beta Was this translation helpful? Give feedback.
-
You need two separate fixes:
My configmap looks like this: apiVersion: v1
kind: ConfigMap
metadata:
name: docker-daemon-config
namespace: arc-runners
data:
daemon.json: |
{
"mtu": 1400,
"registry-mirrors": ["http://registry.arc-runners.svc.local.configit:5000"]
} (the registry mirror is un-related but i would recommend running it to avoid rate limiting on docker hub). The template section of my values.yaml file for the arc template looks like this: template:
spec:
initContainers:
- name: init-dind-externals
image: ghcr.io/actions/actions-runner:latest
command: ["cp", "-r", "-v", "/home/runner/externals/.", "/home/runner/tmpDir/"]
volumeMounts:
- name: dind-externals
mountPath: /home/runner/tmpDir
containers:
- name: runner
#image: ghcr.io/actions/actions-runner:latest
image: configit/gha-runner-shim:latest
command: ["/home/runner/run.sh"]
env:
- name: DOCKER_HOST
value: unix:///run/docker/docker.sock
volumeMounts:
- name: work
mountPath: /home/runner/_work
- name: dind-sock
mountPath: /run/docker
readOnly: true
- name: dind
image: docker:dind
args:
- dockerd
- --host=unix:///run/docker/docker.sock
- --group=$(DOCKER_GROUP_GID)
resources:
requests:
memory: 4Gi
limits:
memory: 16Gi
env:
- name: DOCKER_GROUP_GID
value: "123"
securityContext:
privileged: true
volumeMounts:
- name: work
mountPath: /home/runner/_work
- name: dind-sock
mountPath: /run/docker
- name: dind-externals
mountPath: /home/runner/externals
- name: daemon-json
mountPath: /etc/docker/daemon.json
readOnly: true
subPath: daemon.json
volumes:
- name: work
emptyDir: {}
- name: dind-sock
emptyDir: {}
- name: dind-externals
emptyDir: {}
- name: daemon-json
configMap:
name: docker-daemon-config The first interesting part here is the shimmed runner: Secondly there is the configmap setup:
|
Beta Was this translation helpful? Give feedback.
-
I just came across this discussion which helped me fix this same issue. Thank you!
and as some other people mentioned, I also had to uncomment/ignore the |
Beta Was this translation helpful? Give feedback.
I was able to get this working by creating a configmap like
and mounting it into the dind container in the EphemeralRunnerSet