-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HIP for obtaining output from hook jobs and pods #301
Open
z4ce
wants to merge
3
commits into
helm:main
Choose a base branch
from
z4ce:joboutput_hip
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,254 @@ | ||
--- | ||
hip: 9999 | ||
title: "New annotations for displaying hook output" | ||
authors: [ "Ian Zink <[email protected]>" ] | ||
created: "2023-01-26" | ||
type: "feature" | ||
status: "draft" | ||
--- | ||
|
||
## Abstract | ||
|
||
This proposes a new annotation to indicate that the output from the hook should be displayed to the user. | ||
|
||
## Motivation | ||
|
||
The one motivation for this HIP is the ability to display preflight checks before the main chart installs. This provides vital feedback to the user as to why a Helm chart can not be successfully installed. | ||
|
||
Often it is important to verify that the Kubernetes cluster you are deploying a Helm chart into has certain properties. You might need to know that it has an Ingress controller, a certain amount of ephemeral storage, memory, or CPUs available. You might want to validate the service key they provided was correct or that that database they entered is reachable. Letting a Helm install and/or upgrade fail, and then make the user debug this to see why it failed is a poor user experience. These things can all be done with checks enabled by the hook proposed in this HIP. | ||
|
||
Another common motivation for this HIP is to indicate that database migration output running as a hook should be visible to the user in the case of failure. | ||
|
||
In general, allowing chart developers to run jobs and present that feedback directly to the users could also open up additional use cases beyond just the preflight use case that motivated this HIP. I could imagine scenarios where maybe CVE warnings are presented or specific upgrade feedback is presented instead of just a Helm install failure. | ||
|
||
## Rationale | ||
|
||
There are other ways that this could be implemented. For example, we could have a separate preflight hook type. However, this new hook type wouldn't be handled at all by previous versions of Helm. With this design, it requires minimal changes to Helm and allows for backwards compatibility. | ||
|
||
Another strategy could be for Helm to include Troubleshoot.sh as a dependent library, but this could result in too tight of a coupling between the projects and lower overall flexibility and adaptability. | ||
|
||
## Specification | ||
|
||
Templates may include the following annotations on Jobs or Pods: | ||
|
||
```yaml | ||
"helm.sh/hook": pre-install, pre-upgrade | ||
"helm.sh/hook-output-log-policy": hook-failed # or hook-succeeded or hook-failed,hook-succeeded | ||
``` | ||
|
||
|
||
`helm.sh/hook-output-log-policy` would indicate that Helm should display the output of the Job to the user. | ||
|
||
Additionally, a new user flag should be created `--no-log-output` that would skip the output of logs. | ||
|
||
Additionally, there will be a new item added to the action SDK configuration to allow SDK consumers to get the output. | ||
By default this output will be written to stdout, but an SDK consumer can overwrite the HookOutputFunc to provide a custom writer. | ||
z4ce marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
```go | ||
type Configuration struct { | ||
... | ||
// Called with container name and returns and expects writer that will receive the log output | ||
HookOutputFunc func(namespace, pod, container string) io.Writer | ||
} | ||
``` | ||
|
||
## Backwards compatibility | ||
|
||
The only backwards compatibility concern would be that scripts parsing `helm install` output would see some additional text in the case of logs being output. The fact that notes already make the output unstructured should mitigate any concern here. Since we already are trusting chart developers to provide output in the form of notes, this is a logical extension of that that allows the developer to provide more dynamic output. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if the Helm client wrote them to StdErr instead? This is where logs like this are usually written. You can even see it in default Go logger.
z4ce marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Security implications | ||
|
||
Potentially the preflight checks could check for security misconfigurations that could enhance the security of the chart deployment. | ||
|
||
## How to teach this | ||
|
||
In the first instance, documentation plus the help text for `helm install` would explain the feature. | ||
|
||
An example template could be provided in documentation showing how to use this feature with a generic command used in a hook. | ||
|
||
A more advanced example showing how to use the new feature with Troubleshoot.sh to provide preflight checks could be linked in the documentation, provided directly in the documentation, or provided on the Troubleshoot.sh documentation site independently. | ||
|
||
## Reference implementation | ||
|
||
[Pull Request for Documentation ](https://github.com/helm/helm-www/pull/1242) | ||
|
||
[Pull Request for Helm](https://github.com/helm/helm/pull/10309) - most upvoted open PR | ||
|
||
|
||
## Rejected ideas | ||
N/A | ||
|
||
## Open issues | ||
N/A | ||
|
||
## References | ||
|
||
[Troubleshoot.sh](https://troubleshoot.sh/) - the tool that is the motivation for this HIP. | ||
|
||
[safe-install plugin](https://github.com/z4ce/helm-safe-install) - Plugin that provides a similiar experience to what I hope this HIP will provide natively. | ||
|
||
## Reference - Examples Usage | ||
|
||
### Example using `false` | ||
|
||
Template: | ||
```yaml | ||
apiVersion: batch/v1 | ||
kind: Job | ||
metadata: | ||
name: "{{ .Release.Name }}-false-job" | ||
labels: | ||
app.kubernetes.io/managed-by: {{ .Release.Service | quote }} | ||
app.kubernetes.io/instance: {{ .Release.Name | quote }} | ||
app.kubernetes.io/version: {{ .Chart.AppVersion }} | ||
helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" | ||
annotations: | ||
"helm.sh/hook": pre-install, pre-upgrade | ||
"helm.sh/hook-output-log-policy": hook-failed, hook-suceeded | ||
"helm.sh/hook-weight": "-5" | ||
"helm.sh/hook-delete-policy": hook-succeeded, hook-failed | ||
|
||
spec: | ||
backoffLimit: 0 | ||
template: | ||
metadata: | ||
name: "{{ .Release.Name }}" | ||
labels: | ||
app.kubernetes.io/managed-by: {{ .Release.Service | quote }} | ||
app.kubernetes.io/instance: {{ .Release.Name | quote }} | ||
helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" | ||
spec: | ||
restartPolicy: Never | ||
containers: | ||
- name: post-install-job | ||
image: "alpine:3.18" | ||
command: ["sh", "-c", "echo foo ; false"] | ||
``` | ||
|
||
What it should loook when running: | ||
|
||
```text | ||
$ helm install ./ my-release | ||
Logs for pod: my-release-false-job-bgbz6, container: pre-install-job | ||
foo | ||
Error: INSTALLATION FAILED: failed pre-install: job failed: BackoffLimitExceeded | ||
``` | ||
|
||
### Example using Troubleshoot Preflight Checks | ||
|
||
```yaml | ||
apiVersion: batch/v1 | ||
kind: Job | ||
metadata: | ||
name: "{{ .Release.Name }}-preflight-job" | ||
labels: | ||
app.kubernetes.io/managed-by: {{ .Release.Service | quote }} | ||
app.kubernetes.io/instance: {{ .Release.Name | quote }} | ||
app.kubernetes.io/version: {{ .Chart.AppVersion }} | ||
helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" | ||
annotations: | ||
"helm.sh/hook": pre-install, pre-upgrade | ||
"helm.sh/hook-output-log-policy": hook-failed | ||
"helm.sh/hook-weight": "-5" | ||
"helm.sh/hook-delete-policy": hook-succeeded, hook-failed | ||
|
||
spec: | ||
backoffLimit: 0 | ||
template: | ||
metadata: | ||
name: "{{ .Release.Name }}" | ||
labels: | ||
app.kubernetes.io/managed-by: {{ .Release.Service | quote }} | ||
app.kubernetes.io/instance: {{ .Release.Name | quote }} | ||
helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" | ||
spec: | ||
restartPolicy: Never | ||
volumes: | ||
- name: preflights | ||
configMap: | ||
name: "{{ .Release.Name }}-preflight-config" | ||
containers: | ||
- name: post-install-job | ||
image: "replicated/preflight:latest" | ||
command: ["preflight", "--interactive=false", "/preflights/preflights.yaml"] | ||
volumeMounts: | ||
- name: preflights | ||
mountPath: /preflights | ||
|
||
--- | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
annotations: | ||
"helm.sh/hook": pre-install, pre-upgrade | ||
"helm.sh/hook-weight": "-6" | ||
"helm.sh/hook-delete-policy": hook-succeeded, hook-failed | ||
labels: | ||
app.kubernetes.io/managed-by: {{ .Release.Service | quote }} | ||
app.kubernetes.io/instance: {{ .Release.Name | quote }} | ||
app.kubernetes.io/version: {{ .Chart.AppVersion }} | ||
helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" | ||
name: "{{ .Release.Name }}-preflight-config" | ||
data: | ||
preflights.yaml: | | ||
apiVersion: troubleshoot.sh/v1beta2 | ||
kind: Preflight | ||
metadata: | ||
name: preflight-tutorial | ||
spec: | ||
collectors: | ||
{{ if eq .Values.mariadb.enabled false }} | ||
- mysql: | ||
collectorName: mysql | ||
uri: '{{ .Values.externalDatabase.user }}:{{ .Values.externalDatabase.password }}@tcp({{ .Values.externalDatabase.host }}:{{ .Values.externalDatabase.port }})/{{ .Values.externalDatabase.database }}?tls=false' | ||
{{ end }} | ||
analyzers: | ||
- clusterVersion: | ||
outcomes: | ||
- fail: | ||
when: "< 1.16.0" | ||
message: The application requires at least Kubernetes 1.16.0, and recommends 1.18.0. | ||
uri: https://kubernetes.io | ||
- warn: | ||
when: "< 1.18.0" | ||
message: Your cluster meets the minimum version of Kubernetes, but we recommend you update to 1.18.0 or later. | ||
uri: https://kubernetes.io | ||
- pass: | ||
message: Your cluster meets the recommended and required versions of Kubernetes. | ||
{{ if eq .Values.mariadb.enabled false }} | ||
- mysql: | ||
checkName: Must be MySQL 8.x or later | ||
collectorName: mysql | ||
outcomes: | ||
- fail: | ||
when: connected == false | ||
message: Cannot connect to MySQL server | ||
- fail: | ||
when: version < 8.x | ||
message: The MySQL server must be at least version 8 | ||
- pass: | ||
message: The MySQL server is ready | ||
{{ end }} | ||
``` | ||
|
||
Which should yield the following output to stdout: | ||
|
||
```text | ||
$ helm install ./ my-release | ||
Logs for pod: my-release-preflight-job-bgbz6, container: pre-install-job | ||
--- FAIL: Required Kubernetes Version | ||
--- The application requires at least Kubernetes 1.16.0, and recommends 1.18.0. | ||
--- FAIL: Must be MySQL 8.x or later | ||
--- Cannot connect to MySQL server | ||
--- FAIL preflight-tutorial | ||
FAILED | ||
name: cluster-resources status: completed completed: 1 total: 3 | ||
name: mysql/mysql status: running completed: 1 total: 3 | ||
name: mysql/mysql status: completed completed: 2 total: 3 | ||
name: cluster-info status: running completed: 2 total: 3 | ||
name: cluster-info status: completed completed: 3 total: 3 | ||
|
||
Error: INSTALLATION FAILED: failed pre-install: job failed: BackoffLimitExceeded | ||
|
||
``` |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the SDK should write to StdOut. StdErr is where logs usually go. Consider a service, like Flux, running this update. How can things be low impact on them? What about writing to
io.Discard
by default. They set a writer if they want to capture the output. This is how they can opt-in to getting that output.