Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(manifests): fix full CRDs and update tests to use them. Fixes #8532 #14044

Merged
merged 13 commits into from
Jan 16, 2025
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions api/jsonschema/schema.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 0 additions & 8 deletions api/openapi-spec/swagger.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 2 additions & 6 deletions docs/executor_swagger.md
Original file line number Diff line number Diff line change
Expand Up @@ -983,7 +983,7 @@ ConfigMap volumes support ownership management and SELinux relabeling.
| template | string| `string` | | | Name of template to execute | |
| templateRef | [TemplateRef](#template-ref)| `TemplateRef` | | | | |
| when | string| `string` | | | When is an expression in which the task should conditionally execute | |
| withItems | [][Item](#item)| `[]Item` | | | WithItems expands a task into multiple parallel tasks from the items in the list | |
| withItems | [][Item](#item)| `[]Item` | | | WithItems expands a task into multiple parallel tasks from the items in the list</br>+kubebuilder:validation:Schemaless</br>+kubebuilder:pruning:PreserveUnknownFields | |
| withParam | string| `string` | | | WithParam expands a task into multiple parallel tasks from the value in the parameter,</br>which is expected to be a JSON list. | |
| withSequence | [Sequence](#sequence)| `Sequence` | | | | |

Expand Down Expand Up @@ -1856,7 +1856,6 @@ ISCSI volumes support ownership management and SELinux relabeling.


> +protobuf.options.(gogoproto.goproto_stringer)=false
+kubebuilder:validation:Type=object


Expand Down Expand Up @@ -2511,11 +2510,8 @@ be cluster-scoped, so there is no namespace field.
### <span id="parallel-steps"></span> ParallelSteps


> +kubebuilder:validation:Type=array




[interface{}](#interface)

### <span id="parameter"></span> Parameter
Expand Down Expand Up @@ -3391,7 +3387,7 @@ cause implementors to also use a fixed point implementation.
| resources | [ResourceRequirements](#resource-requirements)| `ResourceRequirements` | | | | |
| restartPolicy | [ContainerRestartPolicy](#container-restart-policy)| `ContainerRestartPolicy` | | | | |
| securityContext | [SecurityContext](#security-context)| `SecurityContext` | | | | |
| source | string| `string` | | | Source contains the source code of the script to execute | |
| source | string| `string` | | | Source contains the source code of the script to execute</br>+optional | |
| startupProbe | [Probe](#probe)| `Probe` | | | | |
| stdin | boolean| `bool` | | | Whether this container should allocate a buffer for stdin in the container runtime. If this</br>is not set, reads from stdin in the container will always result in EOF.</br>Default is false.</br>+optional | |
| stdinOnce | boolean| `bool` | | | Whether the container runtime should close the stdin channel after it has been opened by</br>a single attach. When stdin is true the stdin stream will remain open across multiple attach</br>sessions. If stdinOnce is set to true, stdin is opened on container start, is empty until the</br>first client attaches to stdin, and then remains open and accepts data until the client disconnects,</br>at which time stdin is closed and remains closed until the container is restarted. If this</br>flag is false, a container processes that reads from stdin will never receive an EOF.</br>Default is false</br>+optional | |
Expand Down
10 changes: 10 additions & 0 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,16 @@ You can use Kustomize to patch your preferred [configurations](managed-namespace

You can install Argo Workflows using the community maintained [Helm charts](https://github.com/argoproj/argo-helm).

### Full CRDs

The official release manifests come with stripped-down CRDs that omit validation information.
This is a workaround for [Kubernetes size limitations](https://github.com/kubernetes/kubernetes/issues/82292) when using client-side apply.
As of version 3.7, the full CRDs can be installed using [server-side apply](https://kubernetes.io/docs/reference/using-api/server-side-apply/) via the following command:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this doesn't go out as part of 3.7, I'll remove this before it's merged

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


```bash
kubectl apply --server-side --kustomize https://github.com/argoproj/argo-workflows/manifests/base/crds/full?ref=v3.7.0
```

## Installation options

Determine your base installation option.
Expand Down
2 changes: 1 addition & 1 deletion hack/manifests/crdgen.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ add_header() {
mv tmp "$1"
}

controller-gen crd:maxDescLen=0 paths=./pkg/apis/... output:dir=manifests/base/crds/full
controller-gen crd:maxDescLen=0,generateEmbeddedObjectMeta=true paths=./pkg/apis/... output:dir=manifests/base/crds/full

find manifests/base/crds/full -name 'argoproj.io*.yaml' | while read -r file; do
# remove junk fields
Expand Down
173 changes: 86 additions & 87 deletions hack/manifests/crds.go
Original file line number Diff line number Diff line change
@@ -1,108 +1,107 @@
package main

import (
"fmt"
"os"
"path/filepath"

"sigs.k8s.io/yaml"
)
import "fmt"

// cleanCRD does post-processing of the CRDs generated by kubebuilder to workaround limitations
// (e.g. https://github.com/kubernetes-sigs/controller-tools/issues/461)
func cleanCRD(filename string) {
data, err := os.ReadFile(filepath.Clean(filename))
if err != nil {
panic(err)
}
crd := make(obj)
err = yaml.Unmarshal(data, &crd)
if err != nil {
panic(err)
}
delete(crd, "status")
metadata := crd["metadata"].(obj)
delete(metadata, "annotations")
delete(metadata, "creationTimestamp")
spec := crd["spec"].(obj)
versions := spec["versions"].([]interface{})
version := versions[0].(obj)
schema := version["schema"].(obj)["openAPIV3Schema"].(obj)
name := crd["metadata"].(obj)["name"].(string)
switch name {
crd := ParseYaml(Read(filename))
crd.RemoveNestedField("status")
crd.RemoveNestedField("metadata", "annotations")
crd.RemoveNestedField("metadata", "creationTimestamp")
schema := crd.OpenAPIV3Schema()
switch crd.Name() {
case "cronworkflows.argoproj.io":
properties := schema["properties"].(obj)["spec"].(obj)["properties"].(obj)["workflowSpec"].(obj)["properties"].(obj)["templates"].(obj)["items"].(obj)["properties"]
properties.(obj)["container"].(obj)["required"] = []string{"image"}
properties.(obj)["script"].(obj)["required"] = []string{"image", "source"}
case "clusterworkflowtemplates.argoproj.io", "workflows.argoproj.io", "workflowtemplates.argoproj.io":
properties := schema["properties"].(obj)["spec"].(obj)["properties"].(obj)["templates"].(obj)["items"].(obj)["properties"]
properties.(obj)["container"].(obj)["required"] = []string{"image"}
properties.(obj)["script"].(obj)["required"] = []string{"image", "source"}
}
data, err = yaml.Marshal(crd)
if err != nil {
panic(err)
}
err = os.WriteFile(filename, data, 0o600)
if err != nil {
panic(err)
patchMetadata(&schema, "spec", "properties", "workflowMetadata", "properties")
patchWorkflowSpecTemplateFields(&schema, "spec", "properties", "workflowSpec", "properties")
case "clusterworkflowtemplates.argoproj.io", "workflowtemplates.argoproj.io":
patchWorkflowSpecTemplateFields(&schema, "spec", "properties")
case "workflows.argoproj.io":
patchWorkflowSpecTemplateFields(&schema, "spec", "properties")
patchWorkflowSpecTemplateFields(&schema, "status", "properties", "storedWorkflowTemplateSpec", "properties")
patchTemplateFields(&schema, "status", "properties", "storedTemplates", "additionalProperties", "properties")
patchEphemeralVolumeFields(&schema, "status", "properties", "persistentVolumeClaims", "items", "properties")
case "workfloweventbindings.argoproj.io":
patchMetadata(&schema, "spec", "properties", "submit", "properties", "metadata", "properties")
case "workflowtasksets.argoproj.io":
patchVolumeFields(&schema, "spec", "properties", "tasks", "additionalProperties", "properties")

}
crd.WriteYaml(filename)
}

// minimizeCRD is a workaround for two separate issues:
// 1. "Request entity too large: limit is 3145728" errors due to https://github.com/kubernetes/kubernetes/issues/82292
// 2. "spec.validation.openAPIV3Schema.properties[spec].properties[tasks].additionalProperties.properties[steps].items.items: Required value: must be specified" due to kubebuilder issues (https://github.com/argoproj/argo-workflows/pull/3809#discussion_r472383090)
func minimizeCRD(filename string) {
data, err := os.ReadFile(filepath.Clean(filename))
if err != nil {
panic(err)
}
func patchWorkflowSpecTemplateFields(schema *obj, baseFields ...string) {
patchTemplateFields(schema, append(baseFields, "templateDefaults", "properties")...)
patchTemplateFields(schema, append(baseFields, "templates", "items", "properties")...)
patchVolumeFields(schema, baseFields...)
patchMetadata(schema, append(baseFields, "volumeClaimTemplates", "items", "properties", "metadata", "properties")...)
}

shouldMinimize := false
if len(data) > 1024*1024 {
fmt.Printf("Minimizing %s due to CRD size (%d) exceeding 1MB\n", filename, len(data))
shouldMinimize = true
}
// Workaround for missing generateName (see https://github.com/kubernetes-sigs/controller-tools/pull/640)
func patchMetadata(schema *obj, baseFields ...string) {
schema.SetNestedField(map[string]string{"type": "string"}, append(baseFields, "generateName")...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yuck. Nice fix.

}

crd := make(obj)
err = yaml.Unmarshal(data, &crd)
if err != nil {
panic(err)
}
func patchEphemeralVolumeFields(schema *obj, baseFields ...string) {
patchMetadata(schema, append(baseFields, "ephemeral", "properties", "volumeClaimTemplate", "properties", "metadata", "properties")...)
}

if !shouldMinimize {
name := crd["metadata"].(obj)["name"].(string)
switch name {
case "cronworkflows.argoproj.io", "clusterworkflowtemplates.argoproj.io", "workflows.argoproj.io", "workflowtemplates.argoproj.io", "workflowtasksets.argoproj.io":
fmt.Printf("Minimizing %s due to kubebuilder issues\n", filename)
shouldMinimize = true
}
}
func patchVolumeFields(schema *obj, baseFields ...string) {
patchEphemeralVolumeFields(schema, append(baseFields, "volumes", "items", "properties")...)
}

if !shouldMinimize {
return
}
func patchTemplateFields(schema *obj, baseFields ...string) {
patchVolumeFields(schema, baseFields...)
// "container" and "script" templates embed the k8s.io/api/core/v1/Container
// struct, which requires the "name" field to be specified, but it's not actually required,
// and there's no easy way to override validating rules for embedded structs in kubebuilder.
schema.RemoveNestedField(append(baseFields, "container", "required")...)
schema.RemoveNestedField(append(baseFields, "script", "required")...)

crd = stripSpecAndStatusFields(crd)
// Hack to transform "steps" from an array of objects, e.g.
// steps:
// - steps:
// - name: foo
// template: bar
// to an array of arrays, e.g.:
// steps:
// - - name: foo
// template: bar
//
// The reason it's represented as an array of objects in "workflow_types.go"
// is because that's the only way to get kubebuilder to generate the spec
// without breaking API compatibility.
stepFields := append(baseFields, "steps", "items")
schema.CopyNestedField(append(stepFields, "properties", "steps"), stepFields)
}

data, err = yaml.Marshal(crd)
if err != nil {
panic(err)
// minimizeCRD generates a stripped-down CRD as a workaround for "Request entity too large: limit is 3145728" errors due to https://github.com/kubernetes/kubernetes/issues/82292.
// This isn't an issue if you use server-side apply, but not everyone has that available.
func minimizeCRD(filename string) {
data := Read(filename)
shouldMinimize := false
// TODO: The 512KB limit is just to avoid changing too much in https://github.com/argoproj/argo-workflows/pull/14044
// Increase this back to 1MB after that's merged.
if len(data) > 512*1024 {
fmt.Printf("Minimizing %s due to CRD size (%d) exceeding 512KB\n", filename, len(data))
shouldMinimize = true
}
err = os.WriteFile(filename, data, 0o600)
if err != nil {
panic(err)
if !shouldMinimize {
return
}
crd := ParseYaml(data)
stripSpecAndStatusFields(crd)
crd.WriteYaml(filename)
}

// stripSpecAndStatusFields strips the "spec" and "status" fields from the CRD, as those are usually the largest.
func stripSpecAndStatusFields(crd obj) obj {
spec := crd["spec"].(obj)
versions := spec["versions"].([]interface{})
version := versions[0].(obj)
properties := version["schema"].(obj)["openAPIV3Schema"].(obj)["properties"].(obj)
for k := range properties {
if k == "spec" || k == "status" {
properties[k] = obj{"type": "object", "x-kubernetes-preserve-unknown-fields": true, "x-kubernetes-map-type": "atomic"}
}
func stripSpecAndStatusFields(crd *obj) {
schema := crd.OpenAPIV3Schema()
preserveMarker := obj{"type": "object", "x-kubernetes-preserve-unknown-fields": true, "x-kubernetes-map-type": "atomic"}
if _, ok := schema["spec"]; ok {
schema["spec"] = preserveMarker
}
if _, ok := schema["status"]; ok {
schema["status"] = preserveMarker
}
return crd
}
75 changes: 75 additions & 0 deletions hack/manifests/helpers.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
package main

import (
"fmt"
"os"
"path/filepath"

"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"sigs.k8s.io/yaml"
)

type obj map[string]interface{}

func (o *obj) RemoveNestedField(fields ...string) {
unstructured.RemoveNestedField(*o, fields...)
}

func (o *obj) SetNestedField(value interface{}, fields ...string) {
parentField := nestedFieldNoCopy[map[string]interface{}](o, fields[:len(fields)-1]...)
parentField[fields[len(fields)-1]] = value
}

func (o *obj) CopyNestedField(sourceFields []string, targetFields []string) {
value := nestedFieldNoCopy[any](o, sourceFields...)
o.SetNestedField(value, targetFields...)
}

func (o *obj) Name() string {
return nestedFieldNoCopy[string](o, "metadata", "name")
}

func (o *obj) OpenAPIV3Schema() obj {
versions := nestedFieldNoCopy[[]interface{}](o, "spec", "versions")
version := obj(versions[0].(map[string]interface{}))
return nestedFieldNoCopy[map[string]interface{}](&version, "schema", "openAPIV3Schema", "properties")
}

func nestedFieldNoCopy[T any](o *obj, fields ...string) T {
value, found, err := unstructured.NestedFieldNoCopy(*o, fields...)
if !found {
panic(fmt.Sprintf("failed to find field %v", fields))
}
if err != nil {
panic(err.Error())
}
return value.(T)
}

func (o *obj) WriteYaml(filename string) {
data, err := yaml.Marshal(o)
if err != nil {
panic(err)
}
err = os.WriteFile(filename, data, 0o600)
if err != nil {
panic(err)
}
}

func Read(filename string) []byte {
data, err := os.ReadFile(filepath.Clean(filename))
if err != nil {
panic(err)
}
return data
}

func ParseYaml(data []byte) *obj {
crd := make(obj)
err := yaml.Unmarshal(data, &crd)
if err != nil {
panic(err)
}
return &crd
}
3 changes: 0 additions & 3 deletions hack/manifests/types.go

This file was deleted.

4 changes: 2 additions & 2 deletions manifests/base/crds/full/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Full CRDs

These CRDs have the full schema.
As a result, they are large and probably not suitable to be used in your cluster due to [kubernetes/kubernetes#82292](https://github.com/kubernetes/kubernetes/issues/82292).
These CRDs have the full schema and must be applied using [server-side apply](https://kubernetes.io/docs/reference/using-api/server-side-apply/) to avoid hitting size limits.
See [Full CRDs](https://argo-workflows.readthedocs.io/en/latest/installation/#full-crds) for more details.
Loading
Loading