Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIP: images lock file #281

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
186 changes: 186 additions & 0 deletions hips/hip-0019.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
---
hip: 9999
title: "Add support for an images-lock.yml file"
authors: [ "Javier Salmerón García [email protected]", "Martín Pérez [email protected]", "Pablo Caraballo Llorente [email protected]" ]
created: "2022-12-16"
type: "feature"
status: "draft"
---

## Abstract

This HIP is built on top [HIP-0015](https://github.com/helm/community/blob/main/hips/hip-0015.md), taking it a step further.

While the original document described a standard way of declaring Helm Charts' images through `annotations`, here we propose a complementing way of generating a complete snapshot view of all the container images existing at the time of packaging a Helm Chart: **an images lock file**. This file would be uploaded to OCI registries alongside the Charts, providing both evidence and inventory data needed to simplify the process of moving Helm Charts between registries and to simplify Helm Chart deployments on air gapped environments.

## Motivation

Container images are stored separately from Helm Charts. On one hand, this makes the Helm Chart really small in size. But on the other hand, this brings many challenges in terms of distribution and network access to images, which could be avoided if the application was self-contained.
Copy link
Member

@gjenkins8 gjenkins8 Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my experience, there are two differing ways helm charts are used:

  1. The chart "floats" around the application. In that helm chart version X, can be used to install a range of application versions Y... e.g. https://artifacthub.io/packages/helm/kubegemsapp/nginx#nginx-parameters has the image.tag property. Which effectively corresponds to the version of nginx being deployed (and which has little correspondence to the helm chart version there)

  2. The chart version is synonymous with the application version. ie. helm chart version X is used to deploy application version X. For which the chart must not allow parameterization of image tags. I don't know of an example in the wild, but we follow this pattern internally at my company for delivering software.

I think for this HIP to practically succeed, there needs to be a way to account for the first scenario. But don't get me wrong, also enhancing/formalizing a methodology for the second scenario would be fantastic. Very glad to see this HIP!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great feedback @gjenkins8. I don't see much impediment in this proposal for 1. or 2. Basically what we are asking for Helm ecosystem is to have a way to generate a bill of materials at some agreed lifecycle command (e.g. helm dependency update). That command for example generates today a Chart.lock but in the future it could further decorate that lock or generate a different listing the container image dependencies.

Those container image annotations don't need to point to an specific version. e.g. a chart using nginx container could declare de dependency to 1.23.3, 1.23, 1, or not declare it at all and use latest. The lock file just states what was available at the given tag in dependency update time. As a Helm Chart is just a recipe, an user should always be able to use --set tag.image=1.12 and get an nginx deployment with nginx 1.12.

Now, lets say some new OCI/relocation/deployment/verification tooling exists built on top of this HIP and an user wants to use the latest nginx Helm Chart with nginx 1.12. That user, or automation, could simply change the annotation and pin that Helm Chart to that specific container image, or it could manually update the images lock (discouraged). Perhaps there are even more ways to do this.


This deocupling in addition to the lack of an image inventory, makes the use of Helm difficult on airgapped scenarios. An organization trying to install a Helm Chart with no connections to external repositories would have to previously:

* Migrate all the container images the Helm Chart relies on to its isolated internal registry (same for `dependencies`).
* Update the Helm Chart with the new references (same for `dependencies`).
* Push the modified Helm Chart to the isolated registry (same for `dependencies`).
* Install the Helm Chart

All of this needs to be executed through manual steps or in many cases companies will come up with their [bespoke internal tooling](https://github.com/vmware-tanzu/asset-relocation-tool-for-kubernetes).

In addition to this, there's no standard way of declaring the images a Helm Chart depends on. This forces software providers to rely on self-made conventions for the values files plus external tooling, or to declare these images in other places (with again another self-made structure). Following up with the example shown above, the same organization then would have to:

* Identify the container images the Helm chart relies on from all different combinations of values that the Helm Chart can take as input (same for `dependencies`).
* Migrate the container images the Helm Chart relies on to the isolated registry (same for `dependencies`).
* Update the Helm Chart with the new references (same for `dependencies`).
* Push the modified Helm Chart to the isolated registry (same for `dependencies`).
* Install the Helm Chart.

If Helm provides support for an container images lock file, many of these issues could be just addressed as part of the standard Helm tooling, automating the process and simplifying the work needed to move applications between registries.

## Rationale

During the drafting of how to prepare an accurate images lock file, several alternatives found on the Internet were considered and discarded.

The first option one could think of is rendering the Helm Chart and parsing the images from the equivalent Kubernetes manifests. From that point, it would be relatively easy to do the rest. However, post-render attempts of getting the images list aren't be reliable, as the manifests change depending on the attributes of the values files, or the user input.

Some vendors like ArtifactHub use [annotations](https://github.com/artifacthub/hub/blob/master/charts/artifact-hub/Chart.yaml#L80), others like OpenShift use their own [proprietary fields](https://github.com/mongodb/helm-charts/blob/39df2aa9bc77e9f34c02c18ab7dbac172e5134a1/charts/enterprise-operator/values-openshift.yaml) in values files. Both end up falling into the same basket: promoting self-made conventions that aren't either backed up by an official specification, or supported by Helm itself natively.

A third approach would be trying to `grep` the images references out of the Chart files. Again, this ends up being an attempt of promoting self-made conventions in order to make the _grepping_ work. Discussing this poinrt, we brought Operators to the table. These components may rely on images which aren't declared on the helm files, but pulled directly from the source code.
paleloser marked this conversation as resolved.
Show resolved Hide resolved

Thus, we came to the conclussion that:

* [HIP-0015](https://github.com/helm/community/blob/main/hips/hip-0015.md) already pushes for declaring lists of container images through official `annotations`.
* Annotations are user-friendly (easy to write) but not machine-friendly (lacks of specification, misses digests...).
* An `images-lock.yaml` (or similarly named) would be backed up by a spec. plus it could contain other useful information such as the images' digests that can be used by supply chains to enforce additional good security practices.
* Helm could internally generate this file out of the `annotations` proposed in `HIP-0015`.
* Helm could apply an existing `images-lock.yaml` file to the _post-rendered_ manifests to enforce a BOM (just like others such as `npm` do with their `packages-lock.yml`).
* By parsing this file, Helm could help users to migrate Charts between registries, either by providing new commands or through new features in the existing commands.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be developed as a plugin initially.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a good point @joejulian . It might be worth actually to start this as a plugin.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a while due to other priorities but we ended up following your advice @joejulian https://github.com/vmware-labs/distribution-tooling-for-helm


## Specification

This is an example of a potential `images-lock` file that we would love to get feedback on:

```yaml
apiVersion: v0
kind: ImagesLock
metadata:
generatedAt: "2022-09-26T14:43:00.026746763Z"
generatedBy: Helm
chart:
name: wordpress
version: 10.3.6
location: oci://an-oci-repo.com/bitnami/wordpress:10.3.6
digests:
- digest: sha256:60f13c95c1a2dd37f25fe31e9cf3525704a544261a0b4c6b36f212990911b2c9
arch: linux/amd64
images:
- name: wordpress
image: docker.io/bitnami/wordpress:6.0.2-debian-11-r9
digests:
- digest: sha256:b4cb055a643d1d51d1678f3e011cf008b227852bc0a4118126deeed844ccfb6a
arch: linux/amd64
- digest: sha256:4d2760961bdd2e9956d752d637e7795e2eea973c25825bd18cb881c3edeee8ff
arch: linux/arm64
chart: wordpress
- name: bitnami-shell
image: docker.io/bitnami/bitnami-shell:11-debian-11-r37
digests:
- digest: sha256:0946f1d16c010bed0a47e3c09ff193d898c868044bb4c1b037c83b7c53c26deb
arch: linux/amd64
chart: wordpress
- name: apache-exporter
image: docker.io/bitnami/apache-exporter:0.11.0-debian-11-r42
digests:
- digest: sha256:ea4aa97834500297e4252f5f98492c52a7bfd1993bd5ef4e485ae84996b1c63a
arch: linux/amd64
chart: wordpress
- name: memcached
image: docker.io/bitnami/memcached:1.6.17-debian-11-r6
digests:
- digest: sha256:337a65b5c6e98e4f91606caf66b09b40b21f30b0a67749d5aa705eca1d501be5
arch: linux/amd64
chart: memcached
```

Ideally, the images lock would only contain references to images needed within the main Helm Chart. This makes the assumption that the dependencies already have their own images lock. Yet, the specification should allow to pin images used outside the main Helm Chart (used in the dependencies), like we're doing here with `memcached`.

This file should be distributed either alongside or within the Helm Chart being distributed. This means, that Helm should either package the `images-lock.yml` file within the chart before distributing it, or push it to an OCI registry as another blob. If the second approach is decided, the blob could be named after the Helm Chart checksum, similarly to what [cosign](https://github.com/sigstore/cosign#sign-a-container-and-store-the-signature-in-the-registry) does with signatures.

In addition to this, if the chart lock is available, then we could create a Helm builtin object called `Images`, just like we have `Chart` or `Release`. This would be useful for generating the template YAML and therefore avoid having the image information replicated in the lock and the values.yaml file.

Alternatively to the `images-lock.yml` file, this bill of materials could live inside the existing `Chart.lock`. While this reuses an existing file, which in part is intended to accomodate these kind of metadata, it also implies that a packaged chart may bundle different code than the original (i.e. if we took Bitnami's [Wordpress](https://github.com/bitnami/charts/blob/main/bitnami/wordpress/Chart.lock) and packaged it, the resulting `Chart.lock` would be different to the original). Yet, this could be addressed by tracking the `Chart.lock` file with images in `git`, or by not tracking it at all (like other users may do). This is a debate it already exists for other package managers and that has been out there for a while. A minimalistic draft of the `Chart.lock` file could be as follows:

```yaml
dependencies:
- name: memcached
repository: https://charts.bitnami.com/bitnami
version: 6.3.2
- name: mariadb
repository: https://charts.bitnami.com/bitnami
version: 11.4.1
- name: common
repository: https://charts.bitnami.com/bitnami
version: 2.2.2
digest: sha256:5a55c2b684433bf6fa9efcf0d3c4ab23ceef313808203b766c485b6e03bda60f
generated: "2022-12-13T02:28:18.307577267Z"
images:
- name: wordpress
image: docker.io/bitnami/wordpress:6.0.2-debian-11-r9
digests:
- digest: sha256:b4cb055a643d1d51d1678f3e011cf008b227852bc0a4118126deeed844ccfb6a
arch: linux/amd64
- digest: sha256:4d2760961bdd2e9956d752d637e7795e2eea973c25825bd18cb881c3edeee8ff
arch: linux/arm64
chart: wordpress
- name: bitnami-shell
image: docker.io/bitnami/bitnami-shell:11-debian-11-r37
digests:
- digest: sha256:0946f1d16c010bed0a47e3c09ff193d898c868044bb4c1b037c83b7c53c26deb
arch: linux/amd64
chart: wordpress
- name: apache-exporter
image: docker.io/bitnami/apache-exporter:0.11.0-debian-11-r42
digests:
- digest: sha256:ea4aa97834500297e4252f5f98492c52a7bfd1993bd5ef4e485ae84996b1c63a
arch: linux/amd64
chart: wordpress
```

## Backwards compatibility

There isn't any backward incompatibility, since there's no modification of any existing schema. Charts making use of the `annotations` and `images-lock` file would benefit of the feature, others just wouldn't.

## Security implications

Initially, there are no secutiry implications.

## How to teach this

Firstly, users will need to know how to define the `annotations` so the `images-lock.yaml` file is properly generated. This could be probably addressed through a section either in the `Chart.yaml` [documentation](https://helm.sh/docs/topics/charts/#the-chartyaml-file) or next to it.

Then, a description about the lock file would be needed. This is, to ensure that users understand the benefits they can get from it: having a bill of materials of the chart, overriding images at installation, migrating whole charts...

Once users have defined the `annotations` properly and understand what's the `images-lock.yaml` file for, they'll need to know how to use features built on top of it. These may come in different ways: new commands (i.e. `migrate` for moving Charts between registries alongside their images, plus rewriting the lock file), parameters (i.e. `push --with-images` could do the same) or just transparently (i.e. `helm install` plus a modified lock file would pin specific images). Depending on this, the affected part of the CLI documentation should be updated accordingly.

It may be less relevant to the end users how is the `images-lock.yaml` packaged alongside the Helm Chart, whether if it is pushed to a different OCI blob, or how is it named in the registry. These points should be mentioned somewhere as well, but it's likely that with less emphasis.

In addition to the points made above, given the size of the improvement, promotion through conference talks, presentations or any other public conversation would be strongly suggested.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something in here about the linter and chart-testing


## Reference implementation

* [Images in annotations](https://github.com/bitnami/charts/blob/521fe9919a27b088a46f5548e591506d24c2739d/bitnami/rabbitmq-cluster-operator/Chart.yaml).
* [Images Lock file](https://github.com/paleloser/charts/blob/helm-hip/bitnami/wordpress/images.lock.yml).

## Rejected ideas

Already discussed in [Rationale](#rationale).

## Open issues

We think there may be 2 points to review in this proposal:

* First, Helm Chart annotations are schemaless. Parsing is needed to resolve the container images via any tooling.
* Another caveat is that adding an annotation field may cause image tags to exist at least in two places. One in the annotation, and one or more times in the Helm Chart template itself.

## References

N/A