-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement for RHCOS in the marketplace #1655
base: master
Are you sure you want to change the base?
Enhancement for RHCOS in the marketplace #1655
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
`.architectures[("aarch64","x86_64")].rhel_coreos_extensions` fields. | ||
|
||
|
||
### Workflow Description |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This describe the "older" workflow, not the one I suggested in https://miro.com/app/board/uXjVK0sf6No=/, which is more linear and does not need us to do multiple things in a single PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing this back up. I didn't understand the Alternate Workflow, but after reviewing it a bit more, I understand a bit better:
Marketplace images are contained in a separate rhcos-marketplace.json file. Do you have a sense of how that file is generated? By RHCOS team or the installer team?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generate the rhcos.json file with our tooling. Ideally the new rhcos-marketplace.json
file is generated by the team responsible for uploading the images, using the same format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's https://github.com/coreos/stream-metadata-go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@travier I investigated the idea of using multiple files, but landed on the current proposal as a simpler way to solve the same problem. I addressed multiple files in the Alternatives
section. Would you please take a fresh look?
The idea is really quite simple, but I hope I have not made it too confusing when trying to explain.
images and including them in the RHCOS stream. | ||
* Setup non-paid Azure marketplace offering: determine publisher | ||
* Definition in RHCOS stream to separate multiple similar offers (e.g. ROSA-specific marketplace images vs. | ||
OpenShift AWS marketplace images, Azure paid vs. free images) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we call out publishing of x86 and ARM64 images? Or put that as non-goals? Generally, we (ARO HCP) would like to have both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add it. they would be requirements for self-managed (as we support both arches).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this give me the ability to infer an azure imageID or marketplace Architecture/Offer/Publisher/Sku out of a release payload rhcos.json/ocp version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this give me the ability to infer an azure imageID or marketplace Architecture/Offer/Publisher/Sku out of a release payload rhcos.json/ocp version?
Yes, you can currently extract the coreos stream from a release image, for example you can get the 4.16.6 stream like this:
$ oc image extract --file /coreos/coreos-stream.json $(oc adm release info --image-for machine-os-images quay.io/openshift-release-dev/ocp-release:4.16.6-x86_64)
Currently that only includes the source vhd, but if we are able to include marketplace images, they could be queried from that file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have Jira ticket(s) tracking the work needed for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARM publishing is already handled (at the top level of rhcos.json).
Do we have Jira ticket(s) tracking the work needed for this?
If RHCOS team thinks this is a viable idea, I can create cards for the proposal here.
|
||
ROSA and ARO both make use of marketplace images. ARO HCP depend on the | ||
availability of Azure marketplace images. It is an Open Question as to whether | ||
ARO requires a set of Azure marketplace images that is distinct from typical |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARO doesn't rely on any changes to the underlying RHCOS images that are used by upstream OCP. We take the existing VHDs in the rhcos.json
file and publish them ourselves manually today.
The one difference between marketplace offerings though is the ARO one is that the ARO one is a Microsoft published marketplace offering, which I believe has subtle differences between a third-party published offering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. So we need two separate sets of azure marketplace images? I assume standalone openshift cannot use the marketplace images from the 1P Microsoft Marketplace, as they are billed specifically for ARO? If not for other reasons...
Does ARO need support with publishing the images? ARO is publishing the images now, but would like that to delegate that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose free images are not something we can implement in the Azure Marketplace without offer acceptance, etc. In that case, additional automation might need to be implemented to add the account to install an ARO cluster to an Azure Marketplace listing that contains the install images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ARO images are free today. We use a different mechanism to bill customers.
Does ARO need support with publishing the images? ARO is publishing the images now, but would like that to delegate that?
Yes, that's the goal with our recent conversations with all the stakeholders in SPSTRAT-271 and related cards. There are some nuances with the 1p offering which we can talk about offline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies if I am joining late to the party here. So, is the publishing of ARO images (which is currently handled by the ARO team), out of scope for the work this enhancement is trying to propose?
Will all non-ARO Azure Marketplace images all be of the type NoPurchasePlan
? If not, will there be instances where a non-ARO Azure OpenShift Install would want to use PurchasePlan*
images?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will all non-ARO Azure Marketplace images all be of the type
NoPurchasePlan
? If not, will there be instances where a non-ARO Azure OpenShift Install would want to usePurchasePlan*
images?
These questions were answered later in the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an update, we were able to successfully use the release engineering's tooling to upload an RHCOS image to the 1p Azure Marketplace.
|
||
**pull request** is a pull request, opened by an RHCOS engineer, | ||
bumping the RHCOS stream in openshift/installer. The bump | ||
is based |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing part of the sentence?
Inactive enhancement proposals go stale after 28d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle stale |
|
||
OPEN QUESTION: How to handle EMEA vs North America | ||
OPEN QUESTION: how to handle azure gen1 image | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should another open question be how we handle multi-cloud images in Azure? Is govcloud in scope? China cloud? etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing is how we as an organization should ask the release engineering / spstrat team (assuming they're the ones owning the upload process) how to structure the images, specifically in Azure.
Azure is broken down as such:
- publisher: consistent
- offer: can make new offers, but typically you'd split fcos and rhcos here
- version: the version of the rhcos vhd
- sku: this is where things get interesting. Marketplace allows configuring different options the OS supports such as trusted launch, cloud-init, supports sriov, etc. Typically openshift minor releases will enable new functionality such as this, hence why it would be good to configure a new SKU for every minor version so we can support / enable these features for customers.
When configuring images in the marketplace, you configure them at a plan level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should another open question be how we handle multi-cloud images in Azure? Is govcloud in scope? China cloud? etc.
I think it's a good question, and one we will need to answer for the implementation, but I don't think it affects the overall enhancement. I'm happy to add it as an open question in the next update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sku: this is where things get interesting. Marketplace allows configuring different options the OS supports such as trusted launch, cloud-init, supports sriov, etc. Typically openshift minor releases will enable new functionality such as this, hence why it would be good to configure a new SKU for every minor version so we can support / enable these features for customers.
I have incorporated this into the enhancement. The sku will be variable, but deterministic.
Stale enhancement proposals rot after 7d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle rotten |
/remove-lifecycle rotten |
1 similar comment
/remove-lifecycle rotten |
/test markdownlint |
@patrickdillon: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
$ az vm image list --publisher azureopenshift --offer aro4 --all -o table --sku 418 | ||
Architecture Offer Publisher Sku Urn Version | ||
-------------- ------- -------------- ------- ------------------------------------------------- --------------------- | ||
Arm64 aro4 azureopenshift 418-arm azureopenshift:aro4:418-arm:418.94.202410090804-0 418.94.202410090804-0 | ||
x64 aro4 azureopenshift 418-v2 azureopenshift:aro4:418-v2:418.94.202410090804-0 418.94.202410090804-0 | ||
x64 aro4 azureopenshift aro_418 azureopenshift:aro4:aro_418:418.94.202410090804-0 418.94.202410090804-0 | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are URNs immutable or could azureopenshift:aro4:418-v2:418.94.202410090804-0
today be something different than azureopenshift:aro4:418-v2:418.94.202410090804-0
tomorrow? Ideally they'd be immutable either by Azure policy or our publishing pipeline policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally they'd be immutable either by Azure policy or our publishing pipeline policy
Yes they would be inherently immutable by the coreos stream publishing policy! 418.94.202410090804-0
refers to a specific build, so as long as the conventions for publishing the marketplace image (i.e. correct build version is used and specified) and the stream are practiced the image for that URN will always be the same.
authors: | ||
- "@patrickdillon" | ||
reviewers: # Include a comment about what domain expertise a reviewer is expected to bring and what area of the enhancement you expect them to focus on. For example: - "@networkguru, for networking aspects, please look at IP bootstrapping aspect" | ||
- TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, would like to have lgtm from the following folks, or a delegate of their choosing, before we merge.
@jlebon or @mike-nguyen from RHCOS
@bennerv from ARO
@yuqi-zhang from MCO
@bryan-cox from HCP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but doublechecking with some other HyperShift folks to see if they are good with this as well.
/approved |
Early feedback on this enhancement proposed splitting marketplace images into a separate file. | ||
Multiple files solves the issues regarding holding the initial bump PR to wait for marketplace | ||
images to be uploaded, but there are many assumptions throughout openshift that the stream is contained | ||
in a single file, such as `openshift-install coreos print-stream-json`, the | ||
[coreos configmap manifest](https://github.com/openshift/installer/blob/master/hack/build-coreos-manifest.go), | ||
& perhaps more. I believe the current approach solves the problems we were trying to solve with this alternative | ||
but in a way that requires less reworking of existing code and keeps `rhcos.json` as a single canonical file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, even so I still think that having it be a separate file would be cleaner. I may be wrong, but offhand it doesn't seem like it'd require that much work given how simple e.g. the print and configmap code is. One thing that helps is that this is new information, so only things that need to care about Marketplace images need to know how to access it.
Currently, plume cosa2stream
is purely about transforming some JSON about a build into some other JSON schema. I think it's confusing to have the command now also do cloud queries to find out what the Marketplace images are. It also makes it non-deterministic. But also, having it be separate tools and files managed by separate teams is organizationally just easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My primary concern with using the separate file is that we would lose having a single canonical source for RHCOS images. We plan on moving all Azure installs to marketplace images, so for example, clients would need to know "I'm looking for Azure images so I don't look in rhcos.json like I do for all other images, I need to look in rhcos-marketplace.json because I know Azure uses marketplace images."
I'm not sure if there is a way to use json refs or another solution to mitigate this issue (i.e. so that even though the stream is contained in separate json files it can still essentially be deserialized as a single canonical source). I will definitely look into that more (& please lmk if you know).
I don't think plume cosa2stream
is the right place--this is my ignorance of cosa. It might be ore
that would be a more appropriate place for this to live, but I've had some difficulty pinpointing exactly how we publish the current Azure images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering this more, I think my concern can be handled by (some sort of) an additional layer of abstraction, like the one I mentioned (json ref) or just a package for loading both streams. I will update the enhancement to use the additional file implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We plan on moving all Azure installs to marketplace images, so for example, clients would need to know "I'm looking for Azure images so I don't look in rhcos.json like I do for all other images, I need to look in rhcos-marketplace.json because I know Azure uses marketplace images."
Users are not supposed to look at the JSON file directly. They should really use the openshift-install coreos print-stream-json
command and we can make that command combine or print the right images in the code. See: https://docs.openshift.com/container-platform/4.17/installing/installing_platform_agnostic/installing-platform-agnostic.html#installation-user-infra-machines-iso_installing-platform-agnostic
I agree with Jonathan that this should be a different file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We introduced that command explicitly so that we would not create an interface with the rhcos.json
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users are not supposed to look at the JSON file directly.
we can make that command combine or print the right images in the code
@travier, +1 I see now that the separate json file is more of an implementation detail and we should be able to combine the separate files to maintain a single canonical source, which is my main concern
They should really use the openshift-install coreos print-stream-json
We introduced that command explicitly so that we would not create an interface with the rhcos.json file.
I'm not concerned about users, I'm concerned about openshift clients, such as hypershift. But I think they are looking at the configmap the installer creates, so again we should be able to handle this by combining both files when creating the configmap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That WFM. I think keeping it separate in-tree, but together in the cluster is a good balance.
"no-purchase-plan": { | ||
"publisher": "RedHat", | ||
"offer": "rhcos", | ||
"sku": "rhcos", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, sku is supposed to be something like 418
I stand by what I said in #1655 (comment). This does not reflect the flow we agreed on. We should have a discussion about this again. |
Inactive enhancement proposals go stale after 28d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle stale |
Stale enhancement proposals rot after 7d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle rotten |
/remove-lifecycle rotten |
I have an implementation up in openshift/installer#9329 and coreos/stream-metadata-go@main...patrickdillon:stream-metadata-go:mkt-ext. I believe this is very much in line with @travier's suggested approach. Working on #9329 has made things more concrete and brought forward some more open questions (e.g. we need to discuss fcos azure marketplace images). I will update the enhancement ASAP upon return from recharge. |
Inactive enhancement proposals go stale after 28d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle stale |
Rough draft of an enhancement/design for automating publishing of cloud marketplace images and utilizing them in the installer.
It may make sense to break this into smaller pieces, particularly publishing non-paid Azure marketplace images, for prioritization.
This is very rough at the moment, but perfect is the enemy of good, and getting some feedback, iteration, and centralization may help.