Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move from OSTree to OCI for updates #1823

Open
jlebon opened this issue Oct 30, 2024 · 18 comments
Open

Move from OSTree to OCI for updates #1823

jlebon opened this issue Oct 30, 2024 · 18 comments

Comments

@jlebon
Copy link
Member

jlebon commented Oct 30, 2024

Currently, FCOS pushes updates via an OSTree repo. To better align with the bootable containers initiative, let's move to updating FCOS via a container image. Container images are already being published in https://quay.io/repository/fedora/fedora-coreos.

Note this ticket is separate from #1263, which covers also fleshing out the story of layering. In this ticket, we're strictly scoping the effort to changing the transport used.

jlebon added a commit to jlebon/coreos-assembler that referenced this issue Nov 1, 2024
Take the digest pullspec for the base OS bootable container and put it
in the new `oci-image` field in the release metadata.

Part of coreos/fedora-coreos-tracker#1823.
jlebon added a commit to jlebon/coreos-assembler that referenced this issue Nov 1, 2024
When updating the release index, gather the OCI pullspecs across all
arches for a given release into a single list and inject it into the new
`oci-images` key, the same way we do for OSTree commits and `commits`.

Part of coreos/fedora-coreos-tracker#1823.
jlebon added a commit to jlebon/fedora-coreos-cincinnati that referenced this issue Nov 1, 2024
Currently, we're starting a total of N x M scrapers, where N is the
number of streams, and M the number of arches. So right now, this would
be 3 x 4 = 12 scrapers.

This is very wasteful because each scraper individually downloads the
release index and update metadata every 30 seconds, even though that
metadata is not different per architecture. I think the reason it was
set up this way is in case we wanted to host separate e.g. release
index or update files _per_ architecture in S3 instead of all togother.
This can be seen by the fact the code supports templating those URLs
with `basearch`. However, it's unlikely we'll be changing that design
decision, so let's just do the saner thing and rework the scraping to
be stream-based.

This is done by changing the scraper to host not one single `Graph`
object, but instead a `HashMap<String, Graph>` which maps architectures
to graphs. Then, when a request for a graph comes in, we lookup in our
cache keying off of the requested architecture.

This is prep for adding another dimension to the matrix, which is
whether the OCI version of the graph was reported. If we didn't do this
cleanup first, it would have blown up the number of scrapers to 24.

Part of coreos/fedora-coreos-tracker#1823.
jlebon added a commit to jlebon/fedora-coreos-cincinnati that referenced this issue Nov 1, 2024
When parsing the release index, check for the new `oci-images` key. If
present, also build up a separate graph with only nodes containing OCI
information. In that case, the node payload is the pullspec and the
scheme declared in the node metadata is `oci`.

When a client requests a graph, check if the `oci=` URL parameter was
set. If so, return back the OCI graph instead of the OSTree one.

Part of coreos/fedora-coreos-tracker#1823.
jlebon added a commit to jlebon/stream-metadata-go that referenced this issue Nov 1, 2024
Add support in both release metadata and the release index for
specifying OCI image pullspecs for a given release. In both cases, the
new fields are located at the same level as the existing `commits` key
holding OSTree checksums.

The key is called `oci-image` in release metadata and `oci-images` in
the release index.

Part of coreos/fedora-coreos-tracker#1823.
@jlebon
Copy link
Member Author

jlebon commented Nov 1, 2024

OK, some progress on this:

So the next step on this is support in Zincati for a config knob that tells it to request the OCI version of the graph and deploy using rpm-ostree rebase with appropriate custom origin info.

jlebon added a commit to jlebon/stream-metadata-go that referenced this issue Nov 1, 2024
Add support in both release metadata and the release index for
specifying OCI image pullspecs for a given release. In both cases, the
new fields are located at the same level as the existing `commits` key
holding OSTree checksums.

The key is called `oci-image` in release metadata and `oci-images` in
the release index.

Part of coreos/fedora-coreos-tracker#1823.
@cgwalters
Copy link
Member

Awesome!

jlebon added a commit to jlebon/fedora-coreos-cincinnati that referenced this issue Nov 2, 2024
Currently, we're starting a total of N x M scrapers, where N is the
number of streams, and M the number of arches. So right now, this would
be 3 x 4 = 12 scrapers.

This is very wasteful because each scraper individually downloads the
release index and update metadata every 30 seconds, even though that
metadata is not different per architecture. I think the reason it was
set up this way is in case we wanted to host separate e.g. release
index or update files _per_ architecture in S3 instead of all together.
This can be seen by the fact the code supports templating those URLs
with `basearch`. However, it's unlikely we'll be changing that design
decision, so let's just do the saner thing and rework the scraping to
be stream-based.

This is done by changing the scraper to host not one single `Graph`
object, but instead a `HashMap<String, Graph>` which maps architectures
to graphs. Then, when a request for a graph comes in, we lookup in our
cache keying off of the requested architecture.

This is prep for adding another dimension to the matrix, which is
whether the OCI version of the graph was requested. If we didn't do this
cleanup first, it would have blown up the number of scrapers to 24.

Part of coreos/fedora-coreos-tracker#1823.
jlebon added a commit to jlebon/fedora-coreos-cincinnati that referenced this issue Nov 2, 2024
When parsing the release index, check for the new `oci-images` key. If
present, also build up a separate graph with only nodes containing OCI
information. In that case, the node payload is the pullspec and the
scheme declared in the node metadata is `oci`.

When a client requests a graph, check if the `oci=` URL parameter was
set. If so, return back the OCI graph instead of the OSTree one.

Part of coreos/fedora-coreos-tracker#1823.
jlebon added a commit to jlebon/stream-metadata-go that referenced this issue Nov 6, 2024
Add support in both release metadata and the release index for
specifying OCI image pullspecs for a given release. In both cases, the
new fields are located at the same level as the existing `commits` key
holding OSTree checksums.

The key is called `oci-image` in release metadata and `oci-images` in
the release index.

Part of coreos/fedora-coreos-tracker#1823.
@cgwalters

This comment was marked as off-topic.

@jlebon

This comment was marked as off-topic.

jlebon added a commit to jlebon/coreos-assembler that referenced this issue Nov 7, 2024
Take the digest pullspec for the base OS bootable container and put it
in the new `oci-image` field in the release metadata.

Part of coreos/fedora-coreos-tracker#1823.
jlebon added a commit to jlebon/coreos-assembler that referenced this issue Nov 7, 2024
When updating the release index, gather the OCI pullspecs across all
arches for a given release into a single list and inject it into the new
`oci-images` key, the same way we do for OSTree commits and `commits`.

Part of coreos/fedora-coreos-tracker#1823.
jlebon added a commit to jlebon/fedora-coreos-cincinnati that referenced this issue Nov 7, 2024
When parsing the release index, check for the new `oci-images` key. If
present, also build up a separate graph with only nodes containing OCI
information. In that case, the node payload is the pullspec and the
scheme declared in the node metadata is `oci`.

When a client requests a graph, check if the `oci=` URL parameter was
set. If so, return back the OCI graph instead of the OSTree one.

Part of coreos/fedora-coreos-tracker#1823.
@jlebon
Copy link
Member Author

jlebon commented Nov 8, 2024

So the next step on this is support in Zincati for a config knob that tells it to request the OCI version of the graph

Instead of a config knob, Zincati can just cue off of the origin. If the OS is on OSTree, query the OSTree graph. If it's on OCI, query the OCI graph. So then the barrier release is just about switching the origin over from OSTree to OCI.

Discussed the rollout plan for this with @travier and @dustymabe:

  • Switch next bootimages to deploy-via-container whenever all the code needed has landed in FCOS.
    • This means that only new nodes will be updating via containers to start, giving us a way to bake it before switching updating nodes.
  • After X next releases (e.g. 2 or 3), do barrier release to switch origin on existing nodes.
  • Switch the testing bootimages to deploy-via-container. If we're still before Beta freeze, just do it. If we're after Beta freeze, do it at GA.
    • Stable follows 2 weeks after.
  • After X testing releases, do barrier release to switch origin on existing nodes.
    • Stable follows 2 weeks after.

dustymabe pushed a commit to coreos/coreos-assembler that referenced this issue Nov 13, 2024
Take the digest pullspec for the base OS bootable container and put it
in the new `oci-image` field in the release metadata.

Part of coreos/fedora-coreos-tracker#1823.
dustymabe pushed a commit to coreos/coreos-assembler that referenced this issue Nov 13, 2024
When updating the release index, gather the OCI pullspecs across all
arches for a given release into a single list and inject it into the new
`oci-images` key, the same way we do for OSTree commits and `commits`.

Part of coreos/fedora-coreos-tracker#1823.
@jlebon
Copy link
Member Author

jlebon commented Nov 15, 2024

One thing we discussed related to this was signing. Currently, the FCOS OCI artifacts are not signed (really, AFAICT none of the Fedora container images are signed either). Ideally, we close that gap before we fully switch over.

At least the OSTree commit inside of the OCI is still signed, but we'd still need to download and extract without any verification. But even then it's not really used in any meaningful way client-side currently because when importing into the OSTree repo, it's the OCI "merge commit" that gets deployed. We'd have to unencapsulate instead and in the process point at the remote OSTree config containing the GPG settings. So yeah... all around much better if we just fix this at the OCI level.

Related Robosignatory issue: https://pagure.io/robosignatory/issue/22

@jlebon
Copy link
Member Author

jlebon commented Nov 20, 2024

One thing we discussed related to this was signing.

Was chatting with @cgwalters about this. One interesting note is that the Konflux pipelines building rhel-bootc and centos-bootc do sign container images today. See e.g. the shield icons in https://quay.io/repository/centos-bootc/centos-bootc?tab=tags. Ideally, we also have that setup for fedora-bootc as we bring it up in the Fedora Konflux, and we can piggy-back on it for FCOS too (even if we're not building the whole thing in Konflux just yet).

Otherwise, it should also be possible to make the rpm-ostree consumption of the embedded signature better.

jbtrystram added a commit to jbtrystram/zincati that referenced this issue Dec 16, 2024
Parse rpm-ostree status to detect if the current booted deployment
is an OCI image. If so, query the OCI graph for cincinnati and
rebase to the correct OCI image.

Requires coreos/fedora-coreos-cincinnati#99
and coreos/rpm-ostree#5120

Part of: coreos/fedora-coreos-tracker#1823
jbtrystram added a commit to jbtrystram/zincati that referenced this issue Dec 16, 2024
Parse rpm-ostree status to detect if the current booted deployment
is an OCI image. If so, query the OCI graph for cincinnati and
rebase to the correct OCI image.

Requires coreos/fedora-coreos-cincinnati#99
and coreos/rpm-ostree#5120

Part of: coreos/fedora-coreos-tracker#1823
jbtrystram added a commit to jbtrystram/zincati that referenced this issue Dec 16, 2024
Parse rpm-ostree status to detect if the current booted deployment
is an OCI image. If so, query the OCI graph for cincinnati and
rebase to the correct OCI image.

Requires coreos/fedora-coreos-cincinnati#99
and coreos/rpm-ostree#5120

Part of: coreos/fedora-coreos-tracker#1823
jbtrystram added a commit to jbtrystram/zincati that referenced this issue Dec 16, 2024
Parse rpm-ostree status to detect if the current booted deployment
is an OCI image. If so, query the OCI graph for cincinnati and
rebase to the correct OCI image.

Requires coreos/fedora-coreos-cincinnati#99
and coreos/rpm-ostree#5120

Part of: coreos/fedora-coreos-tracker#1823
jbtrystram added a commit to jbtrystram/fedora-coreos-browser that referenced this issue Dec 18, 2024
Cincinnati now serves a separate graph for OCI images, since
coreos/fedora-coreos-cincinnati#99

Allow displaying the oci graph by passing `oci=true` as a query
parameter.
See coreos/fedora-coreos-tracker#1823
jbtrystram added a commit to jbtrystram/fedora-coreos-browser that referenced this issue Dec 18, 2024
Cincinnati now serves a separate graph for OCI images, since
coreos/fedora-coreos-cincinnati#99

Allow displaying the oci graph by passing `oci=true` as a query
parameter.
See coreos/fedora-coreos-tracker#1823
jbtrystram added a commit to jbtrystram/fedora-coreos-browser that referenced this issue Jan 10, 2025
Cincinnati now serves a separate graph for OCI images, since
coreos/fedora-coreos-cincinnati#99

Allow displaying the oci graph by passing `oci=true` as a query
parameter.
See coreos/fedora-coreos-tracker#1823
@travier
Copy link
Member

travier commented Jan 10, 2025

The Change Checkpoint: Proposal submission deadline (Self Contained Changes) for F42 is Tue 2025-01-14.

@jlebon
Copy link
Member Author

jlebon commented Jan 10, 2025

Re. signing, @travier and I were playing with

$ rpm-ostree rebase ostree-remote-image:fedora:registry:quay.io/fedora/fedora-coreos:testing-devel

and confirmed that it does verify the OSTree signature using the fedora remote. And I now realize this is also how it's set up in Rawhide already. Ideally though, the signing state would also show up in rpm-ostree status or bootc status as it does for the non-OCI case.

Anyway, we still of course want to instead sign the OCI itself but we don't expect that to happen before we move to Konflux.

@cgwalters
Copy link
Member

Yes, there's a lot of custom stuff in the ostree-container stack to do this for exactly this reason.

Anyway, we still of course want to instead sign the OCI itself

Yes.

jbtrystram added a commit to jbtrystram/fedora-coreos-browser that referenced this issue Jan 14, 2025
Cincinnati now serves a separate graph for OCI images, since
coreos/fedora-coreos-cincinnati#99

Allow displaying the oci graph by passing `oci=true` as a query
parameter.
See coreos/fedora-coreos-tracker#1823
@jlebon
Copy link
Member Author

jlebon commented Jan 14, 2025

Ideally though, the signing state would also show up in rpm-ostree status or bootc status as it does for the non-OCI case.

I didn't do exactly this, but I consider it addressed by containers/bootc#1028 and coreos/rpm-ostree#5223. See notably sample output in coreos/rpm-ostree#5223 (comment).

@travier
Copy link
Member

travier commented Jan 15, 2025

We might also have to fix coreos/rpm-ostree#4951 as well.

@jbtrystram jbtrystram added the meeting topics for meetings label Jan 15, 2025
@jbtrystram
Copy link
Contributor

Related Fedora change request : https://fedoraproject.org/wiki/Changes/CoreOSOstree2OCIUpdates

@jbtrystram
Copy link
Contributor

jbtrystram commented Jan 16, 2025

Documenting the next steps (after coreos/zincati#1241 is merged):

  • Release Zincati
  • Document steps to take for people that have disabled Zincati
  • Prepare a barrier release doing the rebase to OCI
  • Write a MOTD explaining what to do if Zincati is disabled (we don't want to force a reboot on people that have disabled Zincati)
  • Add an entry to the major changes documation page

@jbtrystram
Copy link
Contributor

So we discussed how to make the switch to OCI through Zincati : running a service to manually rebase through a barrier release would cause two reboots (one to get the update through Zincati, one to rebase to the OCI image), and double the download, causing the second reboot to be outside the Zincati window potentially.

An alternative is to fake out the origin file, causing Zincati to think the deployment is now OCI, so the next update will be fetched through OCI and rebased.
This works but create a cosmetic issue :

[core@cosa-devsh ~]$ rpm-ostree status
State: idle
Deployments:
● ostree-remote-image:fedora:registry:quay.io/fedora/fedora-coreos:testing
             CustomOrigin: Fedora CoreOS testing stream
                  Version: 41.20250105.2.0 (2025-01-06T19:46:25Z)

  (error fetching image metadata)
             CustomOrigin: Fedora CoreOS testing stream
                  Version: 41.20241215.2.0 (2024-12-17T00:07:38Z)

See how rpm-ostree status cannot find the "faked deployment". Though it's cosmetic only, I was able to overlay packages just fine.

@travier
Copy link
Member

travier commented Jan 20, 2025

@jbtrystram
Copy link
Contributor

Seems those docs are based on FCOS docs already https://docs.fedoraproject.org/en-US/fedora-coreos/proxy/

@dustymabe
Copy link
Member

We discussed this during the community meeting today:

  • ACTION: @jbtrystram to update the change to make clear that bootimages will migrate first, then existing notes after. Also mention clearly that deriving the images won't be supported for now (@jbtrystram:matrix.org, 16:53:17)
  • INFO: FCOS will move to default to pulling updates from OCI registry in the F42 time frame. (@dustymabe:matrix.org, 16:56:20)

@dustymabe dustymabe removed the meeting topics for meetings label Jan 22, 2025
jbtrystram added a commit to jbtrystram/fedora-coreos-config that referenced this issue Jan 23, 2025
Switch boot images to use OCI for updates. This is a step towards
bootable containers support and bootc rebase.

See https://fedoraproject.org/wiki/Changes/CoreOSOstree2OCIUpdates
Requires coreos/zincati#1241
See coreos/fedora-coreos-tracker#1823
jbtrystram added a commit to jbtrystram/fedora-coreos-docs that referenced this issue Jan 23, 2025
With the f42 rebase we will switch disk images to use OCI for
updates. While this is transparent for most users, it's still a big
technical change so it's worth mentionning it, at least so users using
can look into it.

See https://fedoraproject.org/wiki/Changes/CoreOSOstree2OCIUpdates
coreos/fedora-coreos-tracker#1823
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants