-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cache-invalidation based on image's contentDigest(a.k.a. imageId) #15678
Conversation
Hi @x7upLime. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: x7upLime The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Can one of the admins verify this patch? |
From pr kubernetes#15678 we're able to address content inside the kicDriver based on contentDigest(a.k.a. imageID, a.k.a. ID). So that archived images inside the kicBase cache, can be addressed and loaded to a more generic kicDriver entity, rather than worrying about how each container engine treats the distributionDigest/Tag/Name triple
be73155
to
59955b1
Compare
Being pkg/drivers/kic/types.go the source of truth for the version of the container we're using to instantiate our kübernetes cluster in, the pr should start here.. Initially I thought about hardcoding the contentDigest(a.k.a. imageId) here as well, to then use it to check against the images inside the kicDriver.. It later took another turn(we're retrieving it from tar). Plus a collaborator showed me that it was a bad idea.. maintaining it here would bean bumping it as part of the image build process. The idea is based on the following concepts: .contentDigest is the most reliable way to address image content: if the image is tampered with after push to a registry, the contentDigest we'd see after pull, would be different than the one hardcoded here. It is also part of the image itself, i.e. part of the tar archive; thus giving us a way to always know if the cache is up to date, even offline. .distributionDigest is the most reliable way to determine which image we're looking to pull from a registry; a tag can be detached from an image and recycled, referencing another one, with different content. It is not part of the image itself; it is computed on the image in compressed state.. and since different engines/mechanisms could use different types of compression, this digest is totally unreliable as a way to address content. [*] refs: https://windsock.io/explaining-docker-image-ids/ google/go-containerregistry#895 (comment) https://stackoverflow.com/questions/45533005/why-digests-are-different-depend-on-registry https://blog.aquasec.com/docker-image-tags -- follow links
Ther are two versions of imagePathInCache: one in pkg/minikube/download one in pkg/minikube/image They refer to two different caches.. I thought that we could use the one referring to the kicBase cache, instead of rewriting the same thing inside pkg/minikube/node
The idea is that the distributionDigest was not meant to be used as something content-addressable. So instead of invalidating the kicBase cache based on the distributionDigest present in the kicDriver's storage, we could move to something like this: .User expresses preferences in regard of kicBase image, by specifying the image by name:tag@digest, name:tag,... whatever .The image's specified name is sanitized and used as filename for the tarbal in minikube's cache[*] .Image stored inside kicDriver(docker, podman,...) is validated against the image's contentDigest(imageID), which we're retrieving directly from the .tar archive.. which we're selecting from the cache, by sanitized name I initially tried implementing a mechanism based on a json file that would contain entries in regard of cached images, plus other infos that we would need to address content, like distributionDigest(registry digest)... Something like a repositories.json file.. in a docker fashion (/var/lib/docker/image/overlay2/repositories.json) That would lead to complications like "what if user removes a .tar to save space?", (basically.. how we keep the file up-to-date with cache content) that would add extra complexity in solving.. like some logic that would be called on startup, that would read all cached files and reflect in the file. As far as I can understand our usecase, which doesn't seem very complex(I might be wrong..), some kind of simple/raw mechanism like the one in this commit could suffice.. Like we don't have to maintain a lot of images.. The source of truth is hardcoded in the sources.. the image/file relation for the kicBase cache is based on the sanitize(img) mechanism.. ... [+] refs.. https://windsock.io/explaining-docker-image-ids/ https://blog.aquasec.com/docker-image-tags https://stackoverflow.com/questions/45533005/why-digests-are-different-depend-on-registry [*] TODO: this could be one possible shortcoming.. - what if user wants the image:tag to be always up-to-date with registry? perhaps some flag?
59955b1
to
35dfc10
Compare
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
In order to accomplish contentDigest-based cache-invalidation,
this proposes a very simple mechanism based on cache's filesystem:
KicDriver's stored images are checked against the cache based on contentDigest,
the contentDigest from the cached image is retrieved reading the manifest.json file inside the tarball,
the tarball is selected based on sanitized image name.
related to #15677
solves #currently_nothing.. but would facilitate the work in accomplishing #15491
and solving related issues..