Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate Docker Layers for Dependencies and App jars. #1310

Merged
merged 21 commits into from
Mar 16, 2020

Conversation

ppiotrow
Copy link
Contributor

@ppiotrow ppiotrow commented Feb 20, 2020

Fixes #1267
Inspired by https://phauer.com/2019/no-fat-jar-in-docker-image/

The idea is to modify docker staging directory to move application related jars from lib to app-lib directory. Then two COPY operations are invoked to the same destination. Dependencies are copied first as they are rarely change.
staging

Other:

  • mapGenericFilesToDocker exotic method was reimplemented to be similar to other functions around
  • only DockerPermissionStrategy.MultiStage branch of code was updated. Others would need to be updated to work with new staging.

Questions:

  • Convention to hardcode logic of organization filters was ok at first glance, but now I'd rather do it user configurable via project's build.sbt as suggested in the original issue.

Output logs from using newly published plugin in some real project.
First docker/publishLocal

[success] All package validations passed
[info] Sending build context to Docker daemon  26.57MB
[info] Step 1/20 : FROM docker.artifactory.the-company.com/the-team/openjre:11.0.5-0-e58ca54 as stage0
[info]  ---> 16b6e185c5cb
[info] Step 2/20 : LABEL snp-multi-stage="intermediate"
[info]  ---> Running in 539cc1636855
[info] Removing intermediate container 539cc1636855
[info]  ---> 5fe9c7b0dba8
[info] Step 3/20 : LABEL snp-multi-stage-id="ba011bf6-1ed4-4279-8ab4-937abc1a6236"
[info]  ---> Running in 749a51a0aca9
[info] Removing intermediate container 749a51a0aca9
[info]  ---> 8689656277d9
[info] Step 4/20 : WORKDIR /opt/the-project
[info]  ---> Running in 3ace818a32a7
[info] Removing intermediate container 3ace818a32a7
[info]  ---> 96e4b3bd8537
[info] Step 5/20 : COPY opt /opt
[info]  ---> d40dfada42ef
[info] Step 6/20 : USER some-user
[info]  ---> Running in 6b61481f355f
[info] Removing intermediate container 6b61481f355f
[info]  ---> 16e81c1582bd
[info] Step 7/20 : RUN ["chmod", "-R", "u=rX,g=rX", "/opt/the-project"]
[info]  ---> Running in 2f4a9f97d975
[info] Removing intermediate container 2f4a9f97d975
[info]  ---> 6ad7c76a87bf
[info] Step 8/20 : RUN ["chmod", "u+x,g+x", "/opt/the-project/bin/the-project"]
[info]  ---> Running in 6b2df7afc497
[info] Removing intermediate container 6b2df7afc497
[info]  ---> 4848ae77b613
[info] Step 9/20 : FROM docker.artifactory.the-company.com/the-team/openjre:11.0.5-0-e58ca54
[info]  ---> 16b6e185c5cb
[info] Step 10/20 : USER some-user
[info]  ---> Using cache
[info]  ---> 6af66ce6012f
[info] Step 11/20 : RUN id -u some-user 1>/dev/null 2>&1 || (( getent group 0 1>/dev/null 2>&1 || ( type groupadd 1>/dev/null 2>&1 && groupadd -g 0 some-user || addgroup -g 0 -S some-user )) && ( type useradd 1>/dev/null 2>&1 && useradd --system --create-home --uid 1001 --gid 0 some-user || adduser -S -u 1001 -G some-user some-user ))
[info]  ---> Using cache
[info]  ---> e74737ebd8f2
[info] Step 12/20 : WORKDIR /opt/the-project
[info]  ---> Using cache
[info]  ---> dbc0546c52a8
[info] Step 13/20 : COPY --from=stage0 --chown=some-user:some-user /opt/the-project/lib /opt/the-project
[info]  ---> Using cache
[info]  ---> 98297bc28b34
[info] Step 14/20 : COPY --from=stage0 --chown=some-user:some-user /opt/the-project/app-lib /opt/the-project
[info]  ---> beb82b669e4b
[info] Step 15/20 : ENV APP_CLASS_PATH="/opt/the-project/lib/*"
[info]  ---> Running in 423e10592214
[info] Removing intermediate container 423e10592214
[info]  ---> 2563bea886ff
[info] Step 16/20 : ENV APP_MAIN_CLASS="com.the-company.red.Main"
[info]  ---> Running in 59802f900a45
[info] Removing intermediate container 59802f900a45
[info]  ---> de2cbc001a35
[info] Step 17/20 : USER 1001:0
[info]  ---> Running in fa6765c1141c
[info] Removing intermediate container fa6765c1141c
[info]  ---> a26809b0b4dc
[info] Step 18/20 : ENTRYPOINT ["/usr/local/bin/run-class"]
[info]  ---> Running in b3f18864cdab
[info] Removing intermediate container b3f18864cdab
[info]  ---> 85aac1ee1e36
[info] Step 19/20 : CMD []
[info]  ---> Running in fdea9f4b0a17
[info] Removing intermediate container fdea9f4b0a17
[info]  ---> 2f2c119a32e7
[info] Step 20/20 : USER some-user
[info]  ---> Running in 12e5f019898b
[info] Removing intermediate container 12e5f019898b
[info]  ---> 5a7a508a4b22

Changing some lines of code and running docker/publishLocal again

[info] Step 1/20 : FROM docker.artifactory.the-company.com/the-team/openjre:11.0.5-0-e58ca54 as stage0
[info]  ---> 16b6e185c5cb
[info] Step 2/20 : LABEL snp-multi-stage="intermediate"
[info]  ---> Running in 125156cd5b67
[info] Removing intermediate container 125156cd5b67
[info]  ---> c904b531d432
[info] Step 3/20 : LABEL snp-multi-stage-id="c999d42d-5607-426f-b0c0-f207de303312"
[info]  ---> Running in 24f829d1375b
[info] Removing intermediate container 24f829d1375b
[info]  ---> a083afc95f36
[info] Step 4/20 : WORKDIR /opt/the-project
[info]  ---> Running in 2ff312e71914
[info] Removing intermediate container 2ff312e71914
[info]  ---> 75e2a8e9ff7a
[info] Step 5/20 : COPY opt /opt
[info]  ---> 6bac37070916
[info] Step 6/20 : USER some-user
[info]  ---> Running in 6dfbf0b79135
[info] Removing intermediate container 6dfbf0b79135
[info]  ---> fbed0213ad52
[info] Step 7/20 : RUN ["chmod", "-R", "u=rX,g=rX", "/opt/the-project"]
[info]  ---> Running in 95e6c38877e3
[info] Removing intermediate container 95e6c38877e3
[info]  ---> f8ca32bdfb14
[info] Step 8/20 : RUN ["chmod", "u+x,g+x", "/opt/the-project/bin/the-project"]
[info]  ---> Running in f68765152ce6
[info] Removing intermediate container f68765152ce6
[info]  ---> 77a6c19df80d
[info] Step 9/20 : FROM docker.artifactory.the-company.com/the-team/openjre:11.0.5-0-e58ca54
[info]  ---> 16b6e185c5cb
[info] Step 10/20 : USER some-user
[info]  ---> Using cache
[info]  ---> 6af66ce6012f
[info] Step 11/20 : RUN id -u some-user 1>/dev/null 2>&1 || (( getent group 0 1>/dev/null 2>&1 || ( type groupadd 1>/dev/null 2>&1 && groupadd -g 0 some-user || addgroup -g 0 -S some-user )) && ( type useradd 1>/dev/null 2>&1 && useradd --system --create-home --uid 1001 --gid 0 some-user || adduser -S -u 1001 -G some-user some-user ))
[info]  ---> Using cache
[info]  ---> e74737ebd8f2
[info] Step 12/20 : WORKDIR /opt/the-project
[info]  ---> Using cache
[info]  ---> dbc0546c52a8
[info] Step 13/20 : COPY --from=stage0 --chown=some-user:some-user /opt/the-project/lib /opt/the-project
[info]  ---> Using cache
[info]  ---> 98297bc28b34
[info] Step 14/20 : COPY --from=stage0 --chown=some-user:some-user /opt/the-project/app-lib /opt/the-project
[info]  ---> 9378c36fa0e7
[info] Step 15/20 : ENV APP_CLASS_PATH="/opt/the-project/lib/*"
[info]  ---> Running in e94424f66cbd
[info] Removing intermediate container e94424f66cbd
[info]  ---> 8d1d7a06e8f5
[info] Step 16/20 : ENV APP_MAIN_CLASS="com.the-company.red.Main"
[info]  ---> Running in 149b46c7ad6c
[info] Removing intermediate container 149b46c7ad6c
[info]  ---> 81600a029db3
[info] Step 17/20 : USER 1001:0
[info]  ---> Running in 4dd779a082a3
[info] Removing intermediate container 4dd779a082a3
[info]  ---> 719903d06e3e
[info] Step 18/20 : ENTRYPOINT ["/usr/local/bin/run-class"]
[info]  ---> Running in 694314ccaad5
[info] Removing intermediate container 694314ccaad5
[info]  ---> de3b6d226c99
[info] Step 19/20 : CMD []
[info]  ---> Running in af74538469f1
[info] Removing intermediate container af74538469f1
[info]  ---> b2bec70c0733
[info] Step 20/20 : USER some-user
[info]  ---> Running in bdeb33033857
[info] Removing intermediate container bdeb33033857
[info]  ---> 2d6d81e64ea6

@ppiotrow
Copy link
Contributor Author

I working on making this better by giving up magic (hardcoded organization) and use solution suggested by #1267 (comment)
Could you help me deciding if I should care about binary compatibility validations for mapGenericFilesToDocker? If yes I'll leave&deprecate it.
I also found in comments that some people might be unhappy with modification of staging directory layout. It's quite convinient at least in the current state of this PR.

@ppiotrow
Copy link
Contributor Author

ppiotrow commented Feb 20, 2020

Update, I decided to go with nice configurable mapping from file prefix to layer index.
The library now contains default implementation that caches bin then lib and then user organisation code.
staging2

I tried to experiment with reading staging directory with IO.listFiles to know what layers (/0 , /1, /17) are present. This would allow me to create perfect Dockerfile. However it would end up with circular dependencies as staging needs Dockerfiles present and Dockerfiles creation would need staging. That is why I decided to throw exception if file doesn't match any pattern.
Could you point me how to do it properly as exceptions are not common in this plugin?

Known issues:

  • The code will fail if no files directory defined for layer fromdockerLayerGrouping
  • The code will fail if files doesn't match any prefix from dockerLayerGrouping
  • Unclear what will happen with dockerPackageMappings

Output now:

FROM docker.artifactory.the-company.com/the-team/openjre:11.0.5-0-e58ca54 as stage0
LABEL snp-multi-stage="intermediate"
LABEL snp-multi-stage-id="63cc94e9-7a5b-4e76-80d4-833177fa7293"
WORKDIR /opt/the-project
COPY opt /opt
USER the-user
RUN ["chmod", "-R", "u=rX,g=rX", "/opt/the-project"]

FROM docker.artifactory.the-company.com/the-team/openjre:11.0.5-0-e58ca54
USER the-user
RUN id -u the-user 1>/dev/null 2>&1 || (( getent group 0 1>/dev/null 2>&1 || ( type groupadd 1>/dev/null 2>&1 && groupadd -g 0 the-user || addgroup -g 0 -S the-user )) && ( type useradd 1>/dev/null 2>&1 && useradd --system --create-home --uid 1001 --gid 0 the-user || adduser -S -u 1001 -G the-user the-user ))
WORKDIR /opt/the-project
COPY --from=stage0 --chown=the-user:the-user /opt/the-project/0 /opt/the-project
COPY --from=stage0 --chown=the-user:the-user /opt/the-project/1 /opt/the-project
COPY --from=stage0 --chown=the-user:the-user /opt/the-project/2 /opt/the-project
ENV APP_CLASS_PATH="/opt/the-project/lib/*"
ENV APP_MAIN_CLASS="com.the-company.red.Main"
USER 1001:0
ENTRYPOINT ["/usr/local/bin/run-class"]
CMD []
USER the-user

@nigredo-tori
Copy link
Collaborator

nigredo-tori commented Feb 21, 2020

I tried to experiment with reading staging directory with IO.listFiles to know what layers (/0 , /1, /17) are present. This would allow me to create perfect Dockerfile. However it would end up with circular dependencies as staging needs Dockerfiles present and Dockerfiles creation would need staging. That is why I decided to throw exception if file doesn't match any pattern.

We could add an intermediate task (e.g. val dockerLayerMappings) which would contain Docker / mappings with their corresponding layers as specified by dockerLayerGrouping (List[(Int, File, String)] or something like that). We can then use that both in Docker/stage (for transformed mappings) and for building the Dockerfile (layer indices). This should solve the circular dependency issue. It also would dockerLayerGrouping to remain a SettingKey[String => Int], which means we won't have have to deal with files not matching patterns.

It is reused in Docker/stage and and Docker/dockerCommands.
This intermediate step helps with cyclic dependency.
@ppiotrow
Copy link
Contributor Author

ppiotrow commented Feb 21, 2020

Thanks to @nigredo-tori tips the newest version fixes all three problems from above. The pull request is almost complete apart from this issues:

  • The docker staging directory layout has changed, was it part of contract?
  • Should care about binary compatibility validations for mapGenericFilesToDocker?
  • When I delete docker staging repository between first and second docker:publish it is not using cache for /bin directory. The startup scripts have identical content but most likely creation date should be ignored?
    Maybe in separate issue someone can force /bin scripts to inherit creation date from their templates?

The last-modified and last-accessed times of the file(s) are not considered in these checksums
docker

After making sure its on the right track I'll implement code for branches other than DockerPermissionStrategy.MultiStage. Waiting for your review now.

@nigredo-tori
Copy link
Collaborator

When I delete docker staging repository between first and second docker:publish it is not using cache for /bin directory. The startup scripts have identical content but most likely creation date should be ignored?

Creation date shouldn't matter at all:

For the ADD and COPY instructions, the contents of the file(s) in the image are examined and a checksum is calculated for each file. The last-modified and last-accessed times of the file(s) are not considered in these checksums. During the cache lookup, the checksum is compared against the checksum in the existing images. If anything has changed in the file(s), such as the contents and metadata, then the cache is invalidated.

@ppiotrow
Copy link
Contributor Author

ppiotrow commented Feb 23, 2020

Latest changes solve all previous problems, questions and review remarks. Some changes were required in dockerAdditionalPermissions as code is executed in stage0.
Code got little dirty after experiments and will be improved after plugin behaviour is accepted.
I've just found a bug in RUN ["chmod", "-R", "u=rX,g=rX", "/opt/the-project"] while writing this comment. (fixed in next commit)

staging3

Dockerfile now:

FROM docker.artifactory.the-company.com/the-team/openjre:11.0.5-0-e58ca54 as stage0
LABEL snp-multi-stage="intermediate"
LABEL snp-multi-stage-id="02fda8dd-1fdd-4841-9a70-6f5462f0c3a8"
WORKDIR /opt/the-project
COPY 1 /1/
COPY 2 /2/
USER the-user
RUN ["chmod", "-R", "u=rX,g=rX", "/1/opt/the-project"]
RUN ["chmod", "-R", "u=rX,g=rX", "/2/opt/the-project"]
RUN ["chmod", "u+x,g+x", "/1/opt/the-project/bin/the-project"]

FROM docker.artifactory.the-company.com/the-team/openjre:11.0.5-0-e58ca54
USER the-user
RUN id -u demiourgos728 1>/dev/null 2>&1 || (( getent group 0 1>/dev/null 2>&1 || ( type groupadd 1>/dev/null 2>&1 && groupadd -g 0 the-user || addgroup -g 0 -S the-user )) && ( type useradd 1>/dev/null 2>&1 && useradd --system --create-home --uid 1001 --gid 0 demiourgos728 || adduser -S -u 1001 -G the-user demiourgos728 ))
WORKDIR /opt/the-project
COPY --from=stage0 --chown=demiourgos728:the-user /1/opt/the-project /opt/the-project
COPY --from=stage0 --chown=demiourgos728:the-user /2/opt/the-project /opt/the-project
USER 1001:0
ENTRYPOINT ["/opt/the-project/bin/the-project"]
CMD []
USER demiourgos728

@muuki88 muuki88 added the docker label Feb 26, 2020
Copy link
Contributor

@muuki88 muuki88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! The implementation looks solid, docs and tests look good to me.

Two small remarks that would be nice to have fixed. When the failing docker tests
are green (you can ignore the windows ones, we have to deal with those soon).
then this is good to merge from my perspective.

Thanks @nigredo-tori for the guidance and well done comments 🤗

@ppiotrow
Copy link
Contributor Author

I'm looking at failing tests on my computer.
scripted docker/build-command
This fails as it wants to invoke Dockerfiles in the staging directory.
The path of Dockerfile changed to build-command/target/docker/stage/0/Dockerfile.
I'm not sure if I should just change test expectations or try to address it.
https://github.com/sbt/sbt-native-packager/blob/master/src/sbt-test/docker/build-command/build.sbt#L9

@nigredo-tori
Copy link
Collaborator

nigredo-tori commented Mar 1, 2020

I'm not sure if I should just change test expectations or try to address it.

This test assumes the way the files are mapped to the staging directory, and we're changing that, so we should definitely change the expectations. Preferably in a way that eliminates that assumption. For example, we can replace

mappings in Docker ++= directory("src/main/resources/docker-test")
dockerBuildCommand := Seq("docker", "build", "-t", "docker-build-command-test:0.1.0", "docker-test/")

with

dockerBuildCommand := Seq(
  "docker", "build", "-t", "docker-build-command-test:0.1.0",
  (baseDirectory.value / "docker-test").getAbsolutePath
)

and move the Dockerfile to the docker-test directory in the test project root.

However, this highlights two more potential issues. This change is breaking for people with custom build commands (only if they use a non-standard Dockerfile path) and, more importantly, for people with hand-written Dockerfiles. I'm not sure what we can do to address this without complicating the plugin even more.

@ppiotrow
Copy link
Contributor Author

ppiotrow commented Mar 1, 2020

To have it consistent with the past we can try to limit power of grouping function to include None as default layer.
case class LayeredMapping(layerId: Option[Int], file: File, path: String)
Then we define default grouping function to affect only /lib&/bin directories.

@ppiotrow ppiotrow requested review from nigredo-tori and muuki88 March 1, 2020 20:28
@muuki88
Copy link
Contributor

muuki88 commented Mar 3, 2020

Thanks for the good discussion 😃 I tried to give my opinion on this, but my docker production knowledge is rather limited, so bare with me 😬

stage dir and default layer

The thing is that, for existing Dockerfile to continue working, the default layer would need to be in the same directory as the Dockerfile (the root of the build context), where all the layers are, and it would not be easy to cleanly copy it.

AFAIK if the user wants the old behaviour, then this would be enough

dockerLayerGrouping := (_ => None)

and everything will be layered out in staging as before. For me this would be what @nigredo-tori described as
"we can provide a single opt-in (opt-out?) setting for the new layering behavior."

The trade-off would be that we would need to maintain (and test) two versions of the same functionality

The complexity from this PR seams reasonable to me, so I would accept the extra maintenance required to check layered and non-layered.

#854 introduce the failing build-command test. The purpose of this was not have a stage directory with a certain shape, but to configure the docker command to adjust to breaking changes in the CLI. So this test can be adapted as needed.

Long term

If we're thinking long-term, it would make sense to just rip off the band-aid, and provide a consistent and predictable layering logic from the start.

I'm open for suggestions 😄 This issue is open for years and this is the fourth attempt, which is looking promising to me. Docker changes so fast, I have lost faith in "long-term" solutions for docker 😞 as this constantly break or required lot of workarounds (server vs. api version, chowning, flags ).

Conclusion

I would like to go for this solution as this solves a long standing pain point for docker users. Change the stage directory format is a price I'm willing to pay. My guess would be that there's quite a few people mess around with custom Dockerfiles and the stage directory due to the lack of a layering system in native-packager. As opting out is possible (or correct me if I'm wrong @ppiotrow ) the risk and migration costs are even lower for end users.

muuki88
muuki88 previously approved these changes Mar 3, 2020
@ppiotrow
Copy link
Contributor Author

ppiotrow commented Mar 5, 2020

Hi, sorry for delays, I had unexpected offline time. I'm going to finish this PR in upcoming days.

AFAIK if the user wants the old behaviour, then this would be enough
dockerLayerGrouping := (_ => None)

That is true but need to be tested.

  • write test to confirm user is able to opt-out from layering via dockerLayerGrouping := (_ => None).
  • write tests confirming that custom dockerPackageMappings works as before layering was introduced.
  • final refactoring of code before merging.

@ppiotrow
Copy link
Contributor Author

ppiotrow commented Mar 7, 2020

I finished working on this PR. I've checked how people are using dockerPackageMappings on GitHub and created a test. Also opting-out is possible as proven in the test.
Documentation is up-to-date. I'm not really happy with makeCopy* methods as they repeat some code, but I wasn't able to refactor them better.
For me this is feature complete and reasonable code quality and tested.

@ppiotrow ppiotrow changed the title Docker Layers separate for Dependencies and App jars. Separate Docker Layers for Dependencies and App jars. Mar 11, 2020
@muuki88
Copy link
Contributor

muuki88 commented Mar 11, 2020

Thanks a lot for all the time and passion you put into this @ppiotrow
I'll review as soon as possible, but I'll surely merge this ❤️

@muuki88 muuki88 merged commit 411bb5a into sbt:master Mar 16, 2020
@muuki88
Copy link
Contributor

muuki88 commented Mar 16, 2020

Release 1.7.0 is on its way.

@sideeffffect
Copy link

Hello guys
are you sure this feature is working as intended? I've run a small experiment, where I try to build 2 images in isolation, where only a tiny part of the code is different to see how much reuse/caching will take place.
From the results I see that actually no caching/re-use can happen, since the 2 images are different right from the layers native-packager/demiourgos creates.
I suspect that this is because of different timestamps in the layers' filesystem.

image

(visualization from dive)

@ppiotrow
Copy link
Contributor Author

ppiotrow commented Mar 23, 2020

Hi @sideeffffect, as I can see the first green layers came from your source image with JRE, right?
Then immediately you have difference in the first command that runs id and other user related command. I think that running same command from same parent layer should yield consistent layer. Have you run it on the same host? What is your docker version?

I'm looking at the source code now, should not you have USER root at the beginning?
https://github.com/sbt/sbt-native-packager/blob/master/src/main/scala/com/typesafe/sbt/packager/docker/DockerPlugin.scala#L188

Anyway, can you just upgrade your docker to support Multi Stage Builds?
It's intention is to prepare files and their permission and copy into final image without this permissions-related noise. This configuration for sure works best.

@sideeffffect
Copy link

I think that running same command from same parent layer should yield consistent layer.

I'm not sure it will work that way, at least not always
the previous layers are the same:
image

Have you run it on the same host?

No, those were (hopefully) completely isolated instance, that was the point of the experiment.

Anyway, can you just upgrade your docker to support Multi Stage Builds?

I can try that :)

@ppiotrow
Copy link
Contributor Author

ppiotrow commented Mar 23, 2020

It seems docker sha digest is not deterministic between different hosts. I've run several experiments with very simple docker files on my local node and remote machine. All gave inconsistent digests. Machines differ in the docker --version, but I don't think this is the issue here.
Dockerfile examples:

FROM anapsix/alpine-java:8
COPY test.txt /
# or ADD https://raw.githubusercontent.com/sbt/sbt-native-packager/master/README.md / 
# or RUN echo "hello"

I think we need to understand literally cache mechanism description.

Starting with a parent image that is already in the cache, the next instruction is compared against all child images derived from that base image to see if one of them was built using the exact same instruction. If not, the cache is invalidated.

It will be cached only if there already exist parent-child relation between base layer and docker command. I admit this is surprise to me as I expected consistent behaviour. But this most likely has some sense in the Docker implementation.

You'll still benefit from newest enhancement if you build your images on small pool of Docker hosts.

@sideeffffect
Copy link

sideeffffect commented Mar 23, 2020

But this most likely has some sense in the Docker implementation.

I think that what happens is that you copy files at different points in time, and the timestamps are reflected in the layer's file system, thus having different content and hash.
That's why jibs (and the sbt-jib plugin) sets all file timestamps to 0 -- to make the builds/image layers 100 % deterministic.

@ppiotrow
Copy link
Contributor Author

ppiotrow commented Mar 23, 2020

I don't think timestamp makes digest deterministic between hosts.
See my example with RUN echo "" that doesn't even create a file.
Also JIB doesn't mention reseting timestamps to boost caching.
Let me check once more.

Update:
I played around with changing modification date, but it seems it's inconsistent between Macos and Centos. I'll need to experiment more, because now I'm still getting different SHA.
touch -a -t 197001010000.01 text.txt
Anyway, seems like deterministic file modification date is a good practice and deserve opening issue and discussion there. I this will improve caching between hosts, it's even better.

@jroper
Copy link
Member

jroper commented Apr 6, 2020

Was any consideration given to using projectDependencyArtifacts as the basis for the implementation of deciding group mappings? Unlike using the organization heuristic, which breaks if you have sub projects that don't use the same organization, and includes too much if you have external dependencies that are from the same organization as your build, basing the decision of inclusion in a layer using projectDependencyArtifacts would not be a heuristic, it robustly answers "is this artifact part of this build or not?"

The main thing though to using this is that since dockerLayerGrouping is a SettingKey it can't depend on projectDependencyArtifacts because that's a task, so dockerLayerGrouping has to be changed to a TaskKey for this to work. That's a non binary compatible change, so dockerLayerGrouping would have to be deprecated, and replaced with a new TaskKey.

Additionally, to make this even more robust, if the function was (File, String) => Option[Int], then the matching could be done even more robustly, since the decision could be made based on the source file, rather than having to reconstruct the logic for building the destination path of the file.

Finally, any reason String => Option[Int] was chosen over PartialFunction[String, Int]? PartialFunction[String, Int] is much easier to compose, if you wanted to add a custom decision for some custom files you've added, it would be as simple as:

dockerLayerGrouping := dockerLayerGrouping.value.orElse {
  case ... => 2
}

In contrast to now where you have to:

dockerLayerGrouping := path => dockerLayerGrouping.value(path).orElse {
  if (...) {
    Some(2)
  } else {
    None
  }
}

@jroper
Copy link
Member

jroper commented Apr 6, 2020

I've submitted a PR with my suggestions here:

#1326

@ppiotrow
Copy link
Contributor Author

ppiotrow commented Apr 6, 2020

Hi, I didn't know about projectDependencyArtifacts but it sounds very well suited for this task. That is true that inner source company artefacts are promoted as application code and I'm observing it in some of my projects.
Partial function should be fine, let me review the code in the evening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

docker: introduce more layers for smaller images
5 participants