Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate reuse strategies for docker image #59

Closed
ckunki opened this issue Nov 15, 2023 · 7 comments
Closed

Investigate reuse strategies for docker image #59

ckunki opened this issue Nov 15, 2023 · 7 comments
Labels
refactoring Code improvement without behavior change shelved:yes Closed because this ticket is very unlikely to get implemented

Comments

@ckunki
Copy link
Contributor

ckunki commented Nov 15, 2023

Potential use cases

UC-1

  • notebook developer Nadine works on creating a new Jupyter notebook or updating an existing one
  • Nadine wants to use new libraries that are not available yet in the latest release on docker-hub
  • Nadine therefor wants to build a private Docker image from the branch Nadine is currently working on

UC-2

  • As image creation currently (Nov 2023) takes around 7 minutes Nadine wants to reuse the image in follow-up usage

UC-3

  • Nadine changed a file or dependency that requires to re-create the Docker image taking the change into account
@ckunki ckunki added the refactoring Code improvement without behavior change label Nov 15, 2023
@ckunki
Copy link
Contributor Author

ckunki commented Nov 15, 2023

Using a stripped playbook I measured a total duration of 35 seconds.
The main runtime (> 30 s) is used for copying ~ 1MB files into the Docker container.

When using the full playbook the

  • duration before ansible starts is negligible
    • 13:19:02.553448520 Setup DSS Docker Container
    • 13:20:06.942173827 Install Poetry
    • 13:21:02.421321996 Install Jupyter
    • 13:21:33.782174510 Install JupyterLab and its dependencies
    • 13:22:04.042733300 Install dependencies used in Jupyterlab
    • 13:22:49.754936638 Copy notebook content
    • 13:23:01.747934461 Install Docker
    • 13:23:35.444312396 docker : Add Docker GPG apt Key
    • 13:23:51.388766915 docker : Add Docker Repository
    • 13:24:02.429195331 docker : Update apt and install docker-ce
    • 13:24:52.802106986 Ansible tasks finished
    • 13:25:07.204318968 stopping container
    • 13:25:12.117735411 PASSED
    • 13:25:23.652135227 End of test

@ckunki
Copy link
Contributor Author

ckunki commented Nov 16, 2023

Comments from @tkilias

  • Installation of poetry takes quite long and I think we don't need poetry
  • Yes, APT and PIP installs take a while but there are only limited things we could do about it
  • In a Docker Container we don't need systemd and motd setup as they probably won't work anyway

@ckunki
Copy link
Contributor Author

ckunki commented Nov 16, 2023

@ckunki
Copy link
Contributor Author

ckunki commented Nov 16, 2023

See also https://docs.pytest.org/en/7.4.x/how-to/capture-stdout-stderr.html

def test_disabling_capturing(capsys):
    print("this output is captured")
    with capsys.disabled():
        print("output not captured, going directly to sys.stdout")
    print("this output is also captured")

You tried to access the function scoped fixture capsys with a session scoped request object, involved factories:

@ckunki
Copy link
Contributor Author

ckunki commented Nov 16, 2023

I used

- name: Copy notebook content
  ansible.builtin.synchronize:
    src: "roles/jupyter/files/notebook/"
    dest: /root/notebooks
    rsync_opts:
      - "--chmod=0644"

And got error message

protocol version mismatch -- is your shell clean?
(see the rsync man page for an explanation)
rsyncerror: protocol incompatibility (code 2) at compat.c(178) [sender=3.1.3]

After adding rsync to the Docker Container and replacing ansible.builtin.copy by ansible.builtin.synchronize the duration decreased from ~ 24 seconds to < 1 second!
Great! 🎉

Unfortunately installing rsync itself takes 31 seconds: 🙁

@ckunki
Copy link
Contributor Author

ckunki commented Nov 16, 2023

Capturing stdout in @pytest.fixture(scope="session") only works then pytest is called with -o log_cli=true -o log_cli_level=INFO.

But when these cli options are provided capturing is not required anymore, as pytest will log ansible output anyway.

@ckunki ckunki assigned ckunki and unassigned ckunki Nov 23, 2023
@ckunki
Copy link
Contributor Author

ckunki commented Nov 23, 2023

For reusing a Docker Image in pytest DSS added support for CLI option --dss-docker-image in #69

@redcatbear redcatbear changed the title Investiate reuse strategies for docker image Investigate reuse strategies for docker image Sep 27, 2024
@redcatbear redcatbear added the shelved:yes Closed because this ticket is very unlikely to get implemented label Sep 27, 2024
@redcatbear redcatbear closed this as not planned Won't fix, can't repro, duplicate, stale Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Code improvement without behavior change shelved:yes Closed because this ticket is very unlikely to get implemented
Projects
None yet
Development

No branches or pull requests

2 participants