Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc: build improvement #3580

Open
aarnphm opened this issue Feb 17, 2023 · 12 comments
Open

rfc: build improvement #3580

aarnphm opened this issue Feb 17, 2023 · 12 comments

Comments

@aarnphm
Copy link
Contributor

aarnphm commented Feb 17, 2023

Problem statement

From community reports and internal discussion, bentoml build currently has the following caveats:

  • imports users' service.py as a module, which means all of the code in service.py will be invoked during the build, including every dependency imported in service.py.
    This is not ideal as it requires users to have all the dependencies installed to build, which might not always be available in the CI/CD environment.

    Current workaround we have seen from the community is that users will have to setup their environment beforehand, to run it on CI. This means users will end up with installing dependencies twice, one for build and one during containerise.

  • bentoml build creates a bento that includes a Dockerfile, which will be used by containerize to package a BentoContainer. Often times for CI, the desired behaviour is that build should be able to resolve
    to the container directly.

Proposed solutions

  1. importing service.py during build

a. Using --env

  • feat: conda env for bentos in bentostore #3396 introduces --env argument to serve, which allow serving within conda environment. We should also be able to extend this to container, virtualenv, mamba, and so on.
    --env can also be used during build, which will build the given bento with the specified environment:
bentoml build --env container

The behaviour is as follows:

  • creates a container that contains all of the necessary dependencies defined under bentofile.yaml (PyPI, Conda, system packages, setup script)
  • Attach the container to build the directory and build the Bento inside the container environment
    • For BuildKit supported container daemon, we can use --output=type=local,dest=/path/to/bentoml_home/bentos/bento_name/version to copy the built Bento to local machine,
    • For daemons that doesn't support BuildKit, use cp instead.
  • The container that is used to build the Bento can also be used for containerize and serve directly.

NOTE: #2495 mentions about this capability. All of the APIs are there to be used, so community contributions are welcome.

b. Not using --env

If users wish not to use --env, then to solve this issue, we will need to extract the Service object from service.py without actually importing it to the file.

  • Propose that we write a custom Python parser 😄
  • Or running build in a subprocess
  1. bentoml build directly to a container

For CI, what we can support is that build can also do containerize directly via flag --format=container:

bentoml build --format=container

by default, build will still create a Bento. (--format=bento)

Would love to hear more feedback and comments on this.

Additional context

#3577 suggests that we should refactor the containerization steps so that it will cache the environment setup, and move the model copying to later steps. This fails into --env container proposal, where it set up all of the dependencies inside the container once.

@Quasarman
Copy link

Quasarman commented Feb 20, 2023

Awesome looking forward to not having to install packages twice in a CI/CD env!

@charu-vl
Copy link

charu-vl commented Feb 21, 2023

@aarnphm thanks for writing this up

I think option 1a is ideal, if bentoml can take a dockerfile as input or use the dockerfile specified in the bento config to do it. It would be tedious to need a running container to start the build process. Another reason why option 1a seems reasonable is because it seems to fit into the workflow of creating custom deployment containers (like the sagemaker workflow) better

Do you know roughly how much time it would take to implement something like this?

@aarnphm
Copy link
Contributor Author

aarnphm commented Feb 22, 2023

@aarnphm thanks for writing this up

I think option 1a is ideal, if bentoml can take a dockerfile as input or use the dockerfile specified in the bento config to do it. It would be tedious to need a running container to start the build process. Another reason why option 1a seems reasonable is because it seems to fit into the workflow of creating custom deployment containers (like the sagemaker workflow) better

Do you know roughly how much time it would take to implement something like this?

Sorry but I don't understand this. 1a requires a container runtime in order to build it. I don't think providing additional dockerfile would be necessary.

Not sure if I understand what you mean by "sagemaker workflow"? I believe bentoctl would help with this (which is not relevant to this issue).

@charu-vl
Copy link

charu-vl commented Feb 22, 2023

Sorry, I think I misunderstood what you were saying originally. Overall I think 1a is a good way to do it if it results in a bento on the machine that runs bentoml build --env container
Any thoughts on level of effort?

@aarnphm
Copy link
Contributor Author

aarnphm commented Feb 22, 2023

We will need to do some refactor in the logic of our containerization steps, mostly at the https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/container/frontend/dockerfile/templates/base.j2

Secondly, we will need to implement a Container env_manager, which is under https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/env_manager/__init__.py, using our container SDK https://github.com/bentoml/BentoML/tree/main/src/bentoml/_internal/container

Thirdly, we will need to figure out support for BuildKit and non-BuildKit environment in terms of different caching strategy.

@phitoduck
Copy link

phitoduck commented Mar 10, 2023

@aarnphm Can you help me get a sense of what the interface would be in each case?

Lets say I had model weights in a registry somewhere (S3, MLFlow, maybe Yatai?--I'm less familiar with that)

What would my CI workflow be using these methods to fetch a set of weights, install the dependences used to work with them, and then build the service?

Is this what you're imagining?

# Assume that the service.py references a model called "pytorch-bento"

# pull the model from the registry (assume s3); the dumped location would
aws s3 cp s3://path/to/my/previously-built-pytorch-bento ~/bentoml/models/pytorch-bento

# Option #1: build the bento
DOCKER_BUILDKIT=0 bentoml build --env=container --opt="--tag my-bento:latest" --opt="--platform linux/amd64"

# Option #2: if the bento team decided to use the "build directly
DOCKER_BUILDKIT=0 bentoml build --format=docker --opt="--tag my-bento:latest" --opt="--platform linux/amd64"

# push the build bento to a registry
docker push my-bento:latest my-org/my-bento:latest

@phitoduck
Copy link

phitoduck commented Mar 10, 2023

Also, I haven't tried this, but could a workaround be to simply wrap all of the import statements (except the bentoml ones) in a try/except block?

You might be able to get away with only doing pip install bentoml during the first installation if you did this (I could be wrong):

# bentofile.yaml
service: "service.py:svc"
labels:
  owner: bentoml-team
  project: gallery
include:
- "*.py"
python:
  packages:
    - scikit-learn
    - pandas
# service.py
from typing import Any, List, Tuple, Union
import time

import bentoml
from bentoml.io import JSON
from pydantic import BaseModel # bento requires this, PIL, and numpy on it's own
import numpy as np
from numpy.typing import NDArray

try:
    # put anything import-sensitive in another module; could use OpenCV, pandas, sklearn, keras, etc.
    from my_other_module_that_requires_these_imports import ...
except ImportError:
    print("WARNING: dependencies not installed. Are you runnin 'bentoml build'?")


class PredictRequest(BaseModel):
    input: int


class PredictResponse(BaseModel):
    output: Tuple[str, float]


class ExampleRunnable(bentoml.Runnable):
    SUPPORTED_RESOURCES = ("cpu",)
    SUPPORTS_CPU_MULTI_THREADING = True

    @bentoml.Runnable.method(
        batchable=True,
        batch_dim=0,
    )
    def generate_image_overlayed_with_heatmap(
        self,
        input_data: List[int],
    ) -> List[Tuple[str, float]]:
        print("input data", type(input_data))
        print("input data", input_data)
        # return ["image1", 0.11], ["image2", 0.22]
        return [[f"image{i}", 0.11 * i] for i, _ in enumerate(input_data)]


example_runner = bentoml.Runner(
    models=[],
    runnable_class=ExampleRunnable,
    name="example_runner",
    max_latency_ms=100_000,
    max_batch_size=10,
)

svc = bentoml.Service("dummy_service", runners=[example_runner])


@svc.api(
    route="/predict",
    input=JSON(pydantic_model=PredictRequest),
    output=JSON(pydantic_model=PredictResponse),
)
def predict(input: PredictRequest) -> np.ndarray:
    time.sleep(2)
    result: int = example_runner.run(
        [input.input],
    )
    print("RESULT", result)
    return PredictResponse(
        output=result[0],
    )

This approach may only get you so far. If you need to instantiate a custom logger, or anything else in the global scope of the file, you'd be back to having to install everything twice.

@aarnphm
Copy link
Contributor Author

aarnphm commented Mar 14, 2023

# Assume that the service.py references a model called "pytorch-bento"

# pull the model from the registry (assume s3); the dumped location would
aws s3 cp s3://path/to/my/previously-built-pytorch-bento ~/bentoml/models/pytorch-bento

# Option #1: build the bento
DOCKER_BUILDKIT=0 bentoml build --env=container --opt="--tag my-bento:latest" --opt="--platform linux/amd64"

# Option #2: if the bento team decided to use the "build directly
DOCKER_BUILDKIT=0 bentoml build --format=docker --opt="--tag my-bento:latest" --opt="--platform linux/amd64"

# push the build bento to a registry
docker push my-bento:latest my-org/my-bento:latest

Sorry for the late reply @phitoduck, was busy the last few days.

For Option 1, the environment will result as a container (maybe give it the name <bento>-<generated-string>) and will be saved locally. --opt here will just be a no-opt (for symmetric).

You can then either push the env container as a cache to the registry somewhere, and then you can reference this via a --env-container-opt ='--cache-from=...' (or something similar, which just use <container-engine> build --cache-from when 'building' the environment, so the cache hit).

flowchart TD
    A[aws s3 cp ...] -->|pull| B(pytorch-bento)
    B --> C{bentoml build --env=container pytorch-bento}
    D[(cached env container)] --> |--env-container-opt ='--cache-from=...'| C 
    C --> E[bento]
Loading

For Option 2, you can think of it as the combination of current build -> containerize. --opt here matches with containerize --opt

@aarnphm
Copy link
Contributor Author

aarnphm commented Mar 14, 2023

# service.py
from typing import Any, List, Tuple, Union
import time

import bentoml
from bentoml.io import JSON
from pydantic import BaseModel # bento requires this, PIL, and numpy on it's own
import numpy as np
from numpy.typing import NDArray

try:
    # put anything import-sensitive in another module; could use OpenCV, pandas, sklearn, keras, etc.
    from my_other_module_that_requires_these_imports import ...
except ImportError:
    print("WARNING: dependencies not installed. Are you runnin 'bentoml build'?")


class PredictRequest(BaseModel):
    input: int


class PredictResponse(BaseModel):
    output: Tuple[str, float]


class ExampleRunnable(bentoml.Runnable):
    SUPPORTED_RESOURCES = ("cpu",)
    SUPPORTS_CPU_MULTI_THREADING = True

    @bentoml.Runnable.method(
        batchable=True,
        batch_dim=0,
    )
    def generate_image_overlayed_with_heatmap(
        self,
        input_data: List[int],
    ) -> List[Tuple[str, float]]:
        print("input data", type(input_data))
        print("input data", input_data)
        # return ["image1", 0.11], ["image2", 0.22]
        return [[f"image{i}", 0.11 * i] for i, _ in enumerate(input_data)]


example_runner = bentoml.Runner(
    models=[],
    runnable_class=ExampleRunnable,
    name="example_runner",
    max_latency_ms=100_000,
    max_batch_size=10,
)

svc = bentoml.Service("dummy_service", runners=[example_runner])


@svc.api(
    route="/predict",
    input=JSON(pydantic_model=PredictRequest),
    output=JSON(pydantic_model=PredictResponse),
)
def predict(input: PredictRequest) -> np.ndarray:
    time.sleep(2)
    result: int = example_runner.run(
        [input.input],
    )
    print("RESULT", result)
    return PredictResponse(
        output=result[0],
    )

This doesn't solve the fact that we are still importing the service.py into user's current PYTHONPATH. The crucial improvement for this RFC here is to isolate this from the current PYTHONPATH. Running in subprocess is the first step in solving this issue.

@phitoduck
Copy link

phitoduck commented Apr 13, 2023

Oh I see! I'll try to restate:

  1. With option (1), you still do two commands like: bentoml build ... --env=container and bentoml containerize .... But in this case, the output of bentoml build ... would be an intermediary docker image that would have many of the same layers as the image built with bentoml containerize .... And the benefits of this would be:
    1. You wouldn't have to install any dependencies outside of docker to build your bento
    2. The bentoml containerize ... command would run much faster since it would hit the cached layers from bentoml build ...
    3. You would have an intermediary "environment image" which you could use for... what would you use this for? faster, more isolated local development?
  2. With Option (2), you just run something like bentoml build ... --format=image and you get a fully built bento image. The benefits here would be the same as Option (1) minus (1.iii).

Is this correct?

If so, Option (1) seems to have one more advantage over Option (2) so that one sounds good :D

And so the workflow would be:

# pull the model from the registry (assume s3); the dumped location would
aws s3 cp s3://path/to/my/previously-built-pytorch-bento ~/bentoml/models/pytorch-bento

# Build the environment image (has many of the same layers as the final containerized bento)
DOCKER_BUILDKIT=0 bentoml build --env=container --opt="--platform linux/amd64"

# Build the final image (it shouldn't be necessary to explicitly pass a --from-cache parameter, correct?
# Because the "docker build ..." process executed by this command should naturally pick up on shared, cached
# layers from the result of the previous command? (I'm assuming the containerized bento image *is* the env
# image plus additional layers, although I'm not sure what those would be)
DOCKER_BUILDKIT=0 bentoml containerize --tag my-pytorch-bento:latest --opt="--platform linux/amd64"

# push the build bento to a registry
docker push my-pytorch-bento:latest my-org/my-pytorch-bento:latest

@aarnphm
Copy link
Contributor Author

aarnphm commented Apr 14, 2023

  1. With option (1), you still do two commands like: bentoml build ... --env=container and bentoml containerize .... But in this case, the output of bentoml build ... would be an intermediary docker image that would have many of the same layers as the image built with bentoml containerize .... And the benefits of this would be:

Yes this is correct.

  1. You would have an intermediary "environment image" which you could use for... what would you use this for? faster, more isolated local development?

You can think of this environment as a part in a multi-stage build. containerize will use this 'pseudoimage' as a stage to build the final bento container image

  1. With Option (2), you just run something like bentoml build ... --format=image and you get a fully built bento image. The benefits here would be the same as Option (1) minus (1.iii).

I think the Option 2 is more of a QOL so instead of having two commands, you just need to run one.

And so the workflow would be:

# pull the model from the registry (assume s3); the dumped location would
aws s3 cp s3://path/to/my/previously-built-pytorch-bento ~/bentoml/models/pytorch-bento

# Build the environment image (has many of the same layers as the final containerized bento)
DOCKER_BUILDKIT=0 bentoml build --env=container --opt="--platform linux/amd64"

# Build the final image (it shouldn't be necessary to explicitly pass a --from-cache parameter, correct?
# Because the "docker build ..." process executed by this command should naturally pick up on shared, cached
# layers from the result of the previous command? (I'm assuming the containerized bento image *is* the env
# image plus additional layers, although I'm not sure what those would be)
DOCKER_BUILDKIT=0 bentoml containerize --tag my-pytorch-bento:latest --opt="--platform linux/amd64"

# push the build bento to a registry
docker push my-pytorch-bento:latest my-org/my-pytorch-bento:latest

That workflow makes sense to me. Note that we also support pull with s3 without the need to have aws-cli

bentoml pull s3://path/to/bento .

You can install it with pip install bentoml[aws]

@aarnphm aarnphm mentioned this issue May 2, 2023
5 tasks
@aarnphm
Copy link
Contributor Author

aarnphm commented May 3, 2023

The first iteration of this ticket involves supporting build from the subprocess, which addresses polluting users' sys.modules in #3814

--env is a qol improvement and is currently triaged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants