rfc: build improvement #3580

aarnphm · 2023-02-17T21:08:26Z

Problem statement

From community reports and internal discussion, bentoml build currently has the following caveats:

imports users' service.py as a module, which means all of the code in service.py will be invoked during the build, including every dependency imported in service.py.
This is not ideal as it requires users to have all the dependencies installed to build, which might not always be available in the CI/CD environment.

Current workaround we have seen from the community is that users will have to setup their environment beforehand, to run it on CI. This means users will end up with installing dependencies twice, one for build and one during containerise.
bentoml build creates a bento that includes a Dockerfile, which will be used by containerize to package a BentoContainer. Often times for CI, the desired behaviour is that build should be able to resolve
to the container directly.

Proposed solutions

importing service.py during build

a. Using --env

feat: conda env for bentos in bentostore #3396 introduces --env argument to serve, which allow serving within conda environment. We should also be able to extend this to container, virtualenv, mamba, and so on.
--env can also be used during build, which will build the given bento with the specified environment:

bentoml build --env container

The behaviour is as follows:

creates a container that contains all of the necessary dependencies defined under bentofile.yaml (PyPI, Conda, system packages, setup script)
Attach the container to build the directory and build the Bento inside the container environment
- For BuildKit supported container daemon, we can use --output=type=local,dest=/path/to/bentoml_home/bentos/bento_name/version to copy the built Bento to local machine,
- For daemons that doesn't support BuildKit, use cp instead.
The container that is used to build the Bento can also be used for containerize and serve directly.

NOTE: #2495 mentions about this capability. All of the APIs are there to be used, so community contributions are welcome.

b. Not using --env

If users wish not to use --env, then to solve this issue, we will need to extract the Service object from service.py without actually importing it to the file.

Propose that we write a custom Python parser 😄
Or running build in a subprocess

bentoml build directly to a container

For CI, what we can support is that build can also do containerize directly via flag --format=container:

bentoml build --format=container

by default, build will still create a Bento. (--format=bento)

Would love to hear more feedback and comments on this.

Additional context

#3577 suggests that we should refactor the containerization steps so that it will cache the environment setup, and move the model copying to later steps. This fails into --env container proposal, where it set up all of the dependencies inside the container once.

The text was updated successfully, but these errors were encountered:

Quasarman · 2023-02-20T15:31:23Z

Awesome looking forward to not having to install packages twice in a CI/CD env!

charu-vl · 2023-02-21T16:33:02Z

@aarnphm thanks for writing this up

I think option 1a is ideal, if bentoml can take a dockerfile as input or use the dockerfile specified in the bento config to do it. It would be tedious to need a running container to start the build process. Another reason why option 1a seems reasonable is because it seems to fit into the workflow of creating custom deployment containers (like the sagemaker workflow) better

Do you know roughly how much time it would take to implement something like this?

aarnphm · 2023-02-22T00:38:18Z

@aarnphm thanks for writing this up

I think option 1a is ideal, if bentoml can take a dockerfile as input or use the dockerfile specified in the bento config to do it. It would be tedious to need a running container to start the build process. Another reason why option 1a seems reasonable is because it seems to fit into the workflow of creating custom deployment containers (like the sagemaker workflow) better

Do you know roughly how much time it would take to implement something like this?

Sorry but I don't understand this. 1a requires a container runtime in order to build it. I don't think providing additional dockerfile would be necessary.

Not sure if I understand what you mean by "sagemaker workflow"? I believe bentoctl would help with this (which is not relevant to this issue).

charu-vl · 2023-02-22T19:19:33Z

Sorry, I think I misunderstood what you were saying originally. Overall I think 1a is a good way to do it if it results in a bento on the machine that runs bentoml build --env container
Any thoughts on level of effort?

aarnphm · 2023-02-22T20:08:07Z

We will need to do some refactor in the logic of our containerization steps, mostly at the https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/container/frontend/dockerfile/templates/base.j2

Secondly, we will need to implement a Container env_manager, which is under https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/env_manager/__init__.py, using our container SDK https://github.com/bentoml/BentoML/tree/main/src/bentoml/_internal/container

Thirdly, we will need to figure out support for BuildKit and non-BuildKit environment in terms of different caching strategy.

phitoduck · 2023-03-10T01:20:52Z

@aarnphm Can you help me get a sense of what the interface would be in each case?

Lets say I had model weights in a registry somewhere (S3, MLFlow, maybe Yatai?--I'm less familiar with that)

What would my CI workflow be using these methods to fetch a set of weights, install the dependences used to work with them, and then build the service?

Is this what you're imagining?

# Assume that the service.py references a model called "pytorch-bento"

# pull the model from the registry (assume s3); the dumped location would
aws s3 cp s3://path/to/my/previously-built-pytorch-bento ~/bentoml/models/pytorch-bento

# Option #1: build the bento
DOCKER_BUILDKIT=0 bentoml build --env=container --opt="--tag my-bento:latest" --opt="--platform linux/amd64"

# Option #2: if the bento team decided to use the "build directly
DOCKER_BUILDKIT=0 bentoml build --format=docker --opt="--tag my-bento:latest" --opt="--platform linux/amd64"

# push the build bento to a registry
docker push my-bento:latest my-org/my-bento:latest

phitoduck · 2023-03-10T01:30:43Z

Also, I haven't tried this, but could a workaround be to simply wrap all of the import statements (except the bentoml ones) in a try/except block?

You might be able to get away with only doing pip install bentoml during the first installation if you did this (I could be wrong):

# bentofile.yaml
service: "service.py:svc"
labels:
  owner: bentoml-team
  project: gallery
include:
- "*.py"
python:
  packages:
    - scikit-learn
    - pandas

# service.py
from typing import Any, List, Tuple, Union
import time

import bentoml
from bentoml.io import JSON
from pydantic import BaseModel # bento requires this, PIL, and numpy on it's own
import numpy as np
from numpy.typing import NDArray

try:
    # put anything import-sensitive in another module; could use OpenCV, pandas, sklearn, keras, etc.
    from my_other_module_that_requires_these_imports import ...
except ImportError:
    print("WARNING: dependencies not installed. Are you runnin 'bentoml build'?")


class PredictRequest(BaseModel):
    input: int


class PredictResponse(BaseModel):
    output: Tuple[str, float]


class ExampleRunnable(bentoml.Runnable):
    SUPPORTED_RESOURCES = ("cpu",)
    SUPPORTS_CPU_MULTI_THREADING = True

    @bentoml.Runnable.method(
        batchable=True,
        batch_dim=0,
    )
    def generate_image_overlayed_with_heatmap(
        self,
        input_data: List[int],
    ) -> List[Tuple[str, float]]:
        print("input data", type(input_data))
        print("input data", input_data)
        # return ["image1", 0.11], ["image2", 0.22]
        return [[f"image{i}", 0.11 * i] for i, _ in enumerate(input_data)]


example_runner = bentoml.Runner(
    models=[],
    runnable_class=ExampleRunnable,
    name="example_runner",
    max_latency_ms=100_000,
    max_batch_size=10,
)

svc = bentoml.Service("dummy_service", runners=[example_runner])


@svc.api(
    route="/predict",
    input=JSON(pydantic_model=PredictRequest),
    output=JSON(pydantic_model=PredictResponse),
)
def predict(input: PredictRequest) -> np.ndarray:
    time.sleep(2)
    result: int = example_runner.run(
        [input.input],
    )
    print("RESULT", result)
    return PredictResponse(
        output=result[0],
    )

This approach may only get you so far. If you need to instantiate a custom logger, or anything else in the global scope of the file, you'd be back to having to install everything twice.

aarnphm · 2023-03-14T10:16:06Z

# Assume that the service.py references a model called "pytorch-bento"

# pull the model from the registry (assume s3); the dumped location would
aws s3 cp s3://path/to/my/previously-built-pytorch-bento ~/bentoml/models/pytorch-bento

# Option #1: build the bento
DOCKER_BUILDKIT=0 bentoml build --env=container --opt="--tag my-bento:latest" --opt="--platform linux/amd64"

# Option #2: if the bento team decided to use the "build directly
DOCKER_BUILDKIT=0 bentoml build --format=docker --opt="--tag my-bento:latest" --opt="--platform linux/amd64"

# push the build bento to a registry
docker push my-bento:latest my-org/my-bento:latest

Sorry for the late reply @phitoduck, was busy the last few days.

For Option 1, the environment will result as a container (maybe give it the name <bento>-<generated-string>) and will be saved locally. --opt here will just be a no-opt (for symmetric).

You can then either push the env container as a cache to the registry somewhere, and then you can reference this via a --env-container-opt ='--cache-from=...' (or something similar, which just use <container-engine> build --cache-from when 'building' the environment, so the cache hit).

flowchart TD
    A[aws s3 cp ...] -->|pull| B(pytorch-bento)
    B --> C{bentoml build --env=container pytorch-bento}
    D[(cached env container)] --> |--env-container-opt ='--cache-from=...'| C 
    C --> E[bento]

For Option 2, you can think of it as the combination of current build -> containerize. --opt here matches with containerize --opt

aarnphm · 2023-03-14T10:26:42Z

# service.py
from typing import Any, List, Tuple, Union
import time

import bentoml
from bentoml.io import JSON
from pydantic import BaseModel # bento requires this, PIL, and numpy on it's own
import numpy as np
from numpy.typing import NDArray

try:
    # put anything import-sensitive in another module; could use OpenCV, pandas, sklearn, keras, etc.
    from my_other_module_that_requires_these_imports import ...
except ImportError:
    print("WARNING: dependencies not installed. Are you runnin 'bentoml build'?")


class PredictRequest(BaseModel):
    input: int


class PredictResponse(BaseModel):
    output: Tuple[str, float]


class ExampleRunnable(bentoml.Runnable):
    SUPPORTED_RESOURCES = ("cpu",)
    SUPPORTS_CPU_MULTI_THREADING = True

    @bentoml.Runnable.method(
        batchable=True,
        batch_dim=0,
    )
    def generate_image_overlayed_with_heatmap(
        self,
        input_data: List[int],
    ) -> List[Tuple[str, float]]:
        print("input data", type(input_data))
        print("input data", input_data)
        # return ["image1", 0.11], ["image2", 0.22]
        return [[f"image{i}", 0.11 * i] for i, _ in enumerate(input_data)]


example_runner = bentoml.Runner(
    models=[],
    runnable_class=ExampleRunnable,
    name="example_runner",
    max_latency_ms=100_000,
    max_batch_size=10,
)

svc = bentoml.Service("dummy_service", runners=[example_runner])


@svc.api(
    route="/predict",
    input=JSON(pydantic_model=PredictRequest),
    output=JSON(pydantic_model=PredictResponse),
)
def predict(input: PredictRequest) -> np.ndarray:
    time.sleep(2)
    result: int = example_runner.run(
        [input.input],
    )
    print("RESULT", result)
    return PredictResponse(
        output=result[0],
    )

This doesn't solve the fact that we are still importing the service.py into user's current PYTHONPATH. The crucial improvement for this RFC here is to isolate this from the current PYTHONPATH. Running in subprocess is the first step in solving this issue.

phitoduck · 2023-04-13T08:41:58Z

Oh I see! I'll try to restate:

With option (1), you still do two commands like: bentoml build ... --env=container and bentoml containerize .... But in this case, the output of bentoml build ... would be an intermediary docker image that would have many of the same layers as the image built with bentoml containerize .... And the benefits of this would be:
1. You wouldn't have to install any dependencies outside of docker to build your bento
2. The bentoml containerize ... command would run much faster since it would hit the cached layers from bentoml build ...
3. You would have an intermediary "environment image" which you could use for... what would you use this for? faster, more isolated local development?
With Option (2), you just run something like bentoml build ... --format=image and you get a fully built bento image. The benefits here would be the same as Option (1) minus (1.iii).

Is this correct?

If so, Option (1) seems to have one more advantage over Option (2) so that one sounds good :D

And so the workflow would be:

# pull the model from the registry (assume s3); the dumped location would
aws s3 cp s3://path/to/my/previously-built-pytorch-bento ~/bentoml/models/pytorch-bento

# Build the environment image (has many of the same layers as the final containerized bento)
DOCKER_BUILDKIT=0 bentoml build --env=container --opt="--platform linux/amd64"

# Build the final image (it shouldn't be necessary to explicitly pass a --from-cache parameter, correct?
# Because the "docker build ..." process executed by this command should naturally pick up on shared, cached
# layers from the result of the previous command? (I'm assuming the containerized bento image *is* the env
# image plus additional layers, although I'm not sure what those would be)
DOCKER_BUILDKIT=0 bentoml containerize --tag my-pytorch-bento:latest --opt="--platform linux/amd64"

# push the build bento to a registry
docker push my-pytorch-bento:latest my-org/my-pytorch-bento:latest

aarnphm · 2023-04-14T08:32:53Z

With option (1), you still do two commands like: bentoml build ... --env=container and bentoml containerize .... But in this case, the output of bentoml build ... would be an intermediary docker image that would have many of the same layers as the image built with bentoml containerize .... And the benefits of this would be:

Yes this is correct.

You would have an intermediary "environment image" which you could use for... what would you use this for? faster, more isolated local development?

You can think of this environment as a part in a multi-stage build. containerize will use this 'pseudoimage' as a stage to build the final bento container image

With Option (2), you just run something like bentoml build ... --format=image and you get a fully built bento image. The benefits here would be the same as Option (1) minus (1.iii).

I think the Option 2 is more of a QOL so instead of having two commands, you just need to run one.

And so the workflow would be:

# pull the model from the registry (assume s3); the dumped location would
aws s3 cp s3://path/to/my/previously-built-pytorch-bento ~/bentoml/models/pytorch-bento

# Build the environment image (has many of the same layers as the final containerized bento)
DOCKER_BUILDKIT=0 bentoml build --env=container --opt="--platform linux/amd64"

# Build the final image (it shouldn't be necessary to explicitly pass a --from-cache parameter, correct?
# Because the "docker build ..." process executed by this command should naturally pick up on shared, cached
# layers from the result of the previous command? (I'm assuming the containerized bento image *is* the env
# image plus additional layers, although I'm not sure what those would be)
DOCKER_BUILDKIT=0 bentoml containerize --tag my-pytorch-bento:latest --opt="--platform linux/amd64"

# push the build bento to a registry
docker push my-pytorch-bento:latest my-org/my-pytorch-bento:latest

That workflow makes sense to me. Note that we also support pull with s3 without the need to have aws-cli

bentoml pull s3://path/to/bento .

You can install it with pip install bentoml[aws]

aarnphm · 2023-05-03T02:58:44Z

The first iteration of this ticket involves supporting build from the subprocess, which addresses polluting users' sys.modules in #3814

--env is a qol improvement and is currently triaged.

aarnphm mentioned this issue May 2, 2023

feat: subprocess build #3814

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rfc: build improvement #3580

rfc: build improvement #3580

aarnphm commented Feb 17, 2023 •

edited

Loading

Quasarman commented Feb 20, 2023 •

edited

Loading

charu-vl commented Feb 21, 2023 •

edited

Loading

aarnphm commented Feb 22, 2023

charu-vl commented Feb 22, 2023 •

edited

Loading

aarnphm commented Feb 22, 2023

phitoduck commented Mar 10, 2023 •

edited

Loading

phitoduck commented Mar 10, 2023 •

edited

Loading

aarnphm commented Mar 14, 2023

aarnphm commented Mar 14, 2023

phitoduck commented Apr 13, 2023 •

edited

Loading

aarnphm commented Apr 14, 2023 •

edited

Loading

aarnphm commented May 3, 2023

rfc: build improvement #3580

rfc: build improvement #3580

Comments

aarnphm commented Feb 17, 2023 • edited Loading

Problem statement

Proposed solutions

Additional context

Quasarman commented Feb 20, 2023 • edited Loading

charu-vl commented Feb 21, 2023 • edited Loading

aarnphm commented Feb 22, 2023

charu-vl commented Feb 22, 2023 • edited Loading

aarnphm commented Feb 22, 2023

phitoduck commented Mar 10, 2023 • edited Loading

phitoduck commented Mar 10, 2023 • edited Loading

aarnphm commented Mar 14, 2023

aarnphm commented Mar 14, 2023

phitoduck commented Apr 13, 2023 • edited Loading

aarnphm commented Apr 14, 2023 • edited Loading

aarnphm commented May 3, 2023

aarnphm commented Feb 17, 2023 •

edited

Loading

Quasarman commented Feb 20, 2023 •

edited

Loading

charu-vl commented Feb 21, 2023 •

edited

Loading

charu-vl commented Feb 22, 2023 •

edited

Loading

phitoduck commented Mar 10, 2023 •

edited

Loading

phitoduck commented Mar 10, 2023 •

edited

Loading

phitoduck commented Apr 13, 2023 •

edited

Loading

aarnphm commented Apr 14, 2023 •

edited

Loading