-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc: build improvement #3580
Comments
Awesome looking forward to not having to install packages twice in a CI/CD env! |
@aarnphm thanks for writing this up I think option 1a is ideal, if bentoml can take a dockerfile as input or use the dockerfile specified in the bento config to do it. It would be tedious to need a running container to start the build process. Another reason why option 1a seems reasonable is because it seems to fit into the workflow of creating custom deployment containers (like the sagemaker workflow) better Do you know roughly how much time it would take to implement something like this? |
Sorry but I don't understand this. 1a requires a container runtime in order to build it. I don't think providing additional dockerfile would be necessary. Not sure if I understand what you mean by "sagemaker workflow"? I believe bentoctl would help with this (which is not relevant to this issue). |
Sorry, I think I misunderstood what you were saying originally. Overall I think 1a is a good way to do it if it results in a bento on the machine that runs |
We will need to do some refactor in the logic of our containerization steps, mostly at the https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/container/frontend/dockerfile/templates/base.j2 Secondly, we will need to implement a Container env_manager, which is under https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/env_manager/__init__.py, using our container SDK https://github.com/bentoml/BentoML/tree/main/src/bentoml/_internal/container Thirdly, we will need to figure out support for BuildKit and non-BuildKit environment in terms of different caching strategy. |
@aarnphm Can you help me get a sense of what the interface would be in each case? Lets say I had model weights in a registry somewhere (S3, MLFlow, maybe Yatai?--I'm less familiar with that) What would my CI workflow be using these methods to fetch a set of weights, install the dependences used to work with them, and then build the service? Is this what you're imagining? # Assume that the service.py references a model called "pytorch-bento"
# pull the model from the registry (assume s3); the dumped location would
aws s3 cp s3://path/to/my/previously-built-pytorch-bento ~/bentoml/models/pytorch-bento
# Option #1: build the bento
DOCKER_BUILDKIT=0 bentoml build --env=container --opt="--tag my-bento:latest" --opt="--platform linux/amd64"
# Option #2: if the bento team decided to use the "build directly
DOCKER_BUILDKIT=0 bentoml build --format=docker --opt="--tag my-bento:latest" --opt="--platform linux/amd64"
# push the build bento to a registry
docker push my-bento:latest my-org/my-bento:latest |
Also, I haven't tried this, but could a workaround be to simply wrap all of the import statements (except the You might be able to get away with only doing # bentofile.yaml
service: "service.py:svc"
labels:
owner: bentoml-team
project: gallery
include:
- "*.py"
python:
packages:
- scikit-learn
- pandas # service.py
from typing import Any, List, Tuple, Union
import time
import bentoml
from bentoml.io import JSON
from pydantic import BaseModel # bento requires this, PIL, and numpy on it's own
import numpy as np
from numpy.typing import NDArray
try:
# put anything import-sensitive in another module; could use OpenCV, pandas, sklearn, keras, etc.
from my_other_module_that_requires_these_imports import ...
except ImportError:
print("WARNING: dependencies not installed. Are you runnin 'bentoml build'?")
class PredictRequest(BaseModel):
input: int
class PredictResponse(BaseModel):
output: Tuple[str, float]
class ExampleRunnable(bentoml.Runnable):
SUPPORTED_RESOURCES = ("cpu",)
SUPPORTS_CPU_MULTI_THREADING = True
@bentoml.Runnable.method(
batchable=True,
batch_dim=0,
)
def generate_image_overlayed_with_heatmap(
self,
input_data: List[int],
) -> List[Tuple[str, float]]:
print("input data", type(input_data))
print("input data", input_data)
# return ["image1", 0.11], ["image2", 0.22]
return [[f"image{i}", 0.11 * i] for i, _ in enumerate(input_data)]
example_runner = bentoml.Runner(
models=[],
runnable_class=ExampleRunnable,
name="example_runner",
max_latency_ms=100_000,
max_batch_size=10,
)
svc = bentoml.Service("dummy_service", runners=[example_runner])
@svc.api(
route="/predict",
input=JSON(pydantic_model=PredictRequest),
output=JSON(pydantic_model=PredictResponse),
)
def predict(input: PredictRequest) -> np.ndarray:
time.sleep(2)
result: int = example_runner.run(
[input.input],
)
print("RESULT", result)
return PredictResponse(
output=result[0],
) This approach may only get you so far. If you need to instantiate a custom logger, or anything else in the global scope of the file, you'd be back to having to install everything twice. |
Sorry for the late reply @phitoduck, was busy the last few days. For Option 1, the environment will result as a container (maybe give it the name You can then either push the env container as a cache to the registry somewhere, and then you can reference this via a flowchart TD
A[aws s3 cp ...] -->|pull| B(pytorch-bento)
B --> C{bentoml build --env=container pytorch-bento}
D[(cached env container)] --> |--env-container-opt ='--cache-from=...'| C
C --> E[bento]
For Option 2, you can think of it as the combination of current |
This doesn't solve the fact that we are still importing the |
Oh I see! I'll try to restate:
Is this correct? If so, Option (1) seems to have one more advantage over Option (2) so that one sounds good :D And so the workflow would be: # pull the model from the registry (assume s3); the dumped location would
aws s3 cp s3://path/to/my/previously-built-pytorch-bento ~/bentoml/models/pytorch-bento
# Build the environment image (has many of the same layers as the final containerized bento)
DOCKER_BUILDKIT=0 bentoml build --env=container --opt="--platform linux/amd64"
# Build the final image (it shouldn't be necessary to explicitly pass a --from-cache parameter, correct?
# Because the "docker build ..." process executed by this command should naturally pick up on shared, cached
# layers from the result of the previous command? (I'm assuming the containerized bento image *is* the env
# image plus additional layers, although I'm not sure what those would be)
DOCKER_BUILDKIT=0 bentoml containerize --tag my-pytorch-bento:latest --opt="--platform linux/amd64"
# push the build bento to a registry
docker push my-pytorch-bento:latest my-org/my-pytorch-bento:latest |
Yes this is correct.
You can think of this environment as a part in a multi-stage build.
I think the Option 2 is more of a QOL so instead of having two commands, you just need to run one.
That workflow makes sense to me. Note that we also support bentoml pull s3://path/to/bento . You can install it with |
The first iteration of this ticket involves supporting build from the subprocess, which addresses polluting users'
|
Problem statement
From community reports and internal discussion,
bentoml build
currently has the following caveats:imports users'
service.py
as a module, which means all of the code inservice.py
will be invoked during the build, including every dependency imported inservice.py
.This is not ideal as it requires users to have all the dependencies installed to build, which might not always be available in the CI/CD environment.
Current workaround we have seen from the community is that users will have to setup their environment beforehand, to run it on CI. This means users will end up with installing dependencies twice, one for build and one during containerise.
bentoml build
creates a bento that includes a Dockerfile, which will be used bycontainerize
to package a BentoContainer. Often times for CI, the desired behaviour is that build should be able to resolveto the container directly.
Proposed solutions
service.py
during builda. Using
--env
--env
argument to serve, which allow serving within conda environment. We should also be able to extend this to container, virtualenv, mamba, and so on.--env
can also be used duringbuild
, which will build the given bento with the specified environment:The behaviour is as follows:
bentofile.yaml
(PyPI, Conda, system packages, setup script)--output=type=local,dest=/path/to/bentoml_home/bentos/bento_name/version
to copy the built Bento to local machine,cp
instead.containerize
andserve
directly.b. Not using
--env
If users wish not to use
--env
, then to solve this issue, we will need to extract the Service object fromservice.py
without actually importing it to the file.bentoml build
directly to a containerFor CI, what we can support is that
build
can also docontainerize
directly via flag--format=container
:by default, build will still create a Bento. (
--format=bento
)Would love to hear more feedback and comments on this.
Additional context
#3577 suggests that we should refactor the containerization steps so that it will cache the environment setup, and move the model copying to later steps. This fails into
--env container
proposal, where it set up all of the dependencies inside the container once.The text was updated successfully, but these errors were encountered: