Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve overall user experience of model service #1748

Open
1 of 4 tasks
kyujin-cho opened this issue Nov 29, 2023 · 0 comments
Open
1 of 4 tasks

Improve overall user experience of model service #1748

kyujin-cho opened this issue Nov 29, 2023 · 0 comments
Assignees

Comments

@kyujin-cho
Copy link
Member

kyujin-cho commented Nov 29, 2023

Main idea

Since the birth of Backend.AI Model Service, main concern of the feature is that it is too hard to utilize for majority of users those who want to serve their own model. To overcome this problem we decided to add several new features on both Core and WebUI, which will potentially enhance overall experience of the Model Service feature.

  • Core: New "Dry Run" API
    This new API should validate actual whole lifecycle of the inference session. Its request schema will be identical with the model service creation API. The implementation should first read model-definition.yml, create a new inference session accordingly but without the bound routing and endpoint, wait until model server loads, perform a health check (if defined at model definition), and finally terminate the created session. The API should report the whole progress to callee with the help of SSE.

Alternative ideas

No response

Anything else?

No response

Tasks

Preview Give feedback
  1. comp:client comp:manager size:XL
    kyujin-cho
@kyujin-cho kyujin-cho added the type:feature Add new features label Nov 29, 2023
@achimnol achimnol removed the type:feature Add new features label Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants