If you are interested in participating in the challenge, please send us an email with the topic MLOps Challenge
to [email protected], make sure to add your GitHub email/username and attach your CV.
Create a service that deploys five NLP models for inference, then receives messages through an exposed POST API endpoint, and finally returns inference results (of all five models) in a single response body. Expected deliverable is a service packed in the Docker image.
You service could be a well-configured framework or a self-made API server; use any ML model deployment tool you see fit. There's no language limitation. The most important here is the reusability of a final project.
- Create a dev branch
- Submit your solution
- Create a PR
- Wait for the test results
Once you have a collaborator's access to the repository, please create a separate branch for your submission. If you think that your submission is ready, please create a pull request, and assign @rsolovev and @darknessest as reviewers. We will check your submission, run tests and respond with benchmark results and possibly some comments.
Please work on your solution for the challenge inside the solution
folder.
If you need to add env vars to the container, update values in the Helm chart.
To do that please use solution/helm/envs/*.yaml
.
Don't forget to update env vars in autotests/helm/values.yaml
, i.e., PARTICIPANT_NAME
and api_host
, to make sure that auto-tests are executed properly.
For this challenge, you must use the following models. Model's performance optimization is not allowed.
- https://huggingface.co/cardiffnlp/twitter-xlm-roberta-base-sentiment
- https://huggingface.co/ivanlau/language-detection-fine-tuned-on-xlm-roberta-base
- https://huggingface.co/svalabs/twitter-xlm-roberta-crypto-spam
- https://huggingface.co/EIStakovskii/xlm_roberta_base_multilingual_toxicity_classifier_plus
- https://huggingface.co/jy46604790/Fake-News-Bert-Detect
Your submission will be deployed on a g4dn.2xlarge
instance (see AWS specs), so please bear in mind the hardware limitations when developing your service.
The body of the request for inference only has a text:
curl --request POST \
--url http://localhost:8000/process \
--header 'Content-Type: application/json' \
--data '"This is how true happiness looks like 👍😜"'
Also you can find an example of such a request in autotests/app/src/main.js
.
Your service should respond in the following format. You can also find an example of the expected response in autotests/app/src/main.js
.
{
"cardiffnlp": {
"score": 0.2, // float
"label": "POSITIVE" // "NEGATIVE", or "NEUTRAL"
},
"ivanlau": {
"score": 0.2, // float
"label": "English" // string, a language
},
"svalabs": {
"score": 0.2, // float
"label": "SPAM" // or "HAM"
},
"EIStakovskii": {
"score": 0.2, // float
"label": "LABEL_0" // or "LABEL_1"
},
"jy46604790": {
"score": 0.2, // float
"label": "LABEL_0" // or "LABEL_1"
}
}
- Performance is of paramount importance here, specifically a throughput, and it will be the determining factor in choosing the winner.
- Think about error handling.
- We will be stress-testing your code.
- Consider the scalability and reusability of your service.
- Focus on the application.