AWS: TensorFlow Serving: "Sending prediction request" does not work #725

karlschriek · 2019-05-20T10:46:35Z

I'm going through the following guide: https://www.kubeflow.org/docs/components/tfserving_new/

I've managed to get the guide running up to this point: "Sending prediction request directly" (see my issue here: #724)

Running kubectl get svc mnist-service will return the external IP to use. But the line that follows will not work:

curl -X POST -d @input.json http://EXTERNAL_IP:8500/v1/models/mnist:predict

Firstly, the file input.json does not exist and there is no explanation in the guide for how to create it.

Secondly (and more importantly), the port 8500 (and also 8501) does not appear to be open. The only open ports are 9000:30347/TCP, 8000:31058/TCP

For example
curl http://aba09de577ae811e9bc1c0a77a60ece6-1422674746.us-east-1.elb.amazonaws.com:8000 returns an empty result

curl http://aba09de577ae811e9bc1c0a77a60ece6-1422674746.us-east-1.elb.amazonaws.com:8500 times out

Lastly, since the example uses supposedly uses an "mnist" model it would be useful to include instructions on where to find such this model and where to place it (for example in S3) in order to use it with this guide.

EDIT

On further inspection, it seems that the pods are not starting either. kubectl describe pods yields the following:

Warning FailedScheduling 40s (x8 over 4m) default-scheduler 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.

I presume this means I need to explicitly add nodes with GPUs to the cluster. Following the steps in https://www.kubeflow.org/docs/aws/deploy/install-kubeflow/ only a m5.2xlarge Nodegroup is created. My expectation working through the guides would be that the one follows the other without any unexplained steps. If indeed it is necessary to create (for example) p3.2xlarge nodegroups as well, I would expect this to be done explicitly in the installation example, or otherwise it should be should here how to add the nodegroups.

The text was updated successfully, but these errors were encountered:

issue-label-bot · 2019-05-20T10:46:37Z

Issue-Label Bot is automatically applying the label kind/bug to this issue, with a confidence of 0.63. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

sarahmaddox · 2019-06-06T01:04:47Z

Related: #727

sarahmaddox · 2019-06-06T01:08:52Z

Note that this issue refers to an AWS deployment. The TensorFlow Serving guide should explain the situation in general terms (for clouds other than AWS) and can give an AWS-specific example where useful.

I'm marking this issue for the doc sprint. It will take some testing to ensure the updates are correct.

Jeffwan · 2020-02-23T20:13:16Z

I think this is related to #727, we can close the issue. Device plugin is installed by default in 0.7 and we will make it optional in 1.0 with clear docs. These improvements are tracked in 1.0 doc stories.

We can close this one, feel free to reopen.

/close

k8s-ci-robot · 2020-02-23T20:13:17Z

@Jeffwan: Closing this issue.

In response to this:

I think this is related to #727, we can close the issue. Device plugin is installed by default in 0.7 and we will make it optional in 1.0 with clear docs. These improvements are tracked in 1.0 doc stories.

We can close this one, feel free to reopen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

issue-label-bot bot added the kind/bug label May 20, 2019

sarahmaddox added the doc-sprint Issues to work on during the Kubeflow Doc Sprint label Jun 6, 2019

sarahmaddox changed the title ~~TensorFlow Serving: "Sending prediction request" does not work~~ AWS: TensorFlow Serving: "Sending prediction request" does not work Jan 2, 2020

k8s-ci-robot closed this as completed Feb 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWS: TensorFlow Serving: "Sending prediction request" does not work #725

AWS: TensorFlow Serving: "Sending prediction request" does not work #725

karlschriek commented May 20, 2019 •

edited

Loading

issue-label-bot bot commented May 20, 2019

sarahmaddox commented Jun 6, 2019

sarahmaddox commented Jun 6, 2019

Jeffwan commented Feb 23, 2020

k8s-ci-robot commented Feb 23, 2020

AWS: TensorFlow Serving: "Sending prediction request" does not work #725

AWS: TensorFlow Serving: "Sending prediction request" does not work #725

Comments

karlschriek commented May 20, 2019 • edited Loading

EDIT

issue-label-bot bot commented May 20, 2019

sarahmaddox commented Jun 6, 2019

sarahmaddox commented Jun 6, 2019

Jeffwan commented Feb 23, 2020

k8s-ci-robot commented Feb 23, 2020

karlschriek commented May 20, 2019 •

edited

Loading