Skip to content

Commit

Permalink
Nitpicky items in the docs (#18)
Browse files Browse the repository at this point in the history
* Nitpicky items in the docs

Signed-off-by: Mike McKiernan <[email protected]>

* Clarify registry reqs

I tried to imitate the phrasing I found
when I accessed the following address in
an incognito window:

https://catalog.ngc.nvidia.com/orgs/nim/teams/meta/containers/llama-3.1-8b-instruct/tags

This is the text:

This page requires an active subscription to an NVIDIA AI Enterprise product

Because the repository has a developer audience, I included the address
for signing up.

Signed-off-by: Mike McKiernan <[email protected]>

---------

Signed-off-by: Mike McKiernan <[email protected]>
  • Loading branch information
mikemckiernan authored Aug 5, 2024
1 parent 0c7b037 commit e707787
Show file tree
Hide file tree
Showing 3 changed files with 63 additions and 36 deletions.
5 changes: 5 additions & 0 deletions docs/install.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
<!--
SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
-->

# Installing NIM Operator for Kubernetes using Helm

### Pre-requisites
Expand Down
49 changes: 31 additions & 18 deletions docs/nimcache.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,31 @@
# Caching NIM models
Follow these steps to cache NIM models into a persistent storage (PVC)
<!--
SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
-->

### Pre-requisites
# Caching NIM Models

* NVIDIA GPU Operator have to be installed
* NVIDIA NIM Operator for K8s have to be installed
* Access to following NGC repositories required
- nvcr.io/nvstaging/cloud-native
- nvcr.io/nvidian/nim-llm-dev
* Local Path Provisioner for creating a Persistent Volume (PV)
Follow these steps to cache NIM models in a persistent volume.

### 1. Create a Namespace for running NIM services
## Prerequisites

* NVIDIA GPU Operator is installed.
* NVIDIA NIM Operator is installed.
* You must have an active subscription to an NVIDIA AI Enterprise product or be an NVIDIA Developer Program
[member](https://build.nvidia.com/explore/discover?integrate_nim=true&developer_enroll=true&self_hosted_api=true&signin=true).
Access to the containers and models for NVIDIA NIM microservices is restricted.

* A persistent volume provisioner is installed.

The Local Path Provisioner from Rancher is acceptable for development on a single-node cluster.

## 1. Create a Namespace for Running NIM Microservices

```sh
kubectl create ns nim-service
```

### 2. Create an Image Pull Secret for the NIM container
### 2. Create an Image Pull Secret for the NIM Container

Replace <ngc-cli-api-key> with your NGC CLI API key.

Expand All @@ -27,12 +36,13 @@ kubectl create secret -n nim-service docker-registry ngc-secret \
--docker-password=<ngc-cli-api-key>
```

### 3. Create the `NIMCache` instance with auto-selection of models enabled
## 3. Create the NIM Cache Instance and Enable Model Auto-Detection

Update the `NIMCache` custom resource (CR) with appropriate values for model selection.
These include `model.precision`, `model.engine`, `model.qosProfile`, `model.gpu.product` and `model.gpu.ids`.
With these, the NIM operator will extract supported profiles and use that for caching.
With these, the NIM Operator can extract the supported profiles and use that for caching.

Alternatively if `model.profiles` are specified, then that particular model profile will be downloaded.
Alternatively, if you specify `model.profiles`, then the model puller downloads and caches that particular model profile.

```yaml
apiVersion: apps.nvidia.com/v1alpha1
Expand Down Expand Up @@ -73,23 +83,26 @@ spec:
kubectl create -f nimcache.yaml -n nim-service
```

### 5. Verify the progress of NIM model caching
### 5. Verify the Progress of NIM Model Caching

Verify that the NIM Operator has initiated the caching job and track status via the CR.

```sh
kubectl get nimcache -n nim-service -o wide
```

```console
```output
NAME STATUS PVC AGE
meta-llama3-8b-instruct ready meta-llama3-8b-instruct-pvc 2024-07-04T23:22:13Z
```

Get the NIM cache so you can view the status:

```sh
kubectl get nimcache -n nim-service -o yaml
```

```console
```output
apiVersion: apps.nvidia.com/v1alpha1
kind: NIMCache
metadata:
Expand Down Expand Up @@ -164,4 +177,4 @@ status:
tp: "2"
pvc: meta-llama3-8b-instruct-pvc
state: ready
```
```
45 changes: 27 additions & 18 deletions docs/nimservice.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
<!--
SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
-->

# Create a NIM Service

### Pre-requisites
## Prerequisites

* Create a namespace e.g. `nim-service`
* Create a `NIMCache` instance in the namespace `nim-service` following the guide [here](https://gitlab-master.nvidia.com/dl/container-dev/k8s-nim-operator/-/blob/51e9727929b16982a2dba6d7fccbd0474f566bf8/docs/nimcache.md).
* A `NIMCache` instance in the namespace `nim-service`.

### 1. Create the CR for NIMService
## 1. Create the NIM Service Instance

nimservice.yaml:
Create a file, such as `nimservice.yaml`, with contents like the following example:

```yaml
apiVersion: apps.nvidia.com/v1alpha1
Expand Down Expand Up @@ -39,42 +43,43 @@ spec:
openaiPort: 8000
```
Apply the manifest:
```sh
kubectl create -f nimservice.yaml -n nim-service
```

### 2. Check the status of NIMService deployment
### 2. Check the Status of NIM Service Deployment

```sh
kubectl get nimservice -n nim-service
```

```console
kubectl get nimservice -n nim-service
NAME STATUS AGE
meta-llama3-8b-instruct-latest ready 115m
```output
NAME STATUS AGE
meta-llama3-8b-instruct Ready 115m
```

```sh
kubectl get pods -n nim-service
```

```console
NAME READY STATUS RESTARTS AGE
meta-llama3-8b-instruct-latest-db9d899fd-mfmq2 1/1 Running 0 108m
meta-llama3-8b-instruct-latest-job-xktnk 0/1 Completed 0 4m38s
```output
NAME READY STATUS RESTARTS AGE
meta-llama3-8b-instruct-db9d899fd-mfmq2 1/1 Running 0 108m
meta-llama3-8b-instruct-job-xktnk 0/1 Completed 0 4m38s
```

### 3. Verify with a sample pod
### 3. Verify the Microservice is Running

test-pod.yaml:
Create a file, `verify-pod.yaml`, with contents like the following example:

```yaml
---
apiVersion: v1
kind: Pod
metadata:
name: test-streaming-chat
name: verify-streaming-chat
spec:
containers:
- name: curl
Expand Down Expand Up @@ -118,10 +123,14 @@ spec:
restartPolicy: Never
```
Apply the manifest:
```sh
kubectl create -f test-pod.yaml -n nim-service
```

Confirm the verification pod ran to completion:

```sh
kubectl get pods -n nim-service
```
Expand All @@ -130,6 +139,6 @@ kubectl get pods -n nim-service
NAME READY STATUS RESTARTS AGE
meta-llama3-8b-instruct-latest-db9d899fd-mfmq2 1/1 Running 0 112m
meta-llama3-8b-instruct-latest-job-xktnk 0/1 Completed 0 8m8s
test-streaming-chat 0/1 Completed 0 99m
verify-streaming-chat 0/1 Completed 0 99m
```

0 comments on commit e707787

Please sign in to comment.