NVIDIA · drazvan · Jul 9, 2024 · Jun 5, 2024 · Jul 9, 2024 · Jul 9, 2024
diff --git a/docs/user_guides/configuration-guide.md b/docs/user_guides/configuration-guide.md
@@ -85,9 +85,73 @@ The meaning of the attributes is as follows:
 
 You can use any LLM provider that is supported by LangChain, e.g., `ai21`, `aleph_alpha`, `anthropic`, `anyscale`, `azure`, `cohere`, `huggingface_endpoint`, `huggingface_hub`, `openai`, `self_hosted`, `self_hosted_hugging_face`. Check out the LangChain official documentation for the full list.
 
-**NOTE**: to use any of the providers, you will need to install additional packages; when you first try to use a configuration with a new provider, you will typically receive an error from LangChain that will instruct you on what packages should be installed.
+```{note}
+To use any of the providers, you will need to install additional packages; when you first try to use a configuration with a new provider, you will typically receive an error from LangChain that will instruct you on what packages should be installed.
+```
+
+```{important}
+While from a technical perspective, you can instantiate any of the LLM providers above, depending on the capabilities of the model, some will work better than others with the NeMo Guardrails toolkit. The toolkit includes prompts that have been optimized for certain types of models (e.g., `openai`, `nemollm`). For others, you can optimize the prompts yourself (see the [LLM Prompts](#llm-prompts) section).
+```
+#### NIM for LLMs
+
+[NVIDIA NIM](https://docs.nvidia.com/nim/index.html) is a set of easy-to-use microservices designed to accelerate the deployment of generative AI models across the cloud, data center, and workstations.
+[NVIDIA NIM for LLMs](https://docs.nvidia.com/nim/large-language-models/latest/introduction.html) brings the power of state-of-the-art LLMs to enterprise applications, providing unmatched natural language processing and understanding capabilities. [Learn more about NIMs](https://developer.nvidia.com/blog/nvidia-nim-offers-optimized-inference-microservices-for-deploying-ai-models-at-scale/).
+
+NeMo Guardrails supports connecting to a NIM as follows:
+
+```yaml
+models:
+  - type: main
+    engine: nim
+    model: <MODEL_NAME>
+    parameters:
+      base_url: <NIM_ENDPOINT_URL>
+```
+
+For example, to connect to a locally deployed `meta/llama3-8b-instruct` model, on port 8000, use the following model configuration:
+
+```yaml
+models:
+  - type: main
+    engine: nim
+    model: meta/llama3-8b-instruct
+    parameters:
+      base_url: http://localhost:8000/v1
+```
+
+```{important}
+To use the `nim` LLM provider, you must install the `langchain-nvidia-ai-endpoints` package (`pip install langchain-nvidia-ai-endpoints`).
+```
+
+
+#### NVIDIA AI Endpoints
+
+[NVIDIA AI Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) give users easy access to NVIDIA hosted API endpoints for NVIDIA AI Foundation Models like Llama 3, Mixtral 8x7B, Stable Diffusion, etc.
+These models, hosted on the [NVIDIA API catalog](https://build.nvidia.com/), are optimized, tested, and hosted on the NVIDIA AI platform, making them fast and easy to evaluate, further customize, and seamlessly run at peak performance on any accelerated stack.
+
+To use an LLM model through the NVIDIA AI Endpoints, use the following model configuration:
+
+```yaml
+models:
+  - type: main
+    engine: nvidia_ai_endpoints
+    model: <MODEL_NAME>
+```
+
+For example, to use the `llama3-8b-instruct` model, use the following model configuration:
+
+```yaml
+models:
+  - type: main
+    engine: nvidia_ai_endpoints
+    model: meta/llama3-8b-instruct
+```
+
+```{important}
+To use the `nvidia_ai_endpoints` LLM provider, you must install the `langchain-nvidia-ai-endpoints` package (`pip install langchain-nvidia-ai-endpoints`) and configure a valid `NVIDIA_API_KEY`.
+```
 
-**IMPORTANT**: while from a technical perspective, you can instantiate any of the LLM providers above, depending on the capabilities of the model, some will work better than others with the NeMo Guardrails toolkit. The toolkit includes prompts that have been optimized for certain types of models (e.g., `openai`, `nemollm`). For others, you can optimize the prompts yourself (see the [LLM Prompts](#llm-prompts) section).
+For more details, check out this [user guide](./llm/nvidia_ai_endpoints/README.md).
 
 Here's an example configuration for using `llama3` model with [Ollama](https://ollama.com/):
 

diff --git a/docs/user_guides/llm/nvidia_ai_endpoints/README.md b/docs/user_guides/llm/nvidia_ai_endpoints/README.md
@@ -1,6 +1,6 @@
 # Using LLMs hosted on NVIDIA API Catalog
 
-This guide teaches you how to use NeMo Guardrails with LLMs hosted on NVIDIA API Catalog. It uses the [ABC Bot configuration](../../../../examples/bots/abc) and changes the model to `ai-mixtral-8x7b-instruct`.
+This guide teaches you how to use NeMo Guardrails with LLMs hosted on NVIDIA API Catalog. It uses the [ABC Bot configuration](../../../../examples/bots/abc) and changes the model to `meta/llama3-70b-instruct`.
 
 ## Prerequisites
 
@@ -12,17 +12,11 @@ Before you begin, ensure you have the following prerequisites in place:
 pip install -U --quiet langchain-nvidia-ai-endpoints
 ```
 
-```
-
-[notice] A new release of pip is available: 23.3.2 -> 24.0
-[notice] To update, run: pip install --upgrade pip
-```
-
 2. An NVIDIA NGC account to access AI Foundation Models. To create a free account go to [NVIDIA NGC website](https://ngc.nvidia.com/).
 
 3. An API key from NVIDIA API Catalog:
-    -  Generate an API key by navigating to the AI Foundation Models section on the NVIDIA NGC website, selecting a model with an API endpoint, and generating an API key.
-    -  Export the NVIDIA API key as an environment variable:
+   - Generate an API key by navigating to the AI Foundation Models section on the NVIDIA NGC website, selecting a model with an API endpoint, and generating an API key. You can use this API key for all models available in the NVIDIA API Catalog.
+   - Export the NVIDIA API key as an environment variable:
 
 ```bash
 export NVIDIA_API_KEY=$NVIDIA_API_KEY # Replace with your own key
@@ -51,13 +45,13 @@ Update the `models` section of the `config.yml` file to the desired model suppor
 models:
   - type: main
     engine: nvidia_ai_endpoints
-    model: ai-mixtral-8x7b-instruct
+    model: meta/llama3-70b-instruct
 ...
 ```
 
 ## Usage
 
-Load the guardrails configuration:
+Load the guardrail configuration:
 
 ```python
 from nemoguardrails import LLMRails, RailsConfig
@@ -82,11 +76,11 @@ print(response['content'])
 ```
 
 ```
-The ABC Company provides eligible employees with 20 days of paid vacation time
+According to the employee handbook, eligible employees are entitled to 20 days of paid vacation per year, accrued monthly.
 ```
 
 You can see that the bot responds correctly.
 
 ## Conclusion
 
-In this guide, you learned how to connect a NeMo Guardrails configuration to an NVIDIA API Catalog LLM model. This guide uses `ai-mixtral-8x7b-instruct`, however, you can connect any other model by following the same steps.
+In this guide, you learned how to connect a NeMo Guardrails configuration to an NVIDIA API Catalog LLM model. This guide uses `meta/llama3-70b-instruct`, however, you can connect any other model by following the same steps.