NoxtuaCompliance

Get Started

This repository contains the logical code to run NoxtuaCompliance with vllm. A Gradio application is used for quick testing with a chat.

Prerequisites

Install Docker and Python (tested with version 3.11.2)

Run vllm

docker run --runtime nvidia --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface -p 8000:8000 --ipc=host vllm/vllm-openai:v0.6.6.post1 --model xaynetwork/NoxtuaCompliance --tensor-parallel-size=8 --disable-log-requests --max-model-len 120000 --gpu-memory-utilization 0.95

Adjust tensor-parallel-size to be the amount of available GPUs and to be the same number as specified for the docker command.

Validate hosted model
```
curl http://0.0.0.0:8000/v1/models
```

Setup

pip install -r requirements.txt

Gradio Application

python app.py

This command starts the Gradio application with a chat in the localhost under the specified port 8020. Open the displayed link in the browser, e.g. "http://0.0.0.0:8020" or "http://localhost:8020".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NoxtuaCompliance

Get Started

Prerequisites

Setup

Gradio Application

Files

README.md

Latest commit

History

README.md

File metadata and controls

NoxtuaCompliance

Get Started

Prerequisites

Setup

Gradio Application