NoxtuaCompliance

Get Started

This repository contains the logical code to run NoxtuaCompliance with vllm. A Gradio application is used for quick testing with a chat.

Prerequisites

Install Docker and Python (tested with version 3.11.2)

Run vllm

docker run --runtime nvidia --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface -p 8000:8000 --ipc=host vllm/vllm-openai:v0.6.6.post1 --model xaynetwork/NoxtuaCompliance --tensor-parallel-size=8 --disable-log-requests --max-model-len 120000 --gpu-memory-utilization 0.95

Adjust tensor-parallel-size to be the amount of available GPUs and to be the same number as specified for the docker command.

Validate hosted model
```
curl http://0.0.0.0:8000/v1/models
```

Setup

pip install -r requirements.txt

Gradio Application

python app.py

This command starts the Gradio application with a chat in the localhost under the specified port 8020. Open the displayed link in the browser, e.g. "http://0.0.0.0:8020" or "http://localhost:8020".

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
config.json		config.json
config.py		config.py
model_chat.py		model_chat.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NoxtuaCompliance

Get Started

Prerequisites

Setup

Gradio Application

About

Releases

Packages

Languages

License

xaynetwork/noxtua-specialized

Folders and files

Latest commit

History

Repository files navigation

NoxtuaCompliance

Get Started

Prerequisites

Setup

Gradio Application

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages