A plugin for the llm
CLI that allows you to use the text generation models (LLMs) running on globally on Cloudflare Workers AI, including models like Llama 3.1, Mistral 7B, Gemma and a number of task-specific fine tunes.
llm-cloudflare
is useful for:
- Using and building with LLMs that may not efficiently run on your local machine (limited GPU, memory, etc) vs. having Workers AI run it on a GPU near you.
- Validating the performance of and/or comparing multiple models.
- Experimenting without needing to download models ahead-of-time.
Prerequisite: You'll need the llm
CLI installed first.
Install and setup the plugin:
# Install the plugin from pip
llm install llm-cloudflare
# Provide a valid Workers AI token
# Docs: https://developers.cloudflare.com/workers-ai/get-started/rest-api/#1-get-api-token-and-account-id
llm keys set cloudflare
# Set your Cloudflare account ID
# Docs: https://developers.cloudflare.com/workers-ai/get-started/rest-api/#1-get-api-token-and-account-id
export CLOUDFLARE_ACCOUNT_ID="33charlonghexstringhere"
Use it by specifying a Workers AI model:
llm -m "@cf/meta/llama-3.1-8b-instruct" "Write a Cloudflare Worker in ESM format that returns an empty JSON object as a response. Show only the code."
You can set a Workers AI model as the default model in llm
:
# Set Llama 3.1 8B as the default
llm models default "@cf/meta/llama-3.1-8b-instruct"
# See what model is set as the default
llm models default
# @cf/meta/llama-3.1-8b-instruct
This plugin provides access to the text generation models (LLMs) provided by Workers AI.
To see what models are available, invoke llm models
. Models prefixed with Cloudflare Workers AI
are provided by this plugin.
The supported models are generated by scripts. New models thus rely on this plugin being updated periodically.
In the future, this plugin may also add support for Workers AI's embedding models for use with llm embed
.
Credit to @hex for https://github.com/hex/llm-perplexity, which heavily inspired the design of this plugin.
Copyright Cloudflare, Inc (2024). Apache-2.0 licensed. See the LICENSE file for details.