Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Groq LLM model support #1447

Closed
wants to merge 0 commits into from
Closed

Conversation

paresh2806
Copy link
Contributor

@paresh2806 paresh2806 commented Jul 9, 2024

What problem does this PR solve?

This PR adds support for the Groq LLM (Large Language Model).

Groq is an AI solutions company delivering ultra-low latency inference with the first-ever LPU™ Inference Engine. The Groq API enables developers to integrate state-of-the-art LLMs, such as Llama-2 and llama3-70b-8192, into low latency applications with the request limits specified below. Learn more at groq.com.

ID Requests per Minute Requests per Day Tokens per Minute
gemma-7b-it 30 14,400 15,000
gemma2-9b-it 30 14,400 15,000
llama3-70b-8192 30 14,400 6,000
llama3-8b-8192 30 14,400 30,000
mixtral-8x7b-32768 30 14,400 5,000

Type of change

  • New Feature
  • Other (added model Groq llm model support):

@paresh2806
Copy link
Contributor Author

@JinHai-CN & @KevinHuSh

Should I add support for Gemini APIs to model providers in the same pull request, or should I create a new pull request for it ?

@KevinHuSh
Copy link
Collaborator

@JinHai-CN & @KevinHuSh

Should I add support for Gemini APIs to model providers in the same pull request, or should I create a new pull request for it ?

Gemini‘s already supported.

@KevinHuSh
Copy link
Collaborator

@JinHai-CN & @KevinHuSh

Should I add support for Gemini APIs to model providers in the same pull request, or should I create a new pull request for it ?

Why closed?
I was in the middle of merging conflicts.

@paresh2806
Copy link
Contributor Author

paresh2806 commented Jul 11, 2024 via email

KevinHuSh added a commit that referenced this pull request Jul 12, 2024
#1432  #1447 
This PR adds support for the GROQ LLM (Large Language Model).

Groq is an AI solutions company delivering ultra-low latency inference
with the first-ever LPU™ Inference Engine. The Groq API enables
developers to integrate state-of-the-art LLMs, such as Llama-2 and
llama3-70b-8192, into low latency applications with the request limits
specified below. Learn more at [groq.com](https://groq.com/).
Supported Models


| ID | Requests per Minute | Requests per Day | Tokens per Minute |

|----------------------|---------------------|------------------|-------------------|
| gemma-7b-it | 30 | 14,400 | 15,000 |
| gemma2-9b-it | 30 | 14,400 | 15,000 |
| llama3-70b-8192 | 30 | 14,400 | 6,000 |
| llama3-8b-8192 | 30 | 14,400 | 30,000 |
| mixtral-8x7b-32768 | 30 | 14,400 | 5,000 |

---------

Co-authored-by: paresh0628 <[email protected]>
Co-authored-by: Kevin Hu <[email protected]>
Halfknow pushed a commit to Halfknow/ragflow that referenced this pull request Nov 11, 2024
infiniflow#1432  infiniflow#1447 
This PR adds support for the GROQ LLM (Large Language Model).

Groq is an AI solutions company delivering ultra-low latency inference
with the first-ever LPU™ Inference Engine. The Groq API enables
developers to integrate state-of-the-art LLMs, such as Llama-2 and
llama3-70b-8192, into low latency applications with the request limits
specified below. Learn more at [groq.com](https://groq.com/).
Supported Models


| ID | Requests per Minute | Requests per Day | Tokens per Minute |

|----------------------|---------------------|------------------|-------------------|
| gemma-7b-it | 30 | 14,400 | 15,000 |
| gemma2-9b-it | 30 | 14,400 | 15,000 |
| llama3-70b-8192 | 30 | 14,400 | 6,000 |
| llama3-8b-8192 | 30 | 14,400 | 30,000 |
| mixtral-8x7b-32768 | 30 | 14,400 | 5,000 |

---------

Co-authored-by: paresh0628 <[email protected]>
Co-authored-by: Kevin Hu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants