-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Groq LLM model support #1447
Conversation
Should I add support for Gemini APIs to model providers in the same pull request, or should I create a new pull request for it ? |
Gemini‘s already supported. |
Why closed? |
Apologies from my side.
I thought there were conflicts on my end. I am new to open-source contributions.
…________________________________
From: Kevin Hu ***@***.***>
Sent: Thursday, July 11, 2024 3:26 PM
To: infiniflow/ragflow ***@***.***>
Cc: PARESH MAKWANA ***@***.***>; State change ***@***.***>
Subject: Re: [infiniflow/ragflow] Added Groq LLM model support (PR #1447)
@JinHai-CN<https://github.com/JinHai-CN> & @KevinHuSh<https://github.com/KevinHuSh>
Should I add support for Gemini APIs to model providers in the same pull request, or should I create a new pull request for it ?
Why closed?
I was in the middle of merging conflicts.
—
Reply to this email directly, view it on GitHub<#1447 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ATOTSC2DJJXMPJ3MESWVMLDZLZJFLAVCNFSM6AAAAABKSNRBSWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRSGUYTMMRTGA>.
You are receiving this because you modified the open/close state.Message ID: ***@***.***>
|
#1432 #1447 This PR adds support for the GROQ LLM (Large Language Model). Groq is an AI solutions company delivering ultra-low latency inference with the first-ever LPU™ Inference Engine. The Groq API enables developers to integrate state-of-the-art LLMs, such as Llama-2 and llama3-70b-8192, into low latency applications with the request limits specified below. Learn more at [groq.com](https://groq.com/). Supported Models | ID | Requests per Minute | Requests per Day | Tokens per Minute | |----------------------|---------------------|------------------|-------------------| | gemma-7b-it | 30 | 14,400 | 15,000 | | gemma2-9b-it | 30 | 14,400 | 15,000 | | llama3-70b-8192 | 30 | 14,400 | 6,000 | | llama3-8b-8192 | 30 | 14,400 | 30,000 | | mixtral-8x7b-32768 | 30 | 14,400 | 5,000 | --------- Co-authored-by: paresh0628 <[email protected]> Co-authored-by: Kevin Hu <[email protected]>
infiniflow#1432 infiniflow#1447 This PR adds support for the GROQ LLM (Large Language Model). Groq is an AI solutions company delivering ultra-low latency inference with the first-ever LPU™ Inference Engine. The Groq API enables developers to integrate state-of-the-art LLMs, such as Llama-2 and llama3-70b-8192, into low latency applications with the request limits specified below. Learn more at [groq.com](https://groq.com/). Supported Models | ID | Requests per Minute | Requests per Day | Tokens per Minute | |----------------------|---------------------|------------------|-------------------| | gemma-7b-it | 30 | 14,400 | 15,000 | | gemma2-9b-it | 30 | 14,400 | 15,000 | | llama3-70b-8192 | 30 | 14,400 | 6,000 | | llama3-8b-8192 | 30 | 14,400 | 30,000 | | mixtral-8x7b-32768 | 30 | 14,400 | 5,000 | --------- Co-authored-by: paresh0628 <[email protected]> Co-authored-by: Kevin Hu <[email protected]>
What problem does this PR solve?
This PR adds support for the Groq LLM (Large Language Model).
Groq is an AI solutions company delivering ultra-low latency inference with the first-ever LPU™ Inference Engine. The Groq API enables developers to integrate state-of-the-art LLMs, such as Llama-2 and llama3-70b-8192, into low latency applications with the request limits specified below. Learn more at groq.com.
Type of change