-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The generated python code contains redundancies "]" #575
Comments
Hi, @willwu1984, I am trying to reproduce the issue. However, I'm unsure about the position where the completion is triggered. |
The ghost text screenshot: |
Could you please provide the version information? You can obtain it by executing the following command: Based on the logs, it appears that version is not the latest release (v0.3.0). It is advisable to test it with the latest release (tabbyml/tabby:0.3.0) to check if the issue still persists |
I have upgrade the image, but the problem already exist.
The server log: |
It seems that after updating the server (begin at line 56 of the server log), the two completion requests resulted in empty responses. The screenshot may show the cached completion on the client side. @wsxiaoys
Response:
Server state:
Others:
|
The model simply generates the wrong output. Surprisingly, this only happens on the ctranslate2 inference engine, though. This isn't something that can be fixed immediately from tabby's side. However, over the long run, we might consider implementing grammar constraint sampling (ggerganov/llama.cpp#1773) to eliminate cases like this. |
@wsxiaoys How can I configure to use ggml models? |
The ggml (llama.cpp) inference engine is exclusively designed for the metal backend. For more detailed information, please visit https://github.com/TabbyML/tabby/blob/main/MODEL_SPEC.md. Are there more bad cases generated soley from CodeLlama-7B? |
So far we've found that most of the code that contains arrays has problems, and the language isn't limited to python, javascript also behaves the same way. By the way, llama.cpp also support cuda env. Is it possible to add configuration usage options? |
If this duplication occurs in more than just this simple case, I would say it's likely a bug - Let us investigate it further and get back to you. In the meantime, if you come across such cases, please consider posting a screenshot or log record to this thread. It would be very helpful for us to debug and pinpoint the issue. Thank you! |
in 0.5.0 we've fully switched to gguf for cuda - this should fixed the issue. Hi @willwu1984 could you test it? |
@wsxiaoys This issue has been fixed using version v0.5.4. Great, thank you! |
Describe the bug
The generated python code has syntax errors and contains redundancies "]".
# sort array by bubble sort
Information about GPU
Additional context
compose.yaml
:VS Code Version: 1.83.1
The text was updated successfully, but these errors were encountered: