-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How could you add parallelism to make the encoding faster? #7
Comments
Good points. How are you benchmarking these times today? There is some overhead with the pyo3 code as it stands and I hope we can optimize that away once openai/tiktoken#40 and openai/tiktoken#50 land |
I'd like to contribute to this issue. Is this project still active? |
Issue is open and the project is active! :) Happy to advise / review any PRs |
Sweet! Besides this issue, any other notable issues/enhancements to work on? Gonna take a closer look tomorrow. |
Great to hear! Issues that can be worked on are listed here in the GitHub issues. I'd recommend tackling each issue one at a time. You can comment on each issue that you're interested in working on. |
Sorry, BitBuilder couldn't generate a pull request for you. Please try again later. (wflow_xDf68BFfsE8dDc4G) 🤖 |
On line 140-141 of lib.rs, there is a comment where the author mentions he tried threading with rayon but noticed it wasn't much faster than python threads.
Currently the python version gets me the token length in ~0.26 seconds while this crate takes ~1.8 seconds so I propose we should add back threading to speed up the process.
Now I am still a bit new to Rust so this post is more to bring suggestions on how would we go integrating threading?
The text was updated successfully, but these errors were encountered: