You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When you attempt to get retrieve an encoding through the below code, you are met with the following error message: KeyError: 'Could not automatically map o1 to a tokeniser. Please use tiktoken.get_encoding to explicitly get the tokeniser you expect.'
I ran this unsing tiktoken==0.8.0 i.e. the latest release. I'm not sure if this is intended behaviour or not, I know that o1 is just an alias. But I think that it would make sense to support aliases.
The reason why this fails is because the MODEL_PREFIX_TO_ENCODING dictionary responsible for mapping from model prefix to encoding looks like this
"o1-": "o200k_base",
So tiktoken.encoding_for_model('o1-') and complete o1 model names work without issue. Either the dashes in the prefix dictionary could be removed or o1 could be added to the MODEL_TO_ENCODING dictionary explicitly.
The text was updated successfully, but these errors were encountered:
When you attempt to get retrieve an encoding through the below code, you are met with the following error message: KeyError: 'Could not automatically map o1 to a tokeniser. Please use
tiktoken.get_encoding
to explicitly get the tokeniser you expect.'I ran this unsing tiktoken==0.8.0 i.e. the latest release. I'm not sure if this is intended behaviour or not, I know that o1 is just an alias. But I think that it would make sense to support aliases.
The reason why this fails is because the MODEL_PREFIX_TO_ENCODING dictionary responsible for mapping from model prefix to encoding looks like this
So tiktoken.encoding_for_model('o1-') and complete o1 model names work without issue. Either the dashes in the prefix dictionary could be removed or o1 could be added to the MODEL_TO_ENCODING dictionary explicitly.
The text was updated successfully, but these errors were encountered: