-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ModelRunnerCpp
does not transfer SamplingConfig
Tensor fields correctly
#1183
Comments
@Funatiq, could you please take a look at this? I believe that everything should be supported for this on the C++ side. Only |
same issue |
The random_seed should be allowed to be set to values such as [1, 2, 3, 4] when batch_size is greater than 1. |
I need to fallback to a Python session to make it work, but I prefer to use a C++ session. |
The fix will be included in the next push to main (ETA Mar 19). |
@Funatiq @kaiyux Thank you for processing this so quickly. I took a look at it today but I still cannot pass a
tolist() . For the other fields of SamplingConfig (temperature , frequency_penalty , ...) the tolist() is there. Could you please take a look? Thank you
|
You're right, we somehow missed that. |
Hello team,
the tensorrt_llm.runtime.SamplingConfig defines multiple fields as either scalar or
torch.Tensor
, for example therandom_seed
,top_k
ortop_p
. Inside theModelRunnerCpp
these fields are all assumed to be scalar and wrapped into lists with a single item, see here. Thus, when using theModelRunnerCpp
it is not possible to set independent values for each batch entry. If you try to do so it fails with the following error:Looking at the the SamplingConfig Cpp code it should to be supported to provide a list with batch size entries. Accordingly it would be necessary to cast
torch.Tensors
to lists.Additionally, the parameters
top_p_decay
,top_p_min
andtop_p_reset_ids
are scalar in thetensorrt_llm.runtime.SamplingConfig
, but are supposed to be a vector of length batch size in the Cpp implementation of the config.It would be great if you could have a look at this and possibly fix this. Thank you!
The text was updated successfully, but these errors were encountered: