[QST] Performance differences between encode()
vs __call__()
on tf Encoder block in CPU
#1213
Labels
encode()
vs __call__()
on tf Encoder block in CPU
#1213
❓ Questions & Help
What is the preferred way of generating predictions from a trained
Encoder
from aTwoTowerModelV2
? There seem to be at least two ways of doing that, with apparently huge performance differences.Details
After training a TwoTowerModelV2 I noticed that there is a huge difference in performance between calling the
model.query_encoder.encode()
method of each tower versus calling it directlymodel.query_encoder()
on a single node with CPU.Setup
Calling
encode()
This takes more >1 hour on 434457 rows. Resource usage metrics show that the CPU is idle most of the time, which is quite unexpected.
Tried increasing the number of partitions of the transformed dataset and set the
.compute(scheduler='processes')
to benefit from Dask's parallelization, but it didn't work (failed with serialization issues)Calling
__call__()
withLoader
This takes ~30 seconds on 434457 rows. As my data fits into memory, this ended up being a clear winner.
Is this difference expected or am I doing something wrong?
The text was updated successfully, but these errors were encountered: