Memory leak when running long GPU inferences #487

RandomDefaultUser · 2023-09-28T14:15:34Z

I have encountered odd behavior when running long (over 200) GPU inferences in a row on the GPUs on hemera. After around 200-300 inferences, I get a memory overload error, which seems to originate somewhere on the python side. This looks like a memory leak.

I have tried to identify the problem, but standard profiling and investigating didn't give much insight. I suspect it could be related to our LAMMPS interface, but I have no idea. I will have to investigate further.

RandomDefaultUser added bug Something isn't working important labels Sep 28, 2023

RandomDefaultUser self-assigned this Sep 28, 2023

RandomDefaultUser added this to the v1.3.0 - Into the multi-GPU-verse milestone Jul 24, 2024

RandomDefaultUser removed this from the v1.3.0 - Into the multi-GPU-niverse milestone Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak when running long GPU inferences #487

Memory leak when running long GPU inferences #487

RandomDefaultUser commented Sep 28, 2023

Memory leak when running long GPU inferences #487

Memory leak when running long GPU inferences #487

Comments

RandomDefaultUser commented Sep 28, 2023