-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Outline memory required for tedana to run #267
Comments
@dowdlelt ran a quick test and indicated that the memory usage seems to match expectations, with increased voxel size causing rapid memory consumption and a near-linear scaling with echo number. |
According to @dowdlelt a user-supplied mask can drastically reduce the number of data points required to run data and thus drastically reduce memory requirements; he notes that an AfNI EPI mask reduced a 3mm (resampled to 2.5mm) isotropic data set from 540k voxels to 120k voxels-- a dramatic 80% compression in memory required. This should be ameliorated in the next release due to @tsalo's contributions in #226, especially per a comparison between the nilearn mask and AfNI mask that yielded an ~5k voxel difference. |
With more datasets coming in this should be more easily testable. |
As it seems very deterministic (some function of n_voxels x n_echoes), it may be possible to add a check at runtime comparing estimated RAM usage to available ram and to provide a message for users. Or even just show estimated RAM usage, so if things don't work, they can scroll through and see that as a potential problem, you know, for user friendliness. |
That's a great idea! Probably not too hard to just estimate the required, and add 10% for a buffer amount. |
I believe the data are being loaded into memory as float32 (assuming Nifti-1), which means that the number of bytes used will be: nbytes = (4 * x * y * z * t * e) If the data are being loaded as float64 then substitute 8 for the 4 in the equation. The trickier part is determining how many copies of the data are being made in memory during computations... You could use memory-profiler to try and make some estimates, but as a lower bound that's a good start. |
Yeah, it's also hard to tell memory usage instantaneously. We'll add this to things to look out for on Testing & Validation. I'll try to familiarize myself with this tool, thanks @rmarkello. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions to tedana:tada: ! |
Summary
In several issues and in informal conversations, users have reported large RAM usage. With no formal guidelines, it is difficult for a user to know what to expect from their data. We should outline memory requirements for various data sets, most notably high-resolution data.
Additional Detail
In issues #254 and #144 as well as in formal conversation, users have noted a large RAM usage. While @tsalo has done refactoring that should ameliorate this problem and we should have a release that reduces usage, it would be good to have guidelines for how RAM usage should scale with dataset:
We should bear in mind that peak usage can be problematic since taking all available RAM will lead to system thrashing such as described in #254 (the operating system is spending all of its time retrieving memory from disk since it's run out of RAM, and then the problem cascades into all programs as it struggles to handle its I/O load).
Next Steps
The text was updated successfully, but these errors were encountered: