-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds grains entry point #1082
Adds grains entry point #1082
Conversation
Closes #741 Adds the "swiss-army knife" component to run just filtering on files. This involved modifying how the `.topostats` files are loaded and extracted because of the nesting (see #1068). Tests currently fail because the `tests/resources/test_image/minicircle_small.topostats` is version `0.1` and doesn't therefore work with the refactored structure (surprised #1068 passed all tests actualy!). A separate commit will be made for updating this test file and the associated tweaking of tests.
The `tests/resources/test_image/minicircle_small.topostats` was version `0.1` and failed the updates and tests that now work with `0.2`. I've therefore updated this test file and tweaked the associated tests to work with these files. All tests pass locally (watch them fail on CI!).
Closes #742 - Adds `processing.process_grains()` - Adds `run_modules.grains()` Together these allow the `topostats grains` entry point to run which loads `*.topostats` files (`v0.2`), extracts the flattened image and re-runs grains. If any processing artefacts from previous runs are present in the `.topostats` files they are removed. Output is written to the specified directory so that comparisions will be possible. Added in tests and checked that some things in the previous `topstats filters` step work too. Some code is in place for subsequent modules that will be added in turn and they may need refining.
In working through adding the `topostats grains` entry point I was confused why `.topostats` files had `image` which contained the flattened image but all the processing stages after `Filters` used `image_flattened`. After checking with @SylviaWhittle we have opted to make things consistent across the file output and the processing. Updates tests in light of these changes, previously the result of `loadscans.get_data()` left `img_dict` as a dictionary of the _data_ but to align with other scan types we actually want a nested dictionary with the keys as the filenames then the data (whether that is a single scan from most raw data or the dictionary that `.topostats` files hold). This raises (again as I've asked a similar question before before) the disconnect between a `topostats` object internal to TopoStats and as stored in the HDF5 file format and the value returned by AFMReader. As can be seen [here](https://github.com/AFM-SPM/AFMReader/blob/022dcf286914c23a30da42e4ea401aa577b0b193/AFMReader/topostats.py#L55) for convenience the `image`, `pixel_to_nm_scaling` are extracted from `data` and returned as part of a Tuple along with the `data` from which it was extracted. This _might_ be convenient for users but I see no reason why we shouldn't return just `data` and then users can access these values via `data["image"]` and `data["pixel_to_nm_scaling"]`. This would mean the result of loading a `.topostats` object matches the interal representation and we can remove some logic and wrangling that has been introduced in this PR to sort that out.
I've successfully used the
using a barebones config: # Config file generated 2025-01-21 12:04:47
# # For more information on configuration and how to use it:
# https://afm-spm.github.io/TopoStats/main/configuration.html
base_dir: ./data # Directory in which to search for data files
output_dir: ./output # Directory to output results to
log_level: info # Verbosity of output. Options: warning, error, info, debug
cores: 1 # Number of CPU cores to utilise for processing multiple files simultaneously.
file_ext: .topostats # File extension of the data files.
loading:
channel: Height # Channel to pull data from in the data files.
extract: all # Array to extract when loading .topostats files.
filter:
run: false # Options : true, false
grains:
run: true
plotting:
run: true # Options : true, false
image_set: all # Options : all, core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works well for me, (tested locally)
Just a couple documentation things (the comment and the thing here:)
Remove the WIP tag from both the topostats -h
command and also the topostats grains -h
command (I think?)
Brilliant, thanks for testing @SylviaWhittle
Good spot thanks and not at all incorrect, thanks for picking that up. Corrected along with the comment. Linting error in pre-commit isn't from this Pull Request as I wrote the documentation separately and it was merged the other day. Footnotes
|
Closes #742
processing.process_grains()
run_modules.grains()
Together these allow the
topostats grains
entry point to run which loads*.topostats
files (v0.2
), extracts the flattened image and re-runs grains.If any processing artefacts from previous runs are present in the
.topostats
files they are removed. Output is written to the specified directory so that comparisions will be possible.Added in tests and checked that some things in the previous
topstats filters
step work too. Some code is in place for subsequent modules that will be added in turn and they may need refining.In working through adding the
topostats grains
entry point I was confused why.topostats
files hadimage
which contained the flattened image but all the processing stages afterFilters
usedimage_flattened
. After checking with@SylviaWhittle we have opted to make things consistent across the file output and the processing.
Updates tests in light of these changes, previously the result of
loadscans.get_data()
leftimg_dict
as a dictionary of the data but to align with other scan types we actually want a nested dictionary with the keys as the filenamesthen the data (whether that is a single scan from most raw data or the dictionary that
.topostats
files hold).For future discussion
I found that because
AFMReader
returns a tuple I had to add additional logic (it baffled me for a while until I realised this!)This raises (again as I've asked a similar question before) the disconnect between a
topostats
object internal to TopoStats and as stored in the HDF5 file format and the value returned by AFMReader. As can be seen herefor convenience the
image
,pixel_to_nm_scaling
are extracted fromdata
and returned as part of a Tuple along with thedata
from which it was extracted. This might be convenient for users but I see no reason why we shouldn't return justdata
and then users can access these values viadata["image"]
anddata["pixel_to_nm_scaling"]
. This would mean the result of loading a.topostats
object matches the interal representation and we can remove some logic and wrangling that has been introduced in this PR to sort that out.Before submitting a Pull Request please check the following.
New functions/methods have tests which check the intended behaviour is correct.I felt testing that thepop()
method worked seemed excessive. Perhaps when I add in steps forgrainstats
and others I'll check thecorrect items are removed.