Adds grains entry point #1082

ns-rse · 2025-01-28T21:04:16Z

Closes #742

Adds processing.process_grains()
Adds run_modules.grains()

Together these allow the topostats grains entry point to run which loads *.topostats files (v0.2), extracts the flattened image and re-runs grains.

If any processing artefacts from previous runs are present in the .topostats files they are removed. Output is written to the specified directory so that comparisions will be possible.

Added in tests and checked that some things in the previous topstats filters step work too. Some code is in place for subsequent modules that will be added in turn and they may need refining.

In working through adding the topostats grains entry point I was confused why .topostats files had image which contained the flattened image but all the processing stages after Filters used image_flattened. After checking with
@SylviaWhittle we have opted to make things consistent across the file output and the processing.

Updates tests in light of these changes, previously the result of loadscans.get_data() left img_dict as a dictionary of the data but to align with other scan types we actually want a nested dictionary with the keys as the filenames
then the data (whether that is a single scan from most raw data or the dictionary that .topostats files hold).

For future discussion

I found that because AFMReader returns a tuple I had to add additional logic (it baffled me for a while until I realised this!)

This raises (again as I've asked a similar question before) the disconnect between a topostats object internal to TopoStats and as stored in the HDF5 file format and the value returned by AFMReader. As can be seen here
for convenience the image, pixel_to_nm_scaling are extracted from data and returned as part of a Tuple along with the data from which it was extracted. This might be convenient for users but I see no reason why we shouldn't return just data and then users can access these values via data["image"] and data["pixel_to_nm_scaling"]. This would mean the result of loading a .topostats object matches the interal representation and we can remove some logic and wrangling that has been introduced in this PR to sort that out.

Before submitting a Pull Request please check the following.

Existing tests pass.
Pre-commit checks pass.
New functions/methods have typehints and docstrings.
~~New functions/methods have tests which check the intended behaviour is correct.~~ I felt testing that the
pop() method worked seemed excessive. Perhaps when I add in steps for grainstats and others I'll check the
correct items are removed.

Closes #741 Adds the "swiss-army knife" component to run just filtering on files. This involved modifying how the `.topostats` files are loaded and extracted because of the nesting (see #1068). Tests currently fail because the `tests/resources/test_image/minicircle_small.topostats` is version `0.1` and doesn't therefore work with the refactored structure (surprised #1068 passed all tests actualy!). A separate commit will be made for updating this test file and the associated tweaking of tests.

The `tests/resources/test_image/minicircle_small.topostats` was version `0.1` and failed the updates and tests that now work with `0.2`. I've therefore updated this test file and tweaked the associated tests to work with these files. All tests pass locally (watch them fail on CI!).

Closes #742 - Adds `processing.process_grains()` - Adds `run_modules.grains()` Together these allow the `topostats grains` entry point to run which loads `*.topostats` files (`v0.2`), extracts the flattened image and re-runs grains. If any processing artefacts from previous runs are present in the `.topostats` files they are removed. Output is written to the specified directory so that comparisions will be possible. Added in tests and checked that some things in the previous `topstats filters` step work too. Some code is in place for subsequent modules that will be added in turn and they may need refining.

@SylviaWhittle

In working through adding the `topostats grains` entry point I was confused why `.topostats` files had `image` which contained the flattened image but all the processing stages after `Filters` used `image_flattened`. After checking with @SylviaWhittle we have opted to make things consistent across the file output and the processing. Updates tests in light of these changes, previously the result of `loadscans.get_data()` left `img_dict` as a dictionary of the _data_ but to align with other scan types we actually want a nested dictionary with the keys as the filenames then the data (whether that is a single scan from most raw data or the dictionary that `.topostats` files hold). This raises (again as I've asked a similar question before before) the disconnect between a `topostats` object internal to TopoStats and as stored in the HDF5 file format and the value returned by AFMReader. As can be seen [here](https://github.com/AFM-SPM/AFMReader/blob/022dcf286914c23a30da42e4ea401aa577b0b193/AFMReader/topostats.py#L55) for convenience the `image`, `pixel_to_nm_scaling` are extracted from `data` and returned as part of a Tuple along with the `data` from which it was extracted. This _might_ be convenient for users but I see no reason why we shouldn't return just `data` and then users can access these values via `data["image"]` and `data["pixel_to_nm_scaling"]`. This would mean the result of loading a `.topostats` object matches the interal representation and we can remove some logic and wrangling that has been introduced in this PR to sort that out.

SylviaWhittle · 2025-01-31T13:47:40Z

Might be incorrect, but should the WIP DO NOT USE flag be removed from the grains entry in the help output?

&

SylviaWhittle · 2025-01-31T13:49:52Z

I've successfully used the grains program to re-process a pre-flattened .topostats image, resulting in the expected grains being found.

topostats -c config.yaml grains
[Fri, 31 Jan 2025 13:47:21] [INFO    ] [topostats] The YAML configuration file is valid.
[Fri, 31 Jan 2025 13:47:21] [INFO    ] [topostats] The YAML plotting configuration file is valid.
[Fri, 31 Jan 2025 13:47:21] [ERROR   ] [topostats] Splining enabled but Filters disabled. Please check your configuration file.
[Fri, 31 Jan 2025 13:47:21] [ERROR   ] [topostats] [processing.py] [1444] Splining enabled but Filters disabled. Please check your configuration file.
[Fri, 31 Jan 2025 13:47:21] [INFO    ] [topostats] Configuration file loaded from      : config.yaml
[Fri, 31 Jan 2025 13:47:21] [INFO    ] [topostats] Scanning for images in              : data
[Fri, 31 Jan 2025 13:47:21] [INFO    ] [topostats] Output directory                    : output
[Fri, 31 Jan 2025 13:47:21] [INFO    ] [topostats] Looking for images with extension   : .topostats
[Fri, 31 Jan 2025 13:47:21] [INFO    ] [topostats] Images with extension .topostats in data : 1
[Fri, 31 Jan 2025 13:47:21] [INFO    ] [topostats] Thresholding method (Filtering)     : std_dev
[Fri, 31 Jan 2025 13:47:21] [INFO    ] [topostats] Thresholding method (Grains)        : std_dev
[Fri, 31 Jan 2025 13:47:21] [INFO    ] [topostats] Extracting image from data/minicircle_small_orignal.topostats
13:47:21 | INFO |topostats.py:topostats:load_topostats:38 | Loading image from : data/minicircle_small_orignal.topostats
13:47:21 | INFO |topostats.py:topostats:load_topostats:46 | [minicircle_small_orignal] TopoStats file version : 0.2
Processing images from data, results are under output:   0%|                                     | 0/1 [00:00<?, ?it/s][Fri, 31 Jan 2025 13:47:24] [INFO    ] [topostats] Processing : minicircle_small
[Fri, 31 Jan 2025 13:47:24] [INFO    ] [topostats] [minicircle_small] : *** Grain Finding ***
[Fri, 31 Jan 2025 13:47:24] [INFO    ] [topostats] [minicircle_small] : Grains found for direction above : 3
[Fri, 31 Jan 2025 13:47:24] [INFO    ] [topostats] [minicircle_small] : Plotting Grain Finding Images
[Fri, 31 Jan 2025 13:47:25] [INFO    ] [topostats] [minicircle_small] : Grain Finding stage completed successfully.
[Fri, 31 Jan 2025 13:47:25] [INFO    ] [topostats] [minicircle_small] : Saving image to .topostats file
Processing images from data, results are under output: 100%|█████████████████████████████| 1/1 [00:04<00:00,  4.24s/it][Fri, 31 Jan 2025 13:47:25] [INFO    ] [topostats] [minicircle_small] Grain detection completed (NB - Filtering was *not* re-run).
Processing images from data, results are under output: 100%|█████████████████████████████| 1/1 [00:04<00:00,  4.24s/it]


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


  _______      _____      __ __       _____     ______    _______      _____      _______    ______
/\_______)\   ) ___ (    /_/\__/\    ) ___ (   / ____/\ /\_______)\   /\___/\   /\_______)\ / ____/\
\(___  __\/  / /\_/\ \   ) ) ) ) )  / /\_/\ \  ) ) __\/ \(___  __\/  / / _ \ \  \(___  __\/ ) ) __\/
  / / /     / /_/ (_\ \ /_/ /_/ /  / /_/ (_\ \  \ \ \     / / /      \ \(_)/ /    / / /      \ \ \
 ( ( (      \ \ )_/ / / \ \ \_\/   \ \ )_/ / /  _\ \ \   ( ( (       / / _ \ \   ( ( (       _\ \ \
  \ \ \      \ \/_\/ /   )_) )      \ \/_\/ /  )____) )   \ \ \     ( (_( )_) )   \ \ \     )____) )
  /_/_/       )_____(    \_\/        )_____(   \____\/    /_/_/      \/_/ \_\/    /_/_/     \____\/


[Fri, 31 Jan 2025 13:47:25] [INFO    ] [topostats]

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ COMPLETE ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  TopoStats Version           : 2.3.1.dev93+g51269f793.d20250124
  Base Directory              : data
  File Extension              : .topostats
  Files Found                 : 1
  Successfully Processed^1    : 1 (100.0%)
  All statistics              : output/all_statistics.csv
  Distribution Plots          : Disabled. Enable in config 'summary_stats/run' if needed.

  Configuration               : output/config.yaml

  Email                       : topostats@sheffield.ac.uk
  Documentation               : https://afm-spm.github.io/topostats/
  Source Code                 : https://github.com/AFM-SPM/TopoStats/
  Bug Reports/Feature Request : https://github.com/AFM-SPM/TopoStats/issues/new/choose
  Citation File Format        : https://github.com/AFM-SPM/TopoStats/blob/main/CITATION.cff

  ^1 Successful processing of an image is detection of grains and calculation of at least
     grain statistics. If these have been disabled the percentage will be 0.

  If you encounter bugs/issues or have feature requests please report them at the above URL
  or email us.

  If you have found TopoStats useful please consider citing it. A Citation File Format is
  linked above and available from the Source Code page.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

using a barebones config:

# Config file generated 2025-01-21 12:04:47
# # For more information on configuration and how to use it:
# https://afm-spm.github.io/TopoStats/main/configuration.html
base_dir: ./data # Directory in which to search for data files
output_dir: ./output # Directory to output results to
log_level: info # Verbosity of output. Options: warning, error, info, debug
cores: 1 # Number of CPU cores to utilise for processing multiple files simultaneously.
file_ext: .topostats # File extension of the data files.
loading:
  channel: Height # Channel to pull data from in the data files.
  extract: all # Array to extract when loading .topostats files.
filter:
  run: false # Options : true, false
grains:
  run: true
plotting:
  run: true # Options : true, false
  image_set: all # Options : all, core

Result:

SylviaWhittle

Works well for me, (tested locally)

Just a couple documentation things (the comment and the thing here:)

Remove the WIP tag from both the topostats -h command and also the topostats grains -h command (I think?)

topostats/run_modules.py

ns-rse · 2025-01-31T17:22:40Z

Brilliant, thanks for testing @SylviaWhittle

Might be incorrect, but should the WIP DO NOT USE flag be removed from the grains entry in the help output?

Good spot thanks and not at all incorrect, thanks for picking that up.

Corrected along with the comment.

Linting error in pre-commit isn't from this Pull Request as I wrote the documentation separately and it was merged the other day. ~~Not sure why or how its crept into main without being picked up~~¹, will address separately.

I see now, its because I edited it as a "suggestion" rather than in my editor so no automatic line wrapping and I didn't wait for reapproval. Incoming fix shortly. ↩

ns-rse and others added 6 commits January 15, 2025 14:45

tests: Remove os specific file path from caplog test

Loading
Loading status checks…

d7a6e8b

Merge branch 'main' into ns-rse/742-grains-entry-point

Loading
Loading status checks…

be9650b

ns-rse mentioned this pull request Jan 29, 2025

[feature] : Convert img_path to Path object when loading .topostats files AFM-SPM/AFMReader#106

Closed

SylviaWhittle requested changes Jan 31, 2025

View reviewed changes

topostats/run_modules.py Outdated Show resolved Hide resolved

feature: PR feedback, thanks @SylivaWhittle

Loading
Loading status checks…

778caaa

ns-rse added Grains refactor labels Jan 31, 2025

ns-rse requested review from llwiggins, MaxGamill-Sheffield and SylviaWhittle January 31, 2025 17:33

SylviaWhittle approved these changes Jan 31, 2025

View reviewed changes

ns-rse merged commit e9cad80 into main Jan 31, 2025
10 of 11 checks passed

ns-rse deleted the ns-rse/742-grains-entry-point branch January 31, 2025 20:27

ns-rse mentioned this pull request Feb 3, 2025

Return just data when loading .topostats AFM-SPM/AFMReader#109

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds grains entry point #1082

Adds grains entry point #1082

ns-rse commented Jan 28, 2025 •

edited

Loading

SylviaWhittle commented Jan 31, 2025

SylviaWhittle commented Jan 31, 2025 •

edited

Loading

SylviaWhittle left a comment

ns-rse commented Jan 31, 2025 •

edited

Loading

Adds grains entry point #1082

Adds grains entry point #1082

Conversation

ns-rse commented Jan 28, 2025 • edited Loading

For future discussion

SylviaWhittle commented Jan 31, 2025

SylviaWhittle commented Jan 31, 2025 • edited Loading

SylviaWhittle left a comment

Choose a reason for hiding this comment

ns-rse commented Jan 31, 2025 • edited Loading

Footnotes

ns-rse commented Jan 28, 2025 •

edited

Loading

SylviaWhittle commented Jan 31, 2025 •

edited

Loading

ns-rse commented Jan 31, 2025 •

edited

Loading