Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move snow aging call to avoid need for osnowd in restart #545

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

har917
Copy link
Collaborator

@har917 har917 commented Feb 12, 2025

CABLE

Thank you for submitting a pull request to the CABLE Project.

Description

As per #539 it would be useful (for coupled applications) if there was no need for %osnowd in the restart file.
The current need stems from the call to snow_aging() prior to the first setting of %osnowd as part of the time stepping.
Moving the call to snow_aging to after the call to soil_snow (or sli_main) avoids this as both osnowd and snowd are evaluated in those routines.

This change will not be bitwise reproducible - but is likely scientifically of no impact.

There are partner changes that are needed in the coupled routines (particularly CM2 and AM3) which will be included in this issue - but have not been tested.

Once accepted we may also be able to review/remove the need for osnowd in the offline restart and MPI sections.

Fixes #539

Type of change

Please delete options that are not relevant.

  • Bug fix
  • New or updated documentation
  • Minor change in science algorithm sequencing

Checklist

  • The new content is accessible and located in the appropriate section
  • I have checked that links are valid and point to the intended content
  • I have checked my code/text and corrected any misspellings

Testing

  • Are the changes bitwise-compatible with the main branch? If working on an optional feature, are the results bitwise-compatible when this feature is off? If yes, copy benchcab output showing successful completion of the bitwise compatibility tests or equivalent results below this line.

  • Are the changes non bitwise-compatible with the main branch because of a bug fix or a feature being newly implemented or improved? If yes, add the link to the modelevaluation.org analysis versus the main branch or equivalent results below this line.

Please add a reviewer when ready for review.


📚 Documentation preview 📚: https://cable--545.org.readthedocs.build/en/545/

@har917
Copy link
Collaborator Author

har917 commented Feb 12, 2025

re implementation in CM2 and AM3 - snow age state variable ssnow%snage was already in the prognostic bank so this does not need to be modified.

@har917
Copy link
Collaborator Author

har917 commented Feb 12, 2025

benchcab tests - unsurprisingly bitwise tests failed. More interestingly this only happened at 7 (of 42 sites) - IT-Lav, US-FPe, US-GLE, US-Ha1, US-Me2, US-MMS and US-NR1.

It maybe interesting to chase these down further - are these sites the only sites with snow in the restart/initial conditions? For these runs it appears that initial conditions are coming from offline_CSIRO_1x1.nc

IT-Lav site: latitude = 45.95, longitude=11.28, snowd = 0.034m
US-Fpe : latitude = 48.3, longitude=-105.1, snowd = 0.048m
US-GLE : latitude = 41.3, longitude=-106.2, snowd = 0.070m
US-Ha1: latitude = 42.5, longitude=-72.2, snowd = 0.020m
US-Me2: latitude = 44.5, longitude=-121.6, snowd = 0.066m
US-MMS: latitude = 39.3, longitude=-86.4, snowd = 0.016m
US-NR1: latitude = 40.0, longitude=-105.5, snowd = 0.014m

i.e. the initial conditions all have snow, but with very small amount. There are other sites with snow that pass the bitwise test.

The move of the snow_aging routine means that it is potentially sensitive to what happens on the first time step (snowd, osnowd, ssdn and tggsn that gets into the snow_Aging routine) - osnowd will be different on the first call to snow_Aging and ssdn and tggsn could well be different e.g. if the snow pack disappears during that first time step (as could happen with v. small amounts of snow).

Otherwise - it seems that the change has no impact!

Running through me.org


EDIT: not necessarily a first time step issue: IT-Lav passes on many variables for 17296 time steps - then fails. However the albedo does differ on first time step (a somewhat odd result).

@har917
Copy link
Collaborator Author

har917 commented Feb 12, 2025

benchcab output file and yaml file for completed tests.
benchmark_cable_qsub.sh.o135167668.txt
config.yaml.txt

@JhanSrbinovsky
Copy link
Collaborator

@har917 When this was implemented it was thought to be necessary for calculating snow albedo on the first timestep. IF it is moved to later what will this do the albedo

@har917
Copy link
Collaborator Author

har917 commented Feb 12, 2025

IF it is moved to later what will this do the albedo.

History: back in CABLE2.x days the albedo and snow aging routines were wrapped in together. Issue 331 refactored the code so now CABLE3-albedo only depends on the snow depth, age, density and temperature, not old snow depth.

However - it's not quite as simple as this. This issue in effect moves the snow_aging routine from after the call to albedo (current time step) to the end of the previous time step - order of operations was albedo then snow_aging, it's now snow_aging then albedo. Tracking dependencies is too complicated but my expectation was that there would be differences but not scientifically important ones.

The results seen so far indicate it's even better than this - and it is only making any difference for cases which initialise with a very small amount of snow. I suspect that these come about because either i) the new routine allows for the complete removal of snow pack on the first time step whereas the old routine maintained it (or vice versa) and/or ii) the value of %tss that is getting passed into snow_aging is slightly different.


Follow up on the second cause - looking at the code again I think I've (inadvertently) also fixed a bug.

In offline serial mode it looks like %tss is read in from the restart file (under load_parameters) and then promptly overwritten by %tgg (line 417 in cable_serial), i.e. the value of %tss that gets passed into snow_aging (CABLE3 MAIN) on the first time step is wrong.

The corresponding line in mpi_worker is not there.

This means that by moving snow_aging I've also removed the dependency on this bug in the serial model - @ccarouge is it worth removing that line as part of this PR?

What is wrong in both serial and mpi versions is that %otss and %otss_0 (which I don't think is ever used anymore) are being initialised to %tgg not to %tss in cable_mpiworker and cable_serial

@har917
Copy link
Collaborator Author

har917 commented Feb 13, 2025

me.org output - at this aggregate level there is no discernable impact from this change.

@har917
Copy link
Collaborator Author

har917 commented Feb 13, 2025

@ccarouge sending for main review - noting questions around whether we need to address the surrounding i/o code better or not.

@JhanSrbinovsky this PR includes edits to the cbm() routines in each of the applications - these have not been tested. We should devise a test this aspect of the PR as well (likely requires PRs in UM7 and AM3 repos)

@access-hive-bot
Copy link

This pull request has been mentioned on ACCESS Hive Community Forum. There might be relevant details there:

https://forum.access-hive.org.au/t/cable-in-am3-development-notes/4073/6

Copy link
Member

@ccarouge ccarouge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do the change for CM2? I don't think we'll ever implement changes to CM2 (the user base is too small).

@har917
Copy link
Collaborator Author

har917 commented Feb 14, 2025

Why do the change for CM2?

Because we should be trying to keep the repo internally consistent - not changing the CM2 code will mean that the cbm's in the different applications are fundamentally different -> confusion.

To be honest - the better way forward is to remove the CM2 (and ESM1.5) code from the repo entirely since we're not supposed to be modifying (expecting to modify) the code for these applications any more.

Copy link
Collaborator

@JhanSrbinovsky JhanSrbinovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to port this file manually to ACCESS3-JULES and UM7 - we need them now

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll port this file manually to ACCESS3-JULES

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll port this file manually to UM7

@har917
Copy link
Collaborator Author

har917 commented Feb 18, 2025

@ccarouge Are you happy for this to go into MAIN? or do you want me to unpick the CM2 change?

@JhanSrbinovsky we probably need short AM3 and ESM1.6 runs with this in before merging I suspect.

@JhanSrbinovsky
Copy link
Collaborator

output in feat-snow_aging-expt-c20a3efa/ for ESM1.6

AM3 - rose-calc is giving grief at the minute

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

investigate moving snow_Aging() routine
4 participants