Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraphCast to GenCast: Input File Changes? #121

Open
bondijoe27 opened this issue Jan 8, 2025 · 2 comments
Open

GraphCast to GenCast: Input File Changes? #121

bondijoe27 opened this issue Jan 8, 2025 · 2 comments

Comments

@bondijoe27
Copy link

Hi,

I'm exploring transitioning from GraphCast to GenCast. Besides model definition and parameters, do the input files require changes? Specifically:

Are GraphCast input files directly compatible with GenCast?

If not, what specific input file adjustments are needed (e.g., format, variables, dimensions)?

Any pointers to GenCast input file documentation or examples would be appreciated.

Thanks!

@bondijoe27
Copy link
Author

We are doing forecast now. And here is our input:

<xarray.Dataset> Size: 698MB
Dimensions: (lat: 721, lon: 1440, time: 2, batch: 1, level: 13)
Coordinates:

  • lat (lat) float32 3kB -90.0 -89.75 -89.5 ... 89.75 90.0
  • lon (lon) float32 6kB 0.0 0.25 0.5 ... 359.5 359.8
  • time (time) timedelta64[ns] 16B 00:00:00 06:00:00
  • level (level) int32 52B 50 100 150 200 ... 850 925 1000
    datetime (batch, time) datetime64[ns] 16B ...
    Dimensions without coordinates: batch
    Data variables: (12/13)
    geopotential_at_surface (lat, lon) float32 4MB ...
    2m_temperature (batch, time, lat, lon) float32 8MB ...
    mean_sea_level_pressure (batch, time, lat, lon) float32 8MB ...
    10m_u_component_of_wind (batch, time, lat, lon) float32 8MB ...
    10m_v_component_of_wind (batch, time, lat, lon) float32 8MB ...
    geopotential (batch, time, level, lat, lon) float32 108MB ...
    ... ...
    specific_humidity (batch, time, level, lat, lon) float32 108MB ...
    vertical_velocity (batch, time, level, lat, lon) float32 108MB ...
    u_component_of_wind (batch, time, level, lat, lon) float32 108MB ...
    v_component_of_wind (batch, time, level, lat, lon) float32 108MB ...
    land_sea_mask (lat, lon) float32 4MB ...
    total_precipitation_6hr (batch, time, lat, lon) float32 8MB ...

@alvarosg
Copy link
Collaborator

alvarosg commented Jan 8, 2025

The input files should be identical to what is required for GraphCast, except that GenCast also requires sea surface temperature. For the sea surface temperature, HRES-fc-0 data has a placeholder value over land, but before you feed it to the model you should set values over land to nan, by looking at which pixels are nan for SST in ERA5 data, and setting those to nan. See "Load the example data" section here:

For HRES-fc0 sea surface temperature, we assigned NaNs to grid cells in which sea surface temperature was NaN in the ERA5 dataset (this remains fixed at all times).

To be 100% sure, I would recommend trying to build input data yourself for the same date as the example data provided and verify you get identical input data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants