-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exposure pipeline #4
base: main
Are you sure you want to change the base?
Conversation
@t-downing what kind of feedback do you think makes the most sense here? There's a convo to be had about what we want our "production" setup to be, but I think that's probably best had elsewhere. |
@hannahker yes good question, I should have specified. I think here it would be good just to agree on the core methods for:
Let me just point out where exactly I'm talking about. |
# filter to only pixels with flood extent > 5% to reduce noise | ||
ds_recent_filtered = ds_recent.where(ds_recent >= 0.05) | ||
# interpolate to Worldpop grid and | ||
# multiply by population to get exposure | ||
exposure = ds_recent_filtered.interp_like(pop, method="nearest") * pop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is where we are actually calculating the exposure rasters. I am unsure whether it makes sense to only take pixels with flood extent ≥ 5%. I think we lose a fair amount of information this way, and I'm not sure we really benefit from the reducing the noise.
We also may want to think about whether to multiply by the relevant population raster for that year. Also, which population raster to use? There are several options on WorldPop (of which we're using the 2020_1km_Aggregated_UNadj
). There is also GHSL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a lot of background in working with the Floodscan data, but I'd lean more towards removing that 5% threshold. I think we're already doing a lot of smoothing, so this seems a little bit over cautious to remove noise. We'd also want to be able to justify why we picked that 5% number, which seems slightly arbitrary.
for pcode, row in tqdm( | ||
adm.set_index("ADM2_PCODE").iterrows(), total=len(adm) | ||
): | ||
da_clip = ds_exp_recent.rio.clip([row.geometry]) | ||
dff = ( | ||
da_clip.sum(dim=["x", "y"]) | ||
.to_dataframe(name="total_exposed")["total_exposed"] | ||
.astype(int) | ||
.reset_index() | ||
) | ||
dff["ADM2_PCODE"] = pcode | ||
dfs.append(dff) | ||
|
||
df_exp_adm_new = pd.concat(dfs, ignore_index=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here is where we're calculating the raster stats (in this case, just sum
). But I guess we should replace this with whatever standard method we are using in ds-raster-stats
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we want to also upsample the raster as we're taking the raster stats here? I know we're going to the 1km worldpop grid, but if we're calculating to the admin 2 level we might want to go a bit more granular.
Pipeline to calculate flood exposure using Worldpop and Floodscan. Basic methodology is to:
src.datasources.floodscan.calculate_recent_flood_exposure_rasters()
)src.datasources.floodscan.calculate_recent_flood_exposure_rasterstats()
)Only things that need looking at are the actual functions used by the pipeline (i.e. what is outlined above). There are a couple of notebooks that may be of interest, whose functionality hasn't yet been integrated into either the pipeline or the app:
exposure_plotting
: the first two plots have already been integrated into the app, but the admin bounds ones haven't been yet. I think these could be pretty useful for picking out where specifically flooding is high.floodscan_historical
: just calculating the 1998-2023 flood exposure using the.nc
in the Google Drive (faster than stacking up all the historical COGs), which needs to be done whenever a new country is added (takes about two hours).