Exposure pipeline #4

t-downing · 2024-11-01T20:11:43Z

Pipeline to calculate flood exposure using Worldpop and Floodscan. Basic methodology is to:

Calculate exposure raster (with src.datasources.floodscan.calculate_recent_flood_exposure_rasters())
1. Filter Floodscan raster to ≥ 0.05 (i.e. only keep pixels with more than 5% flooding, to reduce noise)
2. Interpolate Floodscan raster to Worldpop grid
3. Multiply Floodscan raster by Worldpop raster to get exposure raster
Take raster stats (currently just sum, with src.datasources.floodscan.calculate_recent_flood_exposure_rasterstats())
1. Iterate over admin2s and clip raster to calculate sum

Only things that need looking at are the actual functions used by the pipeline (i.e. what is outlined above). There are a couple of notebooks that may be of interest, whose functionality hasn't yet been integrated into either the pipeline or the app:

exposure_plotting: the first two plots have already been integrated into the app, but the admin bounds ones haven't been yet. I think these could be pretty useful for picking out where specifically flooding is high.
floodscan_historical: just calculating the 1998-2023 flood exposure using the .nc in the Google Drive (faster than stacking up all the historical COGs), which needs to be done whenever a new country is added (takes about two hours).

hannahker · 2024-11-06T23:32:33Z

@t-downing what kind of feedback do you think makes the most sense here? There's a convo to be had about what we want our "production" setup to be, but I think that's probably best had elsewhere.

t-downing · 2024-11-07T00:07:18Z

@hannahker yes good question, I should have specified. I think here it would be good just to agree on the core methods for:

calculating the exposure rasters
taking the raster stats of the exposure

Let me just point out where exactly I'm talking about.

t-downing · 2024-11-07T00:11:34Z

src/datasources/floodscan.py

+    # filter to only pixels with flood extent > 5% to reduce noise
+    ds_recent_filtered = ds_recent.where(ds_recent >= 0.05)
+    # interpolate to Worldpop grid and
+    # multiply by population to get exposure
+    exposure = ds_recent_filtered.interp_like(pop, method="nearest") * pop


Here is where we are actually calculating the exposure rasters. I am unsure whether it makes sense to only take pixels with flood extent ≥ 5%. I think we lose a fair amount of information this way, and I'm not sure we really benefit from the reducing the noise.

We also may want to think about whether to multiply by the relevant population raster for that year. Also, which population raster to use? There are several options on WorldPop (of which we're using the 2020_1km_Aggregated_UNadj). There is also GHSL.

I don't have a lot of background in working with the Floodscan data, but I'd lean more towards removing that 5% threshold. I think we're already doing a lot of smoothing, so this seems a little bit over cautious to remove noise. We'd also want to be able to justify why we picked that 5% number, which seems slightly arbitrary.

t-downing · 2024-11-07T00:13:52Z

src/datasources/floodscan.py

+        for pcode, row in tqdm(
+            adm.set_index("ADM2_PCODE").iterrows(), total=len(adm)
+        ):
+            da_clip = ds_exp_recent.rio.clip([row.geometry])
+            dff = (
+                da_clip.sum(dim=["x", "y"])
+                .to_dataframe(name="total_exposed")["total_exposed"]
+                .astype(int)
+                .reset_index()
+            )
+            dff["ADM2_PCODE"] = pcode
+            dfs.append(dff)
+
+        df_exp_adm_new = pd.concat(dfs, ignore_index=True)


And here is where we're calculating the raster stats (in this case, just sum). But I guess we should replace this with whatever standard method we are using in ds-raster-stats

Would we want to also upsample the raster as we're taking the raster stats here? I know we're going to the 1km worldpop grid, but if we're calculating to the admin 2 level we might want to go a bit more granular.

t-downing added 8 commits September 23, 2024 11:55

exp pipeline init

1cc5aab

exposure raster calc

2bd42ef

raster stats

b7c8382

sfed exp comparison

c4500c6

Merge branch 'main' into exposure-pipeline

7e94941

add eth

16faa11

update README

e7ef047

basic function documentation

f0dd8ee

t-downing requested review from zackarno and hannahker November 1, 2024 20:12

add somalia

ee8451f

t-downing commented Nov 7, 2024

View reviewed changes

t-downing added 2 commits November 7, 2024 16:27

add south sudan

fbd6ba8

plot comments

37df9bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exposure pipeline #4

Exposure pipeline #4

t-downing commented Nov 1, 2024

hannahker commented Nov 6, 2024

t-downing commented Nov 7, 2024

t-downing Nov 7, 2024

hannahker Nov 20, 2024

t-downing Nov 7, 2024

hannahker Nov 20, 2024 •

edited

Loading

Exposure pipeline #4

Are you sure you want to change the base?

Exposure pipeline #4

Conversation

t-downing commented Nov 1, 2024

hannahker commented Nov 6, 2024

t-downing commented Nov 7, 2024

t-downing Nov 7, 2024

Choose a reason for hiding this comment

hannahker Nov 20, 2024

Choose a reason for hiding this comment

t-downing Nov 7, 2024

Choose a reason for hiding this comment

hannahker Nov 20, 2024 • edited Loading

Choose a reason for hiding this comment

hannahker Nov 20, 2024 •

edited

Loading