Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check on some lakes in the MNTOHA modeling set #26

Open
jordansread opened this issue Jul 6, 2020 · 5 comments
Open

Check on some lakes in the MNTOHA modeling set #26

jordansread opened this issue Jul 6, 2020 · 5 comments

Comments

@jordansread
Copy link

I am missing Lake of the Woods (DOW 39000200 or 39000201, from either PGDL or GLM3 TOHA release) and I am missing Cass Lake (04003000) from PGDL which doesn't make sense to me.

@aappling-usgs
Copy link
Contributor

I don't see Lake of the Woods on Tallgrass or in my current list of 638 modelable lakes. Do you have an NHD-HR ID for that one to help me double-check? Do I need to update my source data again?

I do see Cass Lake on Tallgrass from the April 15 runs:

(base) [aappling@tg-login2 lake-temperature-neural-networks] ls -Rl 2_model/out/nhdhr_166868528
2_model/out/nhdhr_166868528:
total 8
drwxr-sr-x 2 aappling cida 4096 Apr 15 17:33 finetune_predict
drwxr-sr-x 2 aappling cida 4096 Apr 13 17:56 pretrain_predict

2_model/out/nhdhr_166868528/finetune_predict:
total 21380
-rw-r--r-- 1 aappling cida       67 Apr 15 17:33 checkpoint
-rw-r--r-- 1 aappling cida      759 Apr 15 17:19 model_config.tsv
-rw-r--r-- 1 aappling cida    29060 Apr 15 17:33 model.data-00000-of-00001
-rw-r--r-- 1 aappling cida      505 Apr 15 17:33 model.index
-rw-r--r-- 1 aappling cida  2941882 Apr 15 17:33 model.meta
-rw-r--r-- 1 aappling cida     9817 Apr 15 17:33 params.npz
-rw-r--r-- 1 aappling cida 14596135 Apr 15 17:33 preds.npz
-rw-r--r-- 1 aappling cida    21925 Apr 15 17:33 stats.npz
-rw-r--r-- 1 aappling cida  4266540 Apr 15 17:20 varied_inputs.npz

2_model/out/nhdhr_166868528/pretrain_predict:
total 37252
-rw-r--r-- 1 aappling cida       67 Apr 13 17:56 checkpoint
-rw-r--r-- 1 aappling cida      715 Apr 13 17:28 model_config.tsv
-rw-r--r-- 1 aappling cida    29060 Apr 13 17:56 model.data-00000-of-00001
-rw-r--r-- 1 aappling cida      505 Apr 13 17:56 model.index
-rw-r--r-- 1 aappling cida  2941882 Apr 13 17:56 model.meta
-rw-r--r-- 1 aappling cida     9822 Apr 13 17:56 params.npz
-rw-r--r-- 1 aappling cida 14701592 Apr 13 17:56 preds.npz
-rw-r--r-- 1 aappling cida    42383 Apr 13 17:56 stats.npz
-rw-r--r-- 1 aappling cida 20391792 Apr 13 17:29 varied_inputs.npz

so the next question for that one is where it's failing to get to ScienceBase.

@jordansread
Copy link
Author

Lake of the Woods is nhdhr_123319728

I can dig in on my end too and check the data release

@aappling-usgs
Copy link
Contributor

I do see Cass Lake in my prep data.frames in mntoha-data-release...

> pgdl_predictions_df <- remake::fetch('pgdl_predictions_df')
> pgdl_predictions_df %>% slice(grep('166868528', site_id)) %>% glimpse
Rows: 1
Columns: 4
$ site_id         <chr> "nhdhr_166868528"
$ source_filepath <chr> "../lake-temperature-neural-networks/3_assess/out/nhd…
$ source_hash     <chr> "ffb2261c0f7d5a3f38e10bd8e9577e65"
$ out_file        <chr> "pgdl_nhdhr_166868528_temperatures.csv"

Should be in Group 2:

> pgdl_site_ids_grouped <- remake::fetch('pgdl_site_ids_grouped')
>  pgdl_site_ids_grouped %>% slice(grep('nhdhr_166868528', site_id))
# A tibble: 1 x 2
  site_id         group_id
  <chr>           <chr>
1 nhdhr_166868528 02_N47.00-48.00_W94.00-97.25

And actually, I see Lake of the Woods in there too, searching by nhdhr:

>  pgdl_site_ids_grouped %>% slice(grep('nhdhr_123319728', site_id))
# A tibble: 1 x 2
  site_id         group_id
  <chr>           <chr>
1 nhdhr_123319728 01_N48.00-49.50_W89.50-97.25

@aappling-usgs
Copy link
Contributor

aappling-usgs commented Jul 6, 2020

I think I see Lake of the Woods in the Group 1 zipfile (on my Tallgrass mntoha-data-release repo):

> group1 <- unzip('tmp/pgdl_predictions_01_N48.00-49.50_W89.50-97.25.zip', list=TRUE)
> group1 %>% slice(grep('nhdhr_123319728', Name))
                                   Name   Length                Date
1 pgdl_nhdhr_123319728_temperatures.csv 36091975 2020-04-23 15:17:00

and here's Cass:

> group2 <- unzip('tmp/pgdl_predictions_02_N47.00-48.00_W94.00-97.25.zip', list=TRUE)
> group2 %>% slice(grep('nhdhr_166868528', Name))
                                   Name   Length                Date
1 pgdl_nhdhr_166868528_temperatures.csv 19412665 2020-04-23 15:16:00

@aappling-usgs
Copy link
Contributor

aappling-usgs commented Jul 6, 2020

I downloaded those two zip files from ScienceBase and confirmed that Cass and LotW are indeed in those files:

> unzip('~/Downloads/pgdl_predictions_01_N48.00-49.50_W89.50-97.25.zip', list=TRUE) %>% slice(grep('nhdhr_123319728', Name))
                                   Name   Length                Date
1 pgdl_nhdhr_123319728_temperatures.csv 36091975 2020-04-23 15:17:00

> unzip('~/Downloads/pgdl_predictions_02_N47.00-48.00_W94.00-97.25.zip', list=TRUE) %>% slice(grep('nhdhr_166868528', Name))
                                   Name   Length                Date
1 pgdl_nhdhr_166868528_temperatures.csv 19412665 2020-04-23 15:16:00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants