Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cheklist IDs in ebd that are not in sampling event data #23

Closed
AKissel opened this issue Oct 5, 2018 · 4 comments
Closed

Cheklist IDs in ebd that are not in sampling event data #23

AKissel opened this issue Oct 5, 2018 · 4 comments

Comments

@AKissel
Copy link

AKissel commented Oct 5, 2018

Hello,

I'm new to auk (using version 0.3.0), and working with data for several species in Hawaii. For species of interest, I've filtered my ebd and sampling event data to Hawaii, and then attempted to zero fill these. However, I am getting an error that there are some checklists in the EBD that are missing in the sampling data. Here is an example of my code for one species (Zebra Dove):

#species data
f<-file.path(ebd_dir,'ebd_zebdov_relMay-2018.txt')
output_file <- file.path(ebd_dir_out, 'ebd_zebdov_relMay-2018.txt_HI.txt')

ebd<-f%>% 
  auk_ebd() %>% 
  auk_state("US-HI") %>% 
  auk_complete() %>% 
  auk_filter(file=output_file) %>%
  read_ebd(unique=TRUE)

#sampling event data
f_sampling<-file.path(ebd_dir_sampling, 'ebd_sampling_relAug-2018.txt')
f_out_sampling<-file.path(ebd_dir_sampling, "sampling_HI.txt")

sampling_dat<-auk_sampling(f_sampling) %>% 
  auk_state("US-HI") %>% 
  auk_complete() %>% 
  auk_filter(file=f_out_sampling) %>% 
  read_sampling(unique = TRUE)

#zero fill
zf<-auk_zerofill(ebd, sampling_dat, collapse=T)

and the error:

Error in auk_zerofill.data.frame(ebd, sampling_dat, collapse = T) :
Some checklists in EBD are missing from sampling event data.

Further digging suggests there are 131 checklist IDs present in the ebd that are not present in the sampling event data for this species. Wondering if anyone has any insight into why this may be the case, and how I might get around this (aside from dropping those checklists from the ebd)? This is an issue for several species.

Thanks,
Amanda

@mstrimas
Copy link
Contributor

mstrimas commented Oct 6, 2018

HI Amanda, based on your example, it appears you're mixing two different versions of the EBD, May-2018 for the EBD and Aug-2018 for the sampling event data. So, you'll want to download the Aug-2018 EBD.

Also, based on the file names, it appears like the file "ebd_zebdov_relMay-2018.txt", may have been downloaded via the custom download form, i.e. you didn't download the full EBD and filter it to Zebra Dove yourself. This will hopefully just work, but using custom download data with the sampling event data is not something I've tested, so there may be issues. Let me know how it goes.

Finally, in general the best way to do this, is to combine the filtering of the EBD and sampling event data:

ebd <- auk_ebd(f, file_sampling = f_sampling) %>% 
  auk_state("US-HI") %>% 
  auk_complete() %>% 
  auk_filter(file = output_file, file_sampling = f_out_sampling)
zf <- auk_zerofill(ebd, sampling_dat, collapse = TRUE)

@AKissel
Copy link
Author

AKissel commented Oct 8, 2018

Great, thank you for the tips. I'll reconcile the versions and see if that solves the problem. You are correct in that I'm using a custom download, and thus my primary reason for not filtering the ebd and sampling data at the same time is because I'm looping through ~30 species and it seemed redundant (and potentially time consuming) to filter the sampling data every time. I'd be curious to get your thoughts on whether combining the filtering would be essential, or if it would indeed just slow down the process? Thanks!

@eliotmiller
Copy link
Collaborator

Have a look at the auk_split function.

@mstrimas
Copy link
Contributor

Don’t loop, just filter to all 30 species at once with auk_species(). You can then read in the file directly, or if too big, use auk_split() as Eliot suggests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants