Skip to content

Log & Meeting Updates

Christina Bandaragoda edited this page Aug 17, 2018 · 18 revisions

Latest Update

Aug 16 & Aug 17 Meeting Summary

Tasks and agreements from recent meeting are listed in this Github Repo/Issues:

Link to Tasks in progress when Kelsey and Christina work on Source Areas in North Carolina.

Link to Tasks related to Aug 16 & 17 discussions on how to deliver the drinking water campaign samples in a data archive

Publication of Genomic/microbial data: Sharing microbial data is sometimes specified by the journal - the journal may mandate where the data have to be made available. The primary problem is that there is no standardized format. Currently labs just is to pick their own format. The preference is to have one standard format for this type of data to be stored. Questions for WQ labs: If there is a trusted domain specific repository, publish there and reference in HydroShare. Which repo is best ? For our case, it is simpler to keep a diverse study and data products together in one repo (e.g. HydroShare)? What are the expectations to make use of the University repository (e.g. MetaStorm + UW library) The key is to make it accessible and publicly available, with a focus on the target research audience and community trusted repositories that are well supported. The ddopt format that Qiita (Cheetah) is using (https://qiita.ucsd.edu/)

Data model formats Cross tabulated data gives people a better overall idea of what is available/results for any given sample. This is visually easier to digest than long format data; visual confirmations (by people looking at the data) tend to be done in short format (cross-tab). Analysis is usually done in long format. About 30% of the VT analysis is in short format (cross tabulated), the data is put into a CSV file and then uploads to R - R tends to like data in a long format (serial). In summary, data needs to be machine and human readable. We will have a process for inputing data in serial (long form) and develop code to convert to cross-tab (short form).

The Data Archive Paper outline and list of products in the archive is emerging here - to use as a starting point for discussions in Aug 24 meetings.

The Data Management plan has been updated with specific bullets of data products that are public and private for either explicitly including in the Data Sharing Agreement, or to be sure it's clear how to protect and use these data products.

July 11, 2018 43rd Annual Natural Hazards Workshop Researchers Meeting, Denver, CO.

Link to presentation Building research software infrastructure to prevent disasters like Hurricane Maria

June 26-30, 2018 Meshing with Data Hackathon, San Juan, Puerto Rico

HACKING COMMUNICATIONS SOLUTIONS FOLLOWING NATURAL DISASTERS - Read the blogpost Freshwater Initiative student Jimmy Phuong is a UW School of Medicine PhD Candidate in Biomedical Informatics and Medical Education studying computationally intensive, data-driven knowledge discovery in health research. He is currently facilitating the collection of spatio-temporal, socio-demographic data in Puerto Rico following Hurricane Maria. In June 2018, he had the opportunity to participate in Meshing with Data, a 43-hour hackathon which brought together computer scientists, software engineers, and other scientists to innovate solutions to communications problems following natural disasters. This work was made possible after the May 29 Workshop (see below) with the collaboration of Dr. Patricia Ordonez, a Professor of Biomedical Informatics at the University of Puerto Rico at Rio Piedras and Dr. Graciela Ramirez-Toro, Director of Center for Environmental Education Conservation and Research at the Inter American University of Puerto Rico, and support of the National Science Foundation RAPID research and Earth System Information Partners Lab awards.

May 29, 2018 Collaborative RAPID Design Workshop in San Juan and Patillas, Puerto Rico

Link to Workshop materials made available on HydroShare.

May 24, 2018 Meeting Summary

StoryMap: We had a discussion of photo data planning as an option to use field pictures from public streams - Miguel will follow up with William on instructions for adding this data to StoryMap. Amber will give suggestions for adding photos as a HydroShare resource. Option 1. Public photos stay public and go to Story Map, system summaries would be de-identified. Option 2. Private photos stay private linked to Ben’s schematics with de identified naming from the data spreadsheet linked to the system names.

Summer presentations: Links to slides and posters for presentations in Puerto Rico and summer conferences are shared by email and in meeting notes, and when finished, on this meeting summary with links to where the materials are published. Team instructions for adding metadata:

Instructions: Go to www.hydroshare.org Sign up. Activate in email. Go to Collaborate. Search Puerto Rico Water Studies. Ask to Join. (Group managers will give Edit permissions). Click on Resources. The two links above are on this list. Go to the resource page, Edit fields for Author, Funding, Contributors etc. Review digital citation format.
If you run into any questions or issues - report and discuss on the HydroShare/PuertoRicoWaterStudies Github repository- Github Issue to recover from any glitches

Data Sharing Guidelines: A template in development is available at this HydroShare resource. Chris Lenhardt will catalogue permissions by dataset and Graciela will update after seeing suggestions. https://www.hydroshare.org/resource/732493def7a24b5596203401a3f23706/data/contents/RAPID_Data_sharing_guidelines_draft_20180521.docx

Design Workshop May 29 Planning: Agenda set for presentations by Graciela, Melitza, Fernando, William and Christina, Card sorting exercise and Human Centered Design exercises were discussed. Focus on exploring the ‘no power’ scenario was decided.

Expected Workshop Outcomes:

  1. User driven data priorities by scenario card sorting
  2. Recruitment for Design Interviews for population health data
  3. Collaborative Design for Information Distribution
  4. RAPID project refined personas
  5. Puerto Rico Water Studies Group collaborative authorship experiment (this resource)

May 9, 2018

The format of the water quality data was presented, covering some aspects of QAQC and sample processing. The Virginia Tech group developed a spreadsheet to populate data and sampling sites with QAQC Ben Davis who worked on sampling trip to Puerto RIco: 6-drinking water systems, 3 surface water systems, 1 drinking water plant, 1 wastewater treatment plant. The goal is to document clearly in a data publication what was done, where samples were taken, why samples were taken, how samples were named, etc. GPS points need to be reconciled. Sample Inventory Master Ishi Keenum showed a sample data worksheet: columns are color coded according to endpoint category (and their box and replicate IDs).

  1. Comprehensive name: concat of col B thru J
  2. Col Z thru CH: {chem element}{solubility limit & filtered?}{minimum detection limit}
  3. Col CJ thru CR - results are in the corresponding columns CS thru DA (??) Mapping the raw data to result endpoints might need to chunk out subsets of the column as a subset data set. Chunking of the data comes as they decide what to study and may be dependent upon results that are still coming in. They plan to produce “bite-sized” chunks of the data for use with specific analyses and these will likely be shared with R code, etc.

Issues to clear up: there is some confusion on how to present the data because they aren’t sure what is wanted for the “model” - but also from the cyberinfrastructure/data sharing/publication side of things. CECIA and VT have different data sheet formats. Chemical data results perhaps more straightforward for interpretation; bacterial data results may have more nuances for interpretation.

Next steps.

  • Work with Graciela to describe sites and their status while sampling occurred. Maps need to be anonymized. Terminology needs to be worked to make sure the words/semantics of the hydrologic terms are being used correctly. Inorganics nomenclature (maybe the ICPNS)
  • Work on a sampling QAQC protocol document that includes: sampling protocols, Sample filtration/preparation details, analysis protocols, QAQC info.

May 1 2018 NSF Software Infrastructure for Sustained Innovation (SI2) Principal Investigators Meeting in Washington, D.C.: link to citation with materials on HydroShare.

Other materials include the poster and slide used for Lightning Talk made available on Figshare.

April 20, 2018 Meeting Summary

Miguel Leon presented his work on Environmental Data Archive Contents, Examples of Data Access, Technology and Data Formats. A recording of the meeting is available on Youtube at this link. The Power Point Presentation shared in the April 20 meeting is available on HydroShare.. Research products now available on HydroShare include:

HydroShare resources for the Luquillo CZO Sensor database

Digital Globe remotely sensed imagery sourced from digital globe has Open Data for Hurricane Maria Pre-event at this link.

NASA GLiHT Data is Goddard's LiDAR, Hyperspectral and Thermal Mapper (more info here)

April 6, 2018 Meeting Summary

In this meeting we initiated conversations to define a team approach for managing confidentiality and privacy (per US Homeland Security, Safe Drinking Water Act, and HIPAA compliance concerns). In the RAPID Almost Like Maria Project Folder, there is a folder called All-Hands Meetings where you can find the Meeting Notes

Group consensus is being developed around the following data management issues with project policies to support data sharing mechanisms informed by the followinng federal guidelines:

  1. Institutional Review Board CFR 46 for the protection of human subjects. See FDA site for guidance.

  2. Water and Wastewater systems Sector Homeland security. See EPA site for guidance

  3. Standards for regulated community water system baselines. See EPA site for regulations

  4. Standards for unregulated contaminant monitoring rule, these are monitoring reports for environmental contaminants of which there are concern and infrequent monitoring, but no established determination for regulation. See EPA site.

  5. Safe Drinking Water Information System (SDWIS)with estimates of population served by select PWS and repository of EPA regulated locations. See SDWIS for Puerto Rico.

Drinking water intake locations (point locations):

We do not make public* the exact location of drinking water intake supplies. This includes maintaining privacy and sharing control of digital artifacts that include Latitude and Longitude values in tables, geographic shapefiles, or mapped to a scale in print or interactive online map, that makes the location explicitly available for public viewing or machine learning. Point locations are private**.

--Privacy management for point locations is owned*** by Graciela Toro-Ramirez

--Point locations of PRASA systems will be established after discussions and invitations to own and manage these datasets (facilitated by Graciela Toro-Ramirez)

--Point locations of tanks in the study area and groundwater wells will be established after discussions and invitations with community operators and well owners (facilitated by Graciela Toro-Ramirez).

Drinking water source locations (polygon areas):

We do not make public* the source area geographic information for community drinking water systems for areas that are less than 1 square kilometer. We do not make public* the source area geographic information for PRASA or regulated public utility areas that are less than 1 square kilometer. We do make public the source area polygons from source Areas Polygons from public streams. We do make public the aggregate information on watershed source areas at the spatial scale of the municipalities in Puerto Rico (e.g. Patillas).

--Privacy management for polygon areas of community drinking water system is owned*** by Graciela Toro-Ramirez

--Privacy management for polygon areas of PRASA systems will be established after discussions and invitations to own and manage these datasets.

--Privacy management for point locations and polygon areas of public systems will be managed by Virginia Tech Data Manager (email currently set to William Rhoads) and Christina Bandaragoda

--Privacy management for polygon areas of for the Patillas watershed system will be managed by Jim Phuong and Christina Bandaragoda for population health and water research

Definitions

*Public is a term we use to include data sharing and publication by publishing in a print or online article, posting to a public website, posting to a Google Drive with open or team settings beyond the control of the original data owner, and specifically with regards to the research products of the project, the data posted to HydroShare and set to ‘Public’ (sharing and licensing decisions are based on the data owner discretion).

** Private is a term we use to describe data shared by the following methods (for example), means to send data by email with written communication of privacy expectations or agreements, sharing to a Google Drive with closed sharing settings within the control of the original data owner, and specifically with regards to the research products of the project, the data posted to HydroShare and set to ‘Private’, the shareable button must be unchecked, and the a license agreement must be agreed to before downloading.

***Ownership identifies the responsible individuals and institutions managing the distribution of research products. It is at the discretion of the owner to manage email and online sharing practices to protect privacy based on their personal discretion, institutional policies, and pre-existing agreements. With regards to the research products of this project, the data posted to HydroShare will by ‘Owned’ by the individual and/or the institutional data manager, with co-ownership and sharing settings managed by the responsible owner.

March 20, 2018 Meeting Summary

In the RAPID Almost Like Maria Project Folder, there is a new folder called All-Hands Meetings where you can find the Meeting Notes and this March 20 2018 Shared Materials folder with links to William Rhoad’s slides on the Puerto Rico Drinking Water Campaign which provide the dates at which VT projects various data products will be available.

Graciela Ramirez-Toro gave an overview of her work since 2005 which has included studying how interventions to drinking water quality function based on education and health outcomes. Samples within drinking water system have impacts to human subjects (IRB) with regulations to follow for Homeland Security publications. These regulations frame the infrastructure development use case - the data publication and sharing must fit these protocols.

The Drinking Water Campaign update described the project focus on small community systems that have ongoing challenges as well as unique post-hurricane Maria water quality concerns. The study design includes 1) six small mountain systems with intermittent chlorination treatment (IRB covers these) with small operating budgets supported by shared community resources and tank storage. Samples were collected from raw source and “treated” water at tank. 2) one PRASA EPA regulated facility downstream of the sampled community systems, 3) grab samples from homes across distribution system, 4) grab samples from instream samples along system, 5) wastewater network grab samples to the north with a higher population/industrial use. VT data analysis will focus on molecular/microbial analysis and IAUPR water quality analysis consistent with ongoing studies.

Short Introductions of ongoing work by Miguel, Christina and Jim highlighted the need for the following Working Group focused meetings with scheduling links below. Here are links to recent videos and Jim Phuong's slides on the Population Health qualitative data archive and here is a link to a HydroShare resource where Jim has loaded links to post-Maria population health videos and reports, including the March 19 Six-Month Check-up which we recommend.

March 3, 2018

For reference, the contents of the original proposal plus updates are currently available on this Github website: PuertoRicoWaterStudies/wiki

Portions of the webpage that require review by the team include the recently updated Collaborators page, and the Drinking Water campaign data protection and sharing section at the bottom of the Data Management Plan page. Please let me know if you have edits - would you like to include your students?.

Virginia Tech is currently collecting water samples in Puerto Rico with Graciela's support, with analysis planned for April-May. I am working on coordinating smaller design workshop/meetings for May-June when we have had time to get this data into HydroShare, and we can collect user data on infrastructure design.

Upcoming Activities

March: Puerto Rico Field Trip and Analysis (after March 11); start bi-weekly All-Team meetings

April: Data Analysis by CECIA and VT (molecular work completed)

May: Data Analysis by CECIA and VT (data completion expected; 16s amplicon sequencing finished TBD; SI2 PI meeting (short presentation on project to NSF May 30)

June: User Testing and resiliency planning in PR at LCZO meeting, w/CECIA, and w/non-PRASA selected pilot

July: Data Science for Social Good App development (proposed) for health researchers

Aug: Continued development

Sept: Publication writing

Oct: Publication writing

Nov: (Optimistic) Switch public data from private to public. Publish AGU EOS article on RAPID hurricane data archive on HydroShare (with Harvey and Irma).

Links to previous work

January Tasks: See Github Issue

February Tasks: See Github Issue, Field Campaign Design, Add Graciela to RAPID

Update Log (reverse chronology)

February 23, 2018

Although most of us are just beginning our work on this project, the Virginia Tech team has been working hard to coordinate the drinking water data collection scheduled for Feb 24-March 11. We have a new project collaborator, Graciela Ramírez Toro, who is Director of the Center for Environmental Education, Conservation and Research of the Inter American University (IAU) of Puerto Rico. She has been gracious enough to help coordinate and design a field campaign which would not have been possible without her help, and we hope this work will contribute to her 15+ years of drinking water data collection before hurricane Maria. I would also like to introduce Tim Ferguson-Sauder, a Olin College faculty who has received support from his college to work on science communication and design with us this summer; and Julia Hart, a writer and communications expert with UW Freshwater. Please click here to see the full list of collaborators (and update if edits are needed).

We will be in touch more frequently from now through November and I would like to schedule a bi-weekly regular meeting. Could I get a quick response on whether Tuesday's (afternoon-EST or morning PST) are a good time to meet? If I get too many varied responses, I'll set up a Doodle poll to find a time.

We have two HydroShare groups initiated for published our public and confidential data archives: Puerto Rico Water Studies and Puerto Rico Water Studies: Confidential. We will be working closely with VT and IAU colleagues to determine how we use HydroShare to design standards of practice for using cyberinfrastructure to support drinking water and post-disaster data collection and future research uses.

We have two online press releases available. Let me know if your institution wants to include this project in their outreach. Here is an example from UW eScience Institute: RAPID grant awarded for Puerto Rico research and from RENCI: Delving into the data from Hurricane Maria.

We now have a HydroShare/PuertoRicoWaterStudies Github repository where we have started tracking Issues and a Wiki for documenting our project work. If you have not done so already, please let Jim Phuong ([email protected]) know your Github User ID if you would like to be added as a collaborator to this repository.

December 12, 2017

Yesterday we got news that our RAPID has been awarded - this was the final step in the NSF Fastlane administrative system that allows us to begin the project. Congratulations!

Here are some requests, notifications, opportunities and housekeeping that will guide our next steps. I would allocate 5-10 minutes to read this email and respond to HydroShare related requests.

  1. We proposed to use HydroShare as a Central repository, but will also distribute through other community and domain specific websites. I have uploaded the proposal (full Fastlane pdf and Distribution version) and clarifications to this HydroShare resource.

Requests:

a) Please join HydroShare and complete your profile. We will use the profiles to make a group introduction slide show that everyone can browse before/during virtual meetings. Once you are a HydroShare user, we can add you as authors, owners, and contributors to this and new resources that are created.

b) Please use this resource when citing this project, we can digitally add it as metadata using the HydroShare 'derived from' feature in our future work.

c) Please let Jim Phuong ([email protected]) and I know if you have questions and preferences on sharing settings and authorship. This will help us coordinate the development of a team data sharing and publication policy.

  1. The American Geophysical annual meeting is this week. I received late special permission to include a poster that describes our proposal. The printed version includes all of your names, but I would like to coordinate a three week review period where I hope you have the opportunity to contribute to the digital version. This is a link the the public pdf version you can download without being a HydroShare user. This is a link the poster HydroShare resource.

Requests: Please review the pdf or Google Slide and let Jim Phuong ([email protected]) and I know if you have edits and your preferences for being included on this poster. Once you have completed 1) above, we will apply the metadata related to your contributions. I aim to finalize and formally publish the poster HydroShare Resource by January 8, 2018. We can use this as a communication tool at workshops for orienting colleagues who are new to the project.

  1. Drinking Water Working Group - Notification: The Virginia Tech team can now start planning their Water Sampling Data Collection campaign in Puerto Rico. I'll work directly with Kelsey Pieper and update the full team when we know the next steps.

4: Health Researchers Working Group meeting: Great news! The UW eScience Institute has generously offered to support a UW Seattle campus workshop for health research users of our emerging infrastructure.

Opportunity: If you would like to be involved in user experience design for Health Researchers, please fill out this Doodle poll for a early January meeting to start coordinating the first steps for user-based design.

  1. Housekeeping: Team Communications, including this email, will be stored in the UW Google Drive Folder: Freshwater/Projects/RAPID Almost Like Maria. This will facilitate engaging students on projects using this data and infrastructure. In the future we will move this type of information to a project blog or Slack channel. Let me know if you have strong opinions about NOT wanting detailed information in emails like this, and have a preferred communication strategy for large group projects.

  2. Project Landing Page - CUAHSI is our partner who will be helping coordinate our primary outreach and landing page. Opportunity: If you have opinions and preferences about our outreach and data sharing processes, please email me and I will include you on communications with our CUAHSI team as we plan next steps.