Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Circuitscape use case #6

Open
c42f opened this issue Nov 30, 2020 · 0 comments
Open

Circuitscape use case #6

c42f opened this issue Nov 30, 2020 · 0 comments

Comments

@c42f
Copy link
Contributor

c42f commented Nov 30, 2020

Background

I've been looking at Circutscape.jl as an interesting use case for DataSets.jl. Here's a design for how DataSets could support circuitscape user workflows.

Circuitscape is an interesting case because it's a complete application with existing data management code etc — there's the Circuitscape.compute() function which takes a config file and uses that to discover the input data and output location, and the Circuitscae.start() function which is a wizard which helps users create such a config file0. Because DataSets tries to do IO management and data discovery, some of the data discovery parts of Circuitscape should be replaced with a DataSets-based interface.

I think users should be able to interactively

  • Manage their project datasets — provided by the data REPL (in future, perhaps some GUI data browser)
  • Launch circuitscape jobs — provided by a data REPL run command.

Workflow example

Here's a quick sketch of the workflow:

The wizard Circuitscape.start() acts as it does currently, but instead of linking to existing data in some arbitrary location in the filesystem, it copies the data into a new DataSet. The type of that dataset can be CircuitScapeInput or some such — internally it's just backed by the exact same directory structure as Circutscape currently has.

data> run circuitscape   # If run with no data, calls start (?)

# wizard steps ...

[ Info: Created new input dataset `raster_pairwise_1`

data>

I'm imagining that the Circuitscape.compute() would be replaced by the data REPL run command, and add functionality for listing which data is available for running with. Something like:

Available circuitscape input data:
  📂 raster_pairwise_1      type=CircuitScapeInput
  📂 raster_one_to_all_1    type=CircuitScapeInput

data> run circuitscape raster_pairwise_1 output1!
[ Info: ...

data> ls
  📂 output_1               type=CircuitScapeOutput
  📂 raster_pairwise_1      type=CircuitScapeInput
  📂 raster_one_to_all_1    type=CircuitScapeInput

For run to work, the data REPL needs to be resurrected and taught look at the database of entry points which is currently set up by @datafunc. Then circuitscape would declare several data entry points @datafunc circuitscape to hook into data> run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant