Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

view: Support narratives #129

Closed
wants to merge 1 commit into from
Closed

view: Support narratives #129

wants to merge 1 commit into from

Conversation

tsibley
Copy link
Member

@tsibley tsibley commented Sep 2, 2021

Previously the launched Auspice would either show baked in test
narratives or no narratives at all, depending on the Auspice version.

Resolves #128.

Previously the launched Auspice would either show baked in test
narratives or no narratives at all, depending on the Auspice version.

Resolves #128.
Copy link
Contributor

@huddlej huddlej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love how simple this change is, but it won't support the interface for narratives files we've been recommending in our documentation. I think we've always recommended that users store their narratives in a separate narratives/ directory to distinguish descriptions of datasets from the datasets themselves. Asking users to store their files in the same directory for nextstrain view and in different directories for auspice view will be confusing.

If we want to generally avoid pass-through arguments, could we add an optional argument for --narratives-dir to support the currently recommended file system layout?

I'm also curious about @jameshadfield's thoughts here. If a single directory for auspice JSONs and narratives would be easier for users anyway, we could rethink our historical choice to split these files into separate directories.

@jameshadfield
Copy link
Member

jameshadfield commented Sep 24, 2021

I'll note that we enforce this pattern (./auspice and /narratives) for community datasets. This was discussed a lot at the time and from memory was a very deliberate choice.

From a broader (and more debatable) perspective, should the CLI expose the same options as the tool it's wrapping? For instance, with nextstrain build we can pass through arguments which are then interpreted by snakemake, so should this be the case for nextstrain view?

@tsibley
Copy link
Member Author

tsibley commented Sep 28, 2021

Hmm. This is super helpful feedback and will change the implementation of this PR, so thank you both! nextstrain view should definitely Just Work with the auspice/ and narratives/ convention.

For context, the main reason nextstrain view (as exists now) takes a directory containing *.json instead of a directory containing auspice/*.json is precisely to avoid requiring specifically-named directories. The practice of requiring specific names closely couples file contents with file names/directory hierarchy, which are historically things that were explicitly kept decoupled (i.e. this is what makes the shell and Unix's "everything is a file" so powerful). Specific naming requirements end up manifesting as boilerplate steps a user or program must take, such as mkdir narratives or cd narratives, which manifest into extra documentation to write.

Given that, I see auspice/ and narratives/ as useful conventions to help people stay organized, but things we should avoid requiring. At the same time, as I said above, nextstrain view absolutely needs to Just Work with our conventions.

When nextstrain view was first written, Auspice didn't support local narratives yet, just local datasets, so this reconcilation is long overdue!

If we want to generally avoid pass-through arguments, could we add an optional argument for --narratives-dir to support the currently recommended file system layout?

It feels weird to make the main nextstrain view argument for datasets and relegate narratives to an optional parameter. They seem like they should be on equal footing even if datasets are currently more prevalent.

I have another proposal:

  • If the directory given to nextstrain view contains an auspice/ directory, then use that as the dataset dir.
  • Otherwise, fallback to existing behaviour and expect the given directory to directly contain *.json dataset files.
  • If the given directory contains a narratives/ directory, then use that as the narratives dir.
  • Otherwise, fallback to the behaviour in this PR and expect the given directory to directly contain *.md files.

This is backwards incompatible, but only if you have some datasets in a directory which also has an auspice/ directory in it. This seems unlikely enough, and the upside is that if you're following our build directory layout conventions (auspice/ and optionally narratives/), then nextstrain view <build> Just Works, which is nice, and also aligns the directory argument for nextstrain view with that of nextstrain build.

At the same time, it supports an arbitrary directory containing a pile of dataset and/or narrative files (except for the narrow case of also containing an auspice/ or narratives/ directory).

I'll note that we enforce this pattern (./auspice and /narratives) for community datasets. This was discussed a lot at the time and from memory was a very deliberate choice.

Good point! I think this is the only place we truly require it, though?

I'd love if we starting keeping a record of design decisions like this in a central place, so we can refer back to why we made them.

Constraints like this where "there's only one way to do it" can be useful when the alternatives are not reasonably possible to implement. Alternatives, in this case, could be allowing both kinds of files at the repo root, at arbitrary paths, or at some single but per-repo configurable path. It'd be helpful to know what was considered or not considered and why choices were chosen or not.

From a broader (and more debatable) perspective, should the CLI expose the same options as the tool it's wrapping? For instance, with nextstrain build we can pass through arguments which are then interpreted by snakemake, so should this be the case for nextstrain view?

I think that wrappers like the CLI should generally strive to avoid pass-thru args when possible, as it defeats much of the benefit of wrapping, which is an interface that intends to be smaller, unified across disparate tools, and reside at a conceptually different level. At the same time, pass-thru is sometimes necessary and still the best choice for supporting advanced/uncommon use cases.

See also #128, where @huddlej and I touched on pass-thru arguments:

pass arguments after the first positional argument through to auspice the same way nextstrain build passes arguments through to Snakemake

I prefer even more to avoid this sort of pass thru of arguments when possible, as it can be hard to explain to users who aren't familiar with the concept and it can make it harder to change the interface in the future because pass thru breaks the encapsulation otherwise provided.

@tsibley tsibley marked this pull request as draft September 28, 2021 00:17
@jameshadfield
Copy link
Member

Interesting thoughts, as always. I think all of our build pipelines have adopted the convention of using auspice/ and narratives/. For nextstrain.org it's a different story: {core, staging}, groups and community all do things differently. I think each of these was well thought out on their own, but perhaps without consideration of a unifying structure.

I'll note that auspice view doesn't care where things are stored, it's up to the user to provide this information (i.e. "what do you want to view"). Perhaps one day we will even be able to supply remote paths here. (If these arguments weren't provided, then my intention was to fall back to our convention and look for ./auspice and ./narratives, but this doesn't work on all OSs.)

@tsibley
Copy link
Member Author

tsibley commented Sep 28, 2021

For nextstrain.org it's a different story: {core, staging}, groups and community all do things differently. I think each of these was well thought out on their own, but perhaps without consideration of a unifying structure.

Nod.

For core and staging, eventually, I think it'd be useful to move the narratives into our production S3 buckets used for datasets. This could be namespaced under a narratives/ object prefix, or not. The existing https://github.com/nextstrain/narratives GitHub repo would be maintained, with the change being that narratives are deployed/uploaded the same way as datasets instead of sourced directly from git.

For Groups, the current CLI interface in development centers around existing nextstrain.org URLs, e.g. nextstrain.org/groups/x/a/b refers to a dataset while nextstrain.org/groups/x/narratives/c/d refers to a narrative. Ditto for nextstrain.org/mumps/na vs nextstrain.org/narratives/intro-to-narratives. Where/how the files are stored on the server-side is then an implementation detail that doesn't matter to the user, so we're free to change it from the current practice of mixed together in one namespace.

(If these arguments weren't provided, then my intention was to fall back to our convention and look for ./auspice and ./narratives, but this doesn't work on all OSs.)

Oh, what about {Linux, macOS, Windows} is preventing this?

@tsibley
Copy link
Member Author

tsibley commented Nov 24, 2021

Closing, as we're going to implement a different solution (described in my comment above). I'll update #128 with the summary.

@tsibley tsibley closed this Nov 24, 2021
@tsibley tsibley deleted the view/narratives branch July 18, 2022 19:02
tsibley added a commit that referenced this pull request Dec 6, 2022
Previously the launched Auspice would either show baked in test
narratives or no narratives at all, depending on the Auspice version.

Wee bit of an oversight. orz

Resolves #128.
Supersedes #129.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

view: Allow user to specify narratives directory
3 participants