Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build documentation containers #2355

Closed
iesahin opened this issue Apr 5, 2021 · 7 comments
Closed

build documentation containers #2355

iesahin opened this issue Apr 5, 2021 · 7 comments
Labels
A: docs Area: user documentation (gatsby-theme-iterative) status: stale You've been groomed! type: enhancement Something is not clear, small updates, improvement suggestions

Comments

@iesahin
Copy link
Contributor

iesahin commented Apr 5, 2021

Currently, the GS, UC, UG and REF examples in dvc.org use code samples that are supposed to be run by the user after downloading example-get-started and the example SO dataset. For exposition purposes this is fine, but we can do better.

We can have a set of documentation containers that has DVC, the example project, and the data to test the commands. This reduces the steps to test DVC to a single, e.g., docker run -it dvcorg/doc-start command, which we can mention in the docs.

More importantly, this also allows us to use automated tests to validate the examples. Each documentation page can have an associated container, which runs the examples in the page and check their outputs. When something is updated in the DVC interface , the example project or the data, all the documentation can be tested against this update, and we'll be sure the documentation doesn't tell something that's not there. (Sins of omission is a bit harder to detect, though.)

It may also be used to fill the appropriate parts with the command outputs, but I think tracking just the changes is enough for the time being. Automated updates can go awry and we need some kind of writer intervention anyway.

I have built several containers for the Katacoda scenarios in https://github.com/iterative/dvc-doc-containers I plan to extend these to cover all the documentation.

I have also built a preliminary markdown code runner that runs the code blocks in the documentation in a Docker container. It's currently in https://github.com/iesahin/markdown-code-runner but will be transferred to https://github.com/iterative/markdown-code-runner after a 0.1 release.

Any comments, ideas, and questions are welcome. Thank you.

Related to #2318

@shcheklein @dberenbaum @jorgeorpinel

@iesahin iesahin self-assigned this Apr 5, 2021
@iesahin iesahin added A: docs Area: user documentation (gatsby-theme-iterative) type: enhancement Something is not clear, small updates, improvement suggestions labels Apr 5, 2021
@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Apr 5, 2021

Hmm, good idea but...

Most docs examples are not meant to be reproducible. Really only the Get Started (which already has a repo you can download and run) and the one Tutorial (also has downloadable assets) are. So I'm not sure this is worth the effort and maintenance (other than for Katacoda scenarios), but maybe in the future?

We have other ideas to improve code examples like #759 which seem like lower hanging fruit to me 🍎

Also, could we try to summarize a bit and/or merge this with #2354? It's a lot of reading for the same one thing, unless I'm not getting this correctly. Thanks!

@iesahin
Copy link
Contributor Author

iesahin commented Apr 6, 2021

Most docs examples are not meant to be reproducible. Really only the Get Started (which already has a repo you can download and run) and the one Tutorial (also has downloadable assets) are. So I'm not sure this is worth the effort and maintenance (other than for Katacoda scenarios), but maybe in the future?

Hi @jorgeorpinel

If REF examples are not reproducible, how can we know they will continue to work after the updates? Testing them manually is not feasible and depending on user reaction, expecting them to report errors is not a good way IMHO. For example, after DVC 3.0, there should be no dvc run examples, there will probably be many changes in other options and we'll need to test all the command examples.

I don't think we should mention "install this, install that" in REF examples but people are using these commands to copy and paste to their shell. Providing them a way to do this with the example data will increase the adoption I think.

We have other ideas to improve code examples like #759 which seem like lower hanging fruit to me 🍎

Having different tabs for code examples is a nice idea, though, I don't think it's a lower hanging fruit. It requires an engine update and I would have some automated conversion to Windows instead of typing them manually. Basically it's asking the engine to convert \ to ^ and / to \ in paths. (It may also need to put C:\ somewhere, IDK.) Not very complicated but I'm not a Gatsby expert and probably will need to study the current setup deeply before making any such change. (Also, I didn't touch a Windows box for the last two years if you would decide on manual updates.)

#2354 is just a naming convention issue. It will be closed after we decide on something. This one is the general documentation containers issue.

Thank you very much.

@jorgeorpinel
Copy link
Contributor

Having a dvc command syntax checker would definitely be nice, for automating "regression testing" our example code blocks. I just don't think an entire Docker container per doc is needed for that 🙂

@iesahin
Copy link
Contributor Author

iesahin commented Apr 7, 2021

I just don't think an entire Docker container per doc is needed for that

If examples can run in the same container, I don't see a need for multiple containers either. I think we can have a single ref-examples container that can run all the examples in REF. (We can use labels for particular commands if needed.) Maybe one container for each UC doc, one for each GS<->Katacoda and UG pages can have a common container as well. There should be no more than 20 I suppose.

These are not that heavy, most of the underlying containers are identical (ubuntu or python), so only the different layers are downloaded/pushed if necessary.

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Apr 7, 2021

OK. What I'm saying is that maybe the issue should be to create a dvc command syntax checker we can use for a docs check, without it having to be Docker containers. That is one possible implementation but is that something we can run on GitHub CI or too heavy/slow for that? An alternative is to see what the autocomplete scripts do currently and try to reuse that from a new shell script. Or maybe consider creating a formal dvc lexer.

And if we generalize the issue in that way, is there any Katacoda-specific Docker container work to split into a separate issue or is that all done? Thanks

@jorgeorpinel
Copy link
Contributor

That said if you have a proof-of-concept container that can try all the docs samples currently and seems to be useful, do share @iesahin! 🙂

@jorgeorpinel
Copy link
Contributor

@iesahin do you think this is still relevant? Please reopen and update if so. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: docs Area: user documentation (gatsby-theme-iterative) status: stale You've been groomed! type: enhancement Something is not clear, small updates, improvement suggestions
Projects
None yet
Development

No branches or pull requests

2 participants