Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best way to download notebooks created from myst-notebook files? #148

Open
choldgraf opened this issue Apr 9, 2020 · 47 comments
Open

Best way to download notebooks created from myst-notebook files? #148

choldgraf opened this issue Apr 9, 2020 · 47 comments
Labels
enhancement New feature or request

Comments

@choldgraf
Copy link
Member

Currently, when notebooks are created for a page, they end up in jupyter_execute, and are somehow able to be downloaded with the download-jupyter role.

I am trying to figure out the right way to expose download links for all notebooks so that themes can add the ability to download them. E.g. for this dropdown menu:

image

It seems that download-jupyter: creates a one-off hash for the notebook that wishes to be downloaded. Does it makes sense to do this for all notebook content? Is there a better way that I could do this?

@choldgraf choldgraf added the enhancement New feature or request label Apr 9, 2020
@choldgraf
Copy link
Member Author

Another thought on the generated notebooks - @amueller mentioned that authors may be hesitant to include MyST-specific markdown in Jupyter Notebooks they expect their readers to download and run, because the Jupyter interfaces don't support MyST markdown.

So, I wonder if another feature here could be that MyST-NB also uses Sphinx to output "regular markdown" in the "downloadable" notebook for input notebooks / MyST-notebook files (e.g. that have regular markdown links instead of {ref} in there.

That could be tricky to do, but might be a way to satisfy this condition before we get support for myst-markdown in the jupyter interfaces themselves.

@chrisjsewell
Copy link
Member

e.g. that have regular markdown links instead of {ref}

I’m not sure what you mean by this, can’t you already use regular links?

@choldgraf
Copy link
Member Author

Yeah, that was a bad example. I mean things like admonitions, figure or equation directives, etc

@amueller
Copy link

yeah doing a figure means that if someone looks at the notebook they will not see anything, which is not great.

@chrisjsewell
Copy link
Member

A figure doesn't have a "regular" markdown equivalent though, that's the main reason for using directives; to extend markdown.
You can use an image ![alt](image/path.png), if you want to have it show up in markdown, but then obviously you can't have a caption.
The emphasis here IMO should be to provide equivalent Markdown syntax extensions, using markdown-it/markdown-it-py, to allow you to write in syntax that Jupyter will support in the first place (rather than doing any post-conversion). For example #126 (comment) will allow you to write equations without specifically using the math directive.
Similarly for admonitions, I want to write an extension to allow for use of fenced divs for admonitions:

:::{note}
My note
:::

@choldgraf
Copy link
Member Author

Yeah I agree that the best long-term solution here is to support this syntax in Jupyter interfaces via something like a MyST plugin (in jupyterlab / notebook / vscode / etc).

@phaustin
Copy link
Contributor

phaustin commented May 21, 2020

I get the vague impression that jupyterlab 3.0 is going to change the extension bundling machinery so that a user-selectable markdown parser would be easier to deploy/implement?
jupyterlab/jupyterlab#8385

@choldgraf
Copy link
Member Author

There's also jupyterlab/jupyterlab#272 where they're discussing using markdown-it as the markdown parser in jupyterlab. If that lands, then it would be much easier to build MyST functionality on top of that parser, since markdown-it-py has much of the same structure

@amueller
Copy link

amueller commented May 21, 2020

@chrisjsewell I'm not sure I follow. Sure, there is no direct equivalent in markdown, though you could create html that's equivalent. For numbering and referencing that would need to be either post-processing or somehow needs to be supported by jupyter lab.

I think the goal I have in mind is pretty straight-forward: I want a jupyter notebook that has the content that I wrote but that can also be executed. Right now, I can produce content via jupyter notebook as an editor, but there is no way to view the content as a jupyter notebook. I.e. there is no way for the user to execute the code while seeing the figures.

Having an extension would certainly make jupyter notebooks a better editor for writing jupyter book content, but I don't think it's a feasible solution for the consumer side: it's a giant barrier to entry to ask someone to install an add-on so they can read your notebook [unless Anaconda has installed this add-on by default].

@chrisjsewell
Copy link
Member

@amueller I think I see your point-of-view 😬 but I think it would be best if you could provide a minimal example of a notebook, that you think is currently "unreadable", that we can talk around, and perhaps an example of what you think the notebook should look like

@amueller
Copy link

amueller commented May 21, 2020

I wouldn't say it's unreadable, but some parts are missing. Figures are missing, notes are missing, sidebars are missing - unless the reader clicks into a markdown cell and finds some directive that's not supported and reads the content. Though for a figure she still won't see it unless she edits the markdown.

Maybe a minimal example is a notebook with a figure and a note. If you open that in Jupyter, it will show two empty markdown cells, aka a white page.
What it would ideally show is a figure and a note.

Some things are not easily possible in jupyter I think, like sidebars. But I'd prefer to have a sidebar rendered inside the text rather than have it completely hidden from the reader.

@chrisjsewell
Copy link
Member

chrisjsewell commented May 21, 2020

If you open that in Jupyter, it will show two empty markdown cells, aka a white page.

Are you sure about that?

```{figure} https://miro.medium.com/max/512/1*d69DKqFDwBZn_23mizMWcQ.png
This is my caption
```

```{note}
This is a note, but it won't be *formatted*
```

image

Then what I meant by adding extensions to markdown-it-py, is that you could then write something like this, which renders in the notebook (with no add-ons) but would still be parsed correctly by MyST (given you activate the extensions).

![](https://miro.medium.com/max/512/1*d69DKqFDwBZn_23mizMWcQ.png)
!This is my caption

:::{note}
This is a note, and it will be *formatted*
:::

image

@choldgraf
Copy link
Member Author

choldgraf commented May 22, 2020

Yeah - I think that basically the only options are:

  1. Find ways to inject raw HTML into generated notebooks when a book is built so that it will show up in a jupyter interface
  2. Find ways to support MyST markdown syntax within Jupyter interfaces

To me, 2 is a cleaner and longer-term solution (maybe also simpler as well?). This is just a limitation of the fact that Jupyter only supports CommonMark, which doesn't have support for any of the fancier formatting we're talking about (which is why people tend to hack the same results with raw HTML)

@amueller
Copy link

@chrisjsewell
Hm you're right, it is not formatted but it's there. Somehow I thought I was missing content, but I guess that was only figures. I'll see if there was something else missing.

@choldgraf I think @chrisjsewell had something in between in mind (for now) which basically renders reasonably ok in Jupyter.

I would totally agree that 2 is the cleaner and nicer long-term solution. We'll see how my book evolves. But while @chrisjsewell's solution would be better than the current situation, I don't find it entirely satisfying. I'll be putting hundreds of hours into formatting these pages, I can put another couple hours into a CI job that replaces the myst markdown with some html.

I'm not saying this is a solution that should be supported by jupyter-book, as it is a bit ugly and adds more abstractions and moving pieces, I'm just saying, as someone writing a book, I'd rather have the extra work than have ugly formatting in my book.

@chrisjsewell
Copy link
Member

chrisjsewell commented May 22, 2020

there is no way for the user to execute the code while seeing the figures.

BTW what you're talking about is also reminiscent of https://jupyterbook.org/interactive/launchbuttons.html?highlight=thebelab#live-interactive-pages-with-thebelab. I'm certainly not saying your use case doesn't have merit, but surely the point of creating a HTML book is that people read that, rather than downloading all the individual notebooks, having to open them via Jupyter, and then reading those?

@amueller
Copy link

amueller commented May 22, 2020

but surely the point of creating a HTML book is that people read that, rather than downloading all the individual notebooks, having to open them via Jupyter, and then reading those?

@chrisjsewell I guess that's the disconnect. To me both are equally important.
I want a book that is available as executable jupyter notebooks and as rendered website [and as printed book probably]. I might even be tempted to say the executable notebooks are more important than the website.
If that's not the goal of jupyter-book, then that's of course fine, but it's certainly my goal.
And I don't think of it as 'creating an HTML book'. I think of it as writing a book, and wanting to provide as many convenient ways for people to consume the materials as possible.

@choldgraf
Copy link
Member Author

choldgraf commented May 22, 2020

Good point - I think there will always be trade-offs, but I think in general we should try to push for a top-quality experience in each of: the content files themselves, the rendered HTML, and the rendered PDF. In the current phase, I think we are probably prioritizing them in the order of HTML > PDF > ipynb, but I think this will shift back-and-forth over time

@amueller
Copy link

Yeah agreed, there's certainly trade-offs. Fixing the PDFs will be technically somewhat simpler in my experience (I went through all of this when I wrote my last book, which is entirely in jupyter and was converted to asciidoc). It "just" means fixing the latex that's generated. Though actually there's some issues there es well if you're using pandoc (are you?), because the internal representation of pandoc is somewhat restricted, IIRC, pandoc can't do cell spans in tables and so you can't directly use it to create latex that does. Also pandoc doesn't convert raw html that's inside markdown. You can probably see all the pandoc issues I opened 4 years ago still ;)

@chrisjsewell
Copy link
Member

@amueller
Copy link

amueller commented May 22, 2020

That's for parsing the markdown, not for generating the latex, though, right?
Oh is it sphinx generating the latex? I guess that has it's own engine that's not pandoc. I know very little about that.

@chrisjsewell
Copy link
Member

Yes markdown-it-py parses to its representation of tokens, then myst-parser converts these to the docutils node tree used by sphinx, which has output specific builders.

@phaustin
Copy link
Contributor

One thing that's worked fairly well for us to bridge the notebook/rst personalities of an md:myst file is to make sure that every figure is isolated as a jupyter cell using the jupytext cell delimiters, with a simple cell metadata tag like 'fig'. So turning the md:myst file into a notebook that doesn't scare students just requires a script that uses jupytext.read to get it into the nbformat tree, an operation that transforms the figure cells, and jupytext.write to write the denatured notebook, sync and execute.

@choldgraf
Copy link
Member Author

choldgraf commented May 25, 2020

another issue in a similar vein, just for another datapoint: jupyter-book/jupyter-book#629

we've started to get a few questions from people saying they are confused because the MyST syntax doesn't display in Jupyter environments (e.g., in the issue above it is the .. figure directive...)

@chrisjsewell
Copy link
Member

I think admonitions and image/ figure directives are the main ones to prioritize, in terms of extensions for better "round-tripping", maybe we want to spin that off into a separate issue.

For the latter, perhaps direct parsing of HTML img tags into the doctree might be feasible (using beautifulsoup to actually extract the tag options)

@phaustin
Copy link
Contributor

Yes, this would be ideal for our teaching. A typical course setup will have a textbook or lab manual written in jupyterbook with crossreferencing, equation numbers, figure captions etc., and a set of student labs, which they will work on in jupyter. As long as the figures can be sized correctly in the notebook, things like cross-refrences are a minor detail -- students can just click over to the html/pdf to see the fully rendered text.

@chrisjsewell
Copy link
Member

Yeh cross-referencing is probably not easily possible, because by default in sphinx they are also cross-document

@amueller
Copy link

Totally agree with what is said here. So @phaustin your workflow is having a master md:myst and generating a book and a notebook from it with the notebook getting some extra polish to render nicely? That's basically the workflow I had imagined only my source would have been a notebook with myst, which should be very similar.

@chrisjsewell so round-tripping sounds like doing the conversion, not having directives that work in both environments as in #148 (comment) ?

My setup is very similar to @phaustin, and having students install an add-on can be quite a big barrier.

@phaustin
Copy link
Contributor

@amueller -- yes, our holy grail is a single myst:md master, with derived versions that have provenence via scripts and metadata giving topic, level of difficulty, whether a cell is a question or a solution, answer key letter etc. So for a quiz, we can write the solution we'll eventually post, strip the cells with the answers, construct the answer key, print a pdf for an in-class exam, or convert to canvas (our lms) qti xml for an online quiz.

@chrisjsewell
Copy link
Member

so round-tripping sounds like doing the conversion, not having directives that work in both environments as

Well I just mean that myst, on parsing, would read an HTML img tag as an image or figure directive. You would have to write your source documentation using HTML images (rather than the directives), if you wanted the downloadable notebooks to be that way, but then this avoids having to do any one-way post-processing of notebooks

@amueller
Copy link

Ah, ok. But then I still can't do cross-referencing, right?

I think @phaustin wants cross-referencing in the source document (or at least in one of the versions of the document) - at least that's what I want. Or do you mean you'd write html with some extra syntax that could then be read by myst to create the references?

What I want and what I understood @phaustin to want is:
a) Have an html & pdf export that has cross-references and all the niceties jupyter-book currently has.
b) Have a jupyter notebook (either as source or as export) that renders figures and notes reasonably well and doesn't scare students / readers with weird syntax.

Bonus:
c) Have it written in a version-controllable form (i.e. myst:md).

I'm not sure how your solution achieves a).

@phaustin
Copy link
Contributor

For us, the image/figure swap plus perhaps a howto on filtering myst markdown would be about all we would need to get good-enough jupyter notebooks. If you did get markdown-it-py into jupyter as an extension, we would definitely use that on our large first year courses that are running on jupyterhub in the cloud. If it was possible to install a single jupyterlab extension via a conda environment.yml file then I don't see any problem using the extension in smaller classes where the students are using their own laptops.

@chrisjsewell
Copy link
Member

chrisjsewell commented May 25, 2020

Ah, ok. But then I still can't do cross-referencing, right?

No that would be non-trivial, so I don't think would be a short/medium term goal

Have a jupyter notebook (either as source or as export) that renders figures and notes reasonably well

I think this is a reasonable short-medium term goal

and doesn't scare students / readers with weird syntax.

Well that depends on how much of the "sphinx" functionality you want to use. Essentially roles and directives are the primitives of the MyST "language", then any other syntax are alternatives to these; to improve usability/readability. Naturally it would be unfeasible to provide an alternative syntax for every possible role and directive, but we can look to provide them for the most widely used ones.

@amueller
Copy link

@phaustin can you elaborate on

For us, the image/figure swap plus perhaps a howto on filtering myst markdown would be about all we would need to get good-enough jupyter notebooks.

I'm not sure I understand what you mean. I thought you already had custom processing to do that?

@phaustin
Copy link
Contributor

yes, but I'd be happy to exchange those regular expressions for unambigous information from the parser. (this is strictly wish-list though, at the moment the only processing we do is to comment/uncomment the markdown/html image versions in a figure cell).

@lesteve
Copy link

lesteve commented Dec 18, 2020

I put together a POC script to try option 1. from #148 (comment) "Find ways to inject raw HTML into generated notebooks".

Our main use case is for admonitions : we want to keep using admonitions in JupyterBook and we want them to look decent in Jupyter notebook interfaces. The reason is that people follow the notebooks along when we give the course.

The way it looks can be seen here:
INRIA/scikit-learn-mooc#152 (comment)

The script doing the conversion from py:percent notebooks using MyST admonitions to ipynb files with rendered HTML admonitions is here:
https://github.com/INRIA/scikit-learn-mooc/blob/master/build_tools/convert-python-script-to-notebook.py

There is probably a lot of room for improvements, so suggestions more than welcome! I am guessing that there are some limitations too, for example nesting admonitions is probably not going to work.

The basic idea behind it:

  • use the default markdown-it-py parser to find the lines of the admonitions
  • generate a HTML for admonitions (going to docutils and then HTML). Currently we style the HTML directly e.g. with bootstrap CSS classes (they seem to both work in Jupyter and JupyterLab). My understanding is that there is no easy way to add CSS to a Jupyter notebook outside adding a code cell with HTML("<style>put_your_css_here</style>) or custom.css.
  • replace the source of the admonition by its generated HTML in each notebook cell

It feels like I am doing MyST-markdown to CommonMark conversion, so what would a cleaner strategy look-like, would writing a CommonMarkRenderer class makes any sense?

@choldgraf
Copy link
Member Author

I believe that @mmcky and @AakashGfude are working on a MyST->ipynb converter that outputs commonmark markdown: https://github.com/QuantEcon/sphinx-tojupyter

perhaps that'd be useful?

medium-long term I am very hopeful we can get some support for MyST markdown (some of it anyway) inside of Jupyter interfaces (e.g. via work that @rowanc1 is doing or building off of the JupyterLab markdown-it extension that @agoose77 has worked on

@lesteve
Copy link

lesteve commented Dec 18, 2020

Nice, thanks a lot for the pointers, I'll try to take a look at them!

@chrisjsewell
Copy link
Member

chrisjsewell commented Dec 18, 2020

Meh, I think this feels a little bit like "going round the houses".
You could just have myst-parser identify HTML admonition, the same way it does for HTML images: https://github.com/executablebooks/MyST-Parser/blob/master/myst_parser/parse_html.py

@chrisjsewell
Copy link
Member

chrisjsewell commented Jan 16, 2021

Our main use case is for admonitions : we want to keep using admonitions in JupyterBook and we want them to look decent in Jupyter notebook interfaces.

MyST-Parser now has an extension to read HTML admonitions: executablebooks/MyST-Parser#288 (https://myst-parser.readthedocs.io/en/latest/using/syntax-optional.html#html-admonitions)

@lesteve
Copy link

lesteve commented Jan 18, 2021

Thanks, I may be missing something, but I don't really see how this helps having admonition looking decent in Jupyter notebook interfaces 🤔.

I tried using a HTML admonition with the development Myst-Parser.

<div class="admonition note" name="html-admonition">
<p class="title">This is the **title**</p>
HTML admonition
</div>

The generated HTML does look good:
image

but this is how it looks in the classic Jupyter notebook interface:
image

To give an idea what my current conversion script does (#148 (comment))

JupyterBook

https://inria.github.io/scikit-learn-mooc/python_scripts/02_numerical_pipeline_hands_on.html

image

Notebook

https://nbviewer.jupyter.org/github/inria/scikit-learn-mooc/blob/master/notebooks/02_numerical_pipeline_hands_on.ipynb
image

@chrisjsewell
Copy link
Member

chrisjsewell commented Jan 18, 2021

don't really see how this helps having admonition looking decent in Jupyter notebook interfaces

You can easily just add extra classes and/or inline styles:

<div class="admonition tip alert alert-warning">
<p class="title" style="font-weight: bold;">Tip</p>
parameter allows to get a deterministic results even if we
use some random process (i.e. data shuffling).
</div>

in jupyter lab:

image

<div class="admonition" style="background: lightgreen; padding: 10px">
<p class="title" style="; padding: 10px; font-weight: bold; border-color: green; border-style: solid">Tip</p>
parameter allows to get a deterministic results even if we
use some random process (i.e. data shuffling).
</div>

image

@lesteve
Copy link

lesteve commented Jan 18, 2021

Ah good point thanks!

@chrisjsewell
Copy link
Member

chrisjsewell commented Jan 18, 2021

I guess inline styles are probably the best way to go, as they are deterministic (i.e. don't depend on the available CSS), then when it is converted in sphinx, the style attribute will just be "thrown away", and it will be styled consistent with the sphinx theme you are using

@lesteve
Copy link

lesteve commented Jan 18, 2021

I guess a limitation is that if you use markdown inside the HTML admonition, it will not render very nicely in Jupyter notebook interfaces.

<div class="admonition alert alert-warning">
<p class="title" style="font-weight: bold;">Tip</p>
`random_state` is **very important**
</div>

image

All in all, personally now that I have my hacky .py -> .ipynb conversion script with simple admonition support, I think I will stick to it (maybe sunk cost fallacy 😉). The main advantages are:

  • have the comfort of staying in simple markdown i.e. not having to write HTML
  • one single place (in the script) where the admonition HTML style is defined. I could easily see how the hand-written style in the HTML admonitions would be copied and pasted many times and possibly diverge with time

The main disadvantage would be that it is a stand-alone hacky script and that his longer-term maintenance is less than clear.

For others though, HTML admonition may be exactly what they need.

@agoose77
Copy link
Collaborator

@lesteve you can partially mitigate this by adding a newline above the Markdown:

image

@lesteve
Copy link

lesteve commented Jan 18, 2021

Ah nice I did not think of trying that, thanks!

@mgeier
Copy link

mgeier commented Jan 22, 2021

FYI, nbsphinx parses <div> elements with alert-info and alert-warning: see https://nbsphinx.readthedocs.io/en/0.8.1/markdown-cells.html#Info/Warning-Boxes.
This even works with LaTeX/PDF output.

A newline should still be used before the content, as mentioned above (and as mentioned in the nbsphinx docs).

There are still problems with nbconvert, though: jupyter/nbconvert#1125

And there is some room for improvement regarding the CSS that's used in JupyterLab and the Classic Notebook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants