Opt-in Interface for Preprocessors to Preprocess SUMMARY.md #2466

ssanderson · 2024-11-03T21:12:10Z

Problem

The current interface for mdbook preprocessors is that the preprocessor accepts a serialized (PreprocessorContext, Book) pair over stdin, and it returns a new Book by writing to stdout. In order to produce the context/book pair, the main mdbook process needs to read and process SUMMARY.md, which means that preprocessors can't easily be applied to SUMMARY.md, which limits some potentially interesting use-cases for preprocessors.

As a concrete example, in mdbook-minijinja I've written a preprocessor that evaluates book contents as minijinja templates. One of the use-cases I'm interested in for this preprocessor is the ability to conditionally enable or disable certain chapters based on runtime information (e.g., to render different versions of my book for different audiences).

I'm currently able to support a limited amount of SUMMARY.md preprocessing by loading the summary myself, evaluating it as a template, and then re-parsing it using parse_summary and doing a crude diff between the resulting summary and what I get over stdin. This works surprisingly well for simple use-cases like conditionally-enabling and disabling chapters, but it's brittle and doesn't match the mental model the user would likely want to have (for example, this method doesn't work if I want template evaluation to add new chapters that aren't statically in the SUMMARY.md source).

Proposed Solution

I think the ideal solution for my use-case would be for there to be an interface by which my preprocessor could opt in to being invoked on SUMMARY.md prior to the summary being used to construct the rest of the book content. This would essentially add a new "phase" to the mdbook loading scheme. Currently, my rough mental model of how mdbook works is:

mdbook starts
mdbook loads configuration from book.toml, including information about configured preprocessors and renderers.
mdbook loads SUMMARY.md
mdbook parses SUMMARY.md
mdbook loads book content based on parsed summary.md.
mdbook feeds book content through all preprocessors.
mdbook passes fully preprocessed book content to renderers.

What I would propose would be to introduce a new step between (3) and (4), in which mdbook would feed the loaded-but-not-yet-parsed content of SUMMARY.md to a new subcommand preprocess-summary command of the preprocessor (similar to the existing supports subcommand today).

Notes

I suggested above that we add a new preprocess-summary subcommand. For backwards-compatibility reasons, this command would probably want to be opt-in: the existing templates for preprocessors seem like they generally check if they're being invoked via the supports subcommand, and otherwise assume they're being invoked for the main preprocessing command. Invoking a new preprocess-summary subcommand would likely break existing preprocessors. I think a straightforward workaround for this would be to make participation in the new method opt-in via a flag in the preprocessor's book.toml section, e.g.:

[preprocessor.minijinja]
preprocess-summary = true

mdbook would only invoke the preprocess-summary command for preprocessors that opt-in by instructing their users to set this flag. You could also spell this as something like a top-level summary-preprocessors list or something like that.

The text was updated successfully, but these errors were encountered:

ssanderson · 2024-11-03T21:12:59Z

I'd be happy to work on a more concrete proposal/interface for this feature if there's interest in this general direction.

ehuss · 2024-11-04T14:33:18Z

Sorry, I'm not quite following the request here. Preprocessors can add/remove chapters. The structure of SUMMARY.md is essentially the structure of the Book. Is it possible for your preprocessor to use that information to add/remove chapters?

ssanderson · 2024-11-04T15:33:48Z

The essential problem is the same as what's outlined by @Michael-F-Bryan in this comment: #760 (comment)

This probably isn't possible. Preprocessors are only run on the chapters of a book once they've been loaded into memory, by which time the SUMMARY.md has already been parsed (it's how we find the chapters to give the preprocessor, after all).

My goal with this proposal is to allow SUMMARY.md itself to be an arbitrary minijinja template. For example, I'd like to be able to write something like:

# My Title

{% if enable_section_1 %}
{% include "section1_summary.md" %}
{% endif %}
...

where section1_summary.md might be a file in a templates directory like:

## Section 1

- [Chapter 1](./section1_chapter1.md).
- [Chapter 2](./section2_chapter2.md).

I don't think there's a good way to do this with the current preprocessor API, because by the time my preprocessor receives any input from mdbook, mdbook has already loaded and parsed the summary, which will likely fail (or at least produce unhelpful results).

The proposal here is to add a step that allows preprocessors to receive the unparsed content of SUMMARY.md, apply transformations to it, and return the result over stdout. mdbook would only attempt to parse SUMMARY.md and load chapter contents after this initial "summary preprocessing" step.

Add a configuration option to reload SUMMARY.md, evaluate it as a template, and re-parse chapters. This allows us to support full arbitrary preprocessing of summary.md, but it means we discard the results of any preprocessors that ran before us. In most scenarios this will probably be an OK tradeoff. The ideal solution still requires something like rust-lang/mdBook#2466.

ssanderson · 2024-11-09T19:31:14Z

I've made some updates to mdbook-minijinja that solves some of the problems here. I've made it so that my preprocessor reloads SUMMARY.md, evaluates it as a template, and then re-parses the summary and reloads the book using MDBook::load_with_config_and_summary.

The main downside of this approach is that reloading the book contents discards the output of any preprocessors that came before mine, so I'm currently making summary preprocessing optional and documenting that people should configure mdbook-minijinja to run as the first preprocessor if they enable summary preprocessing. I expect that this should be good enough for the majority of use-cases I'm aiming to support, but it does seem a bit unfortunate that the workaround breaks composability with other preprocessors.

ssanderson added the C-enhancement Category: Enhancement or feature request label Nov 3, 2024

ssanderson changed the title ~~Preprocessor Support for SUMMARY.md~~ Opt-in Interface for Preprocessors to Preprocess SUMMARY.md Nov 3, 2024

ssanderson mentioned this issue Nov 3, 2024

Must Run as First Preprocessor when Summary Preprocessing is Enabled ssanderson/mdbook-minijinja#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opt-in Interface for Preprocessors to Preprocess SUMMARY.md #2466

Opt-in Interface for Preprocessors to Preprocess SUMMARY.md #2466

ssanderson commented Nov 3, 2024 •

edited

Loading

ssanderson commented Nov 3, 2024

ehuss commented Nov 4, 2024

ssanderson commented Nov 4, 2024 •

edited

Loading

ssanderson commented Nov 9, 2024

Opt-in Interface for Preprocessors to Preprocess SUMMARY.md #2466

Opt-in Interface for Preprocessors to Preprocess SUMMARY.md #2466

Comments

ssanderson commented Nov 3, 2024 • edited Loading

Problem

Proposed Solution

Notes

ssanderson commented Nov 3, 2024

ehuss commented Nov 4, 2024

ssanderson commented Nov 4, 2024 • edited Loading

ssanderson commented Nov 9, 2024

ssanderson commented Nov 3, 2024 •

edited

Loading

ssanderson commented Nov 4, 2024 •

edited

Loading