Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add absolute links support #1802

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions guide/src/format/configuration/renderers.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ edit-url-template = "https://github.com/rust-lang/mdBook/edit/master/guide/{path
site-url = "/example-book/"
cname = "myproject.rs"
input-404 = "not-found.md"
use-site-url-as-root = false
```

The following configuration options are available:
Expand Down Expand Up @@ -159,6 +160,7 @@ The following configuration options are available:
navigation links and script/css imports in the 404 file work correctly, even when accessing
urls in subdirectories. Defaults to `/`. If `site-url` is set,
make sure to use document relative links for your assets, meaning they should not start with `/`.
- **use-site-url-as-root:** Prepend the `site_url` in links with absolute path.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add this to the TOML summary up above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

links with absolute path

As a user, I would wonder "what links are those"? It could be

  • links generated by templates, such as those referencing style sheets and JavaScript code
  • links generated between pages of the book
  • links I insert using Markdown syntax

- **cname:** The DNS subdomain or apex domain at which your book will be hosted.
This string will be written to a file named CNAME in the root of your site, as
required by GitHub Pages (see [*Managing a custom domain for your GitHub Pages
Expand Down
3 changes: 3 additions & 0 deletions src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -516,6 +516,8 @@ pub struct HtmlConfig {
pub input_404: Option<String>,
/// Absolute url to site, used to emit correct paths for the 404 page, which might be accessed in a deeply nested directory
pub site_url: Option<String>,
/// Prepend the `site_url` in links with absolute path.
pub use_site_url_as_root: bool,
Comment on lines +519 to +520
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so technically adding this is a breaking change since HtmlConfig is in the public API. However, we have been adding new fields to this struct for a while now, and nobody has complained. That is something we should definitely fix in the future, but for now I guess we can let it slide. 😦

/// The DNS subdomain or apex domain at which your book will be hosted. This
/// string will be written to a file named CNAME in the root of your site,
/// as required by GitHub Pages (see [*Managing a custom domain for your
Expand Down Expand Up @@ -562,6 +564,7 @@ impl Default for HtmlConfig {
edit_url_template: None,
input_404: None,
site_url: None,
use_site_url_as_root: false,
cname: None,
live_reload_endpoint: None,
redirect: HashMap::new(),
Expand Down
11 changes: 10 additions & 1 deletion src/renderer/html_handlebars/hbs_renderer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,16 @@ impl HtmlHandlebars {
}

let content = ch.content.clone();
let content = utils::render_markdown(&content, ctx.html_config.curly_quotes);
let content = if ctx.html_config.use_site_url_as_root {
utils::render_markdown_with_abs_path(
&content,
ctx.html_config.curly_quotes,
None,
ctx.html_config.site_url.as_ref(),
)
} else {
utils::render_markdown(&content, ctx.html_config.curly_quotes)
};

let fixed_content =
utils::render_markdown_with_path(&ch.content, ctx.html_config.curly_quotes, Some(path));
Expand Down
125 changes: 115 additions & 10 deletions src/utils/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -92,12 +92,12 @@ pub fn unique_id_from_content(content: &str, id_counter: &mut HashMap<String, us
/// page go to the original location. Normal page rendering sets `path` to
/// None. Ideally, print page links would link to anchors on the print page,
/// but that is very difficult.
fn adjust_links<'a>(event: Event<'a>, path: Option<&Path>) -> Event<'a> {
fn adjust_links<'a>(event: Event<'a>, path: Option<&Path>, abs_url: Option<&String>) -> Event<'a> {
static SCHEME_LINK: Lazy<Regex> = Lazy::new(|| Regex::new(r"^[a-z][a-z0-9+.-]*:").unwrap());
static MD_LINK: Lazy<Regex> =
Lazy::new(|| Regex::new(r"(?P<link>.*)\.md(?P<anchor>#.*)?").unwrap());

fn fix<'a>(dest: CowStr<'a>, path: Option<&Path>) -> CowStr<'a> {
fn fix<'a>(dest: CowStr<'a>, path: Option<&Path>, abs_url: Option<&String>) -> CowStr<'a> {
if dest.starts_with('#') {
// Fragment-only link.
if let Some(path) = path {
Expand Down Expand Up @@ -126,20 +126,32 @@ fn adjust_links<'a>(event: Event<'a>, path: Option<&Path>) -> Event<'a> {
}

if let Some(caps) = MD_LINK.captures(&dest) {
fixed_link.push_str(&caps["link"]);
fixed_link.push_str(&caps["link"].trim_start_matches('/'));
fixed_link.push_str(".html");
if let Some(anchor) = caps.name("anchor") {
fixed_link.push_str(anchor.as_str());
}
} else if !fixed_link.is_empty() {
// prevent links with double slashes
fixed_link.push_str(&dest.trim_start_matches('/'));
} else {
fixed_link.push_str(&dest);
};
return CowStr::from(fixed_link);
if dest.starts_with('/') || path.is_some() {
if let Some(abs_url) = abs_url {
fixed_link = format!(
"{}/{}",
abs_url.trim_end_matches('/'),
&fixed_link.trim_start_matches('/')
);
}
}
return CowStr::from(format!("{}", fixed_link));
Comment on lines +129 to +149
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to add changes to this function because of leading double slashes when the path and abs_url var are defined. I've added some tests to make sure it only affects those cases.

}
dest
}

fn fix_html<'a>(html: CowStr<'a>, path: Option<&Path>) -> CowStr<'a> {
fn fix_html<'a>(html: CowStr<'a>, path: Option<&Path>, abs_url: Option<&String>) -> CowStr<'a> {
// This is a terrible hack, but should be reasonably reliable. Nobody
// should ever parse a tag with a regex. However, there isn't anything
// in Rust that I know of that is suitable for handling partial html
Expand All @@ -153,7 +165,7 @@ fn adjust_links<'a>(event: Event<'a>, path: Option<&Path>) -> Event<'a> {

HTML_LINK
.replace_all(&html, |caps: &regex::Captures<'_>| {
let fixed = fix(caps[2].into(), path);
let fixed = fix(caps[2].into(), path, abs_url);
format!("{}{}\"", &caps[1], fixed)
})
.into_owned()
Expand All @@ -162,12 +174,12 @@ fn adjust_links<'a>(event: Event<'a>, path: Option<&Path>) -> Event<'a> {

match event {
Event::Start(Tag::Link(link_type, dest, title)) => {
Event::Start(Tag::Link(link_type, fix(dest, path), title))
Event::Start(Tag::Link(link_type, fix(dest, path, abs_url), title))
}
Event::Start(Tag::Image(link_type, dest, title)) => {
Event::Start(Tag::Image(link_type, fix(dest, path), title))
Event::Start(Tag::Image(link_type, fix(dest, path, abs_url), title))
}
Event::Html(html) => Event::Html(fix_html(html, path)),
Event::Html(html) => Event::Html(fix_html(html, path, abs_url)),
_ => event,
}
}
Expand All @@ -190,11 +202,22 @@ pub fn new_cmark_parser(text: &str, curly_quotes: bool) -> Parser<'_, '_> {
}

pub fn render_markdown_with_path(text: &str, curly_quotes: bool, path: Option<&Path>) -> String {
render_markdown_with_abs_path(text, curly_quotes, path, None)
}

pub fn render_markdown_with_abs_path(
text: &str,
curly_quotes: bool,
path: Option<&Path>,
abs_url: Option<&String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The &String type should normally not be used — &str is more flexible for the callers since they can pass in both string literals (that would simplify the tests below) and borrowed owned strings.

) -> String {
// This function should be merged with `render_markdown_with_path`
// in the future. Currently, it is used not to break compatibility.
let mut s = String::with_capacity(text.len() * 3 / 2);
let p = new_cmark_parser(text, curly_quotes);
let events = p
.map(clean_codeblock_headers)
.map(|event| adjust_links(event, path))
.map(|event| adjust_links(event, path, abs_url))
.flat_map(|event| {
let (a, b) = wrap_tables(event);
a.into_iter().chain(b)
Expand Down Expand Up @@ -399,6 +422,88 @@ more text with spaces
}
}

mod render_markdown_with_abs_path {
use super::super::render_markdown_with_abs_path;
use std::path::Path;

#[test]
fn preserves_external_links() {
assert_eq!(
render_markdown_with_abs_path(
"[example](https://www.rust-lang.org/)",
false,
None,
Some(&"ABS_PATH".to_string())
),
"<p><a href=\"https://www.rust-lang.org/\">example</a></p>\n"
);
}

#[test]
fn replace_root_links() {
assert_eq!(
render_markdown_with_abs_path(
"[example](/testing)",
false,
None,
Some(&"ABS_PATH".to_string())
),
"<p><a href=\"ABS_PATH/testing\">example</a></p>\n"
);
}

#[test]
fn replace_root_links_using_path() {
assert_eq!(
render_markdown_with_abs_path(
"[example](bar.md)",
false,
Some(Path::new("foo/chapter.md")),
Some(&"ABS_PATH".to_string())
),
"<p><a href=\"ABS_PATH/foo/bar.html\">example</a></p>\n"
);
assert_eq!(
render_markdown_with_abs_path(
"[example](/bar.md)",
false,
Some(Path::new("foo/chapter.md")),
Some(&"ABS_PATH".to_string())
),
"<p><a href=\"ABS_PATH/foo/bar.html\">example</a></p>\n"
);
assert_eq!(
render_markdown_with_abs_path(
"[example](/bar.html)",
false,
Some(Path::new("foo/chapter.md")),
None
),
"<p><a href=\"foo/bar.html\">example</a></p>\n"
);
}

#[test]
fn preserves_relative_links() {
assert_eq!(
render_markdown_with_abs_path(
"[example](../testing)",
false,
None,
Some(&"ABS_PATH".to_string())
),
"<p><a href=\"../testing\">example</a></p>\n"
);
}

#[test]
fn preserves_root_links() {
assert_eq!(
render_markdown_with_abs_path("[example](/testing)", false, None, None),
"<p><a href=\"/testing\">example</a></p>\n"
);
}
}
#[allow(deprecated)]
mod id_from_content {
use super::super::id_from_content;
Expand Down