-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bookmarks are not working with JSoup #408
Comments
hi @Milchreis , this is caused by the html5 parsing rules. if you do a doc.getOuterHtml(); you will get the following representation: <html>
<head>
<style>
div {
page-break-after: always;
}
#toc {
width: 100%;
border-collapse: collapse;
}
#toc .page-number::after {
/* SPECIAL STUFF HERE! */
content: target-counter(attr(href), page);
width: 30px;
}
</style>
</head>
<body>
<bookmarks>
<bookmark name="Title of element on page 1" href="#page-1" />
<bookmark name="Title of element on page 2" href="#page-2" />
<bookmark name="Title of element on page 3" href="#page-3" />
<bookmark name="Title of element on page 4" href="#page-4" />
</bookmarks>
<h1>Bookmarks and TOC example</h1>
<h2>TOC</h2>
<table id="toc">
<tbody>
<tr>
<td><a href="#page-1">Title of element on page</a></td>
<td class="page-number" href="#page-1"></td>
</tr>
<tr>
<td><a href="#page-2">Title of element on page</a></td>
<td class="page-number" href="#page-2"></td>
</tr>
<tr>
<td><a href="#page-3">Title of element on page</a></td>
<td class="page-number" href="#page-3"></td>
</tr>
<tr>
<td><a href="#page-4">Title of element on page</a></td>
<td class="page-number" href="#page-4"></td>
</tr>
</tbody>
</table>
<div id="page-1">
Page 1
</div>
<div id="page-2">
Page 2
</div>
<div id="page-3">
Page 3
</div>
<div id="page-4">
Page 4
</div>
</body>
</html> You can notice how the bookmarks have been moved from the head to the body. In the code, we can see that it will fetch the bookmarks only in the head ( Note: I've tried with my html5 parser (https://github.com/digitalfondue/jfiveparse) and the output is a little different(note how the self closing "bookmark" elements are interpreted): <html><head>
<style>
div {
page-break-after: always;
}
#toc {
width: 100%;
border-collapse: collapse;
}
#toc .page-number::after {
/* SPECIAL STUFF HERE! */
content: target-counter(attr(href), page);
width: 30px;
}
</style>
</head><body><bookmarks>
<bookmark name="Title of element on page 1" href="#page-1">
<bookmark name="Title of element on page 2" href="#page-2">
<bookmark name="Title of element on page 3" href="#page-3">
<bookmark name="Title of element on page 4" href="#page-4">
</bookmark></bookmark></bookmark></bookmark></bookmarks>
<h1>Bookmarks and TOC example</h1>
<h2>TOC</h2>
<table id="toc">
<tbody><tr><td><a href="#page-1">Title of element on page</a></td><td class="page-number" href="#page-1"></td></tr>
<tr><td><a href="#page-2">Title of element on page</a></td><td class="page-number" href="#page-2"></td></tr>
<tr><td><a href="#page-3">Title of element on page</a></td><td class="page-number" href="#page-3"></td></tr>
<tr><td><a href="#page-4">Title of element on page</a></td><td class="page-number" href="#page-4"></td></tr>
</tbody></table>
<div id="page-1">Page 1</div>
<div id="page-2">Page 2</div>
<div id="page-3">Page 3</div>
<div id="page-4">Page 4</div>
</body></html> Which is even more correct, as chrome will interpret the html the same way: So I guess that I think that I can provide a PR for that, @danfickle what do you think? |
Thank you guys. Waiting for the next release 😊 |
If you generate a PDF with bookmarks with JSoup the bookmarks are not included and no error message is thrown.
Example
The text was updated successfully, but these errors were encountered: