Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Superscripts turn into non superscript #33

Closed
abrambailey opened this issue Apr 12, 2022 · 3 comments · Fixed by #34
Closed

Superscripts turn into non superscript #33

abrambailey opened this issue Apr 12, 2022 · 3 comments · Fixed by #34
Labels
bug Something isn't working

Comments

@abrambailey
Copy link

Expected behavior would be paste superscript, get superscript, but ² turns into 2 in the results.

@Mr0grog
Copy link
Owner

Mr0grog commented Apr 13, 2022

Good catch! Just tried this out and I see what you mean.

I’m not sure I’ll get to this before the weekend, but will definitely get this fixed. In the mean time, pull requests are also welcome. :)


Looks like this is happening because Google Docs renders HTML like:

<span>Test document</span><span><span style="font-size:0.6em;vertical-align:super;">2</span></span>

^ where the “2” is superscripted. The key part is vertical-align: super; in the style attribute. Ditto for subscripts but with vertical-align:sub;.

The best fix here probably to expand unInlineStyles() or add a similar transform that converts the span above into a sup or sub element as appropriate:

/**
* Google Docs does italics/bolds/etc on <span>s with style attributes, but
* rehype-remark does pick up on those well. Instead, transform them into
* `em`, `strong`, etc. elements.
*
* @param {RehypeNode} node Fix the tree below this node
*/
export function unInlineStyles (node) {
visit(node, isStyled, (node, index, parent) => {
const style = node.properties.style;
if (/font-style:\s*italic/.test(style)) {
wrapChildren(node, hast('em'));
}
if (/font-weight:\s*(bold|700)/.test(style)) {
wrapChildren(node, hast('strong'));
}
});
}

There may also need to be some special handling of those elements in the rehype2remark steps, too; not sure.

@Mr0grog Mr0grog added the bug Something isn't working label Apr 13, 2022
Mr0grog added a commit that referenced this issue Apr 18, 2022
Mr0grog added a commit that referenced this issue Apr 18, 2022
You can have superscript and subscript text in Google Docs, but this app would previously remove the formatting. We now keep it in place, even if it outputs unfortunately verbose markup: most Markdown flavors have no markup for superscript or subscript, so we output HTML tags, e.g:

    This<sub>is subscripted</sub> and this<sup>is superscripted</sup> text.

This also updates most of our Rehype and Unified dependencies (they offer some new utilities and fixes that make this feature easier to implement). I did *not* update Remark here, though, since it has some major changes in v8 that need more careful review (really, I should have some tests).

Fixes #33.
@Mr0grog
Copy link
Owner

Mr0grog commented Apr 18, 2022

@abrambailey this should be fixed now! If you are using https://mr0grog.github.io/google-docs-to-markdown/ things should just work, and if you are using a web3-style link, you should switch to https://bafybeihpbbnpk5f2cze5osp4su5hg52jplz7njwgl6nbqlqinlhwbu4imm.ipfs.dweb.link/ :)

@abrambailey
Copy link
Author

Amazing. Thank you!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants