ReverseMarkdown is a Html to Markdown (http://daringfireball.net/projects/markdown/syntax) converter library in C#. Conversion is very reliable since HtmlAgilityPack (HAP) library is used for traversing the Html DOM.
Note that the library implementation is based on the Ruby based Html to Markdown converter xijo/reverse_markdown.
You can install the package from NuGet using Install-Package ReverseMarkdown
or clone the repository and built it yourself.
var converter = new ReverseMarkdown.Converter();
string html = "This a sample <strong>paragraph</strong> from <a href=\"http://test.com\">my site</a>";
string result = converter.Convert(html);
//result This a sample **paragraph** from [my site](http://test.com)
// with config
var config = new ReverseMarkdown.Config
{
UnknownTags = Config.UnknownTagsOption.PassThrough, // Include the unknown tag completely in the result (default as well)
GithubFlavored = true, // generate GitHub flavoured markdown, supported for BR, PRE and table tags
RemoveComments = true, // will ignore all comments
SmartHrefHandling = true // remove markdown output for links where appropriate
};
var converter = new ReverseMarkdown.Converter(config);
-
GithubFlavored
- Github style markdown for br, pre and table. Default is false -
RemoveComments
- Remove comment tags with text. Default is false -
SmartHrefHandling
- how to handle<a>
tag href attribute-
false
- Outputs[{name}]({href}{title})
even if name and href is identical. This is the default option. -
true
- If name and href equals, outputs just thename
. Note that if Uri is not well formed as perUri.IsWellFormedUriString
(i.e string is not correctly escaped likehttp://example.com/path/file name.docx
) then markdown syntax will be used anyway.If
href
containshttp/https
protocol, andname
doesn't but otherwise are the same, outputhref
onlyIf
tel:
ormailto:
scheme, but afterwards identical with name, outputname
only.
-
-
UnknownTags
- handle unknown tags.UnknownTagsOption.PassThrough
- Include the unknown tag completely into the result. That is, the tag along with the text will be left in output. This is the defaultUnknownTagsOption.Drop
- Drop the unknown tag and its contentUnknownTagsOption.Bypass
- Ignore the unknown tag but try to convert its contentUnknownTagsOption.Raise
- Raise an error to let you know
-
WhitelistUriSchemes
- Specify which schemes (without trailing colon) are to be allowed for<a>
and<img>
tags. Others will be bypassed (output text or nothing). By default allows everything.If
string.Empty
provided and whenhref
orsrc
schema coudn't be determined - whitelistsSchema is determined by
Uri
class, with exception when url begins with/
(file schema) and//
(http schema) -
TableWithoutHeaderRowHandling
- handle table without header rowsTableWithoutHeaderRowHandlingOption.Default
- First row will be used as header row (default)TableWithoutHeaderRowHandlingOption.EmptyRow
- An empty row will be added as the header row
Note that UnknownTags config has been changed to an enumeration in v2.0.0 (breaking change)
- Supports all the established html tags like h1, h2, h3, h4, h5, h6, p, em, strong, i, b, blockquote, code, img, a, hr, li, ol, ul, table, tr, th, td, br
- Can deal with nested lists
- Github Flavoured Markdown conversion supported for br, pre and table. Use
var config = new ReverseMarkdown.Config(githubFlavoured:true);
. By default table will always be converted to Github flavored markdown immaterial of this flag.
Copyright © 2019 Babu Annamalai
ReverseMarkdown is licensed under MIT. Refer to License file for more information.