Software development and beyond

Convert Markdown text to HTML and to plaintext in JavaScript

Let’s create a small JavaScript module that will be able to convert formatted text in Markdown to HTML and to plaintext. I chose Marked.js as my library of choice because with this library it is very easy to modify the output.

Here is our basic module to convert Markdown to HTML:

import marked from "marked"

const mdOptions = {
// whether to conform to original MD implementation
pedantic: false,
// Github Flavoured Markdown
gfm: true,
// tables extension
tables: true,
// smarter list behavior
smartLists: true,
// "smart" typographic punctuation for things like quotes and dashes
smartypants: true,
// sanitize HTML tags
sanitize: true,
// ... other options
}

export function convertToHTML(markdownText) {
marked.setOptions(mdOptions)
return marked(markdownText)
}

As we can see, all it takes to generate HTML from Markdown is one call to marked(...) function. I included a set of options where I opted for Github Flavoured Markdown with the new table syntax and sanitizing HTML tags among other things. By choosing to sanitize HTML, we are allowing users to format text only with Markdown syntax. Have a look at the full set of options to configure the output. Some of the options could be omitted based on their default value, but I find it useful to include them. We can also skip setting the options entirely if we don’t want to change the defaults.

We can further change Marked.js rendering with the use of custom renderers. Renderers are basically just functions that accept arguments describing the parsed element and producing text output. Let’s have a look at some examples.

todoListItemRenderer is a renderer that will change the style of li elements when they are used as to-do items (with [ ] or [X]), so that checkboxes in the lists are rendered without bullets before them. This allows us to make better-looking to-do lists. Here it is:

const todoListItemRenderer = (text) => {
if (text.includes('type="checkbox"')) {
return `<li style="list-style: none">${text}</li>`
}
return `<li>${text}</li>`
}

externalLinkRenderer is a renderer that will specify target for external links so that they are opened in a new window and prevent users from producing relative links. For instance, this renderer is suitable for rendering Markdown in Electron applications where we don’t want to move to another page in our main application window. Here it is:

const externalLinkRenderer = (href, title, text) => {
if (href.startsWith("http://") || href.startsWith("https://")) {
if (!text) {
text = href
}
return `<a target="_blank" href="${href}" title="${title}">${text}</a>`
}
return `[${text}](${href})`
}

So now we have defined our renderers, but we still have to use them. For this we will modify our convertToHTML function:

export function convertToHTML(markdownText) {
const renderer = new marked.Renderer()
renderer.listitem = todoListItemRenderer
renderer.link = externalLinkRenderer
marked.setOptions(mdOptions)
return marked(markdownText, { renderer })
}

This is great, because with a little bit of code we are able to produce HTML output we want from Markdown-styled texts. However, sometimes it is useful to strip down all Markdown formating from text. How can we do that?

For converting Markdown to plaintext there is a small helper library called marked-plaintext. Let’s add a new function called convertToPlainText to our JavaScript module:

import marked from "marked"
import PlainTextRenderer from "marked-plaintext"

const plaintextOptions = {
sanitize: false
}

export function convertToPlainText(markdownText) {
const renderer = new PlainTextRenderer()
renderer.checkbox = (text) => {
return text
}
marked.setOptions(plaintextOptions)
return marked(markdownText, { renderer })
}

Now, why do I bother with setting the options and defining a renderer for checkbox that doesn’t really do anything?

Setting sanitize to false for converting to plaintext is important, because we want HTML tags to be as they were written (not encoded). It is also the default option in Marked.js, so feel free to omit it.

Defining a renderer function for checkbox is necessary, because marked-plaintext library doesn’t define it by default. So if we have to-do style lists, we’d get an error during rendering, because they produce checkboxes by default.

To customize it further, or to build a better plaintext renderer, we can find the inspiration in this library itself, because it is really tiny. See its renderer definition. We can in fact just copy the renderer definition to our own code if we don’t want another dependency.

And that’s all. In the end, we ended up with our little JavaScript module to convert Markdown to HTML and to plaintext:

import marked from "marked"
import PlainTextRenderer from "marked-plaintext"

const mdOptions = {
// whether to conform to original MD implementation
pedantic: false,
// Github Flavoured Markdown
gfm: true,
// tables extension
tables: true,
// smarter list behavior
smartLists: true,
// "smart" typographic punctuation for things like quotes and dashes
smartypants: true,
// sanitize HTML tags
sanitize: true,
// ... other options
}

const plaintextOptions = {
sanitize: false
}

const todoListItemRenderer = (text) => {
if (text.includes('type="checkbox"')) {
return `<li style="list-style: none">${text}</li>`
}
return `<li>${text}</li>`
}

const externalLinkRenderer = (href, title, text) => {
if (href.startsWith("http://") || href.startsWith("https://")) {
if (!text) {
text = href
}
return `<a target="_blank" href="${href}" title="${title}">${text}</a>`
}
return `[${text}](${href})`
}

export function convertToHTML(markdownText) {
const renderer = new marked.Renderer()
renderer.listitem = todoListItemRenderer
renderer.link = externalLinkRenderer
marked.setOptions(mdOptions)
return marked(markdownText, { renderer })
}

export function convertToPlainText(markdownText) {
const renderer = new PlainTextRenderer()
renderer.checkbox = (text) => {
return text
}
marked.setOptions(plaintextOptions)
return marked(markdownText, { renderer })
}

We can now use our module as:

import { convertToHTML, convertToPlainText } as markdownService from "./markdownService"
const html = convertToHTML(markdownText)
const plaintext = convertToPlainText(markdownText)

// or

import * as markdownService from "./markdownService"
const html = markdownService.convertToHTML(markdownText)
const plaintext = markdownService.convertToPlainText(markdownText)

Happy coding!

Last updated on 26.10.2018.

javascript markdown