Markdown parser written in Clojure/Script
 
You can try out the parser here.
A markdown parser that compiles to both Clojure and ClojureScript.

Note: markdown-clj versions prior to 0.9.68 requires Clojure 1.2+ to run, versions 0.9.68+ require Clojure 1.7.
Markdown-clj can be invoked either by calling md-to-html or md-to-html-string functions.
The md-to-html function accepts an input containing Markdown markup and an output where
the resulting HTML will be written. The input and output parameters will be passed to a reader
and a writer respectively:
``clojure
(ns foo
(:use markdown.core))
(md-to-html "input.md" "output.html")
(md-to-html (input-stream "input.md") (output-stream "test.txt"))
`
The md-to-html-string function accepts a string with markdown content and returns a string with the resulting HTML:
`clojure`
(md-to-html-string "# This is a test\nsome code follows\nclojure\n(defn foo [])\n`")``xml` This is a test
some code follows(defn foo [])
Both md-to-html and md-to-html-string accept optional parameters:
Specifying :heading-anchors will create anchors for the heading tags, eg:
`clojure
(markdown/md-to-html-string "###foo bar BAz" :heading-anchors true)
``xml`foo bar BAz
The code blocks default to a highlight.js compatible format of:
`xml`some code
Specifying :code-style will override the default code class formatting for code blocks, eg:
`clojure`
(md-to-html-string "# This is a test\nsome code follows\nclojure\n(defn foo [])\n`"`
:code-style #(str "class=\"brush: " % "\""))`xml` This is a test
some code follows
(defn foo [])
The parser defaults to using inline reference for performance reasons, to enable reference style links pass in the :reference-links? true option:
`clojure
(md-to-html-string
"This is [an example][id] reference-style link.
[id]: http://example.com/ 'Optional Title Here'"
:reference-links? true)
`
To enable footnotes, pass the :footnotes? true option:
`clojure
(md-to-html-string
"Footnotes will appear automatically numbered with a link to the footnote at bottom of the page [^footnote1].
[^footnote1]: The footnote will contain a back link to to the referring text."
:footnotes? true)
`
The metadata encoded using the syntax described by MultiMarkdown can be optionally extracted from the document.
The md-to-html function will attempt to parse the metadata when passed the :parse-meta? true option and return it as its output.md-to-html-string-with-meta
Additionally, function can be used to parse string input. The function returns a map with two keys, :html containing the parsed HTML, and :metadata containing a map with the metadata included at the top of the document.
The value of each key in the metadata map will be a list of either 0, 1 or many strings. If a metadata value ends in two spaces then the string will end in a newline. If a line does not contain a header and has at least 4 spaces in front of it then it will be considered to be a member of the last key that was found.
`clojure
(let [input (new StringReader text)
output (new StringWriter)
metadata (md-to-html input output :parse-meta? true)
html (.toString output)]
{:metadata metadata :html html})
(md-to-html-string-with-meta
"Author: Rexella van Imp
Kim Jong-un
Date: October 31, 2015
# Hello!")
{:metadata {:author ["Rexella van Imp"
"Kim Jong-un"],
:date ["October 31, 2015"]},
:html "
$3
If you pass
:inhibit-separator "some-string", then any text within occurrences of some-string will be output verbatim, eg:`clojure
(md-to-html-string "For all %$a_0, a_1, ..., a_n in R$% there is _at least one_ %$b_n in R$% such that..."
:inhibit-separator "%")
`
`xml
For all $a_0, a_1, ..., a_n in R$ there is at least one $b_n in R$ such that...
`This may be useful to use
markdown-clj along with other parsers of languages with conflicting syntax (e.g. asciimath2jax).If you need to output the separator itself, enter it twice without any text inside. Eg:
`clojure
(md-to-html-string "This is one of those 20%% vs 80%% cases."
:inhibit-separator "%")
`
`xml
This is one of those 20% vs 80% cases.
`Some caveats:
- Like other tags, this only works within a single line.
- If you remove the default transformers with
:replacement-transformers (which see below), inhibiting will stop working.- Currently, dashes (
-- and ---) can't be suppressed this way.Customizing the Parser
Additional transformers can be specified using the
:custom-transformers key.
A transformer function must accept two arguments.
First argument is the string representing the current line and the second is the map representing the current state.The default state keys are:
*
:code - inside a code section
* :codeblock - inside a code block
* :eof - end of file
* :heading - in a heading
* :hr - in a horizontal line
* :lists - inside a list
* :blockquote - inside a blockquote
* :paragraph - in a paragraph
* :last-line-empty? - was last line an empty line?For example, if we wanted to add a transformer that would capitalize all text we could do the following:
`clojure
(defn capitalize [text state]
[(.toUpperCase text) state])(markdown/md-to-html-string "#foo" :custom-transformers [capitalize])
``xml
FOO
`Alternatively, you could provide a custom set of transformers to replace the default transformers using the
:replacement-transformers key.`clojure
(markdown/md-to-html-string "#foo" :replacement-transformers [capitalize])
`This can also be used to add preprocessor transformers. For example, if we wanted to sanitize any image links we could do the following:
`clojure
(use 'markdown.transformers 'markdown.core)(defn escape-images [text state]
[(clojure.string/replace text #"(!\[.*?\]\()(.+?)(\))" "") state])
(markdown/md-to-html-string
"foo !Alt text bar text"
:replacement-transformers (cons escape-images transformer-vector))
``xml
"foo bar text
"
`Usage ClojureScript
The ClojureScript portion works the same as above except that the entry function is called
md->html. It accepts
a string followed by the options as its input, and returns the resulting HTML string:`clojure
(ns myscript
(:require [markdown.core :refer [md->html]]))(.log js/console
(md->html "##This is a heading\nwith a paragraph following it"))
(.log js/console
(md->html "# This is a test\nsome code follows\n
`clojure\n(defn foo [])\n`"
:code-style #(str "class=\"" % "\"")))(md->html-with-meta "# This is a test\nsome code follows\n
`clojure\n(defn foo [])\n`")
`Usage JavaScript
`javascript
console.log(markdown.core.mdToHtml("##This is a heading\nwith a paragraph following it"));
`Supported syntax
Control characters can be escaped using \
`
\\ backslash
\ backtick#### Basic Elements
Blockquote,
Strong,
Bold,
Bold-Italic,
Emphasis,
Italics,
Heading,
Line,
Linebreak,
Paragraph,
Strikethrough
##### Automatic Links
This is a shortcut style for creating “automatic” links for URLs and email addresses:
`
`
will be turned this into:`
http://example.com/
`Automatic links for email addresses work similarly, except that they are hex encoded:
`
`will be turned into:
`
address@example.com
`#### Lists
Ordered List,
Unordered List
#### Code
Code Block,
Indented Code,
Inline Code
*
$3
the number of hashes indicates the level of the heading
`
Heading
##Sub-heading
$3
`headings can also be defined using
= and - for h1 and h2 respectively`
Heading 1
=========Heading 2
---------
`$3
`
* *
*
- - -
______
`$3
If a line ends with two or more spaces a
tag will be inserted at the end.$3
`
foo
`$3
`
_foo_
`$3
`
foo
`$3
`
__foo__
`$3
`
bold italic
`$3
> prefixes regular blockquote paragraphs. >- prefixes a
blockquote footer that can be used for author attribution.`
>This is a blockquote
with some content>this is another blockquote
> Everyone thinks of changing the world,
but no one thinks of changing himself.
>- Leo Tolstoy
`$3
`
This is a paragraph, it's
split into separate lines.This is another paragraph.
`$3
indenting an item makes it into a sublist of the item above it, ordered and unordered lists can be nested within one another.
List items can be split over multiple lines.
`
* Foo
* Bar
* Baz
``
* foo
* bar * baz
1. foo
2. bar
more content
## subheading
*
strong text in the list
* fuzz
* blah
* blue
* brass
`$3
`
1. Foo
2. Bar
3. Baz
`$3
Any special characters in code will be escaped with their corresponding HTML codes.
`
Here's some code x + y = z that's inlined.
`$3
Using three backquotes indicates a start of a code block, the next three backquotes ends the code block section.
Optionally, the language name can be put after the backquotes to produce a tag compatible with highlight.js, eg:
```clojure
(defn foo [bar] "baz")
```
$3
indenting by at least 4 spaces creates a code block
some
code
here
note: XML is escaped in code sections
$3
`
~~foo~~
`$3
`
a^2 + b^2 = c^2
`$3
`
github
`##### Reference Link
`
This is [an example][id] reference-style link.[id]: http://example.com/ "Optional Title Here"
`note: reference links require the
:reference-links? option to be set to true$3
`
"Footnotes will appear automatically numbered with a link to the footnote at bottom of the page [^footnote1].
[^footnote1]: The footnote will contain a back link to to the referring text."
`note: to enable footnotes, the
:footnotes? option must be set to true.$3
`
!Alt text
!Alt text
`##### Image Reference
`
This is ![an example][id] reference-style image descriptor.[id]: http://example.com/ "Optional Title Here"
`note: reference links require the
:reference-links? option to be set to true
$3
`

`$3
You can create tables by assembling a list of words and dividing them with hyphens - (for the first row), and then separating each column with a pipe |:
`
| First Header | Second Header |
| ------------- | ------------- |
| Content Cell | Content Cell |
| Content Cell | Content Cell |
`By including colons : within the header row, you can define text to be left-aligned, right-aligned, or center-aligned:
`
| Left-Aligned | Center Aligned | Right Aligned |
| :------------ | :---------------: | ------------: |
| col 3 is | some wordy text | $1600 |
| col 2 is | centered | $12 |
| zebra stripes | are neat | $1 |
``A colon on the left-most side indicates a left-aligned column; a colon on the right-most side indicates a right-aligned column; a colon on both sides indicates a center-aligned column.
The parser reads the content line by line, this means that tag content is not allowed to span multiple lines.
Copyright © 2015 Dmitri Sotnikov
Distributed under the Eclipse Public License, the same as Clojure.