remark plugin to parse Markdown
npm install remark-parse-no-trim[![Build][build-badge]][build]
[![Coverage][coverage-badge]][coverage]
[![Downloads][downloads-badge]][downloads]
[![Size][size-badge]][size]
[![Sponsors][sponsors-badge]][collective]
[![Backers][backers-badge]][collective]
[![Chat][chat-badge]][chat]
[Parser][] for [unified][unified].
Parses Markdown to [mdast][mdast] syntax trees.
Used in the [remark processor][remark] but can be used on its own as well.
Can be [extended][extend] to change how Markdown is parsed.
Gatsby 🥇 | Vercel 🥇 ![]() | Netlify ![]() | |||
Holloway | ThemeIsle 🥉 | BoostIO 🥉 | Expo 🥉 | You? | |
[npm][]:
``sh`
npm install remark-parse
`js
var unified = require('unified')
var createStream = require('unified-stream')
var markdown = require('remark-parse')
var remark2rehype = require('remark-rehype')
var html = require('rehype-stringify')
var processor = unified()
.use(markdown, {commonmark: true})
.use(remark2rehype)
.use(html)
process.stdin.pipe(createStream(processor)).pipe(process.stdout)
`
[See unified for more examples »][unified]
* API
* [processor().use(parse[, options])](#processoruseparse-options)parse.Parser
* Parser
* Extending the Parser#blockTokenizers
* Parser#blockMethods
* Parser#inlineTokenizers
* Parser#inlineMethods
* function tokenizer(eat, value, silent)
* tokenizer.locator(value, fromIndex)
* eat(subvalue)
* add(node[, parent])
* [](#addnode-parent)add.test()
* add.reset(node[, parent])
* [](#addresetnode-parent)
* Turning off a tokenizer
* Security
* Contribute
* License
[See unified for API docs »][unified]
Configure the processor to read Markdown as input and process
[mdast][mdast] syntax trees.
##### options
Options can be passed directly, or passed later through
[processor.data()][data].
###### options.gfm
GFM mode (boolean, default: true).
`markdown`
hello ~~hi~~ world
Turns on:
* Fenced code blocks
* Autolinking of URLs
* Deletions (strikethrough)
* Task lists
* Tables
###### options.commonmark
CommonMark mode (boolean, default: false).
`markdown`
This is a paragraph
and this is also part of the preceding paragraph.
Allows:
* Empty lines to split block quotes
* Parentheses (( and )) around link and image titles)
* Any escaped [ASCII punctuation][escapes] character
* Closing parenthesis () as an ordered list marker
* URL definitions in block quotes
Disallows:
* Indented code blocks directly following a paragraph
* ATX headings (# Hash headings) without spacing after opening hashes or andUnderline headings\n---
before closing hashes
* Setext headings () when following a paragraph<
* Newlines in link and image titles
* White space in link and image URLs in auto-links (links in brackets, and>
)>
* Lazy block quote continuation, lines not preceded by a greater than
character (), for lists, code, and thematic breaks
###### options.pedantic
⚠️ Pedantic was previously used to mimic old-style Markdown mode: no tables, no
fenced code, and with many bugs.
It’s currently still “working”, but please do not use it, it’ll be removed in
the future.
###### options.blocks
Blocks (Array., default: list of [block HTML elements][blocks]).
`markdown`
Defines which HTML elements are seen as block level.
Access to the [parser][], if you need it.
Typically, using [transformers][transformer] to manipulate a syntax tree
produces the desired output.
Sometimes, such as when introducing new syntactic entities with a certain
precedence, interfacing with the parser is necessary.
If the remark-parse plugin is used, it adds a [Parser][parser] constructorprocessor
function to the .
Other plugins can add tokenizers to its prototype to change how Markdown is
parsed.
The below plugin adds a [tokenizer][] for at-mentions.
`js
module.exports = mentions
function mentions() {
var Parser = this.Parser
var tokenizers = Parser.prototype.inlineTokenizers
var methods = Parser.prototype.inlineMethods
// Add an inline tokenizer (defined in the following example).
tokenizers.mention = tokenizeMention
// Run it just before text.`
methods.splice(methods.indexOf('text'), 0, 'mention')
}
Map of names to [tokenizer][]s (Object.).fencedCode
These tokenizers (such as , table, and paragraph) eat from the
start of a value to a line ending.
See #blockMethods below for a list of methods that are included by default.
List of blockTokenizers names (Array.).
Specifies the order in which tokenizers run.
Precedence of default block methods is as follows:
* blankLineindentedCode
* fencedCode
* blockquote
* atxHeading
* thematicBreak
* list
* setextHeading
* html
* definition
* table
* paragraph
*
Map of names to [tokenizer][]s (Object.).url
These tokenizers (such as , reference, and emphasis) eat from the start
of a value.
To increase performance, they depend on [locator][]s.
See #inlineMethods below for a list of methods that are included by default.
List of inlineTokenizers names (Array.).
Specifies the order in which tokenizers run.
Precedence of default inline methods is as follows:
* escapeautoLink
* url
*
* html
* link
* reference
* strong
* emphasis
* deletion
* code
* break
* text
*
There are two types of tokenizers: block level and inline level.
Both are functions, and work the same, but inline tokenizers must have a
[locator][].
The following example shows an inline tokenizer that is added by the mentions
plugin above.
`js
tokenizeMention.notInLink = true
tokenizeMention.locator = locateMention
function tokenizeMention(eat, value, silent) {
var match = /^@(\w+)/.exec(value)
if (match) {
if (silent) {
return true
}
return eat(match[0])({
type: 'link',
url: 'https://social-network/' + match[1],
children: [{type: 'text', value: match[0]}]
})
}
}
`
Tokenizers test whether a document starts with a certain syntactic entity.
In silent mode, they return whether that test passes.
In normal mode, they consume that token, a process which is called “eating”.
Locators enable inline tokenizers to function faster by providing where the next
entity may occur.
###### Signatures
* Node? = tokenizer(eat, value)boolean? = tokenizer(eat, value, silent)
*
###### Parameters
* eat ([Function][eat]) — Eat, when applicable, an entityvalue
* (string) — Value which may start an entitysilent
* (boolean, optional) — Whether to detect or consume
###### Properties
* locator ([Function][locator]) — Required for inline tokenizersonlyAtStart
* (boolean) — Whether nodes can only be found at the beginningnotInBlock
of the document
* (boolean) — Whether nodes cannot be in block quotes or listsnotInList
* (boolean) — Whether nodes cannot be in listsnotInLink
* (boolean) — Whether nodes cannot be in links
###### Returns
boolean?, in silent* mode — whether a node can be found at the start of
valueNode?
[][node], In normal* mode — If it can be found at the start ofvalue
Locators are required for inline tokenizers.
Their role is to keep parsing performant.
The following example shows a locator that is added by the mentions tokenizer
above.
`js`
function locateMention(value, fromIndex) {
return value.indexOf('@', fromIndex)
}
Locators enable inline tokenizers to function faster by providing information on
where the next entity may occur.
Locators may be wrong, it’s OK if there actually isn’t a node to be found at the
index they return.
###### Parameters
* value (string) — Value which may contain an entityfromIndex
* (number) — Position to start searching at
###### Returns
number — Index at which an entity may start, and -1 otherwise.
`js`
var add = eat('foo')
Eat subvalue, which is a string at the start of the [tokenized][tokenizer]value.
###### Parameters
* subvalue (string) - Value to eat
###### Returns
[add][add].
`js
var add = eat('foo')
add({type: 'text', value: 'foo'})
`
Add [positional information][position] to node and add node to parent.
###### Parameters
* node ([Node][node]) - Node to patch position on and to addparent
* ([Parent][parent], optional) - Place to add node to in the
syntax tree.
Defaults to the currently processed node
###### Returns
[Node][node] — The given node.
Get the [positional information][position] that would be patched on node byadd.
###### Returns
[Position][position].
add, but resets the internal position.
Useful for example in lists, where the same content is first eaten for a list,
and later for list items.
###### Parameters
* node ([Node][node]) - Node to patch position on and insertparent
* ([Node][node], optional) - Place to add node to in
the syntax tree.
Defaults to the currently processed node
###### Returns
[Node][node] — The given node.
In some situations, you may want to turn off a tokenizer to avoid parsing that
syntactic feature.
Preferably, use the [remark-disable-tokenizers][remark-disable-tokenizers]
plugin to turn off tokenizers.
Alternatively, this can be done by replacing the tokenizer from
blockTokenizers (or blockMethods) or inlineTokenizers (orinlineMethods).
The following example turns off indented code blocks:
`js
remarkParse.Parser.prototype.blockTokenizers.indentedCode = indentedCode
function indentedCode() {
return true
}
`
As Markdown is sometimes used for HTML, and improper use of HTML can open you up
to a [cross-site scripting (XSS)][xss] attack, use of remark can also be unsafe.
When going to HTML, use remark in combination with the [rehype][rehype]
ecosystem, and use [rehype-sanitize][sanitize] to make the tree safe.
Use of remark plugins could also open you up to other attacks.
Carefully assess each plugin and the risks involved in using them.
See [contributing.md][contributing] in [remarkjs/.github][health] for wayssupport.md
to get started.
See [][support] for ways to get help.remarkjs/ideas`][ideas].
Ideas for new plugins and tools can be posted in [
A curated list of awesome remark resources can be found in [**awesome
remark**][awesome].
This project has a [code of conduct][coc].
By interacting with this repository, organization, or community you agree to
abide by its terms.
[MIT][license] © [Titus Wormer][author]
[build-badge]: https://img.shields.io/travis/remarkjs/remark.svg
[build]: https://travis-ci.org/remarkjs/remark
[coverage-badge]: https://img.shields.io/codecov/c/github/remarkjs/remark.svg
[coverage]: https://codecov.io/github/remarkjs/remark
[downloads-badge]: https://img.shields.io/npm/dm/remark-parse.svg
[downloads]: https://www.npmjs.com/package/remark-parse
[size-badge]: https://img.shields.io/bundlephobia/minzip/remark-parse.svg
[size]: https://bundlephobia.com/result?p=remark-parse
[sponsors-badge]: https://opencollective.com/unified/sponsors/badge.svg
[backers-badge]: https://opencollective.com/unified/backers/badge.svg
[collective]: https://opencollective.com/unified
[chat-badge]: https://img.shields.io/badge/chat-spectrum-7b16ff.svg
[chat]: https://spectrum.chat/unified/remark
[health]: https://github.com/remarkjs/.github
[contributing]: https://github.com/remarkjs/.github/blob/HEAD/contributing.md
[support]: https://github.com/remarkjs/.github/blob/HEAD/support.md
[coc]: https://github.com/remarkjs/.github/blob/HEAD/code-of-conduct.md
[ideas]: https://github.com/remarkjs/ideas
[awesome]: https://github.com/remarkjs/awesome-remark
[license]: https://github.com/remarkjs/remark/blob/main/license
[author]: https://wooorm.com
[npm]: https://docs.npmjs.com/cli/install
[unified]: https://github.com/unifiedjs/unified
[data]: https://github.com/unifiedjs/unified#processordatakey-value
[remark]: https://github.com/remarkjs/remark/tree/main/packages/remark
[blocks]: https://github.com/remarkjs/remark/blob/main/packages/remark-parse/lib/block-elements.js
[mdast]: https://github.com/syntax-tree/mdast
[escapes]: https://spec.commonmark.org/0.29/#backslash-escapes
[node]: https://github.com/syntax-tree/unist#node
[parent]: https://github.com/syntax-tree/unist#parent
[position]: https://github.com/syntax-tree/unist#position
[parser]: https://github.com/unifiedjs/unified#processorparser
[transformer]: https://github.com/unifiedjs/unified#function-transformernode-file-next
[extend]: #extending-the-parser
[tokenizer]: #function-tokenizereat-value-silent
[locator]: #tokenizerlocatorvalue-fromindex
[eat]: #eatsubvalue
[add]: #addnode-parent
[remark-disable-tokenizers]: https://github.com/zestedesavoir/zmarkdown/tree/HEAD/packages/remark-disable-tokenizers
[xss]: https://en.wikipedia.org/wiki/Cross-site_scripting
[rehype]: https://github.com/rehypejs/rehype
[sanitize]: https://github.com/rehypejs/rehype-sanitize