remark-parse

[![Build][build-badge]][build]
[![Coverage][coverage-badge]][coverage]
[![Downloads][downloads-badge]][downloads]
[![Size][size-badge]][size]
[![Sponsors][sponsors-badge]][collective]
[![Backers][backers-badge]][collective]
[![Chat][chat-badge]][chat]

[Parser][] for [unified][unified].
Parses Markdown to [mdast][mdast] syntax trees.
Used in the [remark processor][remark] but can be used on its own as well.
Can be [extended][extend] to change how Markdown is parsed.

Install

[npm][]:

``sh npm install remark-parse`

`Use`

`js var unified = require('unified') var createStream = require('unified-stream') var markdown = require('remark-parse') var remark2rehype = require('remark-rehype') var html = require('rehype-stringify')

var processor = unified() .use(markdown, {commonmark: true}) .use(remark2rehype) .use(html)

process.stdin.pipe(createStream(processor)).pipe(process.stdout)`

[See unified for more examples »][unified]

`Contents`

* API * [processor().use(parse[, options])](#processoruseparse-options) *parse.Parser* Extending theParser*Parser#blockTokenizers*Parser#blockMethods*Parser#inlineTokenizers*Parser#inlineMethods*function tokenizer(eat, value, silent)*tokenizer.locator(value, fromIndex)*eat(subvalue)* [add(node[, parent])](#addnode-parent) *add.test()* [add.reset(node[, parent])](#addresetnode-parent) * Turning off a tokenizer * Security * Contribute * License

`API`

[See unified for API docs »][unified]

`$3`

Configure the processorto read Markdown as input and process [mdast][mdast] syntax trees.

##### options

Options can be passed directly, or passed later through [processor.data()][data].

###### options.gfm

GFM mode (boolean, default: true).

`markdown hello ~~hi~~ world`

Turns on:

* Fenced code blocks * Autolinking of URLs * Deletions (strikethrough) * Task lists * Tables

###### options.commonmark

CommonMark mode (boolean, default: false).

`markdown This is a paragraph and this is also part of the preceding paragraph.`

Allows:

* Empty lines to split block quotes * Parentheses (( and )) around link and image titles * Any escaped [ASCII punctuation][escapes] character * Closing parenthesis ()) as an ordered list marker * URL definitions in block quotes

Disallows:

* Indented code blocks directly following a paragraph * ATX headings (# Hash headings) without spacing after opening hashes or and before closing hashes * Setext headings (Underline headings\n---) when following a paragraph * Newlines in link and image titles * White space in link and image URLs in auto-links (links in brackets,<and>) * Lazy block quote continuation, lines not preceded by a greater than character (>), for lists, code, and thematic breaks

###### options.pedantic

⚠️ Pedantic was previously used to mimic old-style Markdown mode: no tables, no fenced code, and with many bugs. It’s currently still “working”, but please do not use it, it’ll be removed in the future.

###### options.blocks

Blocks (Array., default: list of [block HTML elements][blocks]).

`markdown foo`

Defines which HTML elements are seen as block level.

`$3`

Access to the [parser][], if you need it.

`Extending the` Parser

Typically, using [transformers][transformer] to manipulate a syntax tree produces the desired output. Sometimes, such as when introducing new syntactic entities with a certain precedence, interfacing with the parser is necessary.

If the remark-parse plugin is used, it adds a [Parser][parser] constructor function to theprocessor. Other plugins can add tokenizers to its prototype to change how Markdown is parsed.

The below plugin adds a [tokenizer][] for at-mentions.

`js module.exports = mentions

function mentions() { var Parser = this.Parser var tokenizers = Parser.prototype.inlineTokenizers var methods = Parser.prototype.inlineMethods

// Add an inline tokenizer (defined in the following example). tokenizers.mention = tokenizeMention

// Run it just before text. methods.splice(methods.indexOf('text'), 0, 'mention') }`

`$3`

Map of names to [tokenizer][]s (Object.). These tokenizers (such asfencedCode, table, and paragraph) eat from the start of a value to a line ending.

See #blockMethods below for a list of methods that are included by default.

`$3`

List of blockTokenizers names (Array.). Specifies the order in which tokenizers run.

Precedence of default block methods is as follows:

* blankLine*indentedCode*fencedCode*blockquote*atxHeading*thematicBreak*list*setextHeading*html*definition*table*paragraph

`$3`

Map of names to [tokenizer][]s (Object.). These tokenizers (such asurl, reference, and emphasis) eat from the start of a value. To increase performance, they depend on [locator][]s.

See #inlineMethods below for a list of methods that are included by default.

`$3`

List of inlineTokenizers names (Array.). Specifies the order in which tokenizers run.

Precedence of default inline methods is as follows:

* escape*autoLink*url*email*html*link*reference*strong*emphasis*deletion*code*break*text

`$3`

There are two types of tokenizers: block level and inline level. Both are functions, and work the same, but inline tokenizers must have a [locator][].

The following example shows an inline tokenizer that is added by the mentions plugin above.

`js tokenizeMention.notInLink = true tokenizeMention.locator = locateMention

function tokenizeMention(eat, value, silent) { var match = /^@(\w+)/.exec(value)

if (match) { if (silent) { return true }

return eat(match[0])({ type: 'link', url: 'https://social-network/' + match[1], children: [{type: 'text', value: match[0]}] }) } }`

Tokenizers test whether a document starts with a certain syntactic entity. In silent mode, they return whether that test passes. In normal mode, they consume that token, a process which is called “eating”.

Locators enable inline tokenizers to function faster by providing where the next entity may occur.

###### Signatures

* Node? = tokenizer(eat, value)*boolean? = tokenizer(eat, value, silent)

###### Parameters

* eat ([Function][eat]) — Eat, when applicable, an entity *value (string) — Value which may start an entity *silent (boolean, optional) — Whether to detect or consume

###### Properties

* locator ([Function][locator]) — Required for inline tokenizers *onlyAtStart (boolean) — Whether nodes can only be found at the beginning of the document *notInBlock (boolean) — Whether nodes cannot be in block quotes or lists *notInList (boolean) — Whether nodes cannot be in lists *notInLink (boolean) — Whether nodes cannot be in links

###### Returns

boolean?, in silent* mode — whether a node can be found at the start ofvalue[Node?][node], In normal* mode — If it can be found at the start ofvalue

`$3`

Locators are required for inline tokenizers. Their role is to keep parsing performant.

The following example shows a locator that is added by the mentions tokenizer above.

`js function locateMention(value, fromIndex) { return value.indexOf('@', fromIndex) }`

Locators enable inline tokenizers to function faster by providing information on where the next entity may occur. Locators may be wrong, it’s OK if there actually isn’t a node to be found at the index they return.

###### Parameters

* value (string) — Value which may contain an entity *fromIndex (number) — Position to start searching at

###### Returns

number — Index at which an entity may start, and -1 otherwise.

`$3`

`js var add = eat('foo')`

Eat subvalue, which is a string at the start of the [tokenized][tokenizer]value.

###### Parameters

* subvalue (string) - Value to eat

###### Returns

[add][add].

`$3`

`js var add = eat('foo')

add({type: 'text', value: 'foo'})`

Add [positional information][position] to node and add node to parent.

###### Parameters

* node ([Node][node]) - Node to patch position on and to add *parent ([Parent][parent], optional) - Place to add nodeto in the syntax tree. Defaults to the currently processed node

###### Returns

[Node][node] — The given node.

`$3`

Get the [positional information][position] that would be patched on nodebyadd.

###### Returns

[Position][position].

`$3`

add, but resets the internal position. Useful for example in lists, where the same content is first eaten for a list, and later for list items.

###### Parameters

* node ([Node][node]) - Node to patch position on and insert *parent ([Node][node], optional) - Place to add nodeto in the syntax tree. Defaults to the currently processed node

###### Returns

[Node][node] — The given node.

`$3`

In some situations, you may want to turn off a tokenizer to avoid parsing that syntactic feature.

Preferably, use the [remark-disable-tokenizers][remark-disable-tokenizers] plugin to turn off tokenizers.

Alternatively, this can be done by replacing the tokenizer fromblockTokenizers (or blockMethods) or inlineTokenizers(orinlineMethods).

The following example turns off indented code blocks:

`js remarkParse.Parser.prototype.blockTokenizers.indentedCode = indentedCode

function indentedCode() { return true }`

`Security`

As Markdown is sometimes used for HTML, and improper use of HTML can open you up to a [cross-site scripting (XSS)][xss] attack, use of remark can also be unsafe. When going to HTML, use remark in combination with the [rehype][rehype] ecosystem, and use [rehype-sanitize][sanitize] to make the tree safe.

Use of remark plugins could also open you up to other attacks. Carefully assess each plugin and the risks involved in using them.

`Contribute`

See [contributing.md][contributing] in [remarkjs/.github][health] for ways to get started. See [support.md][support] for ways to get help. Ideas for new plugins and tools can be posted in [remarkjs/ideas`][ideas].

A curated list of awesome remark resources can be found in [**awesome
remark**][awesome].

This project has a [code of conduct][coc].
By interacting with this repository, organization, or community you agree to
abide by its terms.

License

[MIT][license] © [Titus Wormer][author]

[build-badge]: https://img.shields.io/travis/remarkjs/remark.svg

[build]: https://travis-ci.org/remarkjs/remark

[coverage-badge]: https://img.shields.io/codecov/c/github/remarkjs/remark.svg

[coverage]: https://codecov.io/github/remarkjs/remark

[downloads-badge]: https://img.shields.io/npm/dm/remark-parse.svg

[downloads]: https://www.npmjs.com/package/remark-parse

[size-badge]: https://img.shields.io/bundlephobia/minzip/remark-parse.svg

[size]: https://bundlephobia.com/result?p=remark-parse

[sponsors-badge]: https://opencollective.com/unified/sponsors/badge.svg

[backers-badge]: https://opencollective.com/unified/backers/badge.svg

[collective]: https://opencollective.com/unified

[chat-badge]: https://img.shields.io/badge/chat-spectrum-7b16ff.svg

[chat]: https://spectrum.chat/unified/remark

[health]: https://github.com/remarkjs/.github

[contributing]: https://github.com/remarkjs/.github/blob/HEAD/contributing.md

[support]: https://github.com/remarkjs/.github/blob/HEAD/support.md

[coc]: https://github.com/remarkjs/.github/blob/HEAD/code-of-conduct.md

[ideas]: https://github.com/remarkjs/ideas

[awesome]: https://github.com/remarkjs/awesome-remark

[license]: https://github.com/remarkjs/remark/blob/main/license

[author]: https://wooorm.com

[npm]: https://docs.npmjs.com/cli/install

[unified]: https://github.com/unifiedjs/unified

[data]: https://github.com/unifiedjs/unified#processordatakey-value

[remark]: https://github.com/remarkjs/remark/tree/main/packages/remark

[blocks]: https://github.com/remarkjs/remark/blob/main/packages/remark-parse/lib/block-elements.js

[mdast]: https://github.com/syntax-tree/mdast

[escapes]: https://spec.commonmark.org/0.29/#backslash-escapes

[node]: https://github.com/syntax-tree/unist#node

[parent]: https://github.com/syntax-tree/unist#parent

[position]: https://github.com/syntax-tree/unist#position

[parser]: https://github.com/unifiedjs/unified#processorparser

[transformer]: https://github.com/unifiedjs/unified#function-transformernode-file-next

[extend]: #extending-the-parser

[tokenizer]: #function-tokenizereat-value-silent

[locator]: #tokenizerlocatorvalue-fromindex

[eat]: #eatsubvalue

[add]: #addnode-parent

[remark-disable-tokenizers]: https://github.com/zestedesavoir/zmarkdown/tree/HEAD/packages/remark-disable-tokenizers

[xss]: https://en.wikipedia.org/wiki/Cross-site_scripting

[rehype]: https://github.com/rehypejs/rehype

[sanitize]: https://github.com/rehypejs/rehype-sanitize

remark-parse

Install

[npm][]:

``sh npm install remark-parse`

`Use`

var processor = unified() .use(markdown, {commonmark: true}) .use(remark2rehype) .use(html)

process.stdin.pipe(createStream(processor)).pipe(process.stdout)`

[See unified for more examples »][unified]

`Contents`

`API`

[See unified for API docs »][unified]

`$3`

Configure the processorto read Markdown as input and process [mdast][mdast] syntax trees.

##### options

Options can be passed directly, or passed later through [processor.data()][data].

###### options.gfm

GFM mode (boolean, default: true).

`markdown hello ~~hi~~ world`

Turns on:

* Fenced code blocks * Autolinking of URLs * Deletions (strikethrough) * Task lists * Tables

###### options.commonmark

CommonMark mode (boolean, default: false).

`markdown This is a paragraph and this is also part of the preceding paragraph.`

Allows:

Disallows:

###### options.pedantic

###### options.blocks

Blocks (Array., default: list of [block HTML elements][blocks]).

`markdown foo`

Defines which HTML elements are seen as block level.

`$3`

Access to the [parser][], if you need it.

`Extending the` Parser

If the remark-parse plugin is used, it adds a [Parser][parser] constructor function to theprocessor. Other plugins can add tokenizers to its prototype to change how Markdown is parsed.

The below plugin adds a [tokenizer][] for at-mentions.

`js module.exports = mentions

function mentions() { var Parser = this.Parser var tokenizers = Parser.prototype.inlineTokenizers var methods = Parser.prototype.inlineMethods

// Add an inline tokenizer (defined in the following example). tokenizers.mention = tokenizeMention

// Run it just before text. methods.splice(methods.indexOf('text'), 0, 'mention') }`

`$3`

Map of names to [tokenizer][]s (Object.). These tokenizers (such asfencedCode, table, and paragraph) eat from the start of a value to a line ending.

See #blockMethods below for a list of methods that are included by default.

`$3`

List of blockTokenizers names (Array.). Specifies the order in which tokenizers run.

Precedence of default block methods is as follows:

* blankLine*indentedCode*fencedCode*blockquote*atxHeading*thematicBreak*list*setextHeading*html*definition*table*paragraph

`$3`

Map of names to [tokenizer][]s (Object.). These tokenizers (such asurl, reference, and emphasis) eat from the start of a value. To increase performance, they depend on [locator][]s.

See #inlineMethods below for a list of methods that are included by default.

`$3`

List of inlineTokenizers names (Array.). Specifies the order in which tokenizers run.

Precedence of default inline methods is as follows:

* escape*autoLink*url*email*html*link*reference*strong*emphasis*deletion*code*break*text

`$3`

There are two types of tokenizers: block level and inline level. Both are functions, and work the same, but inline tokenizers must have a [locator][].

The following example shows an inline tokenizer that is added by the mentions plugin above.

`js tokenizeMention.notInLink = true tokenizeMention.locator = locateMention

function tokenizeMention(eat, value, silent) { var match = /^@(\w+)/.exec(value)

if (match) { if (silent) { return true }

return eat(match[0])({ type: 'link', url: 'https://social-network/' + match[1], children: [{type: 'text', value: match[0]}] }) } }`

Locators enable inline tokenizers to function faster by providing where the next entity may occur.

###### Signatures

* Node? = tokenizer(eat, value)*boolean? = tokenizer(eat, value, silent)

###### Parameters

* eat ([Function][eat]) — Eat, when applicable, an entity *value (string) — Value which may start an entity *silent (boolean, optional) — Whether to detect or consume

###### Properties

###### Returns

boolean?, in silent* mode — whether a node can be found at the start ofvalue[Node?][node], In normal* mode — If it can be found at the start ofvalue

`$3`

Locators are required for inline tokenizers. Their role is to keep parsing performant.

The following example shows a locator that is added by the mentions tokenizer above.

`js function locateMention(value, fromIndex) { return value.indexOf('@', fromIndex) }`

###### Parameters

* value (string) — Value which may contain an entity *fromIndex (number) — Position to start searching at

###### Returns

number — Index at which an entity may start, and -1 otherwise.

`$3`

`js var add = eat('foo')`

Eat subvalue, which is a string at the start of the [tokenized][tokenizer]value.

###### Parameters

* subvalue (string) - Value to eat

###### Returns

[add][add].

`$3`

`js var add = eat('foo')

add({type: 'text', value: 'foo'})`

Add [positional information][position] to node and add node to parent.

###### Parameters

* node ([Node][node]) - Node to patch position on and to add *parent ([Parent][parent], optional) - Place to add nodeto in the syntax tree. Defaults to the currently processed node

###### Returns

[Node][node] — The given node.

`$3`

Get the [positional information][position] that would be patched on nodebyadd.

###### Returns

[Position][position].

`$3`

add, but resets the internal position. Useful for example in lists, where the same content is first eaten for a list, and later for list items.

###### Parameters

* node ([Node][node]) - Node to patch position on and insert *parent ([Node][node], optional) - Place to add nodeto in the syntax tree. Defaults to the currently processed node

###### Returns

[Node][node] — The given node.

`$3`

In some situations, you may want to turn off a tokenizer to avoid parsing that syntactic feature.

Preferably, use the [remark-disable-tokenizers][remark-disable-tokenizers] plugin to turn off tokenizers.

Alternatively, this can be done by replacing the tokenizer fromblockTokenizers (or blockMethods) or inlineTokenizers(orinlineMethods).

The following example turns off indented code blocks:

`js remarkParse.Parser.prototype.blockTokenizers.indentedCode = indentedCode

function indentedCode() { return true }`

`Security`

Use of remark plugins could also open you up to other attacks. Carefully assess each plugin and the risks involved in using them.

`Contribute`

A curated list of awesome remark resources can be found in [**awesome
remark**][awesome].

This project has a [code of conduct][coc].
By interacting with this repository, organization, or community you agree to
abide by its terms.

License

[build-badge]: https://img.shields.io/travis/remarkjs/remark.svg

[build]: https://travis-ci.org/remarkjs/remark

[coverage-badge]: https://img.shields.io/codecov/c/github/remarkjs/remark.svg

[coverage]: https://codecov.io/github/remarkjs/remark

[downloads-badge]: https://img.shields.io/npm/dm/remark-parse.svg

[downloads]: https://www.npmjs.com/package/remark-parse

[size-badge]: https://img.shields.io/bundlephobia/minzip/remark-parse.svg

[size]: https://bundlephobia.com/result?p=remark-parse

[sponsors-badge]: https://opencollective.com/unified/sponsors/badge.svg

[backers-badge]: https://opencollective.com/unified/backers/badge.svg

[collective]: https://opencollective.com/unified

[chat-badge]: https://img.shields.io/badge/chat-spectrum-7b16ff.svg

[chat]: https://spectrum.chat/unified/remark

[health]: https://github.com/remarkjs/.github

[contributing]: https://github.com/remarkjs/.github/blob/HEAD/contributing.md

[support]: https://github.com/remarkjs/.github/blob/HEAD/support.md

[coc]: https://github.com/remarkjs/.github/blob/HEAD/code-of-conduct.md

[ideas]: https://github.com/remarkjs/ideas

[awesome]: https://github.com/remarkjs/awesome-remark

[license]: https://github.com/remarkjs/remark/blob/main/license

[author]: https://wooorm.com

[npm]: https://docs.npmjs.com/cli/install

[unified]: https://github.com/unifiedjs/unified

[data]: https://github.com/unifiedjs/unified#processordatakey-value

[remark]: https://github.com/remarkjs/remark/tree/main/packages/remark

[blocks]: https://github.com/remarkjs/remark/blob/main/packages/remark-parse/lib/block-elements.js

[mdast]: https://github.com/syntax-tree/mdast

[escapes]: https://spec.commonmark.org/0.29/#backslash-escapes

[node]: https://github.com/syntax-tree/unist#node

[parent]: https://github.com/syntax-tree/unist#parent

[position]: https://github.com/syntax-tree/unist#position

[parser]: https://github.com/unifiedjs/unified#processorparser

[transformer]: https://github.com/unifiedjs/unified#function-transformernode-file-next

[extend]: #extending-the-parser

[tokenizer]: #function-tokenizereat-value-silent

[locator]: #tokenizerlocatorvalue-fromindex

[eat]: #eatsubvalue

[add]: #addnode-parent

[remark-disable-tokenizers]: https://github.com/zestedesavoir/zmarkdown/tree/HEAD/packages/remark-disable-tokenizers

[xss]: https://en.wikipedia.org/wiki/Cross-site_scripting

[rehype]: https://github.com/rehypejs/rehype

[sanitize]: https://github.com/rehypejs/rehype-sanitize

Gatsby 🥇		Vercel 🥇		Netlify
Holloway	ThemeIsle 🥉	BoostIO 🥉	Expo 🥉	You?

Gatsby 🥇		Vercel 🥇		Netlify
Holloway	ThemeIsle 🥉	BoostIO 🥉	Expo 🥉	You?

remark-parse-no-trim

remark-parse

Sponsors

Install

Use

Contents

API

$3

$3

Extending the Parser

$3

$3

$3

$3

$3

$3

$3

$3

$3

$3

$3

Security

Contribute

License

remark-parse-no-trim

remark-parse

Sponsors

Install

Use

Contents

API

$3

$3

Extending the Parser

$3

$3

$3

$3

$3

$3

$3

$3

$3

$3

$3

Security

Contribute

License

`Use`

`Contents`

`API`

`$3`

`$3`

`Extending the` Parser

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`Security`

`Contribute`

`Use`

`Contents`

`API`

`$3`

`$3`

`Extending the` Parser

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`$3`

`Security`

`Contribute`