Parse fenced code blocks from markdown with useful metadata
npm install code-blocksParse [fenced code blocks] from [Markdown] with useful metadata.
```
npm install [--save | --save-dev] code-blocks
`js
// ES5/CommonJS
const codeBlocks = require('code-blocks')
// ES2015/ES6/Babel, etc.
import codeBlocks from 'code-blocks'
codeBlocks.fromFile('README.md')
.then(blocks => {
// do stuff with blocks here
})
`
See the API documentation for more examples.
This library uses [remark] to parse Markdown into a [unist] tree,
then finds all of the [fenced code blocks]. Those with a language
identifier after the opening ` or ~~~
get some additional properties.
According to the [CommonMark Spec][fenced code blocks]:
> The first word of the info string is typically used to specify the language
> of the code sample, and rendered in the class attribute of the code tag.
> However, this spec does not mandate any particular treatment of the info
> string.
In other words, CommonMark-compliant parsers should safely ignore everything
after the language identifier. That's where we can attach additional key/value
pairs, which are parsed as each code block's info, as in:
~~~markdown
`html title="A dumb example" foo=bar "x.y.z"="1 2 3"`Hello, world!
~~~
When parsed with code-blocks, this would yield an array with one object:
`js`
[{
type: 'code',
lang: 'html',
value: 'Hello, world!
',
info: {
title: 'A dumb example',
foo: 'bar',
'x.y.z': '1 2 3'
},
title: 'A dumb example',
source: {
file: 'README.md',
line: 1
},
position: {
// see https://github.com/syntax-tree/unist#position
}
}]
The [unist] node objects returned by all of the block parsing
functions are "enhanced" with the following properties:
* lang contains only the first "word" of the info stringinfo
* is an object of key/value pairs parsed from the remainder of the infotitle
string
* is the title of the code block, as determined by thissource
algorithm
* is an object with two keys:file
* is the path provided as the first argument tofromFile()
and fromFileSync(), or as the last argumentfromString()
to or fromAST(). (If no file is provided,buffer
this value will be .)line
* is the starting line of the code block in markdown
input.
See the [mdast] documentation for more info about the Code
nodes generated by
[remark], and the [unist] documentation for more on the
underlying structures.
Because code blocks are often meaningless without at least some context, every
block parsed gets a title according to the following algorithm:
1. If a title key is found in the block's info object, use(2)
that.
1. Otherwise, find the previous heading in the markdown and use
its text.
1. If two or more code blocks share the same heading, add a
numeric suffix: for the second, (3) for the third,
and so on.
1. If no previous heading is found, provide a title that
describes where it comes from, in the form:
``
Code block {n} from {filename}:{line}
Where {n} is the 1-based index of the code block in the{filename}
parsed file, is the parsed file (or buffer),{line}` is the line at which the code block starts.
and
[fenced code blocks]: http://spec.commonmark.org/0.12/#fenced-code-blocks
[markdown]: https://en.wikipedia.org/wiki/Markdown
[unist]: https://unifiedjs.github.io/#syntax-tree
[mdast]: https://github.com/syntax-tree/mdast
[remark]: https://github.com/wooorm/remark