Simple Markdown Parser that return AST
npm install @snapp-notes/markdown-parser



Simple Markdown Parser that returns an Abstract Syntax Tree (AST) with location information.
``bash`
npm install @snapp-notes/markdown-parser
- 📝 Parse markdown into a structured AST
- 📍 Location tracking for every node
- 🎯 Support for common markdown elements:
- Headers (H1-H6)
- Code blocks with language specification
- Bold text (** and __)*
- Italic text ( and _)
- Inline links
- List items
- Plain text
- 🚀 Built with PEG.js/Peggy for reliable parsing
- 📦 ES Module support
- 💪 TypeScript definitions included
`javascript
import { parse } from '@snapp-notes/markdown-parser';
const markdown = '# Hello World\nThis is bold text.';
const ast = parse(markdown);
console.log(ast);
`
Output:
`javascript`
[
{
type: 'header',
content: '# Hello World',
level: 1,
loc: { start: { offset: 0, line: 1, column: 1 }, end: { ... } }
},
{
type: 'text',
content: '\n',
loc: { ... }
},
{
type: 'text',
content: 'This is '
},
{
type: 'bold',
content: 'bold',
loc: { ... }
},
{
type: 'text',
content: ' text.'
}
]
`javascript
import { parse } from '@snapp-notes/markdown-parser';
const ast = parse('# H1\n## H2\n### H3');
// Each header node contains:
// - type: 'header'
// - content: full header text including # symbols
// - level: number (1-6)
// - loc: location information
`
`javascript
import { parse } from '@snapp-notes/markdown-parser';
const markdown = \\\javascript\
const greeting = "Hello";
console.log(greeting);
\\;
const ast = parse(markdown);
// Code node contains:
// - type: 'code'
// - content: code content (includes leading newline)
// - language: 'javascript' (or empty string if not specified)
// - loc: location information
`
`javascript
import { parse } from '@snapp-notes/markdown-parser';
// Bold text
parse('bold text'); // or '__bold text__'
// Italic text
parse('italic text'); // or '_italic text_'
// Mixed formatting
const ast = parse('This is bold and italic text');
`
`javascript
import { parse } from '@snapp-notes/markdown-parser';
const ast = parse('Google');
// Link node contains:
// - type: 'link'
// - text: 'Google'
// - url: 'https://google.com'
// - content: 'Google'
// - loc: location information
`
`javascript
import { parse } from '@snapp-notes/markdown-parser';
const markdown = * Item 1
* Item 2
* Item 3;
const ast = parse(markdown);
// List nodes contain:
// - type: 'list'
// - content: '* Item text'
// - loc: location information
`
`javascript
import { parse } from '@snapp-notes/markdown-parser';
const markdown = # My Document
This is a paragraph with bold and italic text.
Visit my website for more info.
\\\python\
def hello():
print("Hello, World!")
\\
* Feature 1
* Feature 2;
const ast = parse(markdown);
// The AST will contain a mix of different node types
ast.forEach(node => {
console.log(${node.type}: ${node.content?.substring(0, 30)}...);`
});
Parses a markdown string and returns an array of AST nodes.
Parameters:
- input (string): The markdown text to parseoptions
- (optional): Parser optionsstartRule
- (optional): The grammar rule to start parsing from (default: 'start')
Returns: An array of MarkdownNode objects
Throws: SyntaxError if the input cannot be parsed
#### TextNode
`typescript`
interface TextNode {
type: 'text' | 'bold' | 'italic' | 'list';
content: string;
loc: Location;
}
Used for plain text, bold text, italic text, and list items.
#### HeaderNode
`typescript`
interface HeaderNode {
type: 'header';
content: string;
level: number; // 1-6
loc: Location;
}
#### CodeNode
`typescript`
interface CodeNode {
type: 'code';
content: string;
language?: string;
loc: Location;
}
Note: The content includes a leading newline character.
#### LinkNode
`typescript`
interface LinkNode {
type: 'link';
text: string;
url: string;
content: string;
loc: Location;
}
#### Location
`typescript
interface Location {
start: Position;
end: Position;
}
interface Position {
offset: number; // Character offset from start
line: number; // Line number (1-based)
column: number; // Column number (1-based)
}
`
| Element | Syntax | Example |
|---------|--------|---------|
| Header | # to ###### | # Title |text
| Bold | or __text__ | bold |text
| Italic | or _text_ | italic |text
| Link | | Google | ``
| Code Block | lang\ncode\n` | `js\ncode\n` |
| List Item | item | Item 1 |
- Nested formatting (e.g., bold within italic) is not fully supported
- Only unordered lists with * are supported
- No support for:
- Blockquotes
- Tables
- Images
- Horizontal rules
- Strikethrough
- Task lists
Generate the parser from the grammar file:
``bash`
npm run build
Run the test suite:
`bash`
npm test
Watch mode for development:
`bash`
npm run test:watch
The parser is built using Peggy (formerly PEG.js). The grammar file is located at src/grammar.peggy.
To modify the parser, edit the grammar file and rebuild:
`bash`
npm run build
Contributions are welcome! Please ensure all tests pass before submitting a pull request.
`bash``
npm run build
npm test
Copyright (c) 2025 Jakub T. Jankiewicz
Released under MIT license