An HTML parser recognizes content and string placeholders and allows JavaScript expressions as attribute values
npm install mi-htmljs-parserhtmljs-parser
=============
HTML parsers written according to the HTML spec will interpret all
attribute values as strings which makes it challenging to properly
describe a value's type (boolean, string, number, array, etc.)
or to provide a complex JavaScript expression as a value.
The ability to describe JavaScript expressions within attributes
is important for HTML-based template compilers.
For example, consider a HTML-based template that wishes to
support a custom tag named that supports an
attribute named message that can be a string literal or a JavaScript expression.
Ideally, the template compiler should be able to handle any of the following:
``html`
This parser extends the HTML grammar to add these important features:
- JavaScript expressions as attribute values onPlaceholder: function(event) { onString: function(event) { onCDATA: function(event) { onOpenTag: function(event) { onCloseTag: function(event) { onDocumentType: function(event) { onDeclaration: function(event) { onComment: function(event) { onScriptlet: function(event) { onError: function(event) { parser.parse(str); The parser, by default, will look for HTML tags within content. This behavior There are three content parsing modes: - HTML Content (DEFAULT): - Parsed Text Content: The parser will look for the closing tag that matches - Static Text Content: The parser will look for the closing tag that matches parser.parse(str); The The EXAMPLE: Simple tag INPUT: OUTPUT EVENT: EXAMPLE: Tag with literal attribute values INPUT: OUTPUT EVENT: EXAMPLE: Tag with expression attribute INPUT: OUTPUT EVENT: EXAMPLE: Tag with an argument INPUT: OUTPUT EVENT: EXAMPLE: Attribute with an argument INPUT: OUTPUT EVENT: The EXAMPLE: Simple close tag INPUT: OUTPUT EVENT: The NOTE: Text within EXAMPLE In the following example code, the INPUT: OUTPUT EVENT: The EXAMPLE: INPUT: OUTPUT EVENT: The If the placeholder starts with the If the placeholder starts with the Text within EXAMPLE: INPUT: OUTPUT EVENTS -------- NOTE: Here's an example of modifying the expression based on the The EXAMPLE: INPUT: OUTPUT EVENT: The EXAMPLE: INPUT: OUTPUT EVENT: The EXAMPLE: INPUT: OUTPUT EVENT: The EXAMPLE: INPUT: OUTPUT EVENT: The Possible error codes: - EXAMPLE: INPUT: OUTPUT EVENT:
`html``
- Placeholders in the content of an elementhtml`
Hello ${personName}`
- Placeholders within attribute value stringshtml
``
- JavaScript flow-control statements within HTML elementshtml``
- JavaScript flow-control statements as elementshtml``Installation
bash`
npm install htmljs-parser`Usage
javascript`
var parser = require('htmljs-parser').createParser({
onText: function(event) {
// Text within an HTML element
var value = event.value;
},
// ${
// $!{
var value = event.value; // String
var escaped = event.escaped; // boolean
var withinBody = event.withinBody; // boolean
var withinAttribute = event.withinAttribute; // boolean
var withinString = event.withinString; // boolean
var withinOpenTag = event.withinOpenTag; // boolean
var pos = event.pos; // Integer
},
// Text within ""
var value = event.value; // String
var stringParts = event.stringParts; // Array
var isStringLiteral = event.isStringLiteral // Boolean
var pos = event.pos; // Integer
},
// ]]>
var value = event.value; // String
var pos = event.pos; // Integer
},
var tagName = event.tagName; // String
var attributes = event.attributes; // Array
var argument = event.argument; // Object
var pos = event.pos; // Integer
},
// close tag
var tagName = event.tagName; // String
var pos = event.pos; // Integer
},
// Document Type/DTD
// >
// Example:
var value = event.value; // String
var pos = event.pos; // Integer
},
// Declaration
//
// Example:
var value = event.value; // String
var pos = event.pos; // Integer
},
// Text within XML comment
var value = event.value; // String
var pos = event.pos; // Integer
},
// Text within <% %>
var value = event.value; // String
var pos = event.pos; // Integer
},
// Error
var message = event.message; // String
var code = event.code; // String
var pos = event.pos; // Integer
}
});onOpenTagContent Parsing Modes
might not be desirable for certain tags, so the parser allows the parsing mode
to be changed (usually in response to an event).`
The parser will look for any HTML tag and content placeholders while in
this mode and parse opening and closing tags accordingly.
the current open tag as well as content placeholders but all other content
will be interpreted as text.
the current open tag but all other content will be interpreted as raw text.javascript`
var htmljs = require('htmljs-parser');
var parser = htmljs.createParser({
onOpenTag: function(event) {
// open tag
switch(event.tagName) {
case 'textarea':
//fall through
case 'script':
//fall through
case 'style':
// parse the content within these tags but only
// look for placeholders and the closing tag.
parser.enterParsedTextContentState();
break;
case 'dummy'
// treat content within
// text and ignore other tags and placeholders
parser.enterStaticTextContentState();
break;
default:
// The parser will switch to HTML content parsing mode
// if the parsing mode is not explicitly changed by
// "onOpenTag" function.
}
}
});htmljs-parserParsing Events
is an event-based parser which means that it will emiton
events as it is parsing the document. Events are emitted via calls
to function which are supplied as properties in the optionsrequire('htmljs-parser').createParser(options)
via call to .onOpenTag$3
function will be called each time an opening tag is`
encountered.html
``javascript`
{
type: 'openTag',
tagName: 'div',
attributes: []
}`html
``javascript`
{
type: 'openTag',
tagName: 'div',
attributes: [
{
name: 'class',
value: '"demo"',
literalValue: 'demo'
},
{
name: 'disabled',
value: 'false',
literalValue: false
},
{
name: 'data-number',
value: '123',
literalValue: 123
}
]
}`html``javascript`
{
type: 'openTag',
tagName: 'div',
attributes: [
{
name: 'message',
value: '"Hello "+data.name'
}
]
}`html``javascript`
{
type: 'openTag',
tagName: 'for',
argument: {
value: 'var i = 0; i < 10; i++',
pos: ... // Integer
},
attributes: []
}`html` y)>
``javascript`
{
type: 'openTag',
tagName: 'div',
attributes: [
{
name: 'if',
argument: {
value: 'x > y',
pos: ... // Integer
}
}
]
}onCloseTag$3
function will be called each time a closing tag is`
encountered.html`javascript`
{
type: 'closeTag',
tagName: 'div'
}onText$3
function will be called each time within an element ]]>
when textual data is encountered. will be emitted via callonCDATA
to .TEXT sequences will be emitted as`
text events.html`
Simple text`javascript`
{
type: 'text',
value: 'Simple text'
}onCDATA$3
function will be called when text within ]]>`
is encountered.html``javascript`
{
type: 'cdata',
value: 'This is text'
}onPlaceholder$3
function will be called each time a placeholder$!{
is encountered. sequence then event.escapefalse
will be .${ sequence then event.escape will betrue. ]]> and will not be parsed so you`
cannot use placeholders for these blocks of code.html`
${"This is an escaped placeholder"}
$!{"This is a non-escaped placeholder"}`html`
${name}`javascript`
{
type: 'placeholder',
value: 'name',
escape: true
}`html`
$!{name}`javascript`
{
type: 'placeholder',
value: 'name',
escape: true
}escape
The flag is merely informational. The application code is responsibleevent.escape
for interpreting this flag to properly escape the expression. flag:`javascript`
onPlaceholder: function(event) {
if (event.escape) {
event.value = 'escapeXml(' + event.value + ')';
}
}onDocumentType$3
function will be called when the document type declaration`
is encountered _anywhere_ in the content.html``javascript`
{
type: 'documentType',
value: 'DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN"'
}onDeclaration$3
function will be called when an XML declaration`
is encountered _anywhere_ in the content.html``javascript`
{
type: 'declaration',
value: 'xml version="1.0" encoding="UTF-8"'
}onComment$3
function will be called when text within `
is encountered.html``javascript`
{
type: 'comment',
value: 'This is a comment'
}onScriptlet$3
function will be called when text within <% %>`
is encountered.html`
<% console.log("Hello World!"); %>`javascript`
{
type: 'scriptlet',
value: ' console.log("Hello World!"); '
}onError$3
function will be called when malformed content is detected.MISSING_END_TAG
The most common cause for an error is due to reaching the end of the
input while still parsing an open tag, close tag, XML comment, CDATA section,
DTD, XML declaration, or placeholder.MISSING_END_DELIMITER
- MALFORMED_OPEN_TAG
- MALFORMED_CLOSE_TAG
- MALFORMED_CDATA
- MALFORMED_PLACEHOLDER
- MALFORMED_DOCUMENT_TYPE
- MALFORMED_DECLARATION
- MALFORMED_COMMENT
- EXTRA_CLOSING_TAG
- MISMATCHED_CLOSING_TAG
- `
- ...html`javascript``
{
type: 'error',
code: 'MALFORMED_OPEN_TAG',
message: 'EOF reached while parsing open tag.',
pos: 0,
endPos: 9
}