Event-driven XML parser in TypeScript
npm install sax-tsSimple API for XML in TypeScript






A SAX-style parser for XML
and HTML.
Designed with deno in mind, so it's browser compatible
- A very simple tool to parse through an XML string.
- A handy way to deal with RSS and other mostly-ok-but-kinda-broken XML
docs.
- A perfect way to parse 80 GB of XML data and don't burn your laptop :)
typescript
import { SAXParser } from 'https://unpkg.com/sax-ts@1.2.8/src/sax.ts';
// for semver, use "@%5E1.2.8", which is the urlencoded version of "@^1.2.8"
const strict: boolean = true; // change to false for HTML parsing
const options: {} = {}; // refer to "Arguments" section
const parser = new SAXParser(strict, options);parser.onerror = function (e) {
// an error happened.
console.error(e);
};
parser.ontext = function (t) {
// got some text. t is the string of text.
console.log('onText: ', t)
};
parser.onopentag = function (node) {
// opened a tag. node has "name" and "attributes"
console.log('onOpenTag: ', node)
};
parser.onattribute = function (attr) {
// an attribute. attr has "name" and "value"
console.log('onAttribute: ', attr)
};
parser.onend = function () {
// parser stream is done, and ready to have more stuff written to it.
console.warn('end of XML')
};
parser.write('Hello, world ! ').close();
`
Arguments
Pass the following arguments to the parser function. All are optional.
strict - Boolean. Disabled "forgiving" mode. Default: false.options - Object bag of settings regarding string formatting. All default to
false.Settings supported:
-
trim - Boolean. Whether or not to trim text and comment nodes.
- normalize - Boolean. If true, then turn any whitespace into a single
space.
- lowercase - Boolean. If true, then lowercase tag names and attribute names
in loose mode, rather than uppercasing them.
- xmlns - Boolean. If true, then namespaces are supported.
- position - Boolean. If false, then don't track line/col/position.
- strictEntities - Boolean. If true, only parse predefined XML
entities
(&, ', >, <, and ")Methods
write - Write bytes onto the stream. You don't have to do this all at
once. You can keep writing as much as you want.close - Close the stream. Once closed, no more data may be written until
it is done processing the buffer, which is signaled by the end event.resume - To gracefully handle errors, assign a listener to the error
event. Then, when the error is taken care of, you can call resume to
continue parsing. Otherwise, the parser will not continue while in an error
state.Events
All events emit with a single argument. To listen to an event, assign a
function to
on. Functions get executed in the this-context of
the parser object. The list of supported events are also in the exported
EVENTS array.error - Indication that something bad happened. The error will be hanging
out on parser.error, and must be deleted before parsing can continue. By
listening to this event, you can keep an eye on that kind of stuff. Note:
this happens much more in strict mode. Argument: instance of Error.
`javascript
//TODO: currently error is protected, need to expose it to user somehow.
`text - Text node. Argument: string of text.doctype - The declaration. Argument: doctype string.processinginstruction - Stuff like . Argument:name
object with and body members. Attributes are not parsed, as
processing instructions have implementation dependent semantics.
sgmldeclaration - Random SGML declarations. Stuff like
would trigger this kind of event. This is a weird thing to support, so it
might go away at some point. SAX isn't intended to be used to parse SGML,
after all.
opentagstart - Emitted immediately when the tag name is available,name
but before any attributes are encountered. Argument: object with a field and an empty attributes set. Note that this is theopentag
same object that will later be emitted in the event.
opentag - An opening tag. Argument: object with name and attributes.lowercase
In non-strict mode, tag names are uppercased, unless the xmlns
option is set. If the option is set, then it will containns
namespace binding information on the member, and will have alocal, prefix, and uri member.
closetag - A closing tag. In loose mode, tags are auto-closed if theircloseTag
parent closes. In strict mode, well-formedness is enforced. Note that
self-closing tags will have emitted immediately after openTag.
Argument: tag name.
attribute - An attribute node. Argument: object with name and value.lowercase
In non-strict mode, attribute names are in upper-case, unless the xmlns
option is set. If the option is set, it will also contains namespace
information.
comment - A comment node. Argument: the string of the comment.
opencdata - The opening tag of a block.
cdata - The text of a block. Since blocks can get
quite large, this event may fire multiple times for a single block, if it
is broken up into multiple write()s. Argument: the string of random
character data.
closecdata - The closing tag (]]>) of a block.
opennamespace - If the xmlns option is set, then this event will
signal the start of a new namespace binding.
closenamespace - If the xmlns option is set, then this event will
signal the end of a namespace binding.
end - Indication that the closed stream has ended.
ready - Indication that the stream has reset, and is ready to be written
to.
noscript - In non-strict mode,