A fast and lightweight streaming RDFa parser
npm install rdfa-streaming-parser


A fast and lightweight _streaming_ and 100% _spec-compliant_ RDFa 1.1 parser,
with RDFJS representations of RDF terms, quads and triples.
The streaming nature allows triples to be emitted _as soon as possible_, and documents _larger than memory_ to be parsed.
``bash`
$ npm install rdfa-streaming-parser
or
`bash`
$ yarn add rdfa-streaming-parser
This package also works out-of-the-box in browsers via tools such as webpack and browserify.
`javascript`
import {RdfaParser} from "rdfa-streaming-parser";
_or_
`javascript`
const RdfaParser = require("rdfa-streaming-parser").RdfaParser;
RdfaParser is a Node Transform stream
that takes in chunks of RDFa data,
and outputs RDFJS-compliant quads.
It can be used to pipe streams to,
or you can write strings into the parser directly.
While not required, it is advised to specify the profile of the parser
by supplying a contentType or profile constructor option.
`javascript
const myParser = new RdfaParser({ baseIRI: 'https://www.rubensworks.net/', contentType: 'text/html' });
fs.createReadStream('index.html')
.pipe(myParser)
.on('data', console.log)
.on('error', console.error)
.on('end', () => console.log('All triples were parsed!'));
`
`javascript
const myParser = new RdfaParser({ baseIRI: 'https://www.rubensworks.net/', contentType: 'text/html' });
myParser
.on('data', console.log)
.on('error', console.error)
.on('end', () => console.log('All triples were parsed!'));
myParser.write('');
myParser.write(
);
myParser.write();
myParser.write();
myParser.write();
myParser.write();
myParser.write();
myParser.end();
`$3
This parser implements the RDFJS Sink interface,
which makes it possible to alternatively parse streams using the
import method.`javascript
const myParser = new RdfaParser({ baseIRI: 'https://www.rubensworks.net/', contentType: 'text/html' });const myTextStream = fs.createReadStream('index.html');
myParser.import(myTextStream)
.on('data', console.log)
.on('error', console.error)
.on('end', () => console.log('All triples were parsed!'));
`Configuration
Optionally, the following parameters can be set in the
RdfaParser constructor:*
dataFactory: A custom RDFJS DataFactory to construct terms and triples. _(Default: require('@rdfjs/data-model'))_
* baseIRI: An initial default base IRI. _(Default: '')_
* language: A default language for string literals. _(Default: '')_
* vocab: The initial vocabulary. _(Default: '')_
* defaultGraph: The default graph for constructing quads. _(Default: defaultGraph())_
* features: A hash of features that should be enabled. Defaults to the features defined by the profile. _(Default: all features enabled)_
* profile: The RDFa profile to use. _(Default: profile with all features enabled)_
* contentType: The content type of the document that should be parsed. This can be used as an alternative to the 'profile' option. _(Default: profile with all features enabled)_
* htmlParseListener: An optional listener for the internal HTML parse events, should implement IHtmlParseListener _(Default: null)_`javascript
new RdfaParser({
dataFactory: require('@rdfjs/data-model'),
baseIRI: 'http://example.org/',
language: 'en-us',
vocab: 'http://example.org/myvocab',
defaultGraph: namedNode('http://example.org/graph'),
features: { langAttribute: true },
profile: 'html',
htmlParseListener: new MyHtmlListener(),
});
`$3
On top of RDFa Core 1.1, there are a few RDFa variants that add specific sets of rules,
which are all supported in this library:
* HTML+RDFa 1.1: Internally identified as the
'html' profile with 'text/html' as content type.
* XHTML+RDFa 1.1: Internally identified as the 'xhtml' profile with 'application/xhtml+xml' as content type.
* SVG Tiny 1.2: Internally identified as the 'xml' profile with 'application/xml', 'text/xml' and 'image/svg+xml' as content types.This library offers three different ways to define the RDFa profile or setting features:
* Content type: Passing a content type such as
'text/html' to the contentType option in the constructor.
* Profile string: Passing '', 'core', 'html', 'xhtml' or 'svg' to the profile option in the constructor.
* Features object: A custom combination of features can be defined by passing a features option in the constructor.The table below lists all possible RDFa features and in what profile they are available:
| Feature | Core | HTML | XHTML | XML | Description |
| -------------------------------- | ---- |----- | ----- | --- | ----------- |
|
baseTag | | ✓ | ✓ | | If the baseIRI can be set via the tag. |
| xmlBase | | | | ✓ | If the baseIRI can be set via the xml:base attribute. |
| langAttribute | | ✓ | ✓ | ✓ | If the language can be set via the language attribute. |
| onlyAllowUriRelRevIfProperty | ✓ | ✓ | ✓ | | If non-CURIE and non-URI rel and rev have to be ignored if property is present. |
| inheritSubjectInHeadBody | | ✓ | ✓ | | If the new subject can be inherited from the parent object if we're inside or if the resource defines no new subject. |
| datetimeAttribute | | ✓ | ✓ | ✓ | If the datetime attribute must be interpreted as datetimes. |
| timeTag | | ✓ | ✓ | ✓ | If the