Generator function that tokenizes string based on JSON format.
npm install @johntalton/json-tokenizerGenerator function that tokenizes string based on JSON format.
- Uses Generator based API
- Produces tokens for all input text (including error tokens)
- Allows for AbortSignal to control termination
- Best effort to match JSON.parse restriction
- Start and End position for errors
Basic initialization and iteration:
``js
import { JSONTokenizer } from '@johntalton/json-tokenizer'
const signal = AbortSignal.timeout(100)
const text = '{ }'
for(const token of JSONTokenizer.tokenize(text, { signal })) {
const { type, value } = token
// ...
}
`
The following shows the Token-Stream from a valid JSON text
`js
import { JSONTokenizer } from '@johntalton/json-tokenizer'
const text = JSON.stringify({
team: 'Mystery Inc',
members: [ 'Fred', 'Daphne', 'Velma', 'Shaggy', 'Scooby' ],
aired: 1969
}, undefined, '\t')
const stream = JSONTokenizer.tokenize(text)
for(const token of stream) {
console.log(token)
}
/*
{ type: 'object-open', value: '{', start: 0, end: 0 }
{ type: 'whitespace', value: '\n\t', start: 1, end: 2 }
{ type: 'open-key-quote', value: '"', start: 3, end: 3 }
{ type: 'key', value: 'team', start: 4, end: 7 }
{ type: 'close-key-quote', value: '"', start: 8, end: 8 }
{ type: 'colon', value: ':', start: 9, end: 9 }
{ type: 'whitespace', value: ' ', start: 10, end: 10 }
{ type: 'open-string-quote', value: '"', start: 11, end: 11 }
{ type: 'string', value: 'Mystery Inc', start: 12, end: 22 }
{ type: 'close-string-quote', value: '"', start: 23, end: 23 }
{ type: 'object-member-comma', value: ',', start: 24, end: 24 }
{ type: 'whitespace', value: '\n\t', start: 25, end: 26 }
{ type: 'open-key-quote', value: '"', start: 27, end: 27 }
{ type: 'key', value: 'members', start: 28, end: 34 }
{ type: 'close-key-quote', value: '"', start: 35, end: 35 }
{ type: 'colon', value: ':', start: 36, end: 36 }
{ type: 'whitespace', value: ' ', start: 37, end: 37 }
{ type: 'array-open', value: '[', start: 38, end: 38 }
{ type: 'whitespace', value: '\n\t\t', start: 39, end: 41 }
{ type: 'open-string-quote', value: '"', start: 42, end: 42 }
{ type: 'string', value: 'Fred', start: 43, end: 46 }
{ type: 'close-string-quote', value: '"', start: 47, end: 47 }
{ type: 'array-element-comma', value: ',', start: 48, end: 48 }
{ type: 'whitespace', value: '\n\t\t', start: 49, end: 51 }
{ type: 'open-string-quote', value: '"', start: 52, end: 52 }
{ type: 'string', value: 'Daphne', start: 53, end: 58 }
{ type: 'close-string-quote', value: '"', start: 59, end: 59 }
{ type: 'array-element-comma', value: ',', start: 60, end: 60 }
{ type: 'whitespace', value: '\n\t\t', start: 61, end: 63 }
{ type: 'open-string-quote', value: '"', start: 64, end: 64 }
{ type: 'string', value: 'Velma', start: 65, end: 69 }
{ type: 'close-string-quote', value: '"', start: 70, end: 70 }
{ type: 'array-element-comma', value: ',', start: 71, end: 71 }
{ type: 'whitespace', value: '\n\t\t', start: 72, end: 74 }
{ type: 'open-string-quote', value: '"', start: 75, end: 75 }
{ type: 'string', value: 'Shaggy', start: 76, end: 81 }
{ type: 'close-string-quote', value: '"', start: 82, end: 82 }
{ type: 'array-element-comma', value: ',', start: 83, end: 83 }
{ type: 'whitespace', value: '\n\t\t', start: 84, end: 86 }
{ type: 'open-string-quote', value: '"', start: 87, end: 87 }
{ type: 'string', value: 'Scooby', start: 88, end: 93 }
{ type: 'close-string-quote', value: '"', start: 94, end: 94 }
{ type: 'whitespace', value: '\n\t', start: 95, end: 96 }
{ type: 'array-close', value: ']', start: 97, end: 97 }
{ type: 'object-member-comma', value: ',', start: 98, end: 98 }
{ type: 'whitespace', value: '\n\t', start: 99, end: 100 }
{ type: 'open-key-quote', value: '"', start: 101, end: 101 }
{ type: 'key', value: 'aired', start: 102, end: 106 }
{ type: 'close-key-quote', value: '"', start: 107, end: 107 }
{ type: 'colon', value: ':', start: 108, end: 108 }
{ type: 'whitespace', value: ' ', start: 109, end: 109 }
{ type: 'number', value: '1969', start: 110, end: 113 }
{ type: 'whitespace', value: '\n', start: 114, end: 114 }
{ type: 'object-close', value: '}', start: 115, end: 115 }
{ type: 'eof', value: '', start: null, end: 116 }
*/
``
A simple set of test for coverage exists within the repo.
For a more complete and varied set of validation of in-the-wild json, the following have been tested against:
- https://github.com/nst/JSONTestSuite
- https://github.com/nlohmann/json_test_data
- https://github.com/open-source-parsers/jsoncpp
- any other that can be found :P