A module for creating buffers using tagged template strings
npm install bintag``js`
var bintag = require('bintag');
A Node.js module for creating buffers using tagged template strings.
This module uses the tagged template
string
syntax to fill buffers with structured binary data. This is primarily useful in
unit tests for code that deals with binary data formats or network protocols.
The simplest example:
`jsi4: 1 2 -10 0xaabbccdd
bintag`
// =
The expression evaluates to a new Buffer instance.
The i4: is a _format specifier_ which means that the numbers following it
should be formatted as 4-byte integers.
Note that template strings can be multi-line:
`jsi1:
bintag
0x12 0x34
0x56 0x78`
// =
- Substitutions
- Substitutions as values
- Substitutions in other positions
- Format specifiers
- Integers
- Hexadecimal data
- Floating point
- Strings
- Endianness
- Shortcut tags
- Groups
- Repeat count
- Alignment
- Padding
- Offset expressions
- Length expressions
- Base for offset calculations
- Compiled templates
The template string syntax allows substitutions: ${_expression_}. Withbintag, you can use substitutions as values to be formatted as well as in
other syntactic constructs.
A substitution expression can evaluate to a single value:
`jsi4: ${n}
let n = 10;
bintag`
// =
It can evaluate to an array of values:
`jsi1: ${array}
let array = [1, 2, 3];
bintag`
// =
Nested arrays will be flattened:
`jsi1: ${array}
let array = [1, [2], 3, [[4]], [5, 6], 7];
bintag`
// =
Anything
iterable,
such as a Set or a generator, will be treated the same way as an array:
`jsi1: ${gen()}
function* gen(){
yield 1;
yield 2;
yield [3, 4];
}
bintag`
// =
Unless the iterable is a generator, it must produce stable results on two
subsequent traversals, or the result will be undefined.
A substitution expression can also evaluate to a Buffer:
`jsi1: 0xaa 0xbb
let buf = bintag;${buf} i2: 2 ${buf}
bintag`
// =
Arrays and nested arrays of buffers are also supported. You can even mix
buffers with immediate values in the same array. An empty array produces no
buffer content.
Finally, a substitution expression can evaluate to a compiled bintag template
(see below).
A substitution is generally allowed wherever the syntax expects an integer,
such as in a format specifier:
`jsi${n}: 1
let n=2;
bintag`
// =
When used in this way, the substitution expression must evaluate to an integer,
or to something that can be converted to an integer. (1.6 will happily become1, but {} becomes NaN and will be rejected.)
A substitution cannot be used in place of parts of the syntax that are not
numbers, such as the letter of a format specifier.
All format specifiers end with a colon. Whitespace after the colon is optional.
A format specifier itself does not produce any buffer content, but it specifies
the format in which the values following it are to be formatted. The format
remains active until another format specifier is encountered:
`jsi1: 1 2 i4: 7 i1: 8
bintag`
// =
Note: format specifiers are case-sensitive.
The integer format specifier is the letter i followed by a number between 1
and 6. Each formatted integer will take the specified number of bytes in the
buffer.
This format specifier accepts decimal integers with an optional sign, as well
as unsigned hexadecimal nubmers:
`jsi1: 1 +1 -1 0x80 128 -128
bintag`
// =
Both positive and negative values can be used to represent certain bit
patterns, such as 80 in the example above.
A value that is out of range for both signed and unsigned integers of the
specified width, will trigger an exception.
For i2 and upwards, endianness matters. See below.
The format specifier is just x. It expects an even number of hexadecimal
digits:
`jsx: 12 34 abCDef
bintag`
// =
Whitespace between pairs is optional. The only case when this whitespace
matters is when a repeat count (see below) is used: it will apply to a whole
“word” of hexadecimal data.
The x format specifier handles substitution expressions like i1.
This format is not affected by endianness.
The format specifiers are f for “float” (32-bit) and d for “double”
(64-bit) types.
`jsf: -1.1 d: .5e-10
bintag`
// =
The standard JS syntax for floating-point literals is supported, including
Infinity with an optional sign, and NaN. Note that -0 produces binary0
data distinct from .
The format is affected by endianness.
The three string formats are a for ASCII, u for UTF-8, and U for UTF-16.
When a string is forced to ASCII, the lower byte of each character's Unicode
value will be used.
With string formats, substitution expressions must be used. There is no
syntax for string values in the template string.
When a bare string format specifier is used, the result will take exactly the
number of bytes that are necessary to represent all the characters of a string:
`jsa: ${'abc'}
bintag`
// =
If the format letter is followed by an integer constant (or a substitution
expression evaluating to an integer), the string will take exactly the
specified number of bytes, and will be truncated or zero-padded as necessary.
`jsa4: ${['ab', 'xyzzy']}
bintag`
// =
When a Unicode string is truncated, an incomplete character at the end is never
encoded. If necessary, the string will be zero-padded:
`jsu8: ${'\u1000\u1000\u1000'}
bintag`
// =
The z modifier adds a terminating zero byte (two bytes in case of UTF-16).
`jsaz: ${'abc'}
bintag`
// =
If z is combined with a fixed length, the length includes the terminator, andz
the string is guaranteed to be zero-terminated. This means that it might be
truncated earlier than without to accomodate the terminator.
The p modifier, followed by an integer between 1 and 4 (or a substitution
expression evaluating to such an integer), makes the string a “Pascal string”:
length followed by string data. The number specifies the width of the length
field.
`jsap2: ${'abc'}
bintag`
// =
For UTF-8, the number of bytes is stored in the length field rather than the
number of Unicode characters. For UTF-16, the number of two-byte pairs is
stored, which can be different from the number of Unicode characters when
surrogate pairs are present.
The length field respects endianness. Strings longer than the maximum length
than can be represented in a length field of the chosen size (such as 255
characters for a 1-byte length), will be truncated.
If the p modifier is combined with a fixed length, the latter includes the
size of the length field. Therefore, the fixed length must be greater than the
size of the length field.
`jsa8p1: ${['abc', '0123456789']}
bintag`
// =
The UTF-16 encoding is affected by endianness, but ASCII and UTF-8 are not.
By default, data is formatted according to the endianness of the host platform.
This can be overridden by the endianness specifiers: LE: for little-endianBE:
and for big-endian. An endianness specifier remains in effect until
overridden by another such specifier.
`jsi2: LE: 0xabcd BE: 0x1122 0xabcd LE: 0x1122
bintag`
// =
Note: endianness specifiers are case-sensitive.
You can call bintag.tag to create a shortcut to a particular set of options.bintag
The shortcut can be used in tagged template expressions in place of :
`js1 2
let short = bintag.tag('i2:');
short${'abc'}
// =
let utf16le = bintag.tag('LE:U:');
utf16le`
// =
The use of a shortcut is equivalent to specifying the options at the start of
the template.
The following convenience shortcuts are already defined in the bintag module:bintag.LE for LE:, bintag.BE for BE:, and bintag.hex for x:.
`jsi2: 1
bintag.BE`
// =
Here is another convenient way to use the predefined shortcuts:
`jsaa bb
let hex = require('bintag').hex;
hex`
// =
A parenthesized group allows you to override format and endianness, and the
original settings will be restored after the group ends:
`jsi1: 1 2 (i2: 3 4) 5 6
bintag`
// =
Groups can be nested. At the start of a group, settings are inherited from the
surrounding context.
An nonnegative integer followed by an asterisk specifies a repeat count for the
immediately following value or parenthesized group:
`jsx: 2(4aa 2*1234)
bintag`
// =
The repeat count can be given by a substitution expression:
`jsi1: ${n}*${x}
let n = 6, x = 8;
bintag`
// =
Use ! followed by a positive integer to pad the data with zero bytes up to a
multiple of a number:
`jsx: aa bb cc !4 dd !2 ee !16
bintag`
// =
A substitution expression can be used instead of an integer constant to specify
the alignment.
Without a number, ! aligns to the width determined by the current format. For!
integer and floating-point formats, the format's size is used. For UTF-16
strings, the alignment is 2 bytes. For all other formats, bare has no
effect because the alignment is 1 byte.
`jsi4: 1 ! 2 (x: aa bb) ! 3 4
bintag`
// =
Note: see below for information about what offsets are relative to.
The = character followed by a nonnegative integer pads the data with zero
bytes until the offset from the beginning of the buffer becomes equal to the
number.
`jsx: aa bb =4 cc dd
bintag`
// =
An attempt to rewind before the current position will trigger an exception.
A substitution expression can be used instead of an integer constant to specify
the offset.
Note: see below for information about what offsets are relative to.
The @ character followed by a nonnegative integer computes the offset at@1
which a parenthesized group in the current template ends up in the output
buffer. refers to the group whose opening bracket is the leftmost, @2 to@0
the second leftmost one, and so on. refers to the whole template (and
therefore evaluates to 0). Such references can occur before, within, and after
the groups they refer to.
`jsx: cc (i2: 1 @2) (i2: 2 @1)
bintag`
// =
The offset will be encoded according to the current format specifier.
A substitution expression cannot be used in place of the integer after the
@ character.
Note: see below for information about what offsets are relative to.
The # character followed by a nonnegative integer computes the length a#0
parenthesized group in the current template occupies in the output buffer.
refers to the whole template.
`jsi2: (1 #1) #0
bintag`
// =
If the group referred to has a repeat count, the size of the content is taken
before the repeat count is applied.
The length will be encoded according to the current format specifier.
When # is instead followed by a substitution expression, the size of the data
is computed; however, the data itself is not placed into the output buffer.
This is useful to compute the size of a buffer or a list of buffers:
`jsx: aabb ccdd
let buf = bintag;i2: #${buf}
bintag`
// =
Finally, # can be followed by a parenthesized group. In this case, the size
that group would take in the output buffer is computed; however, the group
itself does not produce output.
`jsi2: #(az: ${'abc'})
bintag`
// =
Normally, offsets for purposes of alignment, padding, and offset expressions,
are relative to the start of the output buffer. However, within a parenthesized
group with a repeat count, even if the repeat count is 1, offsets are instead
relative to the start of the (innermost such) group.
`jsx: 00 2*(aa =4 bb p2 cc)
bintag`
// =
The same applies within parenthesized groups preceded by #.
`jsi1: 1 #(x: ddee i4: p)
bintag`
// =
You can get a “compiled template” object by using bintag.compile in a tagged template
expression:
`jsx: aa bb
let t = bintag.compile;`
This also works for shortcuts:
`js1 2
let short = bintag.tag('i2:');
let t = short.compile;`
A “compiled template” has the following API:
* create() method: creates and returns a new Buffer with the binary data
described by the template. This can be called repeatedly to obtain new
buffers without re-parsing the template.
* length property (read-only): the length of the data described by the
template. Can be used to do some sort of allocation or negotiation in
advance.
* write(buf, [offset]) method: writes the data into an existing buffer at the
specified offset (defaults to offset 0). The buffer must be large enough to
contain the data. A non-zero offset specified here does not affect the offset
calculations within the template (that is, the start of the template is still
considered to have offset 0). The method returns the number of bytes written.
In addition, a compiled template has a numbered property for each parenthesized
group in the template. These are objects with offset and length read-only
properties that return the offset and length of each parenthesized group.
`jsi4: 0 (x: aa bb)
let t = bintag.compile`
t[1].offset
// = 4
t[1].length
// = 2
Note that e.g. t[1].offset has the same value as @1 in the template, andt[1].length has the same value as #1 in the template.
Compiled templates and arrays of compiled templates can be used within other
templates, in a manner similar to buffers:
`jsx: aa bb
let t = bintag.compile;2*${t}
bintag``
// =
Arrays, buffers and other objects referred to by substitution expressions, must
not be modified between the compilation and any use of a template.