Typed Bytes

_A public domain binary encoding library for TypeScript._

Hello World

``sh npm install typed-bytes`

`ts import * as tb from "typed-bytes";

console.log( tb.string.encode("Hello world!"), );

/* Uint8Array(13) [ 12, 72, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100, 33 ] */`

Deno users

No need for install, run the above code directly with this tweak:

`diff -import * as tb from "typed-bytes"; +import * as tb from "https://raw.githubusercontent.com/voltrevo/monorepo/a52752e/projects/typed-bytes/mod.ts";`

`About`

typed-bytesprovides convenient type-aware binary encoding and decoding. The type awareness provides two main benefits:

1. Smaller encoded size 2. Type information is present on decoded values

This alone isn't anything new, the key is how typed-bytesembraces TypeScript. In particular:

- there is no need for code-gen - unions are supported - exact types are supported - you can extract type information from the bicoder, so you don't need to duplicate it

For example:

`ts const LogMessage = tb.Object({ level: tb.Enum("INFO", "WARN", "ERROR"), message: tb.string, });

/* // on hover: type LogMessage = { level: "INFO" | "WARN" | "ERROR"; message: string; } */ type LogMessage = tb.TypeOf;

const buffer = LogMessage.encode({ level: "INFO", message: "Test message", });

/* 0, // Option 0: 'INFO' 12, // Message needs 12 bytes 84, 101, 115, 116, // utf-8 bytes for "Test message" 32, 109, 101, 115, 115, 97, 103, 101

// (Notice how no bytes were used for strings 'level', 'message', or 'INFO') */ console.log(buffer);

/* // on hover: const decodedValue: { level: "INFO" | "WARN" | "ERROR"; message: string; } */ const decodedValue = LogMessage.decode(buffer);`

`A More Complex Example`

Suppose we were making a graphics application where the user can draw shapes on a canvas. We want to be able to be able to encode the canvas and its shapes so we can save it to disk, synchronize it with a remote display, or what-have-you.

!Snake

The image above is encoded in just 71 bytes. Keep reading for a step-by-step guide to create a vector graphics format to achieve this usingtyped-bytes.

`RPC`

To use RPC you need to provide a bufferIO which conforms to type tb.BufferIOso thattyped-bytes has a way to send and receive data:

`ts type BufferIO = { read(): Promise; write(buffer: Uint8Array): Promise; };`

_(In future, some convenience methods will probably be added to handle the common TCP socket and WebSocket use cases here, but it's also important to keep this because it allows you to provide whatever exotic transport you desire.)_

Then, define your protocol like this:

`ts const GreeterProtocol = tb.Protocol({ // A method that accepts a string and returns a string // (You can use more complex types too of course, as well as multiple // arguments.) sayHello: tb.Method(tb.string)(tb.string), });`

On the server, use tb.serveProtocol:

`ts tb.serveProtocol(bufferIO, greeterProtocol, { sayHello: (name) => { return Promise.resolve(Hi ${name}!); }, });`

On the client, use tb.Client:

`ts const greeterClient = tb.Client(GreeterProtocol);

const reply = await greeterClient.sayHello("Alice"); console.log(reply); // "Hi Alice!"`

Complete example.

This is all fully typed (and there's still no codegen involved). That means:

- When you type greeterClient., your IDE will show you the list of methods - Calls to those methods will have their arguments checked and the return type will be inferred correctly - When you callserveProtocolyou'll get useful intellisense related to the protocol you passed in and TypeScript will check your implementation provides all the methods correctly

`Status`

typed-bytes isn't ready to offer a stable API.

Having said that, I believe it's very usable in its current form by pinning the version. It's also only ~500 sloc, so if you have problems upgrading you have the option of staying on your own fork.

`Plans`

- Support for omitting fields instead of optionals needing to be present withnull/undefined- Better support for sparse objects / condense union options at the object level so that a whole byte isn't needed for each union option - Optionally including some header bytes representing a digest of the type information - Performance testing and tuning - Tools for aligning with existing encodings - Advice about versioning and compatibility when usingtyped-bytes- Better support for user defined types (e.g. include classes) - Async support - Adaptors for files/sockets/etc - Optional code-gen for boosting performance and supporting other languages - Incorporate pointers to support file format enabling incremental changes to large data structures

`Contributing`

See CONTRIBUTING.md.

`Why Use typed-bytes Instead Of...`

`$3`


Less compact, no type information, click for more
1. typed-bytes is more compact:

`ts const msg: LogMessage = { type: "INFO", message: "Test message", };

new TextEncoder().encode(JSON.stringify(msg)); // 40 bytes LogMessage.encode(msg); // 14 bytes`

Of course, typed-bytes is relying on the type information to achieve this, and you need that information to decode the buffer. With JSON, you can decode it in a different place with justJSON.parse.

2. JSON.parsedoesn't check the structure being decoded and doesn't provide type information:

`ts // on hover: // const jsonValue: any const jsonValue = JSON.parse('{"type":"INFO","message":"Test message"}');

// on hover: // const tbValue: { // level: "INFO" | "WARN" | "ERROR"; // message: string; // } const tbValue = LogMessage.decode(buffer);`

If you still really like JSON for its human readable format, and you like JSON's API, you might still be interested in usingtyped-bytesfor its type information. I have includedtb.JSON to mirror the JSON api like so:

`ts // on hover: // const typedValue: { // level: "INFO" | "WARN" | "ERROR"; // message: string; // } const typedValue = tb.JSON.parse( LogMessage, '{"type":"INFO","message":"Test message"}', ); // (This will also throw if the json is not a valid LogMessage.)

const jsonString = tb.JSON.stringify(LogMessage, { // These fields are type checked againstLogMessagelevel: "INFO", message: "Test message", });`

(If you're not interested in type information, then I'm not sure why you're here 😄.)

`$3`


Less compact, no type information, click for more
1. typed-bytes is more compact:

`ts const msg: LogMessage = { type: "INFO", message: "Test message", };

msgpack.encode(msg); // 33 bytes LogMessage.encode(msg); // 14 bytes`

Of course, typed-bytes is relying on the type information to achieve this, and you need that information to decode the buffer. With MessagePack, you can decode the json in a different place with only the MessagePack library.

2. MessagePack doesn't check the structure being decoded and doesn't provide type information:

`ts // on hover: // const msgpackValue: unknown const msgpackValue = msgpack.decode(buffer);

// on hover: // const tbValue: { // level: "INFO" | "WARN" | "ERROR"; // message: string; // } const tbValue = LogMessage.decode(buffer);`

`$3`


Code-gen, unnecessary code complexity, click for more
Protobuf mini-project containing these
examples.

1. Requires learning a special-purpose .protolanguage (can be a positive _if_ you need to share a protocol with a team that doesn't want to interact with TypeScript)

`proto // messages.proto

syntax = "proto3";

message LogMessage { enum Level { INFO = 1; WARN = 2; ERROR = 3; }

Level level = 1; string message = 2; }`

2. Requires code-gen:

`sh pbjs messages.proto -t static-module -o messages.js pbts messages.js -o messages.d.ts`

3. Protobuf requires you to use its wrappers around your objects which is more verbose:

`ts // More verbose: special protobuf object instead of vanilla object const msg = new LogMessage({ // More verbose: enum wrapper instead of vanilla string level: LogMessage.Level["INFO"], message: "Test message", });`

4. Assuming you want to use protobuf version 3 (as opposed to version 2 which was superseded by version 3 five years ago), protobuf forces all fields to be optional.

TypeScript cannot tell you when you have forgotten a field:

`ts const msg = new LogMessage({ // Forgotlevel, but this compiles just fine message: "Test message", });`

Protobuf is inconsistent about how it represents missing fields:

`ts const emptyMessage = LogMessage.decode( LogMessage.encode(new LogMessage()).finish(), );`

If you use protobuf's wrapped object (and likely other contexts when using cross-language tooling) it will give you its default value for that type:

`ts console.log(JSON.stringify(emptyMessage.message)); /* "" */

// This means you can't tell the difference between the field being missing or // present as an empty string when accessing the field in this way.`

But if you want to work with plain objects, .toJSONwill omit the fields entirely:

`ts console.log(emptyMessage.toJSON()); /* {} */`

In the real world, fields are very often required. It is generally the expected default when programming - if you say that a structure has a field, then an instance of that structure must have that field.

In many cases, this means you need to take special care to deal with the fact that protobuf considers your fields to be optional, even though your application considers messages that are missing those fields to be invalid, and thus should never have been encoded/decoded in the first place.

Protobuf's reason for doing this is that it helps with compatibility. If you are forced to check whether fields are present, then an old message which doesn't have that field will be able to be processed by your upgrade that includes that field (even if that means the upgrade throws it out because it is required nonetheless). Some may find this valuable.typed-bytesallows you to make this decision instead of deciding for you.

5. typed-bytesallows entities of all shapes and sizes, but protobuf only supports objects:

`ts const LogMessages = tb.Array(LogMessage);`

If you want an array in protobuf, you must wrap it in an object:

`proto message LogMessages { repeated LogMessage content = 1; }`

`$3`


Verbose, TypeScript is unofficial, no type information, click for
more
Avro mini-project containing these examples.
Note: avro doesn't have any official support for JavaScript or TypeScript. The
best unofficial library appears to be avsc, and
this is being used for comparison here.
1. avsc's first example from
   their README.md is
   rejected by the TypeScript compiler.

`ts import avro from "avsc";

/* Argument of type '{ type: "record"; fields: ({ name: string; type: { type: "enum"; symbols: string[]; }; } | { name: string; type: string; })[]; }' is not assignable to parameter of type 'Schema'. Type '{ type: "record"; fields: ({ name: string; type: { type: "enum"; symbols: string[]; }; } | { name: string; type: string; })[]; }' is not assignable to type 'string'. ts(2345) */ const type = avro.Type.forSchema({ type: "record", fields: [ { name: "kind", type: { type: "enum", symbols: ["CAT", "DOG"] } }, { name: "name", type: "string" }, ], });`

On troubleshooting this I discovered the namefield is required, so you can fix the example above by adding that field at the top level and also in the embedded enum type.

2. Schemas are much more verbose than typed-bytes:

`ts // avsc const LogMessage = avro.Type.forSchema({ name: "LogMessage", type: "record", fields: [ { name: "level", type: { type: "enum", name: "Level", symbols: ["INFO", "WARN", "ERROR"], }, }, { name: "message", type: "string" }, ], });`

`ts // typed-bytes const LogMessage = tb.Object({ level: tb.Enum("INFO", "WARN", "ERROR"), message: tb.string, });`

3. Type information is not available to the TypeScript compiler (or your IDE):

`ts //.toBufferbelow is typed as: // (method) Type.toBuffer(value: any): any const buf = LogMessage.toBuffer({ level: "INFO", message: "Test message", });`

This also means if you want a TypeScript definition of this object, you'll need to define it redundantly, and TypeScript can't protect you from that redundant type getting out of sync with your avro schema.

By comparison, in typed-bytes, you can write:

`ts type LogMessage = tb.TypeOf;`

`$3`


Lack of support, slow, hacky, click for more
To be clear, we are talking about using Cap'n Proto from TypeScript here. If you
are not using TypeScript these comparisons do not apply.
1. Library describes itself as slow.
> Because v8 cannot inline or otherwise optimize calls into C++ code, and
> because the C++ bindings are implemented in terms of the "dynamic" API, this
> implementation is actually very slow.
node-capnp docs
2. Library describes itself as hacky.
> This package is a hacky wrapper around the Cap'n Proto C++ library.
node-capnp docs
3. Cap'n Proto requires that you install it at the system level.

Simply running npm install capnp does not work:

`// lots of noise npm ERR! ../src/node-capnp/capnp.cc:31:10: fatal error: capnp/dynamic.h: No such file or directory npm ERR! 31 | #include // lots more noise`

As commented by a node-capnp member, this is a requirement.

4. After installing at the system level, npm install capnpstill does not work.

I'm running nodejs 16.1.0 on ubuntu 20.04, and I was able to install Cap'n Proto on my system to fufil the requirement above just fine withsudo apt install capnproto. However, npm install capnpcontinues to fail with the same error.

I'd like to expand on the Cap'n Proto comparison, but for now I think it is clear enough that Cap'n Proto is not currently suitable for use with TypeScript. Contributions welcome.

`$3`


Code-gen, strange API, non-js dependencies, click for more
FlatBuffers mini-project containing these
examples.

1. Requires learning a special-purpose .fbs language.

Here's the .fbs file for LogMessage:

`fbs // FlatBuffers doesn't appear to require namespaces, but for some reason they // are needed to get correct TypeScript output. namespace Sample;

enum Level: byte { INFO = 0, WARN = 1, ERROR = 2 }

table LogMessage { level: Level; message: string; }`

2. Requires code-gen.

`sh flatc --ts LogMessage.fbs`

3. Code-gen requires non-js dependency flatc.

On Ubuntu 20.04 I was able to install using:

`sh sudo apt install flatbuffers-compiler`

4. Version 2.0.0 of the npm package was released in a broken state.

Hopefully they have fixed this by the time you're reading this. I was unlucky enough to try to use FlatBuffers for the first time on the day this release went out, and it took me some time to realise that 2.0.0 was just broken and I needed to install 1.x.

(Simply running require('flatbuffers')threw an error. As far as I can tell the artifact they pushed to npm was incomplete.)

5. flatc's TypeScript code requires a workaround to compile.

The first line of code generated by flatc is:

`ts import { flatbuffers } from "./flatbuffers";`

(In fact, for some reason, if you don't specify a namespace in your .fbsfile,flatcdoesn't even emit this import, and generates unresolved references toflatbuffers.)

./flatbuffersdoes not exist, but it's clear this is intended to be the FlatBuffers library.

Their TypeScript guide doesn't mention this, but the fix in my case was to create./flatbuffers.tswith this content:

`ts export { flatbuffers } from "flatbuffers";`

6. FlatBuffers' API is... strange

Here's what I came up with to encode a LogMessage:

`ts let builder = new flatbuffers.Builder();

// Strings need to be created externally, otherwise FlatBuffers throws: // Error: FlatBuffers: object serialization must not be nested. // // (typed-bytes doesn't have this kind of issue) const testMessage = builder.createString("Test message");

// This is clumsy and verbose. I'd also argue it doesn't even meet the // requirement of encoding a LogMessage as binary. Instead it's an API that // gives you some tools to help you do that in a way that is still very manual. Sample.LogMessage.startLogMessage(builder); Sample.LogMessage.addLevel(builder, Sample.Level.INFO); Sample.LogMessage.addMessage(builder, testMessage); const msgOffset = Sample.LogMessage.endLogMessage(builder); builder.finish(msgOffset);

const buf = builder.asUint8Array();

console.log(buf); /* // This is really long. I'm not sure why. The other schema-based encodings // (including typed-bytes) have managed 14-16 bytes. I'm not going to put this // as a concrete point for now because it might not be true outside of this // example and FlatBuffers has proved exceptionally difficult to work with so // I don't have enough time to get to the bottom of this. If you know more // about what's going on please consider contributing. Uint8Array(40) [ 12, 0, 0, 0, 8, 0, 8, 0, 0, 0, 4, 0, 8, 0, 0, 0, 4, 0, 0, 0, 12, 0, 0, 0, 84, 101, 115, 116, 32, 109, 101, 115, 115, 97, 103, 101, 0, 0, 0, 0 ] */`

the decode part is almost as strange:

`ts const byteBuffer = new flatbuffers.ByteBuffer(buf); const decodedValue = Sample.LogMessage.getRootAsLogMessage(byteBuffer);

// Outputs internal details and not level/message: console.log(decodedValue);

// You need to get the fields one by one. console.log({ level: decodedValue.level(), // 0, not 'INFO' message: decodedValue.message(), });``

I think FlatBuffers is intended to be very low level. It's targeting a use case
where you interact directly with bytes instead of ever really having js-native
objects. Even so, I expect it is possible to make this API much more ergonomic,
and I think it's just a case of trying to support every major language, and js
simply hasn't received enough attention to make something that's simple to use.

Typed Bytes

_A public domain binary encoding library for TypeScript._

Hello World

``sh npm install typed-bytes`

`ts import * as tb from "typed-bytes";

console.log( tb.string.encode("Hello world!"), );

/* Uint8Array(13) [ 12, 72, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100, 33 ] */`

Deno users

No need for install, run the above code directly with this tweak:

`diff -import * as tb from "typed-bytes"; +import * as tb from "https://raw.githubusercontent.com/voltrevo/monorepo/a52752e/projects/typed-bytes/mod.ts";`

`About`

typed-bytesprovides convenient type-aware binary encoding and decoding. The type awareness provides two main benefits:

1. Smaller encoded size 2. Type information is present on decoded values

This alone isn't anything new, the key is how typed-bytesembraces TypeScript. In particular:

- there is no need for code-gen - unions are supported - exact types are supported - you can extract type information from the bicoder, so you don't need to duplicate it

For example:

`ts const LogMessage = tb.Object({ level: tb.Enum("INFO", "WARN", "ERROR"), message: tb.string, });

/* // on hover: type LogMessage = { level: "INFO" | "WARN" | "ERROR"; message: string; } */ type LogMessage = tb.TypeOf;

const buffer = LogMessage.encode({ level: "INFO", message: "Test message", });

/* 0, // Option 0: 'INFO' 12, // Message needs 12 bytes 84, 101, 115, 116, // utf-8 bytes for "Test message" 32, 109, 101, 115, 115, 97, 103, 101

// (Notice how no bytes were used for strings 'level', 'message', or 'INFO') */ console.log(buffer);

/* // on hover: const decodedValue: { level: "INFO" | "WARN" | "ERROR"; message: string; } */ const decodedValue = LogMessage.decode(buffer);`

`A More Complex Example`

!Snake

The image above is encoded in just 71 bytes. Keep reading for a step-by-step guide to create a vector graphics format to achieve this usingtyped-bytes.

`RPC`

To use RPC you need to provide a bufferIO which conforms to type tb.BufferIOso thattyped-bytes has a way to send and receive data:

`ts type BufferIO = { read(): Promise; write(buffer: Uint8Array): Promise; };`

Then, define your protocol like this:

On the server, use tb.serveProtocol:

`ts tb.serveProtocol(bufferIO, greeterProtocol, { sayHello: (name) => { return Promise.resolve(Hi ${name}!); }, });`

On the client, use tb.Client:

`ts const greeterClient = tb.Client(GreeterProtocol);

const reply = await greeterClient.sayHello("Alice"); console.log(reply); // "Hi Alice!"`

Complete example.

This is all fully typed (and there's still no codegen involved). That means:

`Status`

typed-bytes isn't ready to offer a stable API.

Having said that, I believe it's very usable in its current form by pinning the version. It's also only ~500 sloc, so if you have problems upgrading you have the option of staying on your own fork.

`Plans`

`Contributing`

See CONTRIBUTING.md.

`Why Use typed-bytes Instead Of...`

`$3`


Less compact, no type information, click for more
1. typed-bytes is more compact:

`ts const msg: LogMessage = { type: "INFO", message: "Test message", };

new TextEncoder().encode(JSON.stringify(msg)); // 40 bytes LogMessage.encode(msg); // 14 bytes`

2. JSON.parsedoesn't check the structure being decoded and doesn't provide type information:

`ts // on hover: // const jsonValue: any const jsonValue = JSON.parse('{"type":"INFO","message":"Test message"}');

// on hover: // const tbValue: { // level: "INFO" | "WARN" | "ERROR"; // message: string; // } const tbValue = LogMessage.decode(buffer);`

const jsonString = tb.JSON.stringify(LogMessage, { // These fields are type checked againstLogMessagelevel: "INFO", message: "Test message", });`

(If you're not interested in type information, then I'm not sure why you're here 😄.)

`$3`


Less compact, no type information, click for more
1. typed-bytes is more compact:

`ts const msg: LogMessage = { type: "INFO", message: "Test message", };

msgpack.encode(msg); // 33 bytes LogMessage.encode(msg); // 14 bytes`

2. MessagePack doesn't check the structure being decoded and doesn't provide type information:

`ts // on hover: // const msgpackValue: unknown const msgpackValue = msgpack.decode(buffer);

// on hover: // const tbValue: { // level: "INFO" | "WARN" | "ERROR"; // message: string; // } const tbValue = LogMessage.decode(buffer);`

`$3`


Code-gen, unnecessary code complexity, click for more
Protobuf mini-project containing these
examples.