@cto.af/wtf8

Encode and decode WTF-8 with a similar
API to
TextEncoder
and
TextDecoder.

The goal is to be able to parse and generate bytestreams that can store any
JavaScript string, including ones that have unpaired surrogates.

Installation

``sh npm install @cto.af/wtf8`

`API`

Full API documentation is available.

Example:

`js import {Wtf8Decoder, Wtf8Encoder} from '@cto.af/wtf8';

const bytes = new Wtf8Encoder().encode('\ud800'); const string = new Wtf8Decoder().decode(bytes); // '\ud800'`

W3C streams are also provided: Wtf8EncoderStream and Wtf8DecoderStream`.

Notes

Used a few of the tricks from the paper
Validating UTF-8 In Less Than One Instruction Per Byte,
but not all of them. Moving data in and out of WASM to be able to use SIMD
might be slightly faster, but since we're not merely validating but instead
actually decoding (and generating replacement characters when fatal is false),
staying in JS seems good enough for the moment.

---
![Build Status](https://github.com/cto-af/wtf8/actions?query=workflow%3ATests)
![codecov](https://codecov.io/gh/cto-af/wtf8)