Utitilities for Unicode code point based on Java API.

In this module, _Unicode code point_ is used for character values in the
range between 0x0000 and 0x10FFFF.
On the other hand, _Unicode code unit_ is used for 16-bit integer values
that are code units of the UTF-16 encoding.

installation

$ npm install codepoint

API

``javascript var codepoint = require('codepoint');`

`$3`

The maximum value of a Unicode code point, constant 0x10FFFF.

`$3`

The maximum value of a Unicode high-surrogate code unit in the UTF-16 encoding, constant0xDBFF.

`$3`

The maximum value of a Unicode low-surrogate code unit in the UTF-16 encoding, constant0xDFFF.

`$3`

The maximum value of a Unicode surrogate code unit in the UTF-16 encoding, constant0xDFFF.

`$3`

The maximum value of a Unicode code unit in the UTF-16 encoding, constant0xFFFF.

`$3`

The minimum value of a Unicode code point, constant 0x0000.

`$3`

The minimum value of a Unicode code unit in the UTF-16 encoding, constant0x0000.

`$3`

The minimum value of a Unicode high-surrogate code unit in the UTF-16 encoding, constant0xD800.

`$3`

The minimum value of a Unicode low-surrogate code unit in the UTF-16 encoding, constant0xDC00.

`$3`

The minimum value of a Unicode supplementary code point, constant 0x10000.

`$3`

The minimum value of a Unicode surrogate code unit in the UTF-16 encoding, constant0xD800.

`$3`

Determines the number of Unicode code units needed to represent the specified Unicode code point. If the specified code point is equal to or greater thancodepoint.MIN_SUPPLEMENTARY_CODE_POINT, then the function returns 2. Otherwise, the function returns1.

* Arguments *cp: the Unicode code point to be tested. * Returns *2if the Unicode code point is a valid supplementary code point;1 otherwise.

`$3`

Returns the code point at the given index of the str. If the code unit value at the givenindex in the stris in the high-surrogate range, the followingindexis less than the length of thestr, and the code unit value at the following indexis in the low-surrogate range, then the supplementary code point corresponding to this surrogate pair is returned. Otherwise, the code unit value at the givenindex is returned.

* Arguments *str: the string. *index: the index to the Unicode code units in strto be converted. Defaults to0. * Returns * the Unicode code point value at the given index.

`$3`

Returns the code point preceding the given index of the str. If the code unit value at (index - 1) in the stris in the low-surrogate range, (index - 2) is not negative, and the code unit value at (index - 2) in the stris in the high-surrogate range, then the supplementary code point corresponding to this surrogate pair is returned. Otherwise, the code unit value at (index - 1) is returned.

* Arguments *str: the string. *index: the index following the code point that should be returned. Defaults tostr.length. * Returns * the Unicode code point value before the given index.

`$3`

Returns the number of Unicode code points in the text range of the specifiedstr. The text range begins at the specified beginIndexand extends to the Unicode code unit at indexendIndex - 1. Thus the length (in code units) of the text range isendIndex-beginIndex. Unpaired surrogates within the text range count as one code point each.

* Arguments *str: the string. *beginIndex: the index to the first code unit of the text range. Defaults to0. *endIndex: the index after the last code unit of the text range. Defaults tostr.length. * Returns * the number of Unicode code points in the specified text range.

`$3`

Executes a provided function (cb) once for each code point present in thestr.

* Arguments *str: the string. *cb: callback function. *thisArg: this value for each invocationof cb. if it is not provided,undefined is used instead.

cb is invoked with three arguments:

* Arguments *cp: the code point value. *index: the index to the code point of the str. (i.e.cp === codepoint.codePointAt(str, index)) *str: the string being traversed.

`$3`

Returns a string value containing as many code points as the number of arguments. Each argument specifies one code point of the resulting string, with the first argument specifing the first code point, and so on, from left to right.

* Argument *cp: Unicode code point. * Returns * the string.

`$3`

Returns the leading surrogate (a high surrogate code unit) of the surrogate pair representing the specified supplementary code point in the UTF-16 encoding. If the specified Unicode code point is not a supplementary code point, an unspecified code unit is returned.

* Arguments *cp: a supplementary code point. * Returns * the leading surrogate code unit used to represent the character in the UTF-16 encoding

`$3`

Determines whether the specified Unicode code point is in the Basic Multilingual Plane (BMP). Such code points can be represented using a single code unit.

* Arguments *cp: the Unicode code point to be tested. * Returns *true if the specified code point is between codepoint.MIN_CODE_POINTandcodepoint.MAX_CODE_UNIT inclusive; false otherwise.

`$3`

Determines if the given Unicode code unit is a Unicode high-surrogate code unit (also known as leading-surrogate code unit).

* Arguments *cu: the Unicode code unit to be tested. * Returns *true if the code unit is between codepoint.MIN_HIGH_SURROGATEandcodepoint.MAX_HIGH_SURROGATE inclusive; false otherwise.

`$3`

Determines if the given Unicode code unit is a Unicode low-surrogate code unit (also known as trailing-surrogate code unit).

* Arguments *cu: the Unicode code unit to be tested. * Returns *true if the code unit is between codepoint.MIN_LOW_SURROGATEandcodepoint.MAX_LOW_SURROGATE inclusive; false otherwise.

`$3`

Determines whether the specified Unicode code point is in the supplementary character range.

* Arguments *cp: the Unicode code point to be tested. * Returns *trueif the specified code point is betweenMIN_SUPPLEMENTARY_CODE_POINT and MAX_CODE_POINTinclusive;false otherwise.

`$3`

Determines if the given Unicode code unit is a Unicode surrogate code unit.

* Arguments *cu: the Unicode code unit to be tested. * Returns *true if the code unit is between codepoint.MIN_SURROGATEandcodepoint.MAX_SURROGATE inclusive; false otherwise.

`$3`

Determines whether the specified pair of code units is a valid Unicode surrogate pair.

* Arguments *highCu: the high-surrogate code unit to be tested. *lowCu: the low-surrogate code unit to be tested. * Returns *trueif the specified high and low-surrogate code values represent a valid surrogate pair;false otherwise.

`$3`

Returns the trailing surrogate (a low surrogate code unit) of the surrogate pair representing the specified supplementary code point in the UTF-16 encoding. If the specified code point is not a supplementary character, an unspecified code unit is returned.

* Arguments *cp: a supplementary code point. * Returns * the trailing surrogate code unit used to represent the character in the UTF-16 encoding

`$3`

Returns the index within the given str that is offset from the given indexbycodePointOffsetcode points. Unpaired surrogates within the text range given byindex and codePointOffset count as one code point each.

* Arguments *str: the string. *index: the index to be offset. *codePointOffset: the offset in code points. * Returns * the index within thestr.-1 if index is negative or larger then the length of the str, or ifcodePointOffsetis positive and the subsequence starting withindex has fewer than codePointOffsetcode points, or ifcodePointOffset is negative and the subsequence before indexhas fewer than the absolute value ofcodePointOffset code points.

`$3`

Converts the specified Unicode code point to its UTF-16 representation stored in an array of code units. If the specified code point is a BMP (Basic Multilingual Plane or Plane 0) value, the resulting array has the same value as codePoint. If the specified code point is a supplementary code point, the resulting array has the corresponding surrogate pair.

* Arguments *cp: a Unicode code point * Returns * an array of code units having codePoint's UTF-16 representation.

`$3`

Converts the specified surrogate pair to its supplementary code point value. This method does not validate the specified surrogate pair. The caller must validate it usingcodepoint.isSurrogatePair() if necessary.

* Arguments *highSurrogate: the high-surrogate code unit. *lowSurrogate`: the low-surrogate code unit.
* Returns
* the supplementary code point composed from the specified surrogate pair.

License

node-codeunit is licensed under the
MIT license.

Utitilities for Unicode code point based on Java API.

installation

$ npm install codepoint

API

``javascript var codepoint = require('codepoint');`

`$3`

The maximum value of a Unicode code point, constant 0x10FFFF.

`$3`

The maximum value of a Unicode high-surrogate code unit in the UTF-16 encoding, constant0xDBFF.

`$3`

The maximum value of a Unicode low-surrogate code unit in the UTF-16 encoding, constant0xDFFF.

`$3`

The maximum value of a Unicode surrogate code unit in the UTF-16 encoding, constant0xDFFF.

`$3`

The maximum value of a Unicode code unit in the UTF-16 encoding, constant0xFFFF.

`$3`

The minimum value of a Unicode code point, constant 0x0000.

`$3`

The minimum value of a Unicode code unit in the UTF-16 encoding, constant0x0000.

`$3`

The minimum value of a Unicode high-surrogate code unit in the UTF-16 encoding, constant0xD800.

`$3`

The minimum value of a Unicode low-surrogate code unit in the UTF-16 encoding, constant0xDC00.

`$3`

The minimum value of a Unicode supplementary code point, constant 0x10000.

`$3`

The minimum value of a Unicode surrogate code unit in the UTF-16 encoding, constant0xD800.

`$3`

* Arguments *cp: the Unicode code point to be tested. * Returns *2if the Unicode code point is a valid supplementary code point;1 otherwise.

`$3`

* Arguments *str: the string. *index: the index to the Unicode code units in strto be converted. Defaults to0. * Returns * the Unicode code point value at the given index.

`$3`

* Arguments *str: the string. *index: the index following the code point that should be returned. Defaults tostr.length. * Returns * the Unicode code point value before the given index.

`$3`

Executes a provided function (cb) once for each code point present in thestr.

* Arguments *str: the string. *cb: callback function. *thisArg: this value for each invocationof cb. if it is not provided,undefined is used instead.

cb is invoked with three arguments:

* Arguments *cp: the code point value. *index: the index to the code point of the str. (i.e.cp === codepoint.codePointAt(str, index)) *str: the string being traversed.

`$3`

* Argument *cp: Unicode code point. * Returns * the string.

`$3`

* Arguments *cp: a supplementary code point. * Returns * the leading surrogate code unit used to represent the character in the UTF-16 encoding

`$3`

Determines whether the specified Unicode code point is in the Basic Multilingual Plane (BMP). Such code points can be represented using a single code unit.

* Arguments *cp: the Unicode code point to be tested. * Returns *true if the specified code point is between codepoint.MIN_CODE_POINTandcodepoint.MAX_CODE_UNIT inclusive; false otherwise.

`$3`

Determines if the given Unicode code unit is a Unicode high-surrogate code unit (also known as leading-surrogate code unit).

* Arguments *cu: the Unicode code unit to be tested. * Returns *true if the code unit is between codepoint.MIN_HIGH_SURROGATEandcodepoint.MAX_HIGH_SURROGATE inclusive; false otherwise.

`$3`

Determines if the given Unicode code unit is a Unicode low-surrogate code unit (also known as trailing-surrogate code unit).

* Arguments *cu: the Unicode code unit to be tested. * Returns *true if the code unit is between codepoint.MIN_LOW_SURROGATEandcodepoint.MAX_LOW_SURROGATE inclusive; false otherwise.

`$3`

Determines whether the specified Unicode code point is in the supplementary character range.

* Arguments *cp: the Unicode code point to be tested. * Returns *trueif the specified code point is betweenMIN_SUPPLEMENTARY_CODE_POINT and MAX_CODE_POINTinclusive;false otherwise.

`$3`

Determines if the given Unicode code unit is a Unicode surrogate code unit.

* Arguments *cu: the Unicode code unit to be tested. * Returns *true if the code unit is between codepoint.MIN_SURROGATEandcodepoint.MAX_SURROGATE inclusive; false otherwise.

`$3`

Determines whether the specified pair of code units is a valid Unicode surrogate pair.

`$3`

* Arguments *cp: a supplementary code point. * Returns * the trailing surrogate code unit used to represent the character in the UTF-16 encoding

`$3`

* Arguments *cp: a Unicode code point * Returns * an array of code units having codePoint's UTF-16 representation.

`$3`

* Arguments *highSurrogate: the high-surrogate code unit. *lowSurrogate`: the low-surrogate code unit.
* Returns
* the supplementary code point composed from the specified surrogate pair.

License

node-codeunit is licensed under the
MIT license.