A parser for files in the Unicode database
npm install codepointsA parser for files in the Unicode database. Produces a giant array of codepoint objects for
every character represented by Unicode, with many properties derived from files in the Unicode
database.
BUILD SCRIPTS ONLY: Use in production is not recommended
as the parsers are not optimized for speed, the text files are huge, and the resulting array uses a
huge amount of memory. To access this data in real world applications, use modules that have
precompiled the data into a compressed form:
Install using npm:
npm install codepoints
Basic usage:
``js`
codepoints = require('codepoints');
The parser generates data by reading the text files contained in the
Unicode Character Database. By default, it will use the database
bundled with this package. To use a custom version of UCD, use codepoints/parser instead,
which accepts an optional path to a directory containing the uncompressed UCD data:
`js`
parser = require('codepoints/parser');
codepoints = parser('/path/to/UCD');
Each element in the generated array is either undefined (for unassigned code
points), or an object containing the following properties:
* code - the code point indexname
* - character nameunicode1Name
* - legacy name used by Unicode 1category
* - Unicode categoryblock
* - the block name this character is a part ofscript
* - the script this character belongs toeastAsianWidth
* - the east asian width for this charactercombiningClass
* - numeric combining class valuecombiningClassName
* - a string name for the combining classbidiClass
* - class for the Unicode bidirectional algorithmbidiMirrored
* - whether the character is mirrored in the bidi algorithmnumeric
* - the numeric value for this characteruppercase
* - an array of code points mapping this character to upper case, if anylowercase
* - an array of code points mapping this character to lower case, if anytitlecase
* - an array of code points mapping this character to title case, if anyfolded
* - an array of code points mapping this character to a folded equivalent, if anycaseConditions
* - conditions used during case mapping for this characterdecomposition
* - an array of code points that this character decomposes into. Used by the Unicode normalization algorithm.compositions
* - a dictionary mapping of compositions for this characterisCompat
* - whether the decomposition is a compatibility oneisExcluded
* - whether the character is excluded from compositionNFC_QC
* - quickcheck value for NFC (0 = YES, 1 = NO, 2 = MAYBE)NFKC_QC
* - quickcheck value for NFKC (0 = YES, 1 = NO, 2 = MAYBE)NFD_QC
* - quickcheck value for NFD (0 = YES, 1 = NO)NFKD_QC
* - quickcheck value for NFKD (0 = YES, 1 = NO)joiningType
* - arabic joining typejoiningGroup` - arabic joining group
*
MIT