Provides fast access to unicode character properties
npm install @chr33s/pdf-unicode-properties> Fast lookup of Unicode character metadata packaged as modern ES modules.
@chr33s/pdf-unicode-properties is part of the chr33s/pdf monorepo and continues the
Hopding/unicode-properties fork of the original foliojs project. This
edition is native ES modules only:
- ships native ES modules with NodeNext resolution (Node.js 18+ or a modern bundler required),
- is authored in TypeScript with generated declaration files, and
- keeps the compressed trie assets embedded for seamless usage across Node.js, browsers, and React Native.
Provides fast access to unicode character properties. Uses @chr33s/pdf-unicode-trie to compress the
properties for all code points into just 12KB.
``ts
import unicodeProperties, {
getCategory,
getNumericValue,
} from "@chr33s/pdf-unicode-properties";
getCategory("2".codePointAt(0) ?? 0); //=> 'Nd'
getNumericValue("2".codePointAt(0) ?? 0); //=> 2
// The default export bundles all helpers together when that is convenient.
unicodeProperties.isDigit("9".codePointAt(0) ?? 0); //=> true
`
`bash``
npm install @chr33s/pdf-unicode-properties
The package is distributed as native ES modules. Use Node.js 18+ or configure your bundler to resolve NodeNext-style imports.
Returns the unicode general category for the given code point.
Returns the script for the given code point.
Returns the canonical combining class for the given code point.
Returns the East Asian width for the given code point.
Returns the numeric value for the given code point, or null if there is no numeric value for that code point.
Returns whether the code point is an alphabetic character.
Returns whether the code point is a digit.
Returns whether the code point is a punctuation character.
Returns whether the code point is lower case.
Returns whether the code point is upper case.
Returns whether the code point is title case.
Returns whether the code point is whitespace: specifically, whether the category is one of Zs, Zl, or Zp.
Returns whether the code point is a base form. A code point of base form does not graphically combine with preceding
characters.
Returns whether the code point is a mark character (e.g. accent).