[](https://www.unicode.org/versions/Unicode17.0.0/)
npm install grapheme-cluster-break
A high-performance JavaScript library (powered by WebAssembly) for segmenting Unicode strings into grapheme clusters (user-perceived characters) according to UAX #29: Unicode Text Segmentation.
``bash`
npm install grapheme-cluster-break
`javascript
import { segmentGraphemeClusters } from "grapheme-cluster-break";
// Basic usage
const clusters = segmentGraphemeClusters("Hello");
console.log(clusters); // ['H', 'e', 'l', 'l', 'o']
// Emoji ZWJ sequences
const family = segmentGraphemeClusters("π¨βπ©βπ§βπ¦");
console.log(family); // ['π¨βπ©βπ§βπ¦']
// Combining characters
const accent = segmentGraphemeClusters("Γ©"); // e + combining acute
console.log(accent); // ['Γ©']
// Regional indicators (flags)
const flags = segmentGraphemeClusters("π¨π³πΊπΈ");
console.log(flags); // ['π¨π³', 'πΊπΈ']
// Indic conjuncts
const indic = segmentGraphemeClusters("ΰ€ΰ₯ΰ€·");
console.log(indic); // ['ΰ€ΰ₯ΰ€·']
// CJK characters
const cjk = segmentGraphemeClusters("δ½ ε₯½δΈη");
console.log(cjk); // ['δ½ ', 'ε₯½', 'δΈ', 'η']
// Hangul
const hangul = segmentGraphemeClusters("νκΈ");
console.log(hangul); // ['ν', 'κΈ']
`
Full TypeScript support with type definitions included:
`typescript
import { segmentGraphemeClusters } from "grapheme-cluster-break";
const clusters: string[] = segmentGraphemeClusters("π¨βπ©βπ§βπ¦");
`
Initializes the WebAssembly module. Must be called once before using segmentGraphemeClusters.
Segments a string into grapheme clusters.
Parameters:
- s - The input string to segment.extended
- (optional, default: true) - If true, uses extended grapheme cluster rules. If false, uses legacy rules.
Returns:
- An array of strings, each representing one grapheme cluster.
Throws:
- Error if init() has not been called.
- Emscripten
- Node.js 18+
- CMake 4.0+
`bashBuild WASM
npm run build
Browser Usage
The library works in modern browsers that support WebAssembly and ES modules:
`html
``MIT License