Showing 1-20 of 109 packages
TypeScript definitions for tiny-segmenter
WebVTT parser, compiler, and segmenter with HLS support
This repo builds .wasm module using icu4c for breaking text into words, so that we can polyfill [Intl Segmenter Proposal](https://github.com/tc39/proposal-intl-segmenter) with full compatibility, even on browsers that do not expose v8BreakIterator api.
A small chunk segmenter.
Work with grapheme, words, and sentences with small, simple, and fast API using Intl.Segmenter
A lightweight implementation of the Unicode Text Segmentation (UAX #29)
Polyfill for Intl.Segmenter
Super compact Japanese tokenizer in Javascript. http://chasen.org/~taku/software/TinySegmenter/
Lightweight Japanese word segmenter
The node.js implement of IKAnalyzer Chinese Segmenter.
MP4 video file segmenter for MPEG-DASH usage, based on MP4Box
unicode-segmenter for miniprogram
Split a string in to sentences. Supports multiple languages.
segments Bluesky's rich text facets into tokens
A polyfill for Intl.Segmenter
`data-segmenter` is a tool that allows package consumers to define segments from their data regardless of data source like MongoDB or SQL in the backend and provide those segments to a client consumer or user in the frontend.
SRT parser, compiler, and segmenter with HLS support
Clause segmentation extension for GLOST - segments sentences into clauses
Locale-aware word counting powered by the Web API [`Intl.Segmenter`](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Intl/Segmenter). The script automatically detects the primary writing system for each portion of the input, seg
recursive segmenter is for recursively identifying separate words in Chinese or any eastern text