TypeScript utilities for Uyghur language processing - transliteration, slug generation, script detection, and more
npm install uyghur-utilsTypeScript utilities for Uyghur language processing - transliteration, slug generation, script detection, and more.




- Transliteration - Convert between Uyghur Arabic script and ULY (Uyghur Latin Yéziqi)
- Slug Generation - Generate URL-friendly slugs from Uyghur text
- Script Detection - Detect if text contains Uyghur, Arabic, Latin, Chinese, or Cyrillic
- Text Normalization - Normalize Uyghur text for consistent processing
- Number Conversion - Convert between Arabic-Indic and Western numerals
- RTL Support - Detect text direction for proper rendering
``bashnpm
npm install uyghur-utils
Usage
$3
Convert between Uyghur Arabic script and Latin (ULY):
`typescript
import { toULY, toArabic, transliterate } from 'uyghur-utils';// Arabic to Latin (ULY)
toULY('نوزۇگۇم'); // 'nozugum'
toULY('سالام دۇنيا'); // 'salam dunya'
toULY('ئۇيغۇرچە'); // 'uyghurche'
// Latin to Arabic
toArabic('salam'); // 'سالام'
toArabic('nozugum'); // 'نوزۇگۇم'
// Auto-detect and transliterate
transliterate('نوزۇگۇم'); // 'nozugum' (Arabic → Latin)
transliterate('salam'); // 'سالام' (Latin → Arabic)
`$3
Generate URL-friendly slugs:
`typescript
import { generateSlug, slugify, isValidSlug } from 'uyghur-utils';// From Uyghur text
generateSlug('نوزۇگۇم داستانى'); // 'nozugum-dastani'
generateSlug('ئۇيغۇر مەدەنىيىتى'); // 'uyghur-medeniyiti'
// From English text
generateSlug('The Story of Nozugum'); // 'the-story-of-nozugum'
// With options
generateSlug('Very Long Title Here', { maxLength: 10 }); // 'very-long'
// Validate slugs
isValidSlug('nozugum-dastani'); // true
isValidSlug('Invalid Slug'); // false
`$3
Detect the script/language of text:
`typescript
import {
detectScript,
containsArabicScript,
containsUyghurChars,
isLikelyUyghur,
isRTL,
getTextDirection,
} from 'uyghur-utils';// Detect script type
detectScript('نوزۇگۇم'); // 'uyghur'
detectScript('Hello'); // 'latin'
detectScript('你好'); // 'chinese'
detectScript('Привет'); // 'cyrillic'
// Check for specific scripts
containsArabicScript('نوزۇگۇم'); // true
containsUyghurChars('نوزۇگۇم'); // true (has Uyghur-specific chars)
isLikelyUyghur('نوزۇگۇم'); // true
// RTL detection
isRTL('نوزۇگۇم داستانى'); // true
getTextDirection('نوزۇگۇم'); // 'rtl'
getTextDirection('Hello'); // 'ltr'
`$3
Normalize text for consistent processing:
`typescript
import {
normalizeUyghur,
normalizeForSearch,
removeDiacritics,
areEquivalent,
} from 'uyghur-utils';// Basic normalization
normalizeUyghur('سالام\u200c'); // 'سالام' (removes ZWNJ)
// Search normalization (removes hamza, diacritics, punctuation)
normalizeForSearch('ئۇيغۇرچە'); // 'ۇيغۇرچە'
// Compare texts after normalization
areEquivalent('ئۇيغۇر', 'ۇيغۇر'); // true
`$3
Convert between numeral systems:
`typescript
import {
toWesternNumerals,
toArabicIndicNumerals,
extractNumbers,
formatNumber,
} from 'uyghur-utils';// Arabic-Indic to Western
toWesternNumerals('٢٠٢٤'); // '2024'
toWesternNumerals('سال: ٢٠٢٤'); // 'سال: 2024'
// Western to Arabic-Indic
toArabicIndicNumerals('2024'); // '٢٠٢٤'
// Extract numbers from text
extractNumbers('٢٠٢٤-يىل ٥-ئاي'); // [2024, 5]
// Format numbers
formatNumber(2024, 'arabic-indic'); // '٢٠٢٤'
`API Reference
$3
| Function | Description |
|----------|-------------|
|
toULY(text) | Convert Arabic script to Latin (ULY) |
| toArabic(text) | Convert Latin (ULY) to Arabic script |
| transliterate(text) | Auto-detect and convert |$3
| Function | Description |
|----------|-------------|
|
generateSlug(text, options?) | Generate URL-friendly slug |
| slugify(text, options?) | Alias for generateSlug |
| isValidSlug(slug) | Check if slug is valid |
| sanitizeSlug(slug) | Clean up a slug |$3
| Function | Description |
|----------|-------------|
|
detectScript(text) | Detect primary script |
| containsArabicScript(text) | Check for Arabic script |
| containsUyghurChars(text) | Check for Uyghur-specific chars |
| isLikelyUyghur(text) | Check if text is likely Uyghur |
| isRTL(text) | Check if text is RTL |
| getTextDirection(text) | Get 'rtl' or 'ltr' |$3
| Function | Description |
|----------|-------------|
|
normalizeUyghur(text) | Basic normalization |
| normalizeForSearch(text) | Aggressive search normalization |
| removeDiacritics(text) | Remove diacritical marks |
| removeHamza(text) | Remove hamza characters |$3
| Function | Description |
|----------|-------------|
|
toWesternNumerals(text) | Convert to Western numerals |
| toArabicIndicNumerals(text) | Convert to Arabic-Indic |
| extractNumbers(text) | Extract all numbers |
| formatNumber(num, system) | Format with numeral system |Uyghur Alphabet Reference
The Uyghur language uses Arabic script with 32 letters:
| Arabic | Latin (ULY) | Name |
|--------|-------------|------|
| ا | a | Alef |
| ە | e | E |
| ب | b | Be |
| پ | p | Pe |
| ت | t | Te |
| ج | j | Je |
| چ | ch | Che |
| خ | x | Xe |
| د | d | De |
| ر | r | Re |
| ز | z | Ze |
| ژ | zh | Zhe |
| س | s | Se |
| ش | sh | She |
| غ | gh | Ghe |
| ف | f | Fe |
| ق | q | Qe |
| ك | k | Ke |
| ڭ | ng | Nge |
| گ | g | Ge |
| ل | l | Le |
| م | m | Me |
| ن | n | Ne |
| ھ | h | He |
| و | o | O |
| ۇ | u | U |
| ۆ | ö | Ö |
| ۈ | ü | Ü |
| ۋ | w | We |
| ې | é | É |
| ى | i | I |
| ي | y | Ye |
Contributing
Contributions are welcome! Please follow these guidelines:
$3
This project follows the Conventional Commits specification.
Format:
Types:
-
feat: New feature (triggers minor version bump)
- fix: Bug fix (triggers patch version bump)
- docs: Documentation changes
- style: Code style changes (formatting, etc.)
- refactor: Code refactoring
- test: Adding or updating tests
- chore: Maintenance tasksExamples:
`bash
feat: add support for Cyrillic script transliteration
fix(slug): handle empty string input correctly
docs: update API reference for toULY function
test: add edge case tests for number conversion
`$3
`bash
Install dependencies
pnpm installRun tests
pnpm testBuild
pnpm buildType check
pnpm typecheck
``MIT © azataiot