Georgian Language Hyphenation Library - Browser + Node.js compatible
npm install georgian-hyphenationbash
npm install georgian-hyphenation
`
Usage
$3
`javascript
import GeorgianHyphenator from 'georgian-hyphenation';
const hyphenator = new GeorgianHyphenator();
// Basic hyphenation
console.log(hyphenator.hyphenate('საქართველო'));
// Output: საქართველო
// Get syllables as array
console.log(hyphenator.getSyllables('თბილისი'));
// Output: ['თბი', 'ლი', 'სი']
// Hyphenate entire text
const text = 'საქართველო არის ძალიან ლამაზი ქვეყანა';
console.log(hyphenator.hyphenateText(text));
`
$3
`javascript
const GeorgianHyphenator = require('georgian-hyphenation');
const hyphenator = new GeorgianHyphenator();
console.log(hyphenator.hyphenate('კომპიუტერი'));
`
$3
`html
`
Or without modules:
`html
`
API
$3
`javascript
const hyphenator = new GeorgianHyphenator(hyphenChar = '\u00AD');
`
Parameters:
- hyphenChar (optional): Character to use for hyphenation. Default is soft hyphen (\u00AD)
$3
#### hyphenate(word)
Hyphenates a single word.
`javascript
hyphenator.hyphenate('საქართველო');
// Returns: 'საქართველო'
`
#### getSyllables(word)
Returns an array of syllables.
`javascript
hyphenator.getSyllables('თბილისი');
// Returns: ['თბი', 'ლი', 'სი']
`
#### hyphenateText(text)
Hyphenates all words in a text string.
`javascript
hyphenator.hyphenateText('საქართველო არის ლამაზი');
// Returns: 'საქართველო არის ლამაზი'
`
#### loadLibrary(data)
Load custom exception dictionary.
`javascript
const customWords = {
'განათლება': 'გა-ნათ-ლე-ბა',
'უნივერსიტეტი': 'უ-ნი-ვერ-სი-ტე-ტი'
};
hyphenator.loadLibrary(customWords);
`
#### async loadDefaultLibrary()
Load the default exception dictionary (browser only, requires network).
`javascript
await hyphenator.loadDefaultLibrary();
`
Custom Hyphen Character
You can use any character for hyphenation:
`javascript
// Visible hyphen
const hyphenator = new GeorgianHyphenator('-');
console.log(hyphenator.hyphenate('საქართველო'));
// Output: 'სა-ქარ-თვე-ლო'
// Custom separator
const hyphenator2 = new GeorgianHyphenator('•');
console.log(hyphenator2.hyphenate('საქართველო'));
// Output: 'სა•ქარ•თვე•ლო'
`
Compound Words (v2.2.6+)
The library now preserves existing hyphens in compound words:
`javascript
hyphenator.hyphenate('მაგ-რამ');
// Preserves the hyphen: 'მაგ-რამ'
hyphenator.hyphenate('ხელ-ფეხი');
// Preserves the hyphen: 'ხელ-ფეხი'
`
CSS Integration
Use soft hyphens for automatic line breaking:
`css
.georgian-text {
hyphens: auto;
-webkit-hyphens: auto;
-ms-hyphens: auto;
}
`
`javascript
const hyphenator = new GeorgianHyphenator('\u00AD'); // soft hyphen
document.querySelector('.georgian-text').innerHTML =
hyphenator.hyphenateText('თქვენი ტექსტი აქ');
`
Algorithm
The library uses a phonetic algorithm based on Georgian syllable structure:
1. Vowel Detection: Identifies vowels (ა, ე, ი, ო, უ)
2. Consonant Cluster Analysis: Recognizes 70+ harmonic clusters
3. Gemination Rules: Splits double consonants (კკ → კკ)
4. Orphan Prevention: Ensures minimum syllable length (2 characters)
$3
`
ბლ, ბრ, ბღ, ბზ, გდ, გლ, გმ, გნ, გვ, გზ, გრ, დრ, თლ, თრ, თღ,
კლ, კმ, კნ, კრ, კვ, მტ, პლ, პრ, ჟღ, რგ, რლ, რმ, სწ, სხ, ტკ,
ტპ, ტრ, ფლ, ფრ, ფქ, ფშ, ქლ, ქნ, ქვ, ქრ, ღლ, ღრ, ყლ, ყრ, შთ,
შპ, ჩქ, ჩრ, ცლ, ცნ, ცრ, ცვ, ძგ, ძვ, ძღ, წლ, წრ, წნ, წკ, ჭკ,
ჭრ, ჭყ, ხლ, ხმ, ხნ, ხვ, ჯგ
`
Browser Support
- ✅ Chrome/Edge 90+
- ✅ Firefox 88+
- ✅ Safari 14+
- ✅ Node.js 14+
Performance
- Average hyphenation speed: ~0.05ms per word
- Memory usage: ~50KB with dictionary loaded
- Optimized with Set for O(1) cluster lookups
Examples
$3
`javascript
const hyphenator = new GeorgianHyphenator();
function formatText(text) {
return hyphenator.hyphenateText(text);
}
document.getElementById('content').innerHTML = formatText(bookText);
`
$3
`javascript
const hyphenator = new GeorgianHyphenator('\u00AD');
const justified = hyphenator.hyphenateText(
'საქართველო არის ერთ-ერთი უძველესი ქვეყანა მსოფლიოში'
);
`
$3
`javascript
const hyphenator = new GeorgianHyphenator('·');
const syllables = hyphenator.getSyllables('პროგრამირება');
syllables.forEach((syllable, i) => {
setTimeout(() => {
console.log(syllable);
}, i * 200);
});
`
Changelog
$3
- ✨ Preserves regular hyphens in compound words
- 🐛 Fixed hyphen stripping to only remove soft hyphens and zero-width spaces
- 📝 Improved documentation
$3
* 🌐 Browser Fix: Fixed CDN URL for reliable dictionary loading
* 📦 NPM Files: Added data/ folder to published package (files whitelist)
* 🔧 Error Handling: Improved fallback when dictionary unavailable
* 📝 Documentation: Corrected examples, removed non-existent words
$3
* 🧹 Sanitization: Added _stripHyphens for automatic input cleaning
* ⚡ Performance: Converted harmonicClusters to Set (O(1) lookup)
* 📦 ESM: Full ES Modules support
* 📚 Dictionary: Added loadDefaultLibrary() method
$3
* 🎓 Academic Rewrite: Phonological distance analysis
* 🛡️ Anti-Orphan: Minimum 2 characters on each side
* 🎼 Harmonic Clusters: Georgian-specific consonant groups
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT © Guram Zhgamadze
Author
Guram Zhgamadze
- GitHub: @guramzhgamadze
- Email: guramzhgamadze@gmail.com
Related
- Georgian Language Resources
- Unicode Georgian Range
---
Made with ❤️ for the Georgian language community
`
Save this as README.md in your package root directory, then:
`bash
git add README.md
git commit -m "Add comprehensive README"
git push
npm publish
``