Tibetan transliteration (Wylie, EWTS) and approximate phonetics
npm install tibetan-ewts-converterThis JavaScript package implements two things:
- conversion between Unicode Tibetan text and
Extended Wylie transliteration (EWTS)
- approximate Tibetan phonetics according to THL and other systems.
``bash
npm install tibetan-ewts-converter
`
As of version 2, this is a pure ES module.
`javascript`
import { EwtsConverter } from 'tibetan-ewts-converter/EwtsConverter';
const ewts = new EwtsConverter();
console.log(ewts.to_unicode("sangs rgyas"));
console.log(ewts.to_ewts("སངས་རྒྱས"));
`javascript`
import { get_phonetics } from 'tibetan-ewts-converter';
const pho = get_phonetics({ style: "lotsawahouse", lang: "en" });
console.log(pho.phonetics("sangs rgyas", { autosplit: true }));
The constructor accepts an optional object with named options:
- check: generate warnings for illegal consonant sequences and the like; default is true.check_strict
- : stricter checking, examine the whole stack; default is true.fix_spacing
- : remove spaces after newlines, collapse multiple tseks into one, fix case, etc; default is true.sloppy
- : silently fix a number of common Wylie mistakes when converting to Unicode; default is falseleave_dubious
- : when converting to Unicode, leave dubious syllables unprocessed, between \[brackets\], instead of doing a best attempt; default is falsepass_through
- : when converting to EWTS, pass through non-Tib characters instead of converting to \[comments\]; default is false
`javascript`
let ewts = new EwtsConverter({ check_strict: false, leave_dubious: true, sloppy: true });
get_phonetics accepts an optional object with named options:
- style: one of 'thl', 'lotsawahouse', 'rigpa', 'lhasey', 'padmakara'lang
- : 2-letter language code, for styles that have language variants (ex. 'en', 'es')
The phonetics method takes a string (Tibetan Unicode or EWTS), and an optional options object.
Unless you're using a better external tokenizer, always pass the option { autosplit: true }.
See the code for lots of other options allowing fine control of phonetics generation. You can also directly import and use the classes TibetanPhonetics, TibetanPhoneticsRigpa, TibetanPhoneticsLhasey and TibetanPhoneticsPadmakara`.
The first version of this code was written in Perl
around 2008. In 2010 the EWTS/Unicode converter was ported to Java at the request of
TBRC, now BDRC.
The Java code for phonetics was then ported to other languages by different groups:
- Python port by Esukhia
- C# port by radiantspace
- Another Python port by radiantspace
- JavaScript ports from BDRC, Ksana Forge and Karmapa Digital Toolbox
- This Javascript port of 2021, going back to the original Perl code, but incorporating some of the improvements done by various groups.
Phonetics generation was added to this project in 2025, also ported from the original perl with the help of AI.
Apache 2.0.