translit
A small Python library for name-aware transliteration. Eight language pairs. github.com/T-Fizz/translit
Try it
→
…
loading Python in browser…
Runs the actual Python library locally in your browser via Pyodide + WebAssembly. First load is ~30 MB (cached after); subsequent calls are instant. Thai is the only pair unavailable here — its library has native code that Pyodide can't run. Use the package directly for that.
Examples
from translit_core import transliterate
transliterate("カナちゃん", "en") # → "Kana-chan"
transliterate("山田太郎", "en", source_lang="ja") # → "Yamada Taro"
transliterate("山田太郎", "en", source_lang="ja",
name_order="given-first") # → "Taro Yamada"
transliterate("김정은", "en", source_lang="ko") # → "Kim Jeong-eun"
transliterate("Иван Сергеевич Тургенев", "en") # → "Ivan Sergeevich Turgenev"
transliterate("नरेंद्र मोदी", "en", source_lang="hi") # → "Narendra Modi"
transliterate("عبد الرحمن", "en", source_lang="ar") # → "Abdul-Rahman"
transliterate("ทักษิณ", "en", source_lang="th") # → "Thaksin"
transliterate("John Smith", "ja") # → "ジョン・スミス"
Returns None when the engine can't deterministically romanize an input. Fail-soft: callers see a clear signal, never a half-translated guess.
Install
pip install -e git+https://github.com/T-Fizz/translit.git#egg=translit-core
Supported pairs
| Source | Target | Method |
|---|---|---|
| Japanese (kana + kanji) | Latin | pykakasi passport-style + 22 honorifics + reverse-katakana round-trip |
| Chinese (Hanzi) | Latin | pypinyin (no tones) |
| Korean (Hangul) | Latin | RR + 36-name surname overlay + 3 honorifics |
| Russian (Cyrillic) | Latin | BGN/PCGN press-style |
| Hindi (Devanagari) | Latin | IAST + diacritic strip + schwa deletion |
| Arabic | Latin | Curated 50-name overlay |
| Thai | Latin | RTGS via pythainlp |
| English (Latin) | Japanese (katakana) | alkana + acronym fallback |
What it handles
- Japanese honorifics including compounds (
ちゃん,先生,兄ちゃん,にゃん) - CJK source disambiguation (
田中→ Tanaka if ja, Tian Zhong if zh) - Multi-kanji morpheme splitting (
山田太郎→ "Yamada Taro") - Round-trip katakana → Western name (
ヴィクター→ "Victor") - Korean traditional surname spellings (Kim/Lee/Park, not Gim/I/Bak)
- Korean 2-syllable family names (남궁/황보/사공/제갈/선우/독고)
- Russian press-style (Mikhail not Mihail, Fyodor not Fedor, ъе/ье → ye)
- Hindi schwa deletion with cluster + aspirated-digraph awareness
- Arabic input normalization: tashkeel, alif variants, tatweel, Allah ligature
- Half-width katakana / full-width Latin (NFKC)
- Punctuation stripping, silent-character dropout detection
- Family-first vs given-first name ordering (
name_order=)
Read more
- INTERNALS.md — every edge case, why each library was picked or discarded
- DESIGN.md — architecture and API contract for the optional FastAPI service
- TENETS.md — the principles behind the deterministic-and-fail-soft choices