Skip to main content

Languages/Scripts supported Tesseract OCR

Languages

LangCodeLanguage4.004.0.04.0.04.0.0
Nov. 2016tessdatatessdata_besttessdata_fast
afrAfrikaansxxxx
amhAmharicxxxx
araArabicxxxx
asmAssamesexxxx
azeAzerbaijanixxxx
aze_cyrlAzerbaijani - Cyrilicxxxx
belBelarusianxxxx
benBengalixxxx
bodTibetanxxxx
bosBosnianxxxx
breBretonxxxx
bulBulgarianxxxx
catCatalan; Valencianxxxx
cebCebuanoxxxx
cesCzechxxxx
chi_simChinese - Simplifiedxxxx
chi_traChinese - Traditionalxxxx
chrCherokeexxxx
cosCorsicanxxx
cymWelshxxxx
danDanishxxxx
dan_frakDanish - Fraktur (contrib)
deuGermanxxxx
deu_frakGerman - Fraktur (contrib)
deu_latfGerman (Fraktur Latin)xxxx
dzoDzongkhaxxxx
ellGreek, Modern (1453-)xxxx
engEnglishxxxx
enmEnglish, Middle (1100-1500)xxxx
epoEsperantoxxxx
equMath / equation detection modulexxx
estEstonianxxxx
eusBasquexxxx
faoFaroesexxx
fasPersianxxxx
filFilipino (old - Tagalog)xxx
finFinnishxxxx
fraFrenchxxxx
frkGerman - Fraktur (now deu_latf)xxxx
frmFrench, Middle (ca.1400-1600)xxxx
fryWestern Frisianxxx
glaScottish Gaelicxxx
gleIrishxxxx
glgGalicianxxxx
grcGreek, Ancient (to 1453) (contrib)xxxx
gujGujaratixxxx
hatHaitian; Haitian Creolexxxx
hebHebrewxxxx
hinHindixxxx
hrvCroatianxxxx
hunHungarianxxxx
hyeArmenianxxx
ikuInuktitutxxxx
indIndonesianxxxx
islIcelandicxxxx
itaItalianxxxx
ita_oldItalian - Oldxxxx
javJavanesexxxx
jpnJapanesexxxx
kanKannadaxxxx
katGeorgianxxxx
kat_oldGeorgian - Oldxxxx
kazKazakhxxxx
khmCentral Khmerxxxx
kirKirghiz; Kyrgyzxxxx
kmrKurmanji (Kurdish - Latin Script)xxxx
korKoreanxxxx
kor_vertKorean (vertical)xxxx
kurKurdish (Arabic Script)
laoLaoxxxx
latLatinxxxx
lavLatvianxxxx
litLithuanianxxxx
ltzLuxembourgishxxxx
malMalayalamxxxx
marMarathixxxx
mkdMacedonianxxxx
mltMaltesexxxx
monMongolianxxxx
mriMaorixxxx
msaMalayxxxx
myaBurmesexxxx
nepNepalixxxx
nldDutch; Flemishxxxx
norNorwegianxxxx
ociOccitan (post 1500)xxxx
oriOriyaxxxx
osdOrientation and script detection modulexxxx
panPanjabi; Punjabixxxx
polPolishxxxx
porPortuguesexxxx
pusPushto; Pashtoxxxx
queQuechuaxxxx
ronRomanian; Moldavian; Moldovanxxxx
rusRussianxxxx
sanSanskritxxxx
sinSinhala; Sinhalesexxxx
slkSlovakxxxx
slk_frakSlovak - Fraktur (contrib)
slvSlovenianxxxx
sndSindhixxxx
spaSpanish; Castilianxxxx
spa_oldSpanish; Castilian - Oldxxxx
sqiAlbanianxxxx
srpSerbianxxxx
srp_latnSerbian - Latinxxxx
sunSundanesexxxx
swaSwahilixxxx
sweSwedishxxxx
syrSyriacxxxx
tamTamilxxxx
tatTatarxxxx
telTeluguxxxx
tgkTajikxxxx
tglTagalog (new - Filipino)x
thaThaixxxx
tirTigrinyaxxxx
tonTongaxxxx
turTurkishxxxx
uigUighur; Uyghurxxxx
ukrUkrainianxxxx
urdUrduxxxx
uzbUzbekxxxx
uzb_cyrlUzbek - Cyrilicxxxx
vieVietnamesexxxx
yidYiddishxxxx
yorYorubaxxxx

Scripts

Script4.004.0.04.0.04.0.0
Nov 2016tessdatatessdata_besttessdata_fast
arabArabicxxx
armnArmenianxxx
bengBengalixxx
cansCanadian_Aboriginalxxx
cherCherokeexxx
cyrlCyrillicxxx
devaDevanagarixxx
ethiEthiopicxxx
frakFrakturxxx
georGeorgianxxx
grekGreekxxx
gujrGujaratixxx
guruGurmukhixxx
hansHanS (Han simplified)xxx
hans-vertHanS_vert (Han simplified vertical)xxx
hantHanT (Han traditional)xxx
hant-vertHanT_vert (Han traditional vertical)xxx
hangHangulxxx
hang-vertHangul_vert (Hangul vertical)xxx
hebrHebrewxxx
jpanJapanesexxx
jpan-vertJapanese_vert (Japanese vertical)xxx
kndaKannadaxxx
khmrKhmerxxx
laooLaoxxx
latnLatinxxx
mlymMalayalamxxx
mymrMyanmarxxx
oryaOriya(Odia)xxx
sinhSinhalaxxx
syrcSyriacxxx
tamlTamilxxx
teluTeluguxxx
thaaThaanaxxx
thaiThaixxx
tibtTibetanxxx
vietVietnamesexxx