Automated Text Recognition for Ottoman Turkish
Our Handwritten Text Recognition (HTR) project is a cutting-edge initiative at the forefront of Ottoman Turkish transcription. The focus of this project is the application of HTR, an artificial intelligence-driven automatic transcription system, to Ottoman Turkish.
This initiative, currently focusing on HTR with Transkribus, pursues two overarching objectives: firstly, to enhance the accessibility of Ottoman Turkish historical archives to researchers and the general public; secondly, to contribute to the digital research infrastructure creation for Ottoman Turkish.
Our ongoing work with Transkribus involves the creation of a generalized text recognition model for 19th-century Ottoman Turkish periodicals. As of June 2023, the Character Error Rate (CER) of our most recent HTR model stands at 7.20%, and we are diligently working to improve it. We have made this model publicly available on Transkribus, further cementing our commitment to open scholarship and to the digital study of Ottoman Turkish.