Digital Ottoman Corpora

‘a digital infrastructure creation platform’

portfolio

Welcome to the Digital Ottoman Corpora, a pioneering project aimed at transforming the way we access, analyze, and engage with Ottoman Turkish texts. Operating under the "collections as data" approach, our initiative embodies the intersection of history, digital humanities, and artificial intelligence methods to enhance the study and understanding of Ottoman Turkish historical collections.

Why Digital Corpora?
While Ottoman Turkish possesses a rich cultural and historical archive, its representation in the digital humanities has been limited. The extensive volume of analog or partially digitized material, paired with the lack of OCR technology for these sources, make distant reading and large-scale analysis of Ottoman Turkish historical collections an elusive endeavor. In other words, we face the paradox of an archive that is simultaneously abundant yet computationally inaccessible. The Digital Ottoman Corpora intervenes at this critical juncture, focusing on the creation of searchable, computable text collections that can be studied with textual analysis and data visualization tools.

The overarching aim of the Digital Ottoman Corpora project is to facilitate the dissemination and detailed analysis of Ottoman Turkish historical texts. The pressing need for computer-readable Ottoman Turkish texts has impelled our journey, and we hope that our concerted efforts will provide a comprehensive platform for scholars, researchers, and enthusiasts alike.

Welcome to a new era of Ottoman Turkish study, where collections become data, and the past is but a click away!

Projects

We bring together multiple sub-projects under the Digital Ottoman Corpora banner, each employing unique methodologies to overcome the challenges posed by the traditional format of Ottoman Turkish texts.

HTR

Our artificial intelligence-based text recognition project incorporates Handwritten Text Recognition (HTR) tools for Ottoman Turkish. This groundbreaking technology paves the way for the automation of the transcription of the script, facilitating large-scale computer-readable and keyword-searchable text creation.

Crowdsourcing

We introduce the first Ottoman Turkish crowdsourcing project designed on Zooniverse. This novel approach leverages the power of community engagement for the transcription of Ottoman Turkish, broadening participation and insight into this vital historical archive.

Digital Editions

Under the Digital Ottoman Corpora umbrella, we present our Digital Editions project. This venture will publish selected Ottoman Turkish works on an open-access platform, pairing facsimiles of the original texts with their transcriptions, thus broadening access to this rich historical era.