CATMuS (Consistent Approach to Transcribing ManuScript) Medieval is a Kraken HTR model trained on four different languages (in descending order of importance in the dataset: Old and Middle French, Latin, Spanish (and other languages of Spain), Italian) on strictly graphematic transcriptions. No abbreviations are resolved. This model is the result of the collaboration from researchers from CREMMA, GalliCorpora, HTRomance and DEEDS projects. It follows the CREMMA Guidelines (Supplemented by the CREMMA Medii Aevi) and will be consolidated under the CATMuS Medieval Guidelines in an upcoming paper. The model is trained with NFD Unicode normalization: each diacritic (including superscripts) are transcribed as their own characters, separately from the "main" character.
CATMuS Medieval
Camps, Jean-Baptiste;Boschetti, Federico;Fischer, Franz;
2023-01-01
Abstract
CATMuS (Consistent Approach to Transcribing ManuScript) Medieval is a Kraken HTR model trained on four different languages (in descending order of importance in the dataset: Old and Middle French, Latin, Spanish (and other languages of Spain), Italian) on strictly graphematic transcriptions. No abbreviations are resolved. This model is the result of the collaboration from researchers from CREMMA, GalliCorpora, HTRomance and DEEDS projects. It follows the CREMMA Guidelines (Supplemented by the CREMMA Medii Aevi) and will be consolidated under the CATMuS Medieval Guidelines in an upcoming paper. The model is trained with NFD Unicode normalization: each diacritic (including superscripts) are transcribed as their own characters, separately from the "main" character.I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.