The automated correction of errors in the Handwritten Text Recognition (HTR) output can be challenging and is far from solved. To address this challenge, we set up a shared task on AIcrowd that received 271 submissions, of which very few succeed. This paper presents the datasets, the best methods, and experimental analysis in error-correcting HTRed manuscripts and papyri in Byzantine Greek, the language that followed Classical and preceded Modern Greek. By using recognised and transcribed data from seven centuries, the two best-performing methods are compared, one based on a neural encoded-decoder architecture and the other based on linguistic knowledge. We show that the recognition error rate can be reduced by both, up to 2.5 points at the level of characters and up to 15 at the level of words, also highlighting the weak and strong points of each.
Error Correcting HTR’ed Byzantine Text
Konstantina Liagkou;Holger Essler;Jean-Baptiste Camps;Franz Fischer
2023-01-01
Abstract
The automated correction of errors in the Handwritten Text Recognition (HTR) output can be challenging and is far from solved. To address this challenge, we set up a shared task on AIcrowd that received 271 submissions, of which very few succeed. This paper presents the datasets, the best methods, and experimental analysis in error-correcting HTRed manuscripts and papyri in Byzantine Greek, the language that followed Classical and preceded Modern Greek. By using recognised and transcribed data from seven centuries, the two best-performing methods are compared, one based on a neural encoded-decoder architecture and the other based on linguistic knowledge. We show that the recognition error rate can be reduced by both, up to 2.5 points at the level of characters and up to 15 at the level of words, also highlighting the weak and strong points of each.I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.