The dataset comprises carefully selected manuscripts, each containing approximately 10 columns of text (equivalent to 5 bi-column pages or 10 single-column pages). The data adheres to the Segmonto guidelines, ensuring consistency and compatibility with other datasets following the same standards. Each image is accompanied by two XML files: - Files suffixed with .chocomufin.xml are normalized for compliance with broader datasets. - The other XML files contain repository-specific information.
HTRogène, Medieval Latin corpus of ground-truth for Handwritten Text Recognition and Layout Segmentation [Data set]
Boschetti F.;Fischer F.
;
2025-01-01
Abstract
The dataset comprises carefully selected manuscripts, each containing approximately 10 columns of text (equivalent to 5 bi-column pages or 10 single-column pages). The data adheres to the Segmonto guidelines, ensuring consistency and compatibility with other datasets following the same standards. Each image is accompanied by two XML files: - Files suffixed with .chocomufin.xml are normalized for compliance with broader datasets. - The other XML files contain repository-specific information.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.