An extended and revised form of Tim Buckwalter’s Arabic lexical and morphological resource AraMorph, named eXtended Revised AraMorph (XRAM), is presented. A number of weaknesses and inconsistencies of the original model are addressed by allowing a wider coverage of real-world classical and contemporary (both formal and informal) Arabic texts. Building upon previous research, XRAM enhancements include (i) flag-selectable usage markers, (ii) probabilistic mildly context-sensitive POS tagging, filtering, disambiguation and ranking of alternative morphological analyses, and (iii) semi-automatic increments of lexical coverage through the extraction of lexical and morphological information from existing lexical resources. Testing XRAM through a front-end Python module showed a remarkable success level.
Semi-Automatic Data Annotation, POS Tagging and Mildly Context-Sensitive Disambiguation: The eXtended Revised AraMorph (XRAM)
Giuliano Lancioni;Marta Campanelli;Simona Olivieri
2018-01-01
Abstract
An extended and revised form of Tim Buckwalter’s Arabic lexical and morphological resource AraMorph, named eXtended Revised AraMorph (XRAM), is presented. A number of weaknesses and inconsistencies of the original model are addressed by allowing a wider coverage of real-world classical and contemporary (both formal and informal) Arabic texts. Building upon previous research, XRAM enhancements include (i) flag-selectable usage markers, (ii) probabilistic mildly context-sensitive POS tagging, filtering, disambiguation and ranking of alternative morphological analyses, and (iii) semi-automatic increments of lexical coverage through the extraction of lexical and morphological information from existing lexical resources. Testing XRAM through a front-end Python module showed a remarkable success level.File | Dimensione | Formato | |
---|---|---|---|
2018_Lancioni et al_Semi-automatic data annotation_World Scientific.pdf
non disponibili
Tipologia:
Versione dell'editore
Licenza:
Copyright dell'editore
Dimensione
704.77 kB
Formato
Adobe PDF
|
704.77 kB | Adobe PDF | Visualizza/Apri |
I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.