An extended and revised form of Tim Buckwalter’s Arabic lexical and morphological resource AraMorph, named eXtended Revised AraMorph (XRAM), is presented. A number of weaknesses and inconsistencies of the original model are addressed by allowing a wider coverage of real-world classical and contemporary (both formal and informal) Arabic texts. Building upon previous research, XRAM enhancements include (i) flag-selectable usage markers, (ii) probabilistic mildly context-sensitive POS tagging, filtering, disambiguation and ranking of alternative morphological analyses, and (iii) semi-automatic increments of lexical coverage through the extraction of lexical and morphological information from existing lexical resources. Testing XRAM through a front-end Python module showed a remarkable success level.

Semi-Automatic Data Annotation, POS Tagging and Mildly Context-Sensitive Disambiguation: The eXtended Revised AraMorph (XRAM)

Giuliano Lancioni;Marta Campanelli;Simona Olivieri
2018-01-01

Abstract

An extended and revised form of Tim Buckwalter’s Arabic lexical and morphological resource AraMorph, named eXtended Revised AraMorph (XRAM), is presented. A number of weaknesses and inconsistencies of the original model are addressed by allowing a wider coverage of real-world classical and contemporary (both formal and informal) Arabic texts. Building upon previous research, XRAM enhancements include (i) flag-selectable usage markers, (ii) probabilistic mildly context-sensitive POS tagging, filtering, disambiguation and ranking of alternative morphological analyses, and (iii) semi-automatic increments of lexical coverage through the extraction of lexical and morphological information from existing lexical resources. Testing XRAM through a front-end Python module showed a remarkable success level.
2018
Computational Linguistics, Speech and Image Processing for Arabic Language
File in questo prodotto:
File Dimensione Formato  
2018_Lancioni et al_Semi-automatic data annotation_World Scientific.pdf

non disponibili

Tipologia: Versione dell'editore
Licenza: Copyright dell'editore
Dimensione 704.77 kB
Formato Adobe PDF
704.77 kB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5058907
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact