In this paper we present our approach to extract multi-word terms (MWTs) from an Italian-Arabic parallel corpus of legal texts. Our approach is a hybrid model which combines linguistic and statistical knowledge. The linguistic approach includes Part Of Speech (POS) tagging of the corpus texts in the two languages in order to formulate syntactic patterns to identify candidate terms. After that, the candidate terms will be ranked by statistical association measures which here represent the statistical knowledge. After the creation of two MWTs lists, one for each language, the parallel corpus will be used to validate and identify translation equivalents.

Italian-Arabic domain terminology extraction from parallel corpora

FAWI, FATHI HASSAN AHMED;DELMONTE, Rodolfo
2015-01-01

Abstract

In this paper we present our approach to extract multi-word terms (MWTs) from an Italian-Arabic parallel corpus of legal texts. Our approach is a hybrid model which combines linguistic and statistical knowledge. The linguistic approach includes Part Of Speech (POS) tagging of the corpus texts in the two languages in order to formulate syntactic patterns to identify candidate terms. After that, the candidate terms will be ranked by statistical association measures which here represent the statistical knowledge. After the creation of two MWTs lists, one for each language, the parallel corpus will be used to validate and identify translation equivalents.
2015
Proceedings of the Second Italian Conference on Computational Linguistics CLiC-it 2015
File in questo prodotto:
File Dimensione Formato  
Italian-Arabic domain terminology extraction from parallel corpora.pdf

accesso aperto

Tipologia: Versione dell'editore
Licenza: Licenza non definita
Dimensione 164.67 kB
Formato Adobe PDF
164.67 kB Adobe PDF Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/3663337
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact