This paper will present work carried out on the 60,000 words Italian Spontaneous Speech Corpus which collapses together two corpora, the first one called AVIP, under national project API - the Italian version of MapTask, the other one called IPAR. We will present in particular the parser, to produce syntactic structures of overlapped temporally aligned turns. The paper will argue in favour of a joint and thus temporally aligned representation of overlapping material to capture all linguistic information made available by the local context. We argue that this is different from simply aligning on a time scale linguistic annotations on separate layers of representation. In fact, we are interested in collapsing overlapping linguistic material within a single representation in order to capture pragmatic inferences. This will result in a syntactically branching node we call OVL which contains both the overlapper's and the overlappee's material (linguistic or non-linguistic). The paper will also comment on data from our Written Text Treebank called VIT and compare them with the Oral Treebank.

Overlaps in AVIP/IPAR, the Italian Treebank of Spontaneous Speech

DELMONTE, Rodolfo;TONELLI, Sara
2007

Abstract

This paper will present work carried out on the 60,000 words Italian Spontaneous Speech Corpus which collapses together two corpora, the first one called AVIP, under national project API - the Italian version of MapTask, the other one called IPAR. We will present in particular the parser, to produce syntactic structures of overlapped temporally aligned turns. The paper will argue in favour of a joint and thus temporally aligned representation of overlapping material to capture all linguistic information made available by the local context. We argue that this is different from simply aligning on a time scale linguistic annotations on separate layers of representation. In fact, we are interested in collapsing overlapping linguistic material within a single representation in order to capture pragmatic inferences. This will result in a syntactically branching node we call OVL which contains both the overlapper's and the overlappee's material (linguistic or non-linguistic). The paper will also comment on data from our Written Text Treebank called VIT and compare them with the Oral Treebank.
Proc. SRSL7 - Semantic Representation of Spoken Language, CAEPIA
File in questo prodotto:
File Dimensione Formato  
Overlaps.pdf

non disponibili

Tipologia: Abstract
Licenza: Licenza non definita
Dimensione 260.15 kB
Formato Adobe PDF
260.15 kB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/10278/39719
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact