Design and collection of the Bimodal Italian Learner Corpus of Chinese (BILCC)

This article presents a new methodological resource for L2 Chinese acquisition research, BILCC (Bimodal Italian Learner Corpus of Chinese). BILCC was created to fill two major gaps in the literature: first, the absence of data from Italian learners in existing L2 Chinese corpora; second, to support research on L2 Chinese acquisition by Italian learners, given the growing interest in Chinese language learning and the flourishing research community devoted to L2 Chinese acquisition in Italy. BILCC has been constructed according to strict design criteria and collects written and spoken data from 106 Italian L2 Chinese learners at beginner, intermediate and advanced levels. It is a specific-purpose corpus, as it was specifically designed to explore the pragmalinguistic knowledge of Chinese shì...de clefts by L1 Italian learners. The written subcorpus consists of 53,437 Chinese characters and the spoken subcorpus of 54,212 Chinese characters. BILCC also includes an equivalent native speaker subcorpus, which collects data from 35 L1 Chinese speakers. The learner corpus, which is fully documented with rich metadata, is annotated at the error and pragmatic levels.

Design and collection of the Bimodal Italian Learner Corpus of Chinese (BILCC)

Alessia Iurato

2024

Abstract

This article presents a new methodological resource for L2 Chinese acquisition research, BILCC (Bimodal Italian Learner Corpus of Chinese). BILCC was created to fill two major gaps in the literature: first, the absence of data from Italian learners in existing L2 Chinese corpora; second, to support research on L2 Chinese acquisition by Italian learners, given the growing interest in Chinese language learning and the flourishing research community devoted to L2 Chinese acquisition in Italy. BILCC has been constructed according to strict design criteria and collects written and spoken data from 106 Italian L2 Chinese learners at beginner, intermediate and advanced levels. It is a specific-purpose corpus, as it was specifically designed to explore the pragmalinguistic knowledge of Chinese shì...de clefts by L1 Italian learners. The written subcorpus consists of 53,437 Chinese characters and the spoken subcorpus of 54,212 Chinese characters. BILCC also includes an equivalent native speaker subcorpus, which collects data from 35 L1 Chinese speakers. The learner corpus, which is fully documented with rich metadata, is annotated at the error and pragmatic levels.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2024
			
	Titolo del Volume
	
				Continuing Learner Corpus Research: Challenges and Opportunities
			
	Appare nelle tipologie:
	
				3.1 Articolo su libro

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5033540

Citazioni

ND

ND

ND

social impact