Designing and compiling the written sub-corpus of the Bimodal Italian Learner Corpus of Chinese (BILCC): Methodological issues

This article introduces a new methodological resource for research in L2 Chinese acquisition: the written sub-corpus of the Bimodal Italian Learner Corpus of Chinese (BILCC). The corpus, methodologically grounded in the Learner Corpus Research (LCR) framework, has been assembled according to strict design criteria. It is a specific-purpose corpus, as it has been specifically designed to explore the pragmalinguistic knowledge of Chinese shì...de clefts by L1 Italian learners. The corpus consists of contextualized written data produced by 103 Italian learners at beginner, intermediate, and advanced levels, totaling 53,437 Chinese characters, 38,793 tokens, and 693 word types. Additionally, the corpus design includes an equivalent sub-corpus of native Chinese speakers, consisting of data from 30 L1 Chinese speakers. The paper presents the features of the corpus design, and describes the corpus typology, as well as the environment, learner, and task variables. The data collection procedure and the corpus size are also discussed. Finally, the paper demonstrates the effectiveness of SLA theoretically motivated tasks used for data collection by presenting statistical analyses of the collected data on the production of shì...de clefts.

Designing and compiling the written sub-corpus of the Bimodal Italian Learner Corpus of Chinese (BILCC): Methodological issues

Alessia Iurato

2023

Abstract

This article introduces a new methodological resource for research in L2 Chinese acquisition: the written sub-corpus of the Bimodal Italian Learner Corpus of Chinese (BILCC). The corpus, methodologically grounded in the Learner Corpus Research (LCR) framework, has been assembled according to strict design criteria. It is a specific-purpose corpus, as it has been specifically designed to explore the pragmalinguistic knowledge of Chinese shì...de clefts by L1 Italian learners. The corpus consists of contextualized written data produced by 103 Italian learners at beginner, intermediate, and advanced levels, totaling 53,437 Chinese characters, 38,793 tokens, and 693 word types. Additionally, the corpus design includes an equivalent sub-corpus of native Chinese speakers, consisting of data from 30 L1 Chinese speakers. The paper presents the features of the corpus design, and describes the corpus typology, as well as the environment, learner, and task variables. The data collection procedure and the corpus size are also discussed. Finally, the paper demonstrates the effectiveness of SLA theoretically motivated tasks used for data collection by presenting statistical analyses of the collected data on the production of shì...de clefts.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2023
			
	Titolo del Volume
	
				Studies on Chinese Language and Linguistics in Italy
			
	DOI
	
				https://dx.doi.org/10.30682/sitlec45
			
	Appare nelle tipologie:
	
				3.1 Articolo su libro

File in questo prodotto:

File	Dimensione	Formato
Iurato_179-228_removed.pdf accesso aperto Tipologia: Versione dell'editore Licenza: Accesso libero (no vincoli) Dimensione 734.49 kB Formato Adobe PDF Visualizza/Apri	734.49 kB	Adobe PDF	Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5003704

Citazioni

ND

ND

ND

social impact