This article introduces a new methodological resource for research in L2 Chinese acquisition: the written sub-corpus of the Bimodal Italian Learner Corpus of Chinese (BILCC). The corpus, methodologically grounded in the Learner Corpus Research (LCR) framework, has been assembled according to strict design criteria. It is a specific-purpose corpus, as it has been specifically designed to explore the pragmalinguistic knowledge of Chinese shì...de clefts by L1 Italian learners. The corpus consists of contextualized written data produced by 103 Italian learners at beginner, intermediate, and advanced levels, totaling 53,437 Chinese characters, 38,793 tokens, and 693 word types. Additionally, the corpus design includes an equivalent sub-corpus of native Chinese speakers, consisting of data from 30 L1 Chinese speakers. The paper presents the features of the corpus design, and describes the corpus typology, as well as the environment, learner, and task variables. The data collection procedure and the corpus size are also discussed. Finally, the paper demonstrates the effectiveness of SLA theoretically motivated tasks used for data collection by presenting statistical analyses of the collected data on the production of shì...de clefts.

Designing and compiling the written sub-corpus of the Bimodal Italian Learner Corpus of Chinese (BILCC): Methodological issues

Alessia Iurato
2023-01-01

Abstract

This article introduces a new methodological resource for research in L2 Chinese acquisition: the written sub-corpus of the Bimodal Italian Learner Corpus of Chinese (BILCC). The corpus, methodologically grounded in the Learner Corpus Research (LCR) framework, has been assembled according to strict design criteria. It is a specific-purpose corpus, as it has been specifically designed to explore the pragmalinguistic knowledge of Chinese shì...de clefts by L1 Italian learners. The corpus consists of contextualized written data produced by 103 Italian learners at beginner, intermediate, and advanced levels, totaling 53,437 Chinese characters, 38,793 tokens, and 693 word types. Additionally, the corpus design includes an equivalent sub-corpus of native Chinese speakers, consisting of data from 30 L1 Chinese speakers. The paper presents the features of the corpus design, and describes the corpus typology, as well as the environment, learner, and task variables. The data collection procedure and the corpus size are also discussed. Finally, the paper demonstrates the effectiveness of SLA theoretically motivated tasks used for data collection by presenting statistical analyses of the collected data on the production of shì...de clefts.
2023
Studies on Chinese Language and Linguistics in Italy
File in questo prodotto:
File Dimensione Formato  
Iurato_179-228_removed.pdf

accesso aperto

Tipologia: Versione dell'editore
Licenza: Accesso libero (no vincoli)
Dimensione 734.49 kB
Formato Adobe PDF
734.49 kB Adobe PDF Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/5003704
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact