The Effect of Cohesive Features in Integrated and Independent L2 Writing Quality and Text Classification

Keywords: cohesion, integrated writing assessment, L2 writing, TAACO, text classification


Cohesion features were calculated for a corpus of 960 essays by 480 test-takers from the Test of English as a Foreign Language (TOEFL) in order to examine differences in the use of cohesion devices between integrated (source-based) writing and independent writing samples. Cohesion indices were measured using an automated textual analysis tool, the Tool for the Automatic Assessment of Cohesion (TAACO). A discriminant function analysis correctly classified essays as either integrated or independent in 92.3 per cent of cases. Integrated writing was marked by higher use of specific connectives and greater lexical overlap of content words between textual units, whereas independent writing was marked by greater lexical overlap of function words, especially pronouns. Regression analyses found that cohesive indices which distinguish tasks predict writing quality judgments more strongly in independent writing. However, the strongest predictor of human judgments was the same for both tasks: lexical overlap of function words. The findings demonstrate that text cohesion is a multidimensional construct shaped by the writing task, yet the measures of cohesion which affect human judgments of writing quality are not entirely different across tasks. These analyses allow us to better understand cohesive features in writing tasks and implications for automated writing assessment.

Author Biographies

Rurik Tywoniw, Georgia State University, USA

Rurik Tywoniw is a Ph.D. student in Applied Linguistics at Georgia State University. His research interests include second language assessment, second language literacy, and computational linguistics. His work at Georgia State University includes coordinating the Georgia State Test of English Proficiency and teaching Elementary Japanese

Scott Crossley, Georgia State University, USA

Scott Crossley is Professor at Georgia State University. His interests include computational linguistics, corpus linguistics, cognitive science, discourse processing, and discourse analysis. His primary research focuses on the development and application of computational tools in second language learning and text comprehensibility.