French Learner Language Oral Corpora flloc
The FLLOC project


List and brief description of corpora

Linguistic Development Project

This corpus was collected during a one-year ESRC-funded project in 2001/2002 (R00223421; Myles), based at the University of Southampton. It is a cross-sectional corpus containing oral tasks performed by learners in years 9, 10 and 11, (age 13, 14 and 15 respectively, after 2, 3 and 4 years of learning French) on a one-to-one basis with a researcher. There are 20 learners in each of the years, and they all perform 4 oral activities, ranging from a directed conversation using photos as stimuli, to a negative elicitation task, an interrogative elicitation task and a picture narrative.

Progression Project

This longitudinal corpus was collected from 1993 to 1996 in the context of an ESRC-funded project based at the University of Southampton and the National Foundation for Educational Research (R000234754; Mitchell & Dickson). A cohort of 60 children was tracked through two years (six terms) of classroom French, from the second term of Year 7 (age 11-12, after 1 term of classroom French) until the first term of Year 9 inclusive (age 13-14, after 2 years of French). Once each term a range of speaking activities was undertaken by the cohort, during what was called a ‘Round’ of data elicitation. The resulting corpus of spoken language data amounts to around 200 hours.

Salford Project

The data for this corpus was collected from a longitudinal study of 12 undergraduates studying French at a British university, from their first year to the fourth year. The title of the research project from which the data was collected was 'The development of linguistic knowledge and language processing capacity in second language learning.' The participants were recorded carrying out a variety of production tasks in French. There are also some examples of the participants carrying out the same tasks in English for comparison.

The principal investigators were R D Hawkins and R J Towell. Data was collected by N P Bazergui. The study used a repeated-tasks design.

Brussels Project

In this study the researchers investigated the simultaneous learning of two foreign languages (French and English) in an educational context (secondary education in the Dutch-speaking region of Flanders, Belgium).

The objective of the study was to evaluate the linguistic and socio-psychological outcomes of the teaching-learning process in both target languages at the end of secondary education and to examine the contribution of curricular and extra-curricular factors to these outcomes.

To this end, some 150 Dutch-speaking Flemish students from different schools were tested for their proficiency (speaking, reading, writing, reading proficiency and metalinguistic knowledge) in both target languages as well as for their attitudinal-motivational disposition towards the two languages.

Newcastle corpus

This corpus was created through an AHRC-funded project (2005/2008) (award number 112118) run jointly in the Universities of Newcastle and Southampton (directors Myles and Mitchell). It is a longitudinal corpus containing oral tasks performed by the same 30 English L1 learners in years 12 and 13 of the English school system (age 16 and 17 respectively, after 5 and 6 years of learning French). The learners worked on a one-to-one basis with a researcher. They all performed 6 oral activities as well as one written test measuring vocabulary development. The corpus also contains data from 15 French native speakers performing the same tasks.

Reading Corpus I

This was a project funded by the Research Endowment Trust Fund of the University of Reading to provide transcripts of secondary school students taking their GCSE oral tests. The GCSE (General Certificate of Secondary Education) is a national examination taken by school students in Britain at the age of 16. Unlike many other summative oral language tests conducted at key points in students' education, the GCSE regulations require that candidates are both tested and their performance simultaneously scored, not by a stranger, but by their own class teacher. The fact that this situation exists in a national examination system makes it an important context for investigation.

Reading Corpus II Native Speakers

This was also a project funded by the Research Endowment Trust Fund of the University of Reading. The main objective was to provide transcripts of French students covering the same topics as the British GCSE Higher Level interviews and using a format as close as possible to that of the GCSE.

University of East Anglia Corpus (UEA Corpus)

This corpus was collected by Marie-Noëlle Guillot at the University of East Anglia between November 2002 and December 2004. The data was collected from a cross-sectional study of group oppositional talk (i.e. talk in which speakers express opposing views), both in participants' L1 (French and English) and in their L2 (English and French). The participants were undergraduate student volunteers studying their L2 (French or English) to degree level after an average of 7 years of L2 study in secondary education (predominantly through classroom exposure). Each group performed the same task, a TV-like discussion with a host on the same topic (anti-smoking campaigns), first in the L2, and then in the L1 a few minutes after the L2 recording.