Master TAL - MSc. NLP

Course Unit

Spoken Corpora







Course Description

This course introduces the means of oral communication (sound, images and facial geometry and the vocal tract, aerodynamic parameters, gestures) which can be collected. The technologies used to this end are also introduced (microphone, IRM, ultrasound, electromagnetic articulography) as well as the technical or ethical constraints that these methods of collection present. The second part of the class develops conceptualisation of annotated corpora. For example, we develop notions of automatic tools used for annotation. The last part of the course is consecrated to annotation software and the management of corpora, which are vital for the proper exploitation after the annotation process.


Learning Outcome

  • Knowledge of specificity of spoken corpora

  • Design of the content of spoken corpora

  • Annotation of spoken corpora


  • The courses for the first semester of the master do not have prerequisites other than those defined for the specialisation

Targeted Skills

  • Capacity to collect, structure, and represent data (sound, text, images,… )
  • Combine and utilise interdisciplinary skills and know-how in the aims of creating innovative solutions

More Informations


  • To be completed

Course URL – Arche

  • To be completed

Link with other courses

  • 702-EC2, 803 and 902-EC2

Evaluation procedures

Number of Tests

  • 2

Nature of the tests

  • labs
  • final exam

Group work

  • N/A

Combine with other specialization

  • No

Back to MSc Sciences Cognitives

Back to Master TAL - MSc. NLP