Séminaire R-D «Automatic discriminative measurement of voice onset time»

Séminaire R-D «Automatic discriminative measurement of voice onset time»
12/12/13 12h30
CRIM (405, avenue Ogilvy, bur. 101, Montréal)

 Les séminaires scientifiques gratuits sont donnés par des experts de renommée internationale, des professeurs d'université avec lesquels le CRIM collabore, le personnel de R-D du CRIM et ses boursiers de 2e et 3e cycles. Au programme, des présentations conviviales sur les dernières avancées scientifiques et technologiques. Les membres et partenaires du CRIM sont bienvenus à ces séminaires. 

Jeudi, le 12 décembre 2013, de 12 h 30 à 13 h 30, salle 9
Au CRIM, 405, avenue Ogilvy, bureau 101, Montréal
Conférence gratuite, inscription auprès de Carmen.Robert@crim.ca

Conférencier : Morgan Sonderegger, Ph. D., professeur adjoint au département linguistique de l’Université McGill.

Résumé :  In both clinical and research settings, most research on human speech production has carried out using small to medium-sized datasets of read speech, recorded in the lab. However, much larger corpora of diverse types of speech, developed by researchers in both automatic speech recognition and linguistics, have become increasingly available and easy to construct.  Scaling up to these corpora promises to change the questions researchers can ask about human speech production. However, this promise depends on the development of accurate algorithms to quicken or replace manual annotation, which becomes infeasible for large corpora. With some important exceptions (e.g. vowel formants), such algorithms do not currently exist for most quantities which are widely measured in phonetic research. This talk describes an automatic measurement algorithm for the most widely measured consonantal variable, voice onset time (VOT), considered as a case of predicting structured output from speech.  Manually-labeled data is used to train a function that takes as input a speech segment of an arbitrary length containing a voiceless stop, and outputs its VOT. The function is explicitly trained to minimize the difference between predicted and manually-measured VOT; it operates on a set of acoustic feature functions designed based on spectral and temporal cues used by human VOT annotators.  We present experiments on four corpora demonstrate that the algorithm is highly accurate (near inter-transcriber reliability), and can be quickly adapted to new datasets.  We also describe a concrete application of the algorithm in an ongoing study of language change in Glasgow, where its use has resulted in a massive decrease in annotation time, and discuss outstanding challenges in applying machine learning methods in phonetic research that this application illustrates.

Biographie : Morgan Sonderegger has been an Assistant Professor of Linguistics at McGill University since 2012.  He received his Ph.D. in Computer Science and in Linguistics from the University of Chicago (2012).  His research addresses the structure and causes of variability in speech sounds, particularly variability over time, both within individual speakers and in communities.  He studies these topics by building computational and statistical models of variability in datasets from diverse sources, such as reality television, rhyming poetry, and behavioral experiments. He is also broadly interested in developing quantitative methods for linguistics, especially drawing on methods from machine learning and automatic speech recognition.

Cette conférence sera en anglais.

  • #Bientôt ???? Le CRIM présentera le projet Application des technologies vocales aux langues autochtones le 19 novembre… https://t.co/FnO5IKw79H
  • Tom Landry RT @Tom_Landry_: Ce fût un réel plaisir que de partager ma vision de l'avenir de la géomatique. Et que dire des mots-clés sélectionné par l…