Projects

Key-word spotting in voice conversations recorded on the energy products trading floor

This project was aimed at developing a speech analysis system that can intelligently process voice trader conversations on a continuous basis, to allow fast and accurate search of keywords in large volumes of recorded audio representing months of monitoring of hundreds of traders. The design of such a system presents several unconventional challenges due to trading voice characteristics, required accuracy level, and amounts of data to be processed.

Financed in part by the PRECARN-Alliance program, the project resulted from a partnership between CRIM's Speech Recognition team, Univoc and Hydro-Québec Production. CRIM Speech Recognition team, Univoc and Hydro-Québec Production. The technology that was developed is now marketed by Univoc as both a product and solution that addresses the compliance auditing, investigation and audio mining requirements of organizations with high volumes of recorded conversations.

Emotion Detection System

The goal of the project to detect emotions in man-machine dialogues was to facilitate the evaluation of telephone services provided by Bell to its customers. The system aims to detect negative emotions on the part of customers participating in dialogues with robot operators. Two detection systems were developed, one dealing with customer dissatisfaction and the other with the negative emotions which they might express. In this project, the speech recognition team at CRIM collaborated with Bell University Laboratories, the Ecole de Technologie Superieure (department of computer engineering and information technology) and NSERC.

Live and delayed closed captioning (SST)

Since 2002, the CRIM Speech Recognition team has been developing a closed captioning technology designed for the deaf and hard-of-hearing.

The first project, begun in 2002, was entitled Sous-titrage en direct de bulletins de nouvelles et d’émissions d’intérêt public (STDirect). It studied the feasibility of live closed captioning for Québec French language news programming and general interest shows, using a voice recognition system installed at broadcasters’ studios. It led to the development of the STDirect system, which has been used on the TVA network since 2004.

Then, a series of projects — Sous-titrage en direct et à distance (SST) — took place under real production conditions and satisfied users' real quality and reliability requirements.

The STDirect system has won several prizes:

  • 2004 IWAY Award in the Adaptive Technologies category, awarded by CANARIE to Pierre Dumouchel, CRIM’s Scientific Vice President and voice recognition researcher.
  • 2005 OCTAS Award in the Non-Profit Strategic Partner category awarded to CRIM, Groupe TVA and RQST Conseil-expert by FiQ.
  • 2005 Innovation Award in the Partner category, awarded jointly to CRIM, Groupe TVA and RQST Conseil-expert  by ADRIQ.
  • 2005 CATAAlliance Innovation Award.

Since 2008, CRIM commercializes its captioning services.

E-Inclusion Research Network Partner

The objective of the E-Inclusion Research Network is to create powerful and sensory specific audio-visual tools and methods for multimedia content producers. This project is partly funded by Canadian Heritage.

The Network’s projects are designed to improve the richness of multimedia experiences for people with sensory deficiencies, thus making audio-visual cultural products accessible to all.

CRIM’s Voice Recognition team contributes to the development of live and delayed closed captioning tools for deaf and hard-of-hearing users of Canadian cultural content.

C³GRID project partner

Closed Captioning Computing GRID

Financed in part by the CANARIE–ARIM program, the C³GRID project's objective was to develop a computing grid for the distributed learning of acoustic, visual and language models used in speech recognition.

Originator of the RAP project

Automated speech Recognition, automated transcription and general Access to Parliamentary debates and testimony at various committees

The RAP project, provided the deaf and hard-of-hearing with live and multimodal access to the debates and information.

Text-dialog synchronization for post-synchronization and dubbing

This project’s focal point was to develop and tune an automated voice alignment technology for Ryshco Media, as well as integrate this application to a post-synchronization and dubbing assistance system.

MADIS project partner

MPEG-7 Audio-Visual Document Indexation System

The MADIS project focused on developing a test bench for the indexation and continuous search of films based on the MPEG-7 standard.

The National Film Board of Canada (NFB) and CRIM’s Vision and Imaging and Speech Recognition teams collaborated on the project. MADIS was financed in part by the CANARIE E-content program.

Prototype for the automatic closed captioning of news and public interest programs

The goal of this project was to adapt CRIM’s speech recognition technology specifically for application in the field of closed captioning for Groupe TVA in order to provide subtitles in Québec French for news and public interest programs.

Groupe TVA also mandated RQST Conseil-expert (a closed captioning advocacy group in Québec) to evaluate the subtitles generated by the new system and to determine its appropriateness for the deaf and hard-of-hearing users.

Speech recognition based on Bayesian adaptation

Exploratory research aimed at developing new statistical modelling methods for speech recognition. For instance, what is the voice frequency distribution for speakers of a given language? Based on the frequency distribution, the marginal frequency for each of the speakers within a given group can easily be determined. This frequency may then be used to build a Markovian model of the speaker’s speech using the standard method. It should be noted that the marginal distribution for a given speaker is derived from the data of all speakers within a given group, and that the Markovian model is different from a speaker-dependent learning model. An adequate solution to our problem therefore gave way to developing new methods of adapting to speakers which could be applied in speech recognition as well as for the automated identification of speech.

 
boite_recherche_g

Search

boite_recherche_d

CONTACT

Gilles Boulianne

Team Director, Speech Recognition & Senior Research and Development Advisor

514 840-1235, ext. 5282

Gilles Boulianne

See also