I currently work for the Netherlands Cancer Institute on a grant from Atos Medical, Hornby, Sweden. My work there is to study and implement speech and language technologies that can help revalidation and therapy of patients.
Speak Good Chinese is a cross-platform application based on GTK technology that allows you or your students to train their Mandarin pronunciation. Our softwareis based on Praat , leading software inspeech analysis. Our speech technology is backed by the Institute of Phonetic Sciences part of the University of Amsterdam .
We have a short promotional video made by Lifeng Liu and Guangqin Chen (in a small and large format), which can also be found at digiemotion and at Surfnet .
This project was supported by grant 6046 from the Digitale Universiteit .
Our understanding of the comprehension of spoken language is
lacking on quantitative knowledge on how the different aspects
of language are integrated. Both the time-course with which
information becomes available and the way the diverse sources
of information are combined are relatively unknown. Speech
recognition in the classical sense of "structured
word-recognition" is an extremely complicated process. It is
necessary to start tackling the general problem of the
extraction and integration of information in speech
comprehension with a simpler sub-task. A much simpler problem,
which covers the whole spectrum of language communication, is
the prediction of turn-switches in conversation. Turn-switches
in various forms are the basic control mechanism of
conversations. For the hearer, the task is deceptively simple:
determinewhen to start talking. This makes turn- switching a
good model for the extraction and integration of linguistic
information as all sources of relevant information
aresynchronized with theturn-switching points
(Turn-Relevant-Places or TRP's). From an experimental point of
view, the interference from the task itself, whether or not to
start speaking, is minimal, as the number of choices is
extremely limited. Therefore, the research can concentrate on
the integrating process itself. This project concerns the
quantitative modeling of TRP identification in conversation as
an integration process of temporally unfolding information at
different levels in speech, from conversation-acts and
semantics to prosody, phonetics, and visual cues. Reaction Time
(RT) measurements from TRP monitoring in manipulated (partial)
conversations will be used to determine exactly when the
relevant information at different levels of speech becomes
available and howit is integrated to predict the position of a
TRP. We will especially look at generalizations of the MERGE
model extended with a Random-Walk decision model. We will
include both the standard flat Bayesian decision rule and more
structured Hierarchical models of integration.
Key words: speech comprehension, information integration,
conversation, talk-in-interaction, turn-switching