Oncology Related Communication Disorders

I currently work for the Netherlands Cancer Institute on a grant from Atos Medical, Hornby, Sweden. My work there is to study and implement speech and language technologies that can help revalidation and therapy of patients.


Speak Good Chinese is a cross-platform application based on GTK technology that allows you or your students to train their Mandarin pronunciation. Our softwareis based on Praat , leading software inspeech analysis. Our speech technology is backed by the Institute of Phonetic Sciences part of the University of Amsterdam .

We have a short promotional video made by Lifeng Liu and Guangqin Chen (in a small and large format), which can also be found at digiemotion and at Surfnet .

This project was supported by grant 6046 from the Digitale Universiteit .

Integration of information in spoken communication

Our understanding of the comprehension of spoken language is lacking on quantitative knowledge on how the different aspects of language are integrated. Both the time-course with which information becomes available and the way the diverse sources of information are combined are relatively unknown. Speech recognition in the classical sense of "structured word-recognition" is an extremely complicated process. It is necessary to start tackling the general problem of the extraction and integration of information in speech comprehension with a simpler sub-task. A much simpler problem, which covers the whole spectrum of language communication, is the prediction of turn-switches in conversation. Turn-switches in various forms are the basic control mechanism of conversations. For the hearer, the task is deceptively simple: determinewhen to start talking. This makes turn- switching a good model for the extraction and integration of linguistic information as all sources of relevant information aresynchronized with theturn-switching points (Turn-Relevant-Places or TRP's). From an experimental point of view, the interference from the task itself, whether or not to start speaking, is minimal, as the number of choices is extremely limited. Therefore, the research can concentrate on the integrating process itself. This project concerns the quantitative modeling of TRP identification in conversation as an integration process of temporally unfolding information at different levels in speech, from conversation-acts and semantics to prosody, phonetics, and visual cues. Reaction Time (RT) measurements from TRP monitoring in manipulated (partial) conversations will be used to determine exactly when the relevant information at different levels of speech becomes available and howit is integrated to predict the position of a TRP. We will especially look at generalizations of the MERGE model extended with a Random-Walk decision model. We will include both the standard flat Bayesian decision rule and more structured Hierarchical models of integration.
Key words: speech comprehension, information integration, conversation, talk-in-interaction, turn-switching


  • Clapham, R. P., Martens, J-P., van Son, R. J. J. H., Hilgers, F. J. M., van den Brekel, M. W. M., & Middag, C. (2016). Computing scores of voice quality and speech intelligibility in tracheoesophageal speech for speech stimuli of varying lengths. Computer Speech and Language, 37, 1-10. DOI: 10.1016/j.csl.2015.10.001  [details] 
  • Kraaijenga, S. A. C., Oskam, I. M., van Son, R. J. J. H., Hamming-Vrieze, O., Hilgers, F. J. M., van den Brekel, M. W. M., & van der Molen, L. (2016). Assessment of voice, speech, and related quality of life in advanced head and neck cancer patients 10-years+ after chemoradiotherapy. Oral Oncology, 55, 24-30. DOI: 10.1016/j.oraloncology.2016.02.001  [details] 
  • van Sluis, K. E., van den Brekel, M. W. M., Hilgers, F. J. M., & van Son, R. J. J. H. (2016). Long-Term Stability of Tracheoesophageal Voices. In Interspeech [114]


  • Clapham, R. P., van As-Brooks, C. J., van Son, R. J. J. H., Hilgers, F. J. M., & van den Brekel, M. W. M. (2015). The Relationship Between Acoustic Signal Typing and Perceptual Evaluation of Tracheoesophageal Voice Quality for Sustained Vowels. Journal of Voice, 29(4), 517.e23-517.e29. DOI: 10.1016/j.jvoice.2014.10.002  [details] 
  • Schuller, B., Steidl, S., Batliner, A., Nöth, E., Vinciarelli, A., Burkhardt, F., ... Weiss, B. (2015). A Survey on perceived speaker traits: Personality, likability, pathology, and the first challenge. Computer Speech and Language, 29(1), 100-131. DOI: 10.1016/j.csl.2014.08.003  [details] 


  • Clapham, R., Middag, C., Hilgers, F., Martens, J-P., van den Brekel, M., & van Son, R. (2014). Developing automatic articulation, voice quality and accent assessment techniques for speakers treated for advanced head and neck cancer. Speech Communication, 59, 44-54. DOI: 10.1016/j.specom.2014.01.003  [details] 
  • Middag, C., Clapham, R., van Son, R., & Martens, J-P. (2014). Robust automatic intelligibility assessment techniques evaluated on speakers treated for head and neck cancer. Computer Speech and Language, 28(2), 467-482. DOI: 10.1016/j.csl.2012.10.007  [details] 


  • Clapham, R. P., van der Molen, L., van Son, R. J. J. H., van den Brekel, M., & Hilgers, F. J. M. (2012). NKI-CCRT corpus: speech intelligibility before and after advanced head and neck cancer treated with concomitant chemoradiotherapy. In N. Calzolari, K. Choukri, T. Declerck, M. Uğur Doğan, B. Maegaard, J. Mariani, J. Odijk, ... S. Piperidis (Eds.), Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12): 23-25 May, 2012, Istanbul, Turkey (pp. 3350-3355). Paris: European Language Resources Association (ELRA). [details] 



  • van Son, R. J. J. H., Wesseling, W., Sanders, E., & van den Heuvel, H. (2009). Promoting free dialog video corpora: the IFADV corpus example. In M. Kipp, J. C. Martin, P. Paggio, & D. Heylen (Eds.), Multimodal corpora: from models of natural interaction to systems and applications (pp. 18-37). (Lecture notes in computer science; No. 5509). Berlin: Springer. DOI: 10.1007/978-3-642-04793-0_2  [details] 


  • van Son, R. J. J. H., Wesseling, W., Sanders, E., & van den Heuvel, H. (2008). The IFADV corpus: a free dialog video corpus. In European Language Resources Association (Ed.), Proceedings of the Sixth International Language Resources and Evaluation (LREC'08) (pp. 1-8). Paris: ELDA. [details] 


  • Escudero, P., Kastelein, J., Weiand, K. A., & van Son, R. J. J. H. (2007). Formal modelling of L1 and L2 perceptual learning: Computational linguistics versus machine learning. Interspeech, 8, 1008-1011. [details] 
  • Escudero, P., Kastelein, J., Weiand, K. A., & van Son, R. J. J. H. (2007). Formal modelling of L2 perceptual learning: Computational linguistics versus machine learning. In Proceedings of Interspeech 2007. (pp. 1008-1011). Rundle Mall, Australia: Causal Productions.
  • Weenink, D. J. M., Chen, G., Chen, Z., de Konink, S., Vierkant, D., Hagen, E., & van Son, R. J. J. H. (2007). Learning tone distinction for Mandarin Chinese. Interspeech, 8, 950-953. [details] 
  • Wesseling, W., van Son, R. J. J. H., & Pols, L. C. W. (2007). The influence of masking words on the prediction of TRPs in a shadowed dialog. Interspeech, 8, 816-819. [details] 


  • Pols, L. C. W., & van Son, R. J. J. H. (2006). Speech dynamics: Acoustic manifestations and perceptual consequences. In P. Divenyi, S. Greenberg, & G. Meyer (Eds.), Dynamics of Speech Production and Perception (pp. 71-80). (NATO Science Series, Life and Behavioural Sciences; No. 374). IOS Press. [details] 
  • Wesseling, W., van Son, R. J. J. H., & Pols, L. C. W. (2006). On the sufficiency and redundancy of pitch for TRP projection. Interspeech, 7, 2402-2405. [details] 
  • van Son, R. J. J. H., Wesseling, W., & Pols, L. C. W. (2006). Prominent words as anchors for TRP projection. Interspeech, 7, 465-468. [details] 



  • van Eijk, N. (Author), Roessler, B. (Author), Zuiderveen Borgesius, F. (Author), Oostveen, M. (Author), et al., . U. (Author), van Son, R. (Author), ... Taylor, L. (Author). (2014). Academics Against Mass Surveillance. [details] 


  • Clapham, R., Hilgers, F., van den Brekel, M., & van Son, R. (2011). An exploration into automatic phonological feature evaluation of tracheoesophageal speech. In W. Zonneveld, H. Quené, & W. Heeren (Eds.), Sound and sounds: studies presented to M.E.H. (Bert) Schouten on the occasion of his 65th birthday (pp. 69-79). Utrecht: Utrecht Institute of Linguistics OTS. [details] 

Book editor

  • van Hamme, H., & van Son, R. J. J. H. (2007). Proceedings of Interspeech 2007 (CD ROM). Antwerp: ISCA. [details] 
