TY - GEN
T1 - Investigating speech reconstruction for laryngectomees for silent speech interfaces
AU - Cao, Beiming
AU - Sebkhi, Nordine
AU - Bhavsar, Arpan
AU - Inan, Omer T.
AU - Samlan, Robin
AU - Mau, Ted
AU - Wang, Jun
N1 - Funding Information:
This work was supported by the National Institutes of Health (NIH) under award numbers R03DC013990 and R01DC016621 and by the American Speech-Language-Hearing Foundation through a New Century Scholar Research Grant. We also thank Avery Singson for helping with data preprocessing, and the volunteering participants.
Publisher Copyright:
Copyright © 2021 ISCA.
PY - 2021
Y1 - 2021
N2 - Silent speech interfaces (SSIs) are devices that convert nonaudio bio-signals to speech, which hold the potential of recovering quality speech for laryngectomees (people who have undergone laryngectomy). Although significant progress has been made, most of the recent SSI works focused on data collected from healthy speakers. SSIs for laryngectomees have rarely been investigated. In this study, we investigated the reconstruction of speech for two laryngectomees who either use tracheoesophageal puncture (TEP) or electro-larynx (EL) speech as their post-surgery communication mode. We reconstructed their speech using two SSI designs (1) real-time recognitionand- synthesis and (2) directly articulation-to-speech synthesis (ATS). The reconstructed speech samples were measured in subjective evaluation by 20 listeners in terms of naturalness and intelligibility. The results indicated that both designs increased the naturalness of alaryngeal speech. The real-time recognitionand- synthesis design obtained higher intelligibility in electrolarynx speech as well, while the ATS did not. These preliminary results suggest the real-time recognition-and-synthesis design may have a better potential for clinical applications (for laryngectomees) than ATS.
AB - Silent speech interfaces (SSIs) are devices that convert nonaudio bio-signals to speech, which hold the potential of recovering quality speech for laryngectomees (people who have undergone laryngectomy). Although significant progress has been made, most of the recent SSI works focused on data collected from healthy speakers. SSIs for laryngectomees have rarely been investigated. In this study, we investigated the reconstruction of speech for two laryngectomees who either use tracheoesophageal puncture (TEP) or electro-larynx (EL) speech as their post-surgery communication mode. We reconstructed their speech using two SSI designs (1) real-time recognitionand- synthesis and (2) directly articulation-to-speech synthesis (ATS). The reconstructed speech samples were measured in subjective evaluation by 20 listeners in terms of naturalness and intelligibility. The results indicated that both designs increased the naturalness of alaryngeal speech. The real-time recognitionand- synthesis design obtained higher intelligibility in electrolarynx speech as well, while the ATS did not. These preliminary results suggest the real-time recognition-and-synthesis design may have a better potential for clinical applications (for laryngectomees) than ATS.
KW - Alaryngeal speech
KW - Silent speech interfaces
UR - http://www.scopus.com/inward/record.url?scp=85119296143&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119296143&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2021-1842
DO - 10.21437/Interspeech.2021-1842
M3 - Conference contribution
AN - SCOPUS:85119296143
T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
SP - 436
EP - 440
BT - 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
PB - International Speech Communication Association
T2 - 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
Y2 - 30 August 2021 through 3 September 2021
ER -