Director | |||
|
Carol Espy-Wilson Director, Speech Communication Lab Professor Electrical and Computer Engineering The Institute for Systems Researchemail : espy AT umd DOT edu |
||
Current Postdoctoral Fellows |
|||
|
Nina Benway Postdoctoral Fellow Email : benway AT umd DOT edu Nina R Benway, PhD CCC-SLP, is a Postdoctoral Fellow in Electrical and Computer Engineering with Dr. Carol Espy-Wilson. Nina’s research seeks to validate Dr. Espy-Wilson’s acoustic-to-articulatory speech inversion system for child speakers and speakers with speech sound disorders, for the long-term goal of extracting corrective, knowledge-of-performance articulatory feedback from speech sound learner audio. |
||
Current Graduate Students |
|||
|
Ahmed Adel Attia Research Assistant Email : aadel AT umd DOT edu Ahmed Adel Attia is a Ph.D. student and researcher at the University of Maryland. He obtained his bachelor’s degree with honors from Alexandria University in Egypt in 2020, where he ranked among the top students in his class. With extensive expertise in Deep Learning, Ahmed has gained valuable experience in its applications within NLP, computer vision, and speech-related fields. Presently, his research concentrates on speech articulation and production, utilizing deep learning to comprehend and analyze various aspects of speech phenomena. |
||
|
Thanushi Withanage Research Assistant Email: thanushi AT umd DOT edu Thanushi is a Ph.D. Student at the Speech Communication Laboratory (SCL). She completed her bachelor’s degree from the University of Peradeniya, Sri Lanka in 2019 in Electrical and Electronic Engineering. Her research interests are in applications of deep learning to identify the correlation of mental health disorders with speech, specifically in Depression analysis. |
||
|
Gowtham Premananth Research Assistant Email: gowtham8 AT umd DOT edu Gowtham is a Ph.D. Student at the Speech Communication Laboratory (SCL). He completed his bachelor’s degree in Computer Engineering from the University of Jaffna, Sri Lanka, in 2021, where he ranked top of his class. His research interests are in the applications of deep learning in speech processing and related multimodal inputs like video and text. Currently, his research mainly focuses on using deep learning to distinguish different symptoms of mental health disorders like schizophrenia through a person’s speech and facial gestures. |
||
|
Shuubham Ojha Research Assistant Email: sojha1 AT umd DOT edu Shuubham has been working towards his PhD at the Speech Communication Laboratory (SCL) since October 2023. He completed his bachelor’s in ECE from the Indian Institute of Technology, Guwahati in 2018 and his masters from the Indian Institute Of Technology, Kanpur in 2022. His research interests span applications of diffusion models to problems in the speech domain. He has gathered experience working in complementary areas such as convex optimization and large language models for speech transcription. |
||
|
Saba Tabatabaee Research Assistant Email: sabatb AT umd DOT edu Saba is a Ph.D. student at the Speech Communication Laboratory (SCL) at the University of Maryland. She graduated from the University of Tehran in Electrical and Computer Engineering in 2021. Her research interests lie in the application of deep learning to speech processing. Currently, she is focusing on speech inversion systems, using deep learning techniques to understand and analyze various aspects of speech articulation. |
||
Current Undergraduate Students |
|||
Alumni |
|||
Post-Doctoral Researchers |
|||
Vahid Khanagha (2013 – 2017; currently at Motorola) | |||
Xinhui Zhou (2010 – 2014; currently at Knowles) | |||
Vikramjit Mitra (2011; currently at Apple Inc.) | |||
Om Deshmukh (2006 – 2007; currently at Envestnet Inc.) | |||
Gongjun Li | |||
Zhaoyan Zhang (2002 – 2004; currently Associate Professor at the UCLA School of Medicine) | |||
Suzanne Boyce (1994 – 1995; currently Professor at the University of Cincinnati) | |||
PhD. Students |
|||
Name | Graduated | Title of Thesis | Current Employment |
Yashish M. Siriwardena | 2023 | “Towards Extending Acoustic-to-Articulatory Speech Inversion and Learning Articulatory Representations” | Omnispeech |
Nadee Seneviratne | 2022 | “Generalizable Depression Detection and Severity Prediction Using Articulatory Representations of Speech” | MathWorks |
Saurabh Sahu | 2019 | “Towards Building Generalizable Speech Emotion Recognition Models” | Samsung |
Ganesh Sivaraman | 2017 | “Articulatory representations to address acoustic variability in speech Variable Models” | Pindrop |
Daniel Garcia-Romero | 2012 | “Robust Speaker Recognition Based on Latent Variable Models” | HLTCOE at JHU |
Srikanth Vishnubhotla | Feb 2011 | “Segregation of Speech Signals in Noisy Environments” | Amazon |
Vikram Mitra | Dec 2010 | “Articulatory Information for Robust Speech Recognition” | Apple Inc. |
Tarun Pruthi | Jan 2007 | “Analysis, Vocal-Tract Modeling, and Automatic Detection of Vowel Nasalization” | Meta |
Om Deshmukh | Jul 2006 | “Synergy of Acoustic-Phonetics and Auditory Modeling Towards Robust Speech Recognition” | Envestnet Inc. |
Amit Juneja | Dec 2004 | “Probabilistic landmark detection based on acoustic-phonetic information for automatic speech recognition” | Agile Data Decisions |
Nabil Bitar * | Fall 1997 | “Acoustic modeling of speech based on phonetic features” | GTE |
M. S. (Thesis) |
|||
Name | Graduated | Title of Thesis | Current Employment |
Rahil Parikh ** | May 2022 | “Demystifying End-to-End Speech Segregation Networks” | Amazon |
Yi-Chun Ko ** | Dec 2015 | “A Study of feature sets for emotion recognition from speech signals” | |
Jingting Zhou ** | 2011 | “Automatic Speech CODEC Identification with Applications to Tampering Detection of Speech Recordings” | Texas Instruments |
Srikanth Vishnubhotla** | Jan 2007 | “Irregular Phonation Detection and Speaker ID” | Apple Inc. |
Sandeep Manocha ** | Jul 2006 | “Robust Voice Mining of Telephone Conversations” | Microsoft |
Thorvaldur Einarrson * | Dec 2003 | “Psychoacoustics based gain compensation for low listening level” | |
Ariel Salomon * | Dec 2000 | “The Automatic Detection of Manner Landmarks using Simple Temporal Measures” | Neighborly |
Michelle Delaney * | May 1998 | “An Analysis of the Recognition Errors of a Phonetic Feature Based Speech Recognizer” | |
Venkatesh Chari * | May 1992 | “Extraction of Formant Frequencies by Adaptive Enhancement of Fourier Spectra” | |
M.S. (Projects) |
|||
Name | Duration | Title of Project | |
Tarun Pruthi ** | 2003 | Automatic Classification of Nasal Consonants | |
Om Deskmush * | 2001 | A Direct Measure of Proportion of Periodic and Aperiodic Energy in Speech Signals | |
Amit JuneJa * | 2001 | Acoustic-Phonetic Approach to Speech Recognition Based on Event Detection and Linear Discriminant Analysis | |
Nandini Srinivasan * | May 2000 | Removal of Artificial Larynx Device Resonances through Inverse Filtering | |
Kun Xia * | 2000 | Refinement of Formant Tracker for Automatic Speech Recognition | |
Eric Craft * | 2000 | Automatic Classification of Baby Babble into Broad Classes | |
Arindam Mandel * | 2000 | Comparison of Knowledge-based Recognition with Human Performance Using Impoverished Speech | |
Bethany Broom * | 2000 | Combining Different Order LPC Spectra to obtain Reliable Pole Estimates for Automatic Formant Tracking | |
Heather Cundiff * | 2000 | Analysis of Acoustic and Articulatory Data for American English /r/ | |
Kun Ma * | May 1999 | Improvement of Alaryngeal Speech through the Automatic Insertion of Prosodic Information | |
Pelin Demirel * | May 1999 | Improvement of Alaryngeal Speech through the Automatic Replacement of the Artificial Excitation Signal with a Normal Excitation Signal | |
Qian Zhang * | May 1998 | Recognition of Impoverished Speech | |
Zach McCaffrey * | May 1998 | Replacement of Artificial Voice Excitation Signal with Natural Excitation Signal using Cepstral Analysis | |
Deborah Schwartz * | May 1996 | Signal Processing Algorithms for Electrolaryngeal Speech Enhancement | |
Carla Valera * | May 1997 | Common Features of Devoiced Semivowels | |
Neeraj Deshmukh * | May 1995 | A Strategy for Acoustic Modeling to Increase Efficiency of HG | |
Kenneth Grimes * | May 1992 | Formant estimation of vowels using Critical-band Filtering | |
Jack McLaughlin * | May 1992 | Extraction of the glottal waveform using inverse filtering | |
Tamer Onat * | May 1992 | Vowel recognition using neural networks and phonetic features | |
Undergraduate Students in Research Programs |
|||
Name | Duration | Title of Research Project | |
Kevin Chen | Summer 2013 | Segmental Signal-to-Noise Ratio Improvement Measurement and Its Analysis in Speech Enhancement Algorithms | |
Armand Kana Tano | Summer 2012 | Time Modification Techniques to Improve Speech Intelligibility for Older Hearing Impaired Listeners | |
Matthew Cohen | Summer 2012 | Time Modification Techniques to Improve Speech Intelligibility for Older Hearing Impaired Listeners | |
Jonathan Deutsche | Summer 2011 | Delta-Spectral Cepstral Coefficients for Robust Speaker Recognition | |
Justin Bare | Summer 2011 | Automatic Volume Leveler for Real Time Speech Applications | |
Jonathan Kola | Summer 2011 | Voice Activity Detection | |
Nick Prior | Summer 2010 | Algorithms on Noisy Speech for Hearing-Aid Users | |
Rob Bailey | Summer 2009 | Robust Speech Recognition: Articulatory Information to Account for Coarticulation | |
Kossivi Edji | Summer 2009 | Robust Speech Recognition: Articulatory Information to Account for Coarticulation | |
Kelly Brock | Feb. 2009 to May 2009 | Analysis of the Performance of the APP’s Periodicity Measure as a function of the Signal-to-Noise ratio of Speech | |
Jose Figuero | Summer 2008 | Algorithm for Noisy Speech for Cochlear Implant Users | |
Alex Colvin | Summer 2007 | A Comparison of Acoustic Parameters and MFCCs for Speaker Identification | |
Ryan Amundsen | Summer 2007 | Automatic Speaker Recognition Phonetic Discriminative Power | |
Timothy Burke | Fall 2006 | Replacing STFT filter bank in MPO processing with an Auditory filter bank | |
Kunle Ogunsuyi | Summer 2006 | Speaker Recognition and Voice Mining | |
Bilal Raja | Summer 2006 | Recognition of Nasalized and Non-Nasalized Vowels | |
Chris Turnes | Spring 2006 | The dependence of the MPO model on the exact structure of the filterbank used in implementation (Spring) | |
Geetika Nagpal | Fall 2005 | The dependence of the MPO model on the exact structure of the filter bank used in implementation | |
Avinash Yentrapati | Summer 2005 | Articulatory synthesis of sustained speech. | |
Ayana George | Summer 2005 | Implementation of a Spectral Mean Subtraction Algorithm for Speech Enhancement | |
Sai Hei Yeung | Summer 2005 | MRI-based 3D Finite-element Analysis and Modeling of the Vocal Tract for American English /r/ | |
Ryan Aminzadeh | Summer 2005 | Unsupervised Speaker Segmentation of two-speaker conversations | |
John Lin | Fall 2004 & Spr 2005 | Comparison of the acoustic properties of speech sound produced in upright vs. supine position | |
Shuo Chen | Summer 2004 | Acoustic Parameters for Identification of Nasalized Vowels | |
Thomas Plummer | Summer 2004 | The Investigation of Acoustical Features in Text-Independent Speaker Verification | |
Qin Zou | Spring 2004 | Compensation Algorithms to Minimize the Effect of Noise on Acoustic Speech Parameters | |
Jalaal Deeb | Summer 2003 | Speaker Adaptation in Text-Independent Speaker Verificaton | |
Paul Young | Summer 2003 | Creating Feature-Based Finite State Automata for Speech Recognition First prize in the RITE (Research in Telecommunications Engineering) Program at the University of Maryland |
|
Jawahar Singh | Summer & Fall 2003 | A Graded Method for Determining the Proportion of Periodic/Aperiodic Energy in Speech Signals | |
Shong Yin | Summer 2002 | Speaker Recognition Implemented via GMM and Vector Quantization | |
Jason Strohmeir | Summer 2002 | Multilayer Perceptron Neural Network for Speech Recognition | |
Kazuhito Niimi | Spr 1994 | Automatic classification of stop consonants | |
Stephanie Zierten | Fall 1992 & Spr 1993 | Automatic Detection of Place of Articulation in Stop Consonants (Senior Honors Thesis) |
|
Armen Balien | Fall 1992 & Spr 1993 | Automatic Detection of Acoustic Properties that Separate Adjacent Sounds with the Same Manner of Articulation (Senior Honors Thesis) |
|
Vinay Chandra | Fall 1991 & Spr 1992 | Automatic Discrimination of Strident and Nonstrident Fricatives (Senior Honors Thesis) |
|
Valerie Padilla | Spring 1991 | Detecting linguistic features for use in a speech recognition system | |
Charles Robinson *** | Fall 1990 & Spr 1991 | An Acoustic Study of the Influence of /r/ on different F3 trajectories (BS Thesis) | |
Shawn Williams *** | Fall 1989 & Spr 1990 | An Acoustic Study of the Feature Retroflex (BS Thesis) | |
* Boston University |