ANN-based detection of stop consonant place using vowel-driven formant dynamics and coarticulatory cues in the Buckeye Corpus
Soonhyun Hong (Inha University)
Abstract
An artificial neural network (ANN) classifier was developed to predict stop consonant place contrasts in both consonant-vowel and vowel-consonant tokens from the Buckeye Corpus. Two training conditions were compared: one using static F2 onset/offset values and another incorporating dynamic measurements of F2 onset/offset along with their corresponding target values. The results demonstrated that including dynamic F2 cues significantly improved classification accuracy, with additional benefits observed when dynamic cues from F1 and F3 were also incorporated. In general, predictions for postvocalic tokens outperformed those for prevocalic tokens. Among the secondary features (F0, gender, vowel identity, vowel duration, word duration, and word-internal segmental positioning) examined, vowel identity provided the most notable improvement, particularly in prevocalic contexts. This suggests that anticipatory coarticulatory effects have a stronger impact on preceding consonants. Overall, the combination of dynamic cues from F1–F3 with vowel identity emerged as the most robust predictor of stop consonant place. Moreover, the classifier effectively generalized to novel spontaneous speech tokens, offering valuable insights for enhancing both automatic speech recognition systems and phonetic models.
Keywords
Stop place contrasts, formant transitions, ANN classifier, Buckeye Corpus, ASR