Segmentation of the period of the fundamental tone of a voice source
- Authors: Sorokin V.N.1
-
Affiliations:
- Institute for Information Transmission Problems
- Issue: Vol 62, No 2 (2016)
- Pages: 244-254
- Section: Acoustic Signal Processing. Computer Simulation
- URL: https://journal-vniispk.ru/1063-7710/article/view/185652
- DOI: https://doi.org/10.1134/S1063771016020135
- ID: 185652
Cite item
Abstract
The extrema of the logarithmic derivative of the mean energy of a voice signal in the frequency range of 1000–3000 Hz are used to determine the instants of opening and closure of the glottis. The inaccuracy of analysis is estimated with the Arctic CMU database, which contains synchronous recordings of speech signals and electro-glottograms. The estimates of the instants of opening and closure of the glottis, found by the developed algorithm, are compared with the instants of the maximum and minimum of the derivative from electro-glottogram signals, which are taken as the “true” instants. The mean square deviation of the glottal opening instant from the extrema of the derivative from the electro-glottogram signals for different speakers is in the range of 1.03–1.64 ms. The error rate of a false estimate of the glottal opening instant is from 0.01 to 0.14%, and the error rate of omission is from 0.42 to 2.38%. An error-detection algorithm is developed. The mean square deviation with an relative—to the period of the fundamental tone—error in detecting the glottal opening instant is in the range of 13–18% for the most probable error from 0 to +5%.
About the authors
V. N. Sorokin
Institute for Information Transmission Problems
Author for correspondence.
Email: vns@iitp.ru
Russian Federation, Bol’shoi Karetnyi per. 19, Moscow, 101447
Supplementary files
