Phonetic Labeling of the 'Corpus of Spontaneous Japanese'

Hideaki Kikuchi 1 , Kikuo Maekawa 2 , Yosuke Igarashi 3 , Kiyoko Yoneyama 4 , Masako Fujimoto 5

1 Research Fellow, Second Research Section, Department of Language Research, National Institute for Japanese Language; Assistant Professor of School of Human Sciences, Waseda University
2 Head, Second Research Section, Department of Language Research, National Institute for Japanese Language
3 Doctoral Cource, Graduate School of Area and Culture Studies, Tokyo University of Foreign Studies; Research Fellow, The Japan Society for the Promotion of Science
4 Research Fellow, Second Research Section, Department of Language Research, National Institute for Japanese Language; Assistant Professor, Daito Bunka University
5 Research Fellow, Second Research Section, Department of Language Research, National Institute for Japanese Language

In an attempt to construct a large-scale database of spontaneous speech, the authors planned to give segmental and prosodic labels to spontaneous Japanese speech. This paper reports the method of this labeling and its performance. First, the performance of automatic segmental labeling by Hidden Markov model was verified. Sample speech of about four hour long was automatically phoneme labeled and compared to the results of hand-labeling. It turned out that average of label boundary difference with hand labeled data was 14.3 [ms]. Second, the performance of prosodic labeling by newly proposed labeling scheme named X-JToBI (eXtended J_ToBI) was verified. The analysis of labeled data showed that newly added inventories appeared in the data of spontaneous speech and rate of inter-labeler agreement increased in nearly all types of labels.