Computer Science ›› 2016, Vol. 43 ›› Issue (2): 51-56.doi: 10.11896/j.issn.1002-137X.2016.02.011

Previous Articles     Next Articles

Recognition of Prosodic Phrases Based on Unlabeled Corpus and “Adhesion” Culling Strategy

QIAN Yi-li and CAI Ying-ying   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Obtaining large-scale annotated corpus manually is very difficult and has some disadvantages.Based on the pause role of punctuation,this paper proposed a prosodic phrase recognition method which uses unlabeled corpus and “adhesion” culling strategy.In the method,punctuation is graded and given different weights when it is used to simulate the prosodic boundaries.For recognizing prosodic phrase boundaries automatically,a max entropy model is constructed based on an unlabeled corpus and a Top-K method is also used.According to the mutual information of two contiguous part of speech tagging,words are bundled into adhesion units and the prosodic boundaries appear in it are eliminated.The experimental results show that hierarchical use of punctuation and “adhesion” culling strategy can improve the performance of the model significantly.The method can obtain better recognition results.

Key words: Unlabeled corpus,Prosodic phrase boundary,Maximum entropy(ME),Mutual information

[1] Qian Yi-li,Xun En-dong.Prediction of Speech Pauses Based on Punctuation Information and Statistical Language Model[J].Pattern Recognition and Artificial Intelligence,2008,21(4):541-545(in Chinese) 钱揖丽,荀恩东.基于标点信息和统计语言模型的语音停顿预测[J].模式识别与人工智能,2008,1(4):541-545
[2] Cao Jian-fen.Prediction of Prosodic Organization Based on Gram-matical Information [J].Journal of Chinese Information Processing,2003,17(3):41-46(in Chinese) 曹剑芬.基于语法信息的汉语韵律结构预测[J].中文信息学报,2003,17(3):41-46
[3] Zheng Min,Cai Lian-hong.Statistical model based on probability frequency for Mandarin prosodic structure prediction[J].Journal of Tsinghua University(Science and Technology),2006,46(1):78-81(in Chinese) 郑敏,蔡莲红.基于概率频度的普通话韵律结构预测统计模型[J].清华大学学报(自然科学版),2006,6(1):78-81
[4] Zhao Sheng,Tao Jian-hua,Cai Lian-hong.Rule-learning Based Prosodic Structure Prediction[J].Journal of Chinese Information Processing ,2002,16(5):30-37(in Chinese) 赵晟,陶建华,蔡莲红.基于规则学习的韵律结构预测[J].中文信息学报,2002,6(5):30-37
[5] Ostendorf M,Veilleux N.A hierarchical stochastic model for automatic prediction of prosodic boundary location[J].Computational Linguistics,1994,20(1):27-54
[6] Atterer M,Klein E.Integrating linguistic and performance-based constraints for assigning phrase breaks[C]∥Proceedings of the 19th international conference on Computational linguistics-Vo-lume 1.Association for Computational Linguistics,2002:1-7
[7] Dong Yuan,Zhou Tao,Dong Cheng-yu,et al.Prosodic Structure Prediction Based on Conditional Random Field Model[J].Journal of Beijing University of Posts and Telecommunications,2009,2(5):36-40(in Chinese) 董远,周涛,董乘宇,等.条件随机场模型在韵律结构预测中的应用[J].北京邮电大学学报,2009,32(5):36-40
[8] Qian Yi-li,Feng Zhi-ru.Identification of Chinese Prosodic Ph-rase Based on Chunk and CRF[J].Journal of Chinese Information Processing,2014,28(5):32-38(in Chinese) 钱揖丽,冯志茹.基于语块和条件随机场(CRFs)的韵律短语识别[J].中文信息学报,2014,8(5):32-38
[9] Wang Yong-xin,Cai Lian-hong.Syntactic Information and Ana-lysis and Prediction of Prosody Structure[J].Journal of Chinese Information Processing,2010,4(1):65-70(in Chinese) 王永鑫,蔡莲红.语法信息与韵律结构的分析与预测[J].中文信息学报,2010,4(1):65-70
[10] Pei Yu-lai,Qiu Jin-ping,Wang Hong-jun,et al.Chinese sentence prosodic structure prediction based on the sequence of the parts of Speech[J].Journal of Tsinghua University(Science and Technology),2009(S1):1339-1343(in Chinese) 裴雨来,邱金萍,王洪君,等.基于词类序列的汉语语句韵律结构预测[J].清华大学学报(自然科学版),2009(S1):1339-1343
[11] Yang Hong-wu,Wang Xiao-li,Chen Long,et al.Predicting Chinese prosodic phrase with height of syntax tree[J].Computer Engineering and Applications,2010,6(36):139-143(in Chinese) 杨鸿武,王晓丽,陈龙,等.基于语法树高度的汉语韵律短语预测[J].计算机工程与应用,2010,6(36):139-143
[12] Yang Chen-yu,Zhu Li-xin,Ling Zhen-hua,et al.AutomaticPhrase boundary labeling for a Mandarin TTS corpus using the Viterbi decoding algorithm[J].Journal of Tsinghua University(Science and Technology),2011,1(9):1267-1281(in Chinese) 杨辰雨,朱立新,凌震华,等.基于Viterb解码的中文合成音库韵律短语边界自动标注[J].清华大学学报(自然科学版),2011,1(9):1276-1281
[13] Li Jian-feng,Hu Guo-ping,Wang Ren-hua.New Prosody Ph- rase Prediction Model Based on Whole Sentence Similarity Computing[J].Journal of Chinese Computer Systems,2006,7(10):1935-1938(in Chinese) 李剑锋,胡国平,王仁华.基于整句相似性计算的韵律短语预测模型[J].小型微型计算机系统,2006,7(10):1935-1938
[14] Dong Hong-hui,Tao Jian-hua,Xu Bo.Chinese Prosodic Phrasing with a Constraint based Approach[J].Journal of Chinese Information Processing,2007,1(1):54-59(in Chinese) 董宏辉,陶建华,徐波.基于约束模型的韵律短语预测[J].中文信息学报,2007,21(1):54-59
[15] Shao Yan-qiu,Hui Zhi-fang,Han Ji-qing,et al.A Study on Chinese Prosodic Hierarchy Prediction Based on Dependency Grammar Analysis[J].Journal of Chinese Information Processing,2008,2(2):116-123(in Chinese) 邵艳秋,穗志方,韩纪庆,等.基于依存句法分析的汉语韵律层级自动预测技术研究[J].中文信息学报,2008,22(2):116-123
[16] Yang Hong-wu,Zhu Ling.Predicting Chinese Prosodic boundary based on syntactic features[J].Journal of Northwest Normal University (Natural Science),2013,9(1):41-45(in Chinese) 杨鸿武,朱玲.基于句法特征的汉语韵律边界预[J].西北师范大学学报(自然科学版),2013,9(1):41-45
[17] Zhang Yuan-ping,Ling Zhen-hua,Dai Li-rong,et al.Improved decision tree based method for English prosodic phrase boundary Prediction[J].Application Research of Computers,2012,9(8):2921-2925(in Chinese) 张元平,凌震华,戴礼荣,等.一种改进的基于决策树的英文韵律短语边界预测方法[J].计算机应用研究,2012,29(8):2921-2925

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!