计算机科学 ›› 2016, Vol. 43 ›› Issue (3): 54-56.doi: 10.11896/j.issn.1002-137X.2016.03.010
赵世瑜,线岩团,郭剑毅,余正涛,洪玄贵,王红斌
ZHAO Shi-yu, XIAN Yan-tuan, GUO Jian-yi, YU Zheng-tao, HONG Xuan-gui and WANG Hong-bin
摘要: 音节是泰语构词和读音的基本单位,泰语音节切分对泰语词法分析、语音合成、语音识别研究具有重要意义。结合泰语音节构成特点,提出基于条件随机场(Conditional Random Fields)的泰语音节切分方法。该方法结合泰语字母类别和字母位置定义特征,采用条件随机场对泰语句子中的字母进行序列标注,实现泰语音节切分。在InterBEST 2009泰语语料的基础上,标注了泰语音节切分语料。针对该语料的实验表明,该方法能有效利用字母类别和字母位置信息实现泰语音节切分,其准确率、召回率和F值分别达到了99.115%、99.284%和99.199%。
[1] Yamamoto K,Nakagawa S.Comparison of syllab-le-based andphoneme-based DNN-HMM in Japane-se speech recognition[C]∥2014 International Conference Advanced Informatics:Concept,Theory and Application (ICAICTA).Bandung,2014:249-254 [2] Tangwongsan S,Phoophuangpairoj R.Boosting Thai SyllableSpeech Recognition Using Acoustic Models Combination[C]∥International Conference on Computer and Electrical Engineering(ICCEE 2008).2008:568-572 [3] Gu Hung-yan,Lai Ming-uen,Tsai Sung-Feng.Combining HMM Spectrum Models and ANN Pros-ody Models for Speech Synthesis of Syllable Prom-inent Languages[C]∥2010 7th International Symposium Chinese Spoken Language Processing (ISCSLP).Tainan,2010:451-454 [4] Thairatananond Y.Towards the Design of a Thai Text Syllable Analyzer [D].Asian Institute of Technology,1981 [5] Charnyapornpong S.A Thai syllable separation alg-orithm [D].Asian Institute of Technology,1983 [6] Poowarawan Y.Dictionary-based Thai syllable separathion[C]∥Proceedings of the Ninth Electronics Engineering Conference.1986 [7] Aroonmanakun W.Collocation and Thai Word Segmentation[C]∥Proceedings of SNLP-Oriental Cocosda,2002.2002:68-75 [8] Fferty J,McCallum A,Pereira F.Conditional random fieldsProbabilistie models for segmenting and labeling sequence data[C]∥ICML2001.San Francisco:Morgan Kaufmann,2001:282-289 [9] Sproat R,Emerson T.The first international Chines-e word segmentation bakeoff[C]∥2nd SIGHAN Workshop on Chinese Language Processing.Morristown.NJ:ACL,2003:133-143 [10] Zhao Hai,Huang Chang-ning,Li Mu.An improved Chineseword segmentation system with conditional random field[C]∥5th SIGHAN Workshop on Chinese Language Processing.Morristown,NJ:ACL,2006:108-117 [11] Segmentation Guidelines for InterBEST 2009 Thai Word Segmentation:An international episode [EB/OL].http://thailang.nectec.or.th/downloadcenter/index.php?option=com_doc-man&task=cat_view&gid=43&Itemid=61 [12] Boriboon M,et al.BEST Corpus Development and Analysis[C]∥International Conference on Asian Language Processing,2009(IALP’09).2009:322-327 |
No related articles found! |
|