Computer Science ›› 2016, Vol. 43 ›› Issue (Z11): 215-219, 232.doi: 10.11896/j.issn.1002-137X.2016.11A.049

Previous Articles     Next Articles

Specific Two Words Chinese Lexical Recognition Based on Broadband and Narrowband Spectrogram Feature Fusion with Zoning Projection

WEI Ying, WANG Shuang-wei, PAN Di, ZHANG Ling, XU Ting-fa and LIANG Shi-li   

  • Online:2018-12-01 Published:2018-12-01

Abstract: A method based on broadband and narrowband spectrogram fusion with zoning projection of specific two words Chinese lexical recognition was presented.In the process of image feature extraction,the image processing technique is applied to the speech recognition field.Firstly,equal width zoning line projection and binary width zoning line projection are carried out to the narrowband spectrogram,and they are set respectively as the narrowband spectrogram of the first characteristic set and the second characteristic set.Meanwhile,equal width zoning line projection is carried out again to the narrowband spectrogram after Fourier transform,treating it as the third feature set.Then,equal width column projection is carried out to the broadband spectrogram,regarding it as the fourth feature set.The above three feature sets are used as feature vectors to support vector machine(SVM) as a classifier for the overall recognition of specific two words Chinese vocabulary.1000 voice samples are used in simulation experiment.The results show that the correct recognition rate of the two words Chinese word recognition by the first three feature sets is 92.4%.The correct recognition rate of two words vocabulary recognition using fourth feature sets is 80%.The correct recognition rate of the two words Chinese word recognition by using the feature value fusion of the above four features can reach 95.4%.This method of feature fusion provides a new way of thinking of Chinese vocabulary overall recognition.

Key words: Speech recognition,Spectrogram,Feature fusion,Line projection,Column projection,Support vector machine(SVM)

[1] 蔡莲红,黄德智,蔡锐.现代语音技术基础与应用[M].北京:清华大学出版社,2003
[2] 潘凌云,孙达传,吴美朝.语音识别中基于语谱图的语音音素分割方法[J].杭州大学学报(自然科学版),1995,22(1):42-46
[3] Zue V W,Lamel L F.An Expert Spectrogram Reader:A Know-ledge—Based Approach to Speech Recognition[C]∥IEEE International Conference on Acoustics,Speech,and Signal Proces-sing.1986:1197-1200
[4] Klatt D H,Stevens K N.On the Automatic Recognition of Continuous Speech:Implications from a Spectrogram—Reading Experiment[J].IEEE Transactions on Audio and Electroacoustics,1973,21(3):210-217
[5] Riley M D.Schematizing Spectrograms for Speech Recognition[J].J.Acoust.Soc.Am.,1983,73(1):36-46
[6] Kingsbury B E D,Morgan N,Greenberg S.Robust speech recognition using the modulation spectrogram[J].Speech Commcination,1998,25(1-3):117-132
[7] Hiroaki H,Kensaku A,Yuji S,et al.Sound Source Separationwith Two Spectrograms by Image Processing[J].IEEJ Transactions on Electronics,Information and Systems,2005,124(12):2439-2445
[8] Khunarsal P,Lursinsap C,Raicharoen T P.Singing Voice Re-cognition Based on Mat-Chin of Spectrogram Pattern [C]∥Proceedings of International Joint Conference on Neural Networks.2009:1595-1599
[9] Shirin B,Richard R.A performance monitoring approach to fusing enhanced spectrogram channels in robust speech recognition[C]∥Proceedings of the Annual Conference of the International Speech Communication Association.2011:477-480
[10] Zhang Jin-song,Keikichi H.Tone nucleus modeling for Chinese lexical tone recognition[J].Speech Communication,2004,42(3/4):447-466
[11] Zhang Hua-ping,Liu Qun.Automatic recognition of Chinesepersonal name based on role tagging[J].Chinese Journal of Computers,2004,27(1):85-91
[12] Zhang S X,Gales M J F.Structured SVMs for automatic speech recognition[J].IEEE Transactions on Audio,Speech and Language Processing,2013,21(5):544-555
[13] Neammalai P,Phimoltares S,Lursinsap C.Speech and MusicClassification using Hybrid Form of Spectrogram and Fourier Transformation[C]∥ 2014 Annual Summit and Conference on Asia-Pacific Signal and Information Processing Association(APSIPA).2014:1-6
[14] 马义德,袁敏,齐春亮,等.基于 PCNN 的语谱图特征提取在说话人识别中的应用[J].计算机工程与应用,2005,41(20):81-84
[15] Awais M M,Waqas A,Masud S,et al.Continuous ArabicSpeech Segmentation using FFT Spectrogram[C]∥Innovations in Information Technology.2006
[16] Kensaku A,Akira O.Reduction of Noise in Speech Signalsthrough Image Processing Using the Spectrogram[J].IEEJ Transactions on Electronics,Information and Systems,2006,126(12):1483-1489
[17] Ajmera P K,Djmera,Jadhav D V,et al.Text-independentSpeaker Identification Using Radon and Discrete Cosine Transforms based Features from Speech Spectrogram[J].Pattern Recognition,2011,44(10/11):2749-2759
[18] Kekre H B,Athawale A,Desai M.International Conference and Workshop on Emerging Trends in Technology[C]∥ICWET Conference Proceedings.2011:171-174
[19] Steinberg R,O’Shaughnessy D.Segmentation of a speech spectrogram using mathematical morphology.ICASSP[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing Proceedings.2008:1637-1640
[20] Wu Di,Zhao He-ming,Huang Cheng-wei,et al.Speech EndpointDetection in Low-SNRs Environment Based on Perception Spectrogram Structure Boundary Parameter[J].Chinese Journal of Acoustical,2014,39(4):428-440
[21] Wang K C.The Feature Extraction Based on Texture Image Information for Emotion Sensing in Speech[J].Journal Citation Reports,2014,14(9):16692-16714
[22] Xu Sen,Zhao Xu,Duan Cheng-hua,et al.A Mathematical Morphological Processing of Spectrograms for the Tone of Chinese Vowels Recognition[J].Applied Mechanics and Materials,2014(571/572):665-671
[23] Dutta,Tridibesh.Dynamic time warping based approach to text-dependent speaker identification using spectrograms[C]∥Proceedings-1st International Congress on Image and Signal Processing,CISP.2008:354-360
[24] 赵力.语音信号处理[M].北京:机械工业出版社,2009:29-30
[25] Zhang Yue.The Research on Spectrogram of a particular group of Small-Vocabulary Recognition Algorithm[D].Changchun:Northeast Normal University,2013
[26] Hll D L,Llinas J.Handbook of Multisensor Data Fusion[M]∥The Electrical Engineering and Applied Signal Processing Series,CRC Press,2001
[27] Liu Tong-ming,Xia Zu-xun,Xie Hong-cheng.Data Fusion Technology with Application[M].Beijing:National Defence Industry Press,1998
[28] Blum R S,Xue Z,Zhang Z.Multisensor Image Fusion and itsApplications[M].Boca Raton,CRC Press,2005
[29] 张家騄.汉语人机语言通讯基础[M].上海科学技术出版社,2010:328-332
[30] 李明宇.现代汉语常用词表[M].北京:商务印书馆出版社,2008
[31] 邓乃扬,田英杰.数据挖掘中的新方法:支持向量机[M].科学出版社,2009
[32] Chang C C,Lin C J.A Library for Support Vector Machines[M].National Taiwan University,2001

No related articles found!
Full text



[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[2] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[3] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[4] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[5] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[6] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[7] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[8] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[9] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .
[10] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .