计算机科学 ›› 2020, Vol. 47 ›› Issue (11A): 638-641.doi: 10.11896/jsjkx.200500097

• 交叉&应用 • 上一篇    下一篇

基于PIFA的语音识别系统评测平台

崔阳1, 刘长红2   

  1. 1 中国劳动关系学院应用技术学院 北京 100048
    2 江西师范大学计算机信息工程学院 南昌 330022
  • 出版日期:2020-11-15 发布日期:2020-11-17
  • 通讯作者: 崔阳(cuiyang14@163.com)
  • 基金资助:
    中国劳动关系学院科研项目(20XYJS004);国家自然科学基金项目(61662030)

PIFA-based Evaluation Platform for Speech Recognition System

CUI Yang1, LIU Chang-hong2   

  1. 1 College of Applied Technology,China University of Labor Relations,Beijing 100048,China
    2 College of Computer Information Engineering,Jiangxi Normal University,Nanchang 330022,China
  • Online:2020-11-15 Published:2020-11-17
  • About author:CUI Yang,born in 1979,Ph.D,lecturer.His main research interests include knowledge engineering and knowledge discovery.
  • Supported by:
    This work was supported by the Research Project of China University of Labor Relations (20XYJS004) and National Natural Science Foundation of China (61662030).

摘要: 语音识别技术的应用领域众多,而语音识别系统的性能评测对语音识别技术的发展起着重要的推动作用。为了能够更好地对比各类语音系统的性能,在总结现有各种语音识别评测方法的基础上,提出了一种基于性能影响因素分析(PIFA)的语音识别平台体系结构,并据此开发了一个通用的语音识别系统评测平台。该平台以评测库和评测项目为核心概念,包含评测数据生成、数据分析、性能评价指标计算和性能影响因素分析等主要模块,能够面向多种任务和多种语音数据对语音识别系统性能进行快速、准确的自动化评价,尤其适用于大词汇量、连续性的语音识别情景。评测结果可以由平台加以统计分析,揭示各数据属性对识别系统性能的影响,指导语音识别系统的改进和提高。

关键词: PIFA, 评测库, 评测项目, 性能影响因素分析, 语音识别

Abstract: There are many application fields of speech recognition technology,and the performance evaluation of the speech recognition system plays an important role in promoting the development of speech recognition technology.PIFA (PerformanceInfluen-cing Factor Analysis) based architecture of evaluation platform for speech recognition system is proposed by summarizing va-rious existing speech recognition evaluation methods to compare the performance of various speech systems better,and a platform with PIFA is implemented.The platform involves two key concepts,evaluation database and evaluation project,and includes mo-dules of evaluation data generation,data analysis,performance evaluation index calculation and performance influencing factors analysis.It can deal with multiple recognition tasks and many kinds of data,especially for speech recognition with large vocabulary and continuity.The evaluation results can be statistically analyzed by the platform to reveal the influence of various data attri-butes on the performance of the recognition system,and help the improvement of the speech recognition system.

Key words: Evaluation items, Evaluation library, Performance influencing factor analysis, PIFA, Speech recognition

中图分类号: 

  • TP311.1
[1] LIU J,ZHANG W Q.Research Progress on Key Technologies of Low Resource Speech Recognition[J].Journal of Data Acquisition and Processing,2017,32(2):205-220.
[2] ZHAO J H,GAO H B,LIU Y C,et al.Speech Recognition Algorithm Based on Neural Network and Hidden Markov Model[J].The Journal of China Universities of Posts and Telecommunications,August,2018,25(4):28-37.
[3] YANG Y,WANG Y D.Speech Recognition Based on Improved Convolutional Neural Netword Algorithm[J].Journal of Applied Acoustics,2018,37(6):940-946.
[4] CHEN Z H,ZHENG W L,YOU Y B,et al.Label Synchronous Decoding for Speech Recognition[J].Chinese Journal of Computers,2019,42:1-15.
[5] WEI G W,FENG Z Y.Design of DSP Speech Recognition Sys-tem Based on Denoising Technology[J].Transducer and Mic-rosystem Technologies,2017,36(1):108-118.
[6] JIN C,GONG C,LI H.Speaker Adaptation Research of Neural Network Acoustic Model in Speech Recognition[J].Computer Applications and Software,2018,35(2):200-205.
[7] HU D,ZENG Q N,LONG C,et al.Front-end Robust Study forContinuous Speech Recognition[J].Video Engineering,2015,39(24):43-46.
[8] WANG X D,XIE F,LIN S X,et al.DOE and ANOVA basedPerformance Influencing Factor Analysis for Evaluation of Speech Recognition Systems[C]//International Conference on Industrial.Singapore:ISCSLP,2013:431-442.
[9] WANG B C,WEI Y Y,DAI N.Pivoting and Approximate Pivo-ting of Bootstrap Statistics[J].Statistics and Decision,2016,16:17-20.
[10] BAO Y B,HU Y,LIU C,et al.Phomeme Modeling Units Desing for Mandarin LVCSR Systems[J].Journal of Tsinghua University(Sci & Tech),2011,51(9):1288-1292.
[1] 徐鸣珂, 张帆.
Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法
Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition
计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085
[2] 程高峰, 颜永红.
多语言语音识别声学模型建模方法最新进展
Latest Development of Multilingual Speech Recognition Acoustic Model Modeling Methods
计算机科学, 2022, 49(1): 47-52. https://doi.org/10.11896/jsjkx.210900013
[3] 杨润延, 程高峰, 刘建.
基于端到端语音识别的关键词检索技术研究
Study on Keyword Search Framework Based on End-to-End Automatic Speech Recognition
计算机科学, 2022, 49(1): 53-58. https://doi.org/10.11896/jsjkx.210800269
[4] 郑纯军, 王春立, 贾宁.
语音任务下声学特征提取综述
Survey of Acoustic Feature Extraction in Speech Tasks
计算机科学, 2020, 47(5): 110-119. https://doi.org/10.11896/jsjkx.190400122
[5] 张经, 杨健, 苏鹏.
语音识别中单音节识别研究综述
Survey of Monosyllable Recognition in Speech Recognition
计算机科学, 2020, 47(11A): 172-174. https://doi.org/10.11896/jsjkx.200200006
[6] 史燕燕, 白静.
融合CFCC和Teager能量算子倒谱参数的语音识别
Speech Recognition Combining CFCC and Teager Energy Operators Cepstral Coefficients
计算机科学, 2019, 46(5): 286-289. https://doi.org/10.11896/j.issn.1002-137X.2019.05.044
[7] 魏莹,王双维,潘迪,张玲,许廷发,梁士利.
宽窄带语谱图融合分带投影的特定人汉语词汇识别
Specific Two Words Chinese Lexical Recognition Based on Broadband and Narrowband Spectrogram Feature Fusion with Zoning Projection
计算机科学, 2016, 43(Z11): 215-219. https://doi.org/10.11896/j.issn.1002-137X.2016.11A.049
[8] 李伟林,文剑,马文凯.
基于深度神经网络的语音识别系统研究
Speech Recognition System Based on Deep Neural Network
计算机科学, 2016, 43(Z11): 45-49. https://doi.org/10.11896/j.issn.1002-137X.2016.11A.010
[9] 孙志远,鲁成祥,史忠植,马刚.
深度学习研究与进展
Research and Advances on Deep Learning
计算机科学, 2016, 43(2): 1-8. https://doi.org/10.11896/j.issn.1002-137X.2016.02.001
[10] 刘万凤,胡军,袁伟伟.
陆空通话标准用语(英语)的语音指令识别技术研究
Research on Technology of Voice Instruction Recognition for Air Traffic Control Communication
计算机科学, 2013, 40(7): 131-137.
[11] 晁浩,杨占磊,刘文举.
汉语语音识别中声学界标点引导的随机段模型解码算法
Landmark Guided Segmental Speech Decoding Algorithm for Continuous Mandarin Speech Recognition
计算机科学, 2013, 40(10): 208-212.
[12] 王辉,李生华.
基于EMD的语音特征信息提取
Feature Extraction of Speech Signal Based on Empirical Mode Decomposition
计算机科学, 2011, 38(Z10): 434-436.
[13] 韩志艳,王健,伦淑娴.
基于遗传小波神经网络的语音识别分类器设计
Design of Speech Recognition Classifier Based on Genetic Wavelet Neural Network
计算机科学, 2010, 37(11): 243-246.
[14] 韩志艳,王健,王旭.
基于正交实验设计的语音识别特征参数优化
Parameter Optimization Based on Orthogonal Test Design in Speech Recognition System
计算机科学, 2010, 37(1): 214-216.
[15] 王旭 韩志艳 王健 薛丽芳.
基于动静态组合特征参数的语音识别

计算机科学, 2008, 35(7): 129-132.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!