Computer Science ›› 2022, Vol. 49 ›› Issue (3): 179-184.doi: 10.11896/jsjkx.201200081

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Multiple Fundamental Frequency Estimation Algorithm Based on Generative Adversarial Networks for Image Removal

LI Si-quan, WAN Yong-jing, JIANG Cui-ling   

  1. Department of Information Science and Engineering,East China University of Science and Technology,Shanghai 200000,China
  • Received:2020-12-08 Revised:2021-05-06 Online:2022-03-15 Published:2022-03-15
  • About author:LI Si-quan,born in 1996,master.His main research interests include compu-ter learning and audio analysis.
    JIANG Cui-ling,born in 1976,Ph.D,lecturer.Her main research interests include artificial intelligence and image processing.

Abstract: Multiple fundamental frequency estimation is widely used in music structure analysis,music aided education,information retrieval and other fields.In order to meet the requirements of accurate identification of random chords in music,a multiple fundamental frequency estimation algorithm based on generative adversarial networks is proposed.Firstly,the complete audio is divided into note segments,and a homophonic fingerprint is proposed to extract the spectrum characteristics of the note segment.Then,the current dominant fundamental frequency of the homophonic fingerprint is identified by convolution neural network,and the identified dominant fundamental frequency is considered as the image that interferes with the next fundamental frequency re-cognition.Then,the interference image is removed by generative adversarial networks,and the homophonic fingerprint image affected by interference is processed in a new round.Finally,the multiple fundamental frequency estimation of complete chords is realized by iterative de imaging operation step by step.Experiments on the piano audio database composed of random two tone chord and random three tone chord are carried out.The results show that,compared with the classical spectrum iterative deletion algorithm and the large vocabulary chord recognition algorithm,the algorithm in this paper can adapt to the recognition of random chords,has high robustness in different ranges,and improves the overall accuracy significantly.

Key words: Convolution neural network, Fundamental frequency image, Generative adversarial networks, Homophonic fingerprint, Multiple fundamental frequency estimation

CLC Number: 

  • TP183
[1]SUN M.Applied research on music recognition technology[J].Consumer Electronics,2020(4):62-63.
[2]CHEN Y W,LI K,HAN Y,et al.Musical Note Recognition ofMusical Instruments Based on MFCC and Constant Q Transform[J].Computer Science,2020,47(3):149-155.
[2]LIU Y,ZHAO T Z,JIANG Y Q,et al.Improved piano music recognition algorithm based on autocorrelation function[J].Journal of Wuhan University of Technology,2018,40(2):208-213.
[3]WAN Y,WANG X L,ZHOU R H,et al.Piano multi note estimation algorithm based on spectral envelope nonnegative matrix decomposition[C]//Proceedings of the 5th Academic Exchange Meeting Commemorating the 50th Anniversary of the Institute of Acoustics,Chinese Academy of Sciences.2014:283-287.
[4]HUMPHREY E J,BELLO J P.Rethinking Automatic ChordRecognition with Convolutional Neural Networks[C]//International Conference on Machine Learning & Applications.IEEE,2013.
[5]ALEX K,ILYA S,GEOFFREY E.ImageNet Classification with Deep Convolutional Neural Networks[J].Communications of the ACM,2017,60(6):84-90.
[6]QUAN Z.Convolutional Neural Networks[C]//The 3rd International Conference on Electromechanical Control Technology and Transportation.2018:434-439.
[7]KORZENIOWSKI F,WIDMER G.A Fully Convolutional Deep Auditory Model for Musical Chord Recognition[C]//International Workshop on Machine Learning for Signal Processing (MLSP).IEEE,2016.
[8]ZHANG X L,PENG Y.Audio recognition method based on residual network and random forest[J].Computer Engineering and Science,2019,41(4):727-732.
[9]DENG J Q,KWOK Y K.Large vocabulary automatic chord estimation using bidirectional long short-term memory recurrent neural network with even chance training[J].Journal of New Music Research,2018,47(1):53-67.
[10]RAZVAN P,CAGLAR G,KYUNGHYUN C,et al.How toConstruct Deep Recurrent Neural Networks[J].arXiv:1312.6026,2014.
[11]MESEGUER-BROCAL G,PEETERS G.Conditioned-U-Net:Introducing a control mechanism in the U-Net for multiple source separations[J].arXiv:1907.01277,2019.
[12]LIECK R,ROHRMEIER M.Modelling hierarchical key structure with pitch scapes[C]//Proceedings of the 21st Internatio-nal Society for Music Information Retrieval Conference.Montréal,Canada,2020.
[13]KLAPURI A P.Multiple fundamental frequency estimationbased on harmonicity and spectral smoothness[J].IEEE Tran-sactions on Speech and Audio Proceessing,2003,11(6):804-816.
[14]CHEN J.Research on multi fundamental frequency estimation of piano music[D].Chengdu:University of Electronic Science and technology,2016.
[15]YU L,WU H J,JIANG W K.Multi channel speech enhance-ment based on beamforming and Gan networks[J].Noise and Vibration Control,2018,38(z1):591-596.
[16]LIU H,LI Y,YUAN H Q,et al.Speech signal separation based on generated countermeasure network[J].Computer Enginee-ring,2020,46(1):302-308.
[17]LI Y P,CAO P,SHI Y,et al.Speech conversion based on variational auto encoder and auxiliary classifier in non parallel text[J].Fudan Journal (Natural Science Edition),2020,59(3):322-329.
[18]CHENG X Y,XIE L,ZHU J X,et al.A review of generative countermeasure network Gan[J].Computer Science,2019,46(3):74-81.
[19]PHILLIP I,JUNYAN Z,TINGHUI Z,et al.Image-to-ImageTranslation with Conditional Adversarial Networks[J].arXiv:1611.07004,2018.
[20]EMIYA V,BADEAU R,DAVID B.Multipitch estimation ofpiano sounds using a new probabilistic spectral smoothness principle[J].IEEE Transactions on Audio,Speech,and Language Processing,2010,18(6):1643-1654.
[1] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[2] WANG Shan, XU Chu-yi, SHI Chun-xiang, ZHANG Ying. Study on Cloud Classification Method of Satellite Cloud Images Based on CNN-LSTM [J]. Computer Science, 2022, 49(6A): 675-679.
[3] XU Guo-ning, CHEN Yi-peng, CHEN Yi-ming, CHEN Jin-yin, WEN Hao. Data Debiasing Method Based on Constrained Optimized Generative Adversarial Networks [J]. Computer Science, 2022, 49(6A): 184-190.
[4] XU Hui, KANG Jin-meng, ZHANG Jia-wan. Digital Mural Inpainting Method Based on Feature Perception [J]. Computer Science, 2022, 49(6): 217-223.
[5] GAO Zhi-yu, WANG Tian-jing, WANG Yue, SHEN Hang, BAI Guang-wei. Traffic Prediction Method for 5G Network Based on Generative Adversarial Network [J]. Computer Science, 2022, 49(4): 321-328.
[6] DOU Zhi, WANG Ning, WANG Shi-jie, WANG Zhi-hui, LI Hao-jie. Sketch Colorization Method with Drawing Prior [J]. Computer Science, 2022, 49(4): 195-202.
[7] LIU Yang, LI Fan-zhang. Fiber Bundle Meta-learning Algorithm Based on Variational Bayes [J]. Computer Science, 2022, 49(3): 225-231.
[8] TAN Xin-yue, HE Xiao-hai, WANG Zheng-yong, LUO Xiao-dong, QING Lin-bo. Text-to-Image Generation Technology Based on Transformer Cross Attention [J]. Computer Science, 2022, 49(2): 107-115.
[9] ZHANG Wei-qi, TANG Yi-feng, LI Lin-yan, HU Fu-yuan. Image Stream From Paragraph Method Based on Scene Graph [J]. Computer Science, 2022, 49(1): 233-240.
[10] HU Yan-li, TONG Tan-qian, ZHANG Xiao-yu, PENG Juan. Self-attention-based BGRU and CNN for Sentiment Analysis [J]. Computer Science, 2022, 49(1): 252-258.
[11] XU Tao, TIAN Chong-yang, LIU Cai-hua. Deep Learning for Abnormal Crowd Behavior Detection:A Review [J]. Computer Science, 2021, 48(9): 125-134.
[12] LIN Zhen-xian, ZHANG Meng-kai, WU Cheng-mao, ZHENG Xing-ning. Face Image Inpainting with Generative Adversarial Network [J]. Computer Science, 2021, 48(9): 174-180.
[13] PAN Xiao-qin, LU Tian-liang, DU Yan-hui, TONG Xin. Overview of Speech Synthesis and Voice Conversion Technology Based on Deep Learning [J]. Computer Science, 2021, 48(8): 200-208.
[14] YE Hong-liang, ZHU Wan-ning, HONG Lei. Music Style Transfer Method with Human Voice Based on CQT and Mel-spectrum [J]. Computer Science, 2021, 48(6A): 326-330.
[15] WANG Jian-ming, LI Xiang-feng, YE Lei, ZUO Dun-wen, ZHANG Li-ping. Medical Image Deblur Using Generative Adversarial Networks with Channel Attention [J]. Computer Science, 2021, 48(6A): 101-106.
Full text



No Suggested Reading articles found!