计算机科学 ›› 2017, Vol. 44 ›› Issue (Z6): 551-556.doi: 10.11896/j.issn.1002-137X.2017.6A.123

• 综合、交叉与应用 • 上一篇    下一篇

基于熵的音频指纹检索技术研究与实现

王伟,陈志高,孟宪凯,李伟   

  1. 海军医学研究所 上海200433,复旦大学计算机科学技术学院 上海201203,复旦大学计算机科学技术学院 上海201203,复旦大学计算机科学技术学院 上海201203
  • 出版日期:2017-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受基金项目:基于人声检测及分离的多版本流行音乐检索关键技术研究(NSFC61171128)资助

Research and Implementation of Identifying Music through Performances Using Entropy Based Audio-fingerprint

WANG Wei, CHEN Zhi-gao, MENG Xian-kai and LI Wei   

  • Online:2017-12-01 Published:2018-12-01

摘要: 介绍了一种基于熵的音频指纹检索技术,该技术采用音频的熵特征作为音频的指纹特征(AFP),在检索中,该指纹特征可以用多种串匹配算法进行信息比对。实验采用最大公共字串(LCS)、编辑距离(Levenshtein Distance)和动态时间规整(DTW)算法实现指纹特征匹配,并采用一定数量的歌曲文件作为实验的测试集。每首歌曲都有一个带有不同的较大失真的音频文件或由不同歌唱家演唱的不同版本,这些带有不同的较大失真的音频文件由原曲经过不同的严重音频处理得到,比如添加噪声、加快速度、剪辑等。实验结果显示,使用的3种匹配算法均可以将训练集中所有的歌曲正确地识别出来,从而证明了基于熵的音频指纹检索技术具有准确性、鲁棒性、区分性等优良性质。

关键词: 音频指纹,检索,熵,最大公共子串,编辑距离,动态时间规整

Abstract: A technology of identifying music using entropy based audio-fingerprint was introduced,which takes the music’s character of entropy as audio-fingerprint.In the domain of music identifying,the above audio-fingerprint enables to use flexible string matching algorithms.We adopted longest common subsequence (LCS),levenshtein distance and dynamic time warping (DTW) as the matching algorithms of this audio-fingerprint,and used a number of music as the test set.Every music has another performance which is generated from the original one,most of the other performances have been artificially changed,such as to be noise-accessed,accelerated,cut and so on,and some of them may even be paired of same music played by different orchestras.The obtained results are impressive,in which all the performances in the collection can be correctly identified either with LCS,levenhtein distance or the dynamic time warping (DTW) distances,proving the veracity,robustness and good distinguish ability.

Key words: Audio-fingerprint,Identifying,Entropy,LCS,Levenshtein distance,DTW

[1] 李伟,李晓强,陈芳,等.数字音频指纹技术综述[J].小型微型计算机系统,2008,29(11):2124-2130.
[2] FRAGOULIS D,ROUSOPOULOS G,PANAGOPOULOS T,et al.On the automated recognition of seriously distorted musical recordings [J].IEEE Transactions on Signal Processing,2001,49(4):898-908.
[3] LOGAN B.Mel Frequency Cepstral Coefficients for Music Mode-ling[C]∥International Symposium on Music Information Retrieval.2000.
[4] ALLAMANCHE E,HERRE J,HELMUTH O,et al.Audioid:Towards content-based identification of audio material [C]∥Proc Aes Convention.2001.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!