计算机科学 ›› 2016, Vol. 43 ›› Issue (Z11): 252-255.doi: 10.11896/j.issn.1002-137X.2016.11A.058

• 模式识别与图像处理 • 上一篇    下一篇

基于层次语义的Web服装图像智能采集方法

耿增民,商书元,邵新艳,周毅灵,马玲   

  1. 北京服装学院数字与交互媒体北京市重点实验室 北京100029;北京服装学院计算机信息中心 北京100029,北京服装学院计算机信息中心 北京100029,北京服装学院服装艺术与工程学院 北京100029,北京服装学院计算机信息中心 北京100029,北京服装学院服装艺术与工程学院 北京100029
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受北京市教育科学“十二五”规划重点课题(AJA11174),教育部人文社科项目(12YJA760014),北京市教育委员会专项资助

Hierarchical Semantic-based Web Intelligent Fashion Image Retrieval Method

GENG Zeng-min, SHANG Shu-yuan, SHAO Xin-yan, ZHOU Yi-ling and MA Lin   

  • Online:2018-12-01 Published:2018-12-01

摘要: 以大规模智能采集互联网中的服装图像为目的,研究如何利用互联网上服装图像的伴随文本与服装图像概念之间的关联,实现自动采集各语义对应的服装图像。在HITS(Hyperlink-Induced Topic Search)算法的基础上提出一个基于层次语义的图像采集算法SICR(Semantic-based Image Collection Robot)。该算法在层次语义库的支持下,扩充根集与去除链接工厂页面同步进行。在爬取链接网页前,进行锚文字的相似度计算和页面内容的概念分析,舍弃不符合语义的页面,只下载满足语义的服装图像。算法克服了基于文本分析或链接分析的图像自动提取算法的不足,具有较高的准确率和召回率,实验结果证明了SICR算法的有效性。

关键词: 图像语义,图像检索,服装图像,Web挖掘

Abstract: Aiming at the large-scale automatic collection of fashion images from Web,this paper studied how to use association between the accompany text and the concept of fashion images on Web pages to collect images automatically.Based on the acquisition of semantic-based Web content and the drawbacks of HITS (Hyperlink-induced Topic Search) method,a novel SICR (Semantic-based Image Collection Robot) method was proposed to collect the fashion images from Web.The proposed method (SICR),under the support of the hierarchical semantic library,removes Link Farm page in the expansion of root set,does the similarity calculation for anchor text when crawling link pages.In addition,it makes a brief conceptual analysis of the page content before downloading images.The experimental results on the large-scale dataset have demonstrated that the proposed method can overcome the deficiency of only text or link-based analysis and improve the precision rate and recall rate of fashion image retrieval,experimental results demonstrate the effectiveness of SICR.

Key words: Image semantics,Image retrial,Fashion image,Web mining

[1] Paulus M J,Ingersoll R.Digital Wisdom for a Digital Age:Spiri-tuality and Technology in the 21st Century[C]∥Christ and Cascadia.2014
[2] Babenko A,Slesarev A,Chigorin A,et al.Neural codes for image retrieval[M]∥Computer Vision-ECCV 2014.Springer International Publishing,2014:584-599
[3] Wang R X,Peng G H,Zheng H C.A New Image Retrieval Algorithm Based on Sparse Coding[C]∥2015 International Conference on Artificial Intelligence and Industrial Engineering.Atlantis Press,2015
[4] Pandey M,Lazebnik S.Scene recognition and weakly supervised object localization with deformable part-based models[C]∥2011 IEEE International Conference on Computer Vision (ICCV).IEEE,2011:1307-1314
[5] Li L,Li F.OPTIMOL:Automatic online picture collection via incremental model learning[J].International Journal of ComputerVision,2010,8(2):147-168
[6] Cheng Qi-min,et al.Combining SIFT and global features forWeb image classification[M]∥Advances in Multimedia Information Processing-PCM 2012.Springer Berlin Heidelberg,2012:739-747
[7] Pereira R,Lopes L S,Silva A.Semantic image search and subset selection for classifier training in object recognition[M].Progress in Artificial Intelligence.Springer Berlin Heidelberg,2009:338-349
[8] 张磊.大规模互联网图像检索与模式挖掘[J].中国科学信息科学(中文版),2013,43(12):1641-1653
[9] Ishii H,Tempo R,Bai E W.A web aggregation approach for distributed randomized PageRank algorithms[J].IEEE Transactions on Automatic Control,2012,57(11):2703-2717
[10] Singh A,Sharma S.Role of Page ranking algorithm in Searching the Web:A Survey[J].International Journal of Engineering & Technology,Management and Applied Sciences,2014,1(1):39-43
[11] Lappas T,Liu Kun,Terzi E.A Survey of Algorithms and Systems for Expert Location in Social Networks[M]∥Social Network Data Analytics.Springer US,2011:215-241
[12] 吴少华,崔鑫,胡勇.基于 SNA 的网络舆情演变分析方法[J].四川大学学报(工程科学版),2015,47(1):138-142
[13] Dewgun T K,Chauhan P S.A Survey on Web usage Mining:Process,Techniques and Applications[C]∥International Journal of Engineering Research and Technology.ESRSA Publications,2015
[14] 孙颖伟,眭蕴慧,张磊,等.数据挖掘技术在中医病证规律研究中的应用进展[J].北京中医药,2015,34(1):70-74
[15] Liu Xin-yue,Lin Hong-fei,Zhang Cong.An Improved HITS Algorithm Based on Page-query Similarity and Page Popularity[J].Journal of Computers,2012,7(1):130-134
[16] Wu Ming-fang,Hawking D,Turpin A,et al.Using Anchor Text for Homepage and Topic Distillation Search Tasks[J].Journal of the American Society for Information Science and Technology,2012,3(6):1235-1255
[17] Alsoos M,Kheirbek A.A Semantic Approach to Enhance HITS Algorithm for Extracting Associated Concepts using ConceptNet[J].Journal of Digital Information Management,2015,13(1):55
[18] Amin M S,Kabir S,Kabir R.A Score based Web Page Ranking Algorithm[J].International Journal of Computer Applications,2015,110(12):11-15
[19] Saraswathi D,Kavitha R.A new enhanced technique for linkfarm detection[C]∥2012 International Conference on Pattern Recognition,Informatics and Medical Engineering (PRIME).IEEE,2012:74-81
[20] Hatakenaka S,Miura T.Query and topic sensitive pagerank for general documents[C]∥2012 14th IEEE International Symposiumon Web Systems Evolution (WSE).IEEE,2012:97-101
[21] Geng Zeng-min,Li Xue-fei,Liu Yu-yu.Design of an Intelligent Fashion Information System[C]∥TBIS 2011 Advanced Textiles,Fashionable industry.Beijing,China,2011:27-29
[22] 董振东,董强.知网[EB/OL].http://www.keenage.com,2013
[23] 黄仁,王良伟.基于主题相关概念和网页分块的主题爬虫研究[J].计算机应用研究,2013,30(8):2377-2380
[24] 万玉钗,刘峡壁,韩菲霏,等.用于提高谷歌图像搜索结果的二分类器在线学习方法 [J].自动化学报,2014, 40(8):1699-1708
[25] Zhou N,Fan J.Automatic image-text alignment for large-scale Web image indexing and retrieval[J].Pattern Recognition,2015,48(1):205-219

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!