Computer Science ›› 2018, Vol. 45 ›› Issue (11): 256-260.doi: 10.11896/j.issn.1002-137X.2018.11.040

• Artificial Intelligence • Previous Articles     Next Articles

Study on Big Data Mining Method Based on Sparse Representation and Feature Weighting

CAI Liu-ping1, XIE Hui2, ZHANG Fu-quan3, ZHANG Long-fei3   

  1. (School of Computer Science & Engineering,Tianhe College of Guangdong Polytechnic Normal University,Guangzhou 510540,China)1
    (Department of Computer Sciences and Technology,Tsinghua University,Beijing 100084,China)2
    (School of Software,Beijing Institute of Technology,Beijing 100081,China)3
  • Received:2018-02-14 Published:2019-02-25

Abstract: In order to improve the efficiency and accuracy of big data mining,this paper applied the sparse representation and feature weighting into big data processing.At first,the features of big data are classified by solving the sparse mode of linear equation.In the process of solving the sparse solution,a vector norm is utilized to transform this process into the process of solving the optimization objective function.After feature classification,feature extraction is executed to reduce the dimensionality of data.Finally,the distribution of data in the class is combined sufficiently to conduct weighting effectively,thus realizing data mining.The experimental results suggest that the proposed algorithm is supe-rior to the common feature extraction and feature weighting algorithms in the terms of recall and precision.

Key words: Big data, Data mining, Feature extraction, Feature weighting, Sparse representation

CLC Number: 

  • TP301
[1]LIANG J Y.Challenges and Reflections on large data mining[J].Computer Science,2016,43(7):1-2.(in Chinese)
梁吉业.大数据挖掘面临的挑战与思考[J].计算机科学,2016,43(7):1-2.
[2]FENG Z,ZHU Y.A Survey on Trajectory Data Mining:Techniques and Applications[J].IEEE Access,2017,4:2056-2067.
[3]ZHANG Z,XU Y,YANG J,et al.A Survey of Sparse Representation:Algorithms and Applications[J].IEEE Access,2017,3:490-530.
[4]LIU L,TRAN T D,SANG P C.Partial face recognition:A sparse representation-based approach[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2016:2389-2393.
[5]QIU D,LIU Y.Improved image super-resolution via sparse representation[J].Video Engineering,2016,12(8):100-104.
[6]BOLÓN-CANEDO V,SÁNCHEZ-MAROÑO N,ALONSO-BET- ANZOS A.Feature selection for high-dimensional data[J].Progress in Artificial Intelligence,2016,5(2):65-75.
[7]LIU J H,LIN M L,ZHANG J,et al.A kind of heuristic local random feature selection algorithm[J].Computer Engineering and Applications,2016,52(2):170-174.(in Chinese)
刘景华,林梦雷,张佳,等.一种启发式的局部随机特征选择算法[J].计算机工程与应用,2016,52(2):170-174.
[8]ZHANG Z,HANCOCK E R.A Graph-Based Approach to Feature Selection[C]∥International Conference on Graph-Based Representations in Pattern Recognition.Springer-Verlag,2017:205-214.
[9]RIVEROMORENO C J,BRES S.Texture Feature Extraction and Indexing by Hermite Filters[C]∥International Conference on Pattern Recognition.IEEE,2017:684-687.
[10]JIANG F,LI G H,YUE X.Semantic-based Feature Extraction Method for Document[J].Computer Science,2016,43(2):254-2589.(in Chinese)
姜芳,李国和,岳翔.基于语义的文档特征提取研究方法[J].计算机科学,2016,43(2):254-258.
[11]ZHOU G,CICHOCKI A,ZHANG Y,et al.Group Component Analysis for Multiblock Data:Common and Individual Feature Extraction[J].IEEE Transations Neural Networks Learning Systems,2016,27(11):2426-2439.
[12]XIAO L Y,CHEN X H,LIN X L.Feature Weighted and Improved Partition Fuzzy C-Means Cluster Algorithm[J].Microelectronics &Computer,2016,33(10):143-146.(in Chinese)
肖林云,陈秀宏,林喜兰.特征加权和优化划分的模糊C均值聚类算法[J].微电子学与计算机,2016,33(10):143-146.
[13]ZHANG L,JIANG L,LI C,et al.Two feature weighting approaches for naive Bayes text classifiers[J].Knowledge-Based Systems,2016,100(C):137-144.
[14]LUO Y,ZHAO S L,LI X C,et al.Text keyword extraction method based on word frequency statistics[J].Journal of Computer Applications,2016,36(3):718-725.(in Chinese)
罗燕,赵书良,李晓超,等.基于词频统计的文本关键词提取方法[J].计算机应用,2016,36(3):718-725.
[15]CHEN Z,XIA J B,BAI J,et al.Feature Extraction Algorithm Based on Evolutionary Deep Learning[J].Computer Science,2015,42(11):288-292.(in Chinese)
陈珍,夏靖波,柏骏,等.基于进化深度学习的特征提取算法[J].计算机科学,2015,42(11):288-292.
[16]ZENG Q S,HUANG X Y.Fast Data Mining Algorithm Based on FP-tree.Journal of Chongqing University of Technology(Natural Science),2009,23(10):72-76.(in Chinese)
曾庆森,黄贤英.基于FP-tree的快速数据挖掘算法.重庆理工大学学报(自然科学),2009,23(10):72-76.
[1] HE Qiang, YIN Zhen-yu, HUANG Min, WANG Xing-wei, WANG Yuan-tian, CUI Shuo, ZHAO Yong. Survey of Influence Analysis of Evolutionary Network Based on Big Data [J]. Computer Science, 2022, 49(8): 1-11.
[2] LI Rong-fan, ZHONG Ting, WU Jin, ZHOU Fan, KUANG Ping. Spatio-Temporal Attention-based Kriging for Land Deformation Data Interpolation [J]. Computer Science, 2022, 49(8): 33-39.
[3] CHEN Jing, WU Ling-ling. Mixed Attribute Feature Detection Method of Internet of Vehicles Big Datain Multi-source Heterogeneous Environment [J]. Computer Science, 2022, 49(8): 108-112.
[4] ZHANG Yuan, KANG Le, GONG Zhao-hui, ZHANG Zhi-hong. Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM [J]. Computer Science, 2022, 49(7): 31-39.
[5] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[6] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[7] LIU Wei-ye, LU Hui-min, LI Yu-peng, MA Ning. Survey on Finger Vein Recognition Research [J]. Computer Science, 2022, 49(6A): 1-11.
[8] GAO Yuan-hao, LUO Xiao-qing, ZHANG Zhan-cheng. Infrared and Visible Image Fusion Based on Feature Separation [J]. Computer Science, 2022, 49(5): 58-63.
[9] SUN Xuan, WANG Huan-xiao. Capability Building for Government Big Data Safety Protection:Discussions from Technologicaland Management Perspectives [J]. Computer Science, 2022, 49(4): 67-73.
[10] YAO Xiao-ming, DING Shi-chang, ZHAO Tao, HUANG Hong, LUO Jar-der, FU Xiao-ming. Big Data-driven Based Socioeconomic Status Analysis:A Survey [J]. Computer Science, 2022, 49(4): 80-87.
[11] WANG Mei-shan, YAO Lan, GAO Fu-xiang, XU Jun-can. Study on Differential Privacy Protection for Medical Set-Valued Data [J]. Computer Science, 2022, 49(4): 362-368.
[12] ZUO Jie-ge, LIU Xiao-ming, CAI Bing. Outdoor Image Weather Recognition Based on Image Blocks and Feature Fusion [J]. Computer Science, 2022, 49(3): 197-203.
[13] KONG Yu-ting, TAN Fu-xiang, ZHAO Xin, ZHANG Zheng-hang, BAI Lu, QIAN Yu-rong. Review of K-means Algorithm Optimization Based on Differential Privacy [J]. Computer Science, 2022, 49(2): 162-173.
[14] REN Shou-peng, LI Jin, WANG Jing-ru, YUE Kun. Ensemble Regression Decision Trees-based lncRNA-disease Association Prediction [J]. Computer Science, 2022, 49(2): 265-271.
[15] MA Dong, LI Xin-yuan, CHEN Hong-mei, XIAO Qing. Mining Spatial co-location Patterns with Star High Influence [J]. Computer Science, 2022, 49(1): 166-174.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!