计算机科学 ›› 2016, Vol. 43 ›› Issue (9): 91-98.doi: 10.11896/j.issn.1002-137X.2016.09.017

• 2015 年第三届CCF 大数据学术会议 • 上一篇    下一篇

模式级链接关联数据集上的关联规则挖掘研究

袁柳,张龙波   

  1. 陕西师范大学计算机科学学院 西安710062,山东理工大学计算机学院 淄博255049
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金项目:云计算环境下旅游信息个性化服务模型研究(41271387),中央高校基本科研业务费专项资金:模式级链接开放关联数据集上的数据挖掘关键技术研究(GK201503066)资助

Association Rules Mining on Schema-level Interconnected Associated Data

YUAN Liu and ZHANG Long-bo   

  • Online:2018-12-01 Published:2018-12-01

摘要: 针对关联数据集合呈现出的大数据特性和蕴含的语义信息,提出了首先建立关联数据集的模式级链接,再进行关联规则挖掘的方法。在同领域RDF数据集上定义RDF数据项模式并提出数据项模式的产生规则;利用RDF数据查询技术从数据项模式获得RDF数据项集合,进而再推导出特定领域内的关联规则。提出的基于关联数据RDF数据项模式的关联规则挖掘方法将关联规则挖掘扩展到同一领域内的数据集合而不再局限于单一数据集,同时给出了基于Hadoop的大规模RDF数据集上的关联规则挖掘的实现方案。实验结果验证了模式级链接对于关联规则挖掘的价值和所提方法的有效性。

关键词: 语义大数据,关联数据,本体,RDF,关联规则

Abstract: A schema-level interconnected association rules mining method for large scale associated data was proposed based on the semantic information implied in the associated data set.Instead of mining association rules from separated RDF data sets directly, firstly,we established schema-level linkage between different data sets.The RDF data item pattern generation rules are defined based on the schema-level linked datasets and then the RDF data query techniques are exploited for constructing RDF data items sets.The proposed data item patterns generation rules can extend the data mining objects from a single data set to multi-datasets in the same domain.A Hadoop based implementation plan of association rules mining was designed.The experiment results prove the value of establishing schema-level linkage on linked data and the effectiveness of the proposed method.

Key words: Semantic big data,Associated data,Ontology,RDF,Association rules

[1] Hausenblas M,Karnstedt M.Understanding Linked Open Data as a Web-Scale Database[C]∥2010 Second International Conference on Advances in Databases Knowledge and Data Applications (DBKDA).2010:56-61
[2] Quboa Q K,Saraee M.A State-of-the-Art Survey on Semantic Web Mining[J].Intelligent Information Management,2013,5(1):10-17
[3] Rettinger A,Losch U,Tresp V,et al.Mining the Semantic Web-Statistical Learning for Next Generation Knowledge Bases[J].Data Mining and Knowledge Discouvery,2012,24(3):613-662
[4] Nebot V,Berlanga R.Finding association rules in semantic web data[J].Knowledge-Based Systems,2012,25(1):51-62
[5] Yazdi A S H,Kahani M.A novel model for mining association rules from semantic Web data[C]∥2014 Iranian Conference on Intelligent Systems (ICIS).2014:1-4
[6] Abedjan F N Z.Context and Target Configurations for Mining RDF Data[C]∥Proceedings of the 1st International Workshop on Search and Mining Entity-relationship Data(SMER’11).2011
[7] Abedjan Z,Naumann F.Improving RDF Data Through Association Rule Mining[J].Datenbank-Spektrum,2013,13(2):111-120
[8] Jiang C,Coenen F,Zito M.A Survey of Frequent Subgraph Mi-ning Algorithms[J].Knowledge Engineering Review,2004,28(1):75-105
[9] Wang Y,Ramon J.An efficiently computable support measure for frequent subgraph pattern mining[M]∥Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Springer Berlin Heidelberg,2012:362-377
[10] Narasimha R I V,Vyas O P.LiDDM:A Data Mining System for Linked Data[C]∥Proceedings of the LDOW2011.Hyderabad,India,2011
[11] Khan M A,Grimnes G A,Dengel A.Two pre-processing operators for improved learning from semantic Web data [C]∥First Rapid Miner Community Meeting And Conference (RCOMM).2010
[12] Kiefer A B C,Locher A.Adding data mining support to SPARQL via statistical relational learning[C]∥Proceedings of the 5th European Semantic Web Conference on the Semantic Web:Research and Applications Methods(ESWC’08).2008:478-492
[13] Husain M F,Khan L,Kantarcioglu M,et al.Data intensive query processing for Semantic Web data using Hadoop and MapReduce[M].The University of Texas at Dallas,2011
[14] Husain M F,McGlothlin J,Masud M M,et al.Heuristics-Based Query Processing for Large RDF Graphs Using Cloud Computing[J].IEEE Transactions on Knowledge and Data Enginee-ring,2011,23(9):1312-1327
[15] Yuan P,Xie C,Jin H,et al.Dynamic and fast processing of queries on large-scale RDF data[J].Knowledge and Information Systems,2014,41(2):311-334
[16] Ali L,Janson T,Schindelhauer C.Towards Load Balancing and Parallelizing of RDF Query Processing in P2P Based Distributed RDF Data Stores[C]∥2014 22nd Euromicro International Conference on Parallel,Distributed and Network-Based Processing (PDP).IEEE,2014:307-311

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!