计算机科学 ›› 2011, Vol. 38 ›› Issue (10): 152-156.

• 数据库与数据挖掘 • 上一篇    下一篇

MXDR:一种基于关键字的XML多文档分布式检索方法

李霞,李战怀,张利军,陈群,李宁   

  1. (西北工业大学计算机学院 西安710072)
  • 出版日期:2018-11-16 发布日期:2018-11-16

MXDR, Distributed Information Retrieval for Multi-XML Document Based on Keywords

LI Xia,LI Zhan-huai,ZHANG Li-jun,CHEN Qun,LI Ning   

  • Online:2018-11-16 Published:2018-11-16

摘要: 基于关键字的XMI、检索技术是近几年信息检索领域的研究热点。但是由于关键字缺少XMI、结构语义信 息,检索结果和用户需求偏差较大,检索质量难以提高;而XML结构检索由于用户难以提出准确描述查询意图的查 询表达式而难以普及。另一个更突出的问题是现有的XML检索研究绝大多数都集中在单文档上,缺乏实用性。因 此提出一种基于关键字的结构检索方法,即用分布式方式实现对多XMI文档的检索,简称为MXDR (Multi-XMI_ Distributed RetrievaD。MXDR首先用一种兼顾结构和内容的聚类方法对多文档进行分类,通过分析查询关键字和类 别结构信息,确定分布查找策略,再结合查询关键字和XMI的结构信息,构建结构查询语句,最后通过结构查询系统 实现关键字检索。在多组真实数据Sigmod数据集上的验证结果表明,与经典的S工CA方法比较,MXDR方法具有较 高的查全率和查准率,尤其在检索效率上MXDR方法有显著优势。

关键词: XML多文档,关键字检索,结构检索,分布式

Abstract: The emergence of the Web has increased interests in XML data. Keyword search has attracted a great deal of attention for retrieving XMI_ data because it is a user-friendly mechanism. 13ut Keyword search is hard to directly im- prove search quality because lots of keyword-matched nodes may not contribute to the results. A more important issue is the current studies are focused on single XML retrieval, lack of practicability. To address the challenge, this article pro- posed a new approach for automatically correcting queries over Multi-XMI,called MXDR(Multi XML Distributed Re- trieval). We first classed multi XML documents by a clustering method,and elicited the common structure information. Then generated certifiable structured queries by analyzing the given keywords query and the common structure informa- tion of XML datasets. We can evaluate the generated structured queries over the XML data sources with any existing structure search engine. We conducted an experimental study on real-life multi-XML datasets. The experimental results show that MXDR is effective and efficient in supporting structural querics,compared with existing proposals.

Key words: Multi-ML,Keywords IR, Structure IR, Distributed

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!