计算机科学 ›› 2016, Vol. 43 ›› Issue (4): 182-187.doi: 10.11896/j.issn.1002-137X.2016.04.037

• 软件与数据库技术 • 上一篇    下一篇

一种基于预先索引的关系数据库关键词搜索方法

葛唯益,宗士强,尹文科   

  1. 信息系统工程重点实验室 南京210007,信息系统工程重点实验室 南京210007,信息系统工程重点实验室 南京210007
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(61402426),软件新技术与产业化协同创新中心资助

Keyword Search for Relational Databases Based on Offline Index

GE Wei-yi, ZONG Shi-qiang and YIN Wen-ke   

  • Online:2018-12-01 Published:2018-12-01

摘要: 关系数据库的关键词搜索面临的最大挑战在于满足需求的答案可能来自多个关系的元组的组合。现有主流方法通过定位每个关键词对应的元组并动态发现元组之间的关联来得到搜索结果。然而当数据库规模较大或模式复杂时,这些方法存在搜索效率低的问题;此外,这些方法因只能支持简单的关键词查询而实用性受到限制。为此,提出对元组的组合进行预先索引从而加快搜索,此外还对其索引效率及查询能力进行改进以提高系统的可用性。首先,为了提高搜索和索引效率,提出基于模式图的元组连接枚举技术,该技术利用无环模式图枚举合适的关系连接,将其转换为SQL语句在数据库中执行以得到可能的元组连接;其次,为了保证结果的紧致性,提出了1到m元组连接的预先索引与顺序搜索机制,该机制对元组连接进行由小到大的搜索,并限制所有包含已有结果的元组连接都不再参与搜索;最后,为了支持复杂查询,提出基于域的索引结构,为每个元组连接建立面向不同查询类型的域,通过查找多个域并对结果进行逻辑组合得到最终结果。实验表明,相比于已有技术,本技术具有较快的索引速度与较高的查询效率,并能提供如布尔查询、属性查询等的复杂查询能力。

关键词: 关系数据库,关键词搜索,预先索引,紧致性,复杂查询

Abstract: The biggest challenge for keyword search over relational databases is that results are often assembled from tuples in several tables.Dominant approaches find tuples hit by keywords and identify their joins on the fly to form results,which are rather inefficient for databases with large scale or complex schema.Besides,these approaches only support simple keyword query,which limits their practical usage.Regarding this,we proposed an alternative way by in-dexing joinable tuples offline to speed online search,and strove to improve its index efficiency and query capability for practical usage.Firstly,to improve search and index efficiency,we proposed an approach to utilize schema graph information to enumerate joinable tuples.This approach discovers all suitable joinable tables and translates them to SQL queries,which are sent to database interfaces for getting possible joinable tuples.Secondly,to ensure the compactness of results,we proposed an approach to index joinable tuples and searched them by order of their size.The approach selects rele-vant joinable tuples with small scales in advance and excludes those joinable tuples containing them from the next round of selection.Finally,to support complex query,we proposed a field-based index structure,which uses different fields for different search types.At search time,these fields are queried and their results are sent to perform logical operations to get the final results.Experiments show that,compared to existing approaches,our approach outstands in index and search efficiency and provides query capability close to the SQL.

Key words: Relational database,Keywords search,Offline index,Compactness,Complex query

[1] Agrawal R,Aliamaki A,Bernstein P A,et al.The Claremont Report on Database Research,2008[C/OL].http://db.cs.berkeley.edu/claremont
[2] Bhalotia G,Hulgeri A,Nakhe C,et al.Keyword Searching and Browsing in Databases Using BANKS[C]∥Proc.of the 18th Int’l Conf.on Data Engineering,2002(ICDE 2002).San Jose:IEEE Computer Society Press,2002:431-440
[3] Agrawal S,Chaudhuri S,Das G.DBXplorer:A System for Keyword-based Search over Relational Databases[C]∥Proc.of the 18th Int’l Conf.on Data Engineering,2002(ICDE 2002).San Jose:IEEE Computer Society Press,2002:5-16
[4] Hristidis V,Papakonstantinou Y.DISCOVER:Keyword Search in Relational Databases[C]∥Proc.of the 28th Int’l Conf.on Very Large Data Bases,2002(VLDB 2002).Hong Kong:Morgan Kaufmann Publishers,2002:670-681
[5] Hristidis V,Papakonstantinou Y.Efficient IR-style Keyword Sea-rch over Relational Databases[C]∥Proc.of the 29th Int’l Conf.on Very Large Data Bases,2003(VLDB 2003).Berlin:Morgan Kaufmann Publishers,2003:850-861
[6] Kacholia V,Pandit S,Chakrabarti S,et al.Bidirectional Expansion for Keyword Search on Graph Databases[C]∥Proc.of the 31st Int’l Conf.on Very Large Data Bases,2005(VLDB 2005).Trondheim:ACM Press,2005:505-516
[7] He Hao,Wang Hai-xun,Yang Jun,et al.Binks:Ranked Key-word Searches on Graphs[C]∥Proc.of the 2007 ACM SIGMOD Conf.on Management of Data,2007(SIGMOD 2007).Beijing:ACM,2007:305-316
[8] Li Guo-liang,Ooi B C,Feng Jian-hua,et al.EASE:An Effective 3-in-1 Keyword Search Method for Unstructured,Semi-structured and Structured Data[C]∥Proc.of the 2008 ACM SIGMOD Conf.on Management of Data,2008(SIGMOD 2008).Vancouver:ACM,2008:903-914
[9] Luo Yi,Lin Xue-min,Wang Wei,et al.Spark:Top-k Keyword Query in Relational Databases[C]∥Proc.of the 2007 ACM SIGMOD Conf.on Management of Data,2007(SIGMOD 2007).Beijing:A.CM,2007:115-126
[10] Ding B,Yu J X,Wang S,et al.Finding Top-k Min-cost Connected Trees in Databases[C]∥Proc.of the 23rd Int’l Conf.on Data Engineering,2007(ICDE 2007).Istanbul:IEEE Computer Society Press,2007:836-845
[11] Su Q,Widom J.Indexing Relational Database Content Offline for Efficient Keyword-based Search[C]∥Proc.of the 9th Int’l Data Engineering and Applications Symposium,2005.Montreal:IEEE Computer Society Press,2005:297-306
[12] Zhan Jiang,Wang Shan.ITREKS:Keyword Search over Rela-tional Database by Indexing Tuple Relationship[C]∥Proc.of the 12th Int’l Conf.on Database Systems for Adavanced Applications,2007.Bangkok:Springer-Verlag,2007:67-78
[13] Li Guo-liang,Feng Jian-hua,Zhou Li-zhu.Retune:Retrievingand Materializing Tuple Units for Effective Keyword Search over Relational Databases[C]∥Proc.of the 27th Int’l Data on Conceptual Modeling,2008.Barcelona:Springer-Verlag,2008:469-483
[14] Dixon P.Basics of Oracle Text Retrieval[J].Bulletin of theTechnical Committee on Data Engineering,2001,24(4):11-14
[15] Maier A,Simmen D E.DB2 Optimization in Support of FullText Search[J].Bulletin of the Technical Committee on Data Engineering,2001,24(4):3-6
[16] Hamilton JR,Nayak TK.Microsoft SQL Server Full-text Search[J].Bulletin of the Technical Committee on Data Engineering,2001,24(4):7-10
[17] Reich G,Widmayer P.Beyond Steiner’s Problem:A VLSI Oriented Generalization[C]∥Proc.of the 15th Int’l Workshop on Graph-Theoretic Concepts in Computer Science,1989.Castle Rolduc:Springer-Verlag,1989:196-210
[18] Nie Z,Zhang Y,Wen J,et al.Object-Level Ranking:BringingOrder to Web Objects[C]∥Proc.of the World Wide Web.2005
[19] Zhang J,Shao R J,Zeng Y M.Research on Object-level Information Retrieval over Relational Databases [J].Computer Science,2012,39(1):142-147(in Chinese) 张俊,邵仁俊,曾一鸣.对象级别的关系数据库信息检索技术研究[J].计算机科学,2012,39(1):142-147
[20] Owda M,Bandar Z,Crockett K.Conversation-Based Natural Lan-guage Interface to Relational Databases[C]∥2007 IEEE/WIC/ACM/ International Conferences on Web Intelligence and Intelligent Agent Technology Workshops.2007:363-367
[21] Li H,Tian J W,Wang Y Y,et al.Ontology-based Natural Language Interface to Relational Databases [J].Computer Science,2010,37(6):200-205(in Chinese) 李虎,田金文,王缓缓,等.基于Ontology的数据库自然语言查询接口的研究[J].计算机科学,2010,37(6):200-205

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!