计算机科学 ›› 2013, Vol. 40 ›› Issue (6): 183-186.

• 软件与数据库技术 • 上一篇    下一篇

一种基于大数据的有效搜索方法

尤川川,张桂刚   

  1. 武汉大学软件工程国家重点实验室 武汉430072;清华大学信息技术研究院 北京100084
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受国家973计划项目(2011CB302302)资助

A Kind of Efficient Search Method Based on Big Data

YOU Chuan-chuan and ZHANG Gui-gang   

  • Online:2018-11-16 Published:2018-11-16

摘要: 针对大数据查询效率低下的问题,提出了一种有效的搜索方法。将共享的历史查询结果作为中间结果集,在新的查询请求到达时,首先与历史查询进行匹配,若能实现匹配,则直接将匹配部分的历史查询结果直接作为新查询请求结果的一部分。这减少了大量的对历史查询的重复计算,节省了搜索时间,提高了查询效率。实验对比分析表明,新的基于大数据的查询方法能较好地提高查询效率。

关键词: 大数据,搜索,查询网,云数据库

Abstract: This paper proposed an efficient search method to the problem of low efficiency for large dada queries.Using shared history query results as a set of intermediate results,when a new query request arrives,the first match for historical inquiry is directly added to the matching portion of the historical results for directly as part of the new query result of the request if achieving matching.It can reduce the large number of double counting query history,save search time and improve query efficiency.By experimental comparison and analysis show that data based query methods can improve query efficiency.

Key words: Big data,Search,Query network,Cloud database

[1] Dean J,Ghemawat S.MapReduce:Simplified data processing on large clusters[C]∥Brewer E,Chen P,eds.Proc.of the OSDI.California:USENIX Association,2004:137-150
[2] Ekanayake J,Li Hui,Zhang Bing-jing,et al.Twister:A Runtime for Iterative MapReduce[C]∥The First International Workshop on MapReduce and its Applications (MAPREDUCE’10).2010:110-119
[3] Bu Y Y,Howe B,Balazinska M,et al.HaLoop:Efficient iterative data processing on large clusters[J].PVLDB2010,2010,3(1/2):285-296
[4] Isard M,Budiu M,Yu Y,et al.Dryad:Distributed data-parallel programs from sequential building blocks[J].ACM SIGOPS Operating Systems Review,2007,1(3):59-72
[5] Zaharia M,Chowdhury M,Franklin M J,et al.Spark:Cluster Computing withWorking Sets[R].Technology report of UC Berkeley.2011
[6] Dittrich J,Quian′e-Ruiz J A,Jindal A,et al.Hadoop++:Ma-king a yellow elephant run like a cheetah (without it even noticing)[J].PVLDB,2010,3(1/2):518-529
[7] 陈国华,汤庸,彭泽武,等.基于学术社区的学术搜索引擎设计[J].计算机科学,2011,8(8):171-175
[8] 殷哲,曹炬.带差商信息的云搜索优化算法及其收敛性分析[J].计算机科学,2012,9(1):252-255,7
[9] 杨艺,周元.基于用户查询意图识别的Web搜索优化模型[J].计算机科学,2012,9(1):264-267

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!