计算机科学 ›› 2013, Vol. 40 ›› Issue (11): 228-230.
强保华,李巍,邹显春,汪天天,吴春明
QIANG Bao-hua,LI Wei,ZOU Xian-chun,WANG Tian-tian and WU Chun-ming
摘要: 集成查询接口的生成是Deep Web数据集成的重要组成环节。如何对不同领域的查询接口进行有效的聚类是生成集成查询接口时需要解决的核心问题之一。针对传统的向量空间模型在Deep Web查询接口聚类时单纯依赖关键词匹配的缺点,引入潜在语义分析(LSA)的方法来发掘查询接口之间的语义关系,并给出了基于潜在语义分析的Deep Web查询接口聚类算法,最后采用UIUC的Web集成资源库提供的数据进行了实验。结果表明,潜在语义分析的方法提高了同一领域查询接口之间的相似度,明显改善了Deep Web查询接口聚类的质量。
[1] 刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述[J].计算机学报,2007,0(9):1475-1489 [2] Olney,Andrew M.Generalizing Latent Semantic Analisis[C]∥2009IEEE International Conference on Semantic Computing.2009:40-46 [3] Liu Yun-feng,Qi Huan.Latent Semantic Analysis of Chinese Information[J].Journal of South China University of Technology (Natural Science),2004(32):107-111 [4] Li Ya-xiong,Zhang Jian-qiang,Dan Hu.Text Clustering Based on Domain Ontology and Latent Semantic Analysis[C]∥2010International Conference on Asian Language Processing.2010:219-222 [5] Thomas H.Unsupervised Learning by Probabilistic Latent Semantic Analysis[J].Machine Learning,2001,42(2):177-196 [6] 黄承慧,印鉴,侯昉.一种结合词项目TF-IDF方法的文本相似度量方法[J].计算机学报,2011,4(5):857-864 [7] Mao Qin-jiao,Feng Bao-qin,Pan Shan-lang.Latent Semantic Analysis for Query Ierfaces of Deep Web Site[J].Journal of SouthEast University (English Edition), 2008,4(3):312-314 [8] 盖杰,王怡,武港山.基于潜在语义分析的信息检索[J].计算机工程,2004,0(2):58-60 [9] Wu Chen,Vidyasagar P,Chang E.Latent Semantic analysis-The Dynamics of Semantics Web Services Discovery[J].Lecture Notes in Computer Science,2008,1:346-373 |
No related articles found! |
|