计算机科学 ›› 2013, Vol. 40 ›› Issue (Z6): 157-159.
李贵,李征宇,陈韶刚,韩子扬,孙平,孙焕良
LI Gui,LI Zheng-yu,CHEN Shao-gang,HAN Zi-yang,SUN Ping and SUN Huan-liang
摘要: 面向领域的Web数据挖掘包括领域Web数据抽取和领域Web数据集成。针对领域数据抽取,提出了Web结构数据模型和Web表模式,给出了Web表定位和数据记录抽取的算法,针对领域Web数据集成,提出了基于领域模型的数据集成算法。结合行业领域的实际需求,验证了模型和算法的有效性。
[1] Cafarella M J,Halevy A,Wang D Z,et al.WebTables:Exploring the Power of Tables on the Web[C]∥Proceedings of VLDB-08.Auckland,New Zealand,2008:538-549 [2] Crestan E,Pantel P.Web-Scale Knowledge Extraction from Semi-Structured Tables[C]∥Proceedings of WWW-2010.Raleigh,North Carolina,USA,2010 [3] Liu Bing.Web Data Mining[M].俞勇,薛贵荣,韩定一,译.北京:清华大学出版社,2009:265-26 [4] Chen H,Tsai S,Tsai J.Mining Tables from Large-Scale HTML Texts[C]∥Proceedings of COLING-00.Saarbrücken,Germany,2000 [5] Robert G,Wilks Y.Information extraction:Beyond documentretrieval[J].Journal of Documentation,1998,54(1):70-105 [6] Gatterbauer W,Bohunsky P,Herzog M,et al.Towards Domain-Independent Information Extraction from Web Tables[C]∥Proceedings of WWW-07.Banff,Canda,2007:71-80 |
No related articles found! |
|