Computer Science ›› 2016, Vol. 43 ›› Issue (Z11): 31-34.doi: 10.11896/j.issn.1002-137X.2016.11A.007
Previous Articles Next Articles
XIE Fang-li, ZHOU Guo-min and WANG Jian
[1] Gibson,David,Punera K,et al.The volume and evolution of Webpage templates[C]∥Special Interest Tracks and Posters of the 14th International Conference on World Wide Web.ACM,2005 [2] Wang Ji-ying,Lochovsky F H.Data-rich section extraction from html pages[C]∥Proceedings of the Third International Conference on Web Information Systems Engineering,2002(WISE 2002).IEEE,2002:313-322 [3] Yi L,Liu B,Li X.Eliminating noisy information in web pages for data mining[C]∥Proceedings of the 9th ACM SIGKDD Int Conference on Knowledge Discovery and Data Mining.New York:ACM,2003:296-305 [4] 欧健文,董守斌,蔡斌.模板化网页主题信息的提取方法[J].清华大学学报(自然科学版),2008(S1):1743-1747 [5] Bauer,Daniel,et al.FIASCO:Filtering the Internet by Automatic Subtree Classification,Osnabruck.Building and Exploring Web Corpora[C]∥Proceedings of the 3rd Web as Corpus Workshop,Incorporating Cleaneval.Vol.4.2007 [6] Lin S H,Ho J M.Discovering informative content blocks from Web documents[C]∥Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2002 [7] 时达明,林鸿飞,杨志豪.基于网页框架和规则的网页噪音去除方法[J].计算机工程,2007,3(19):276-278 [8] Cai Deng,et al.VIPS:a vision based page segmentation algorithm.Microsoft technical report[R].MSR-TR-2003-79,2003 [9] 邹永强,钟志农.一种高效的新闻网页噪声过滤方法[J].微型机与应用,2011,0(16):64-67 |
No related articles found! |
|