计算机科学 ›› 2011, Vol. 38 ›› Issue (1): 198-202.
• 数据库与数据挖掘 • 上一篇 下一篇
宫继兵,唐杰,杨文军
出版日期:
发布日期:
基金资助:
GONG Ji-bing,TANG Jie,YANG Wen-jun
Online:
Published:
摘要: 大规模的网络视频信息既为用户信息分享带来了方便,同时也为国家监管部门带来了新的挑战。考虑到效率问题,在线视频监管则主要考虑视频描述信息。主要研究了网络视频描述信息的抽取问题,提出了一种新的Web信息抽取方法:通用抽取引擎框架,其主要包括对视频描述信息抽取问题的形式化描述和用户感知的视频网站逻辑模型。该方法在国家某部委的视频监管项目中已得到应用,并取得了很好的效果。实验结果表明,该方法的扩展性、通用性和抽取准确率大大优于其他方法。
关键词: 通用抽取引擎框架,网络视频监管,视频网站逻辑模型,Web信息抽取,抽取模式产生算法
Abstract: The large size of video collection not only provides an easy way for users to share information, but also brings a big challenge for managing them, in particular online monitoring. A critical rectuirement to monitor the video information is to accurately and adaptively identify the key information describing the video, which is also the first step for video analysis and video search. In this paper, we focused on the extraction problem of the video information from different websites. Specifically,we proposed an engine framework for information extraction. We formally defined the description model in the framework and implemented a customizable engine for information. The proposed framework has been applied to a real-world application of a national department and obtains promising results. Experimental results show that the proposed approach can effectively extract the video information and it significantly outperforms the baseline methods.
Key words: General extraction engine framework, Internet video monitoring, Logical model of video website, Web information extraction, Algorithms for generating extraction patterns
宫继兵,唐杰,杨文军. 通用抽取引擎框架:一种新的Web信息抽取方法的研究[J]. 计算机科学, 2011, 38(1): 198-202. https://doi.org/
GONG Ji-bing,TANG Jie,YANG Wen-jun. General Extraction Engine Framework:Research of a New Approach for Web Information Extraction[J]. Computer Science, 2011, 38(1): 198-202. https://doi.org/
0 / / 推荐
导出引用管理器 EndNote|Reference Manager|ProCite|BibTeX|RefWorks
链接本文: https://www.jsjkx.com/CN/
https://www.jsjkx.com/CN/Y2011/V38/I1/198
Cited