计算机科学 ›› 2012, Vol. 39 ›› Issue (5): 208-212.

• 人工智能 • 上一篇    下一篇

一种有效的标签抽取和匹配方法

邹显春,吴春明,李盛瑜   

  1. (西南大学计算机与信息科学学院 重庆 400715) (重庆工商大学计算机与信息工程学院 重庆 400067)
  • 出版日期:2018-11-16 发布日期:2018-11-16

Effective Approach to Label Extraction and Matching

  • Online:2018-11-16 Published:2018-11-16

摘要: 标签抽取和匹配是查询接口理解的重要组成部分。提出了一种基于视觉的标签抽取和匹配方法,深入分析了相关匹配因子,给出了一种对查询接口表单进行重构的方法,它能依据接口HTMI_源代码自动还原出该表单的视觉布局特征。在最终的匹配算法中,综合考虑了基于label标记的匹配、基于文本语义的匹配以及基于位置特征的匹配。在8个领域共计277个查询接口上的实验证明了所提方法能取得较高的匹配精度。

关键词: 标签抽取,位置特征,表单布局,元素一标签匹配

Abstract: Label extraction and matching arc an important part of the query interface understanding. A vision-based label extraction and matching approach was proposed in this paper. First, the factors which affect label matching were deeply analyzed, and then, a method of reconstructing query interface by analyzing its html code was given correspondingly which can restore the visual layout of form effectively. Finally, the clement label matching was realized which comprehensively considers label tag,text semanteme and position feature. Experiments on 277 query interfaces in 8 domains demonstrate the feasibility of our proposed approach.

Key words: Label extraction, Position feature, Form layout, Element label matching

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!