Computer Science ›› 2012, Vol. 39 ›› Issue (4): 177-180.

Previous Articles     Next Articles

Research on Deep Web Classification Based on Domain Feature Text

  

  • Online:2018-11-16 Published:2018-11-16

Abstract: Automatic Decp Web classification is the basis of building Decp Web data intergration system. An approach was proposed to classify the Deep Web based on domain feature text. Using the ontology knowledge, the concepts which express the same semantics were firstly extracted from different texts. Then the definition of domain correlation was given as the quantitative criteria for feature text selection, in order to avoid the subjectivity and uncertainty of manual selection. In the process of the interface vector space model construction, an improved weighting method namedw I}FIDF was proposed to evaluate the different roles of feature text. At last, a KNN algorithm was used to classify these interface vectors. Comparative experiments indicate that the feature text selected by our method is accurate and effec- tive, and the new weighting method can improve the classification precision significantly and shows good stability in KNN classification.

Key words: Fcaturc tcxt, Domain classification, Vcctor space model, Dccp Web

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!