Computer Science ›› 2012, Vol. 39 ›› Issue (12): 149-152.

Previous Articles     Next Articles

xScraper:Bulk- and Deep-extracting Non-structured Web Information Based on Web-Harvest Techniques

  

  • Online:2018-11-16 Published:2018-11-16

Abstract: A system named xScraper was developed based on the data extraction rules investigation in Web-Harvest, 5 main functions of this system are(1) flexible specification of extraction rules to meet different application rectuire- menu; (2) controllable bulk non-structured data (incl. images) extraction from the same Web site; (3) deep extraction of topi}rclated information across many Web sites; (4) extraction of metadata from Web sites and transformation in to XML tags; (5) non-structured multi-media information management in databases. xScraper is a simple, practical and ex- tendable system. It provides value-added services over Web-Harvest and can meet different requirements of Web infor- matron extraction.

Key words: Web information extraction, xScraper, Web-Harvest core techniques

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!