Saturday, 24 December 2011

WebSPHINX

Link: http://www.cs.cmu.edu/~rcm/websphinx/

Intro: WebSPHINX ( Website-Specific Processors for HTML INformation eXtraction) is a Java class library and interactive development environment for web crawlers. A web crawler (also called a robot or spider) is a program that browses and processes Web pages automatically.

No comments:

Post a Comment