org.apache.lenya.lucene.parser
Interface HTMLParser

All Known Implementing Classes:
AbstractHTMLParser, PDFParserWrapper, SwingHTMLParser

public interface HTMLParser


Method Summary
 java.lang.String getKeywords()
          Returns keywords
 java.io.Reader getReader()
          Returns a reader that reads the contents of the HTML document.
 java.lang.String getTitle()
          Returns the title of the HTML document.
 void parse(java.io.File file)
           
 void parse(java.net.URI uri)
           
 

Method Detail

parse

void parse(java.io.File file)
           throws ParseException
Throws:
ParseException

parse

void parse(java.net.URI uri)
           throws ParseException
Throws:
ParseException

getTitle

java.lang.String getTitle()
                          throws java.io.IOException
Returns the title of the HTML document.

Throws:
java.io.IOException

getKeywords

java.lang.String getKeywords()
                             throws java.io.IOException
Returns keywords

Throws:
java.io.IOException

getReader

java.io.Reader getReader()
                         throws java.io.IOException
Returns a reader that reads the contents of the HTML document.

Throws:
java.io.IOException


Copyright © 1999-2005 Apache Software Foundation. All Rights Reserved.