org.apache.lenya.lucene
Class ReTokenizeFile

java.lang.Object
  extended by org.apache.lenya.lucene.ReTokenizeFile

public class ReTokenizeFile
extends java.lang.Object

DOCUMENT ME!


Constructor Summary
ReTokenizeFile()
           
 
Method Summary
 java.lang.String emphasizeAsXML(java.lang.String string, java.lang.String[] words)
          Encloses all words in words that appear in string in <word> tags.
 java.lang.String getExcerpt(java.io.File file, java.lang.String[] words)
           
protected  java.lang.String includeInCDATA(java.lang.String string)
          Includes a string in CDATA delimiters.
static void main(java.lang.String[] args)
          DOCUMENT ME!
protected  java.lang.String readFile(java.io.File file, java.nio.charset.Charset charset)
          reads a file in the specified encoding.
protected  java.lang.String readFileWithEncoding(java.io.File file)
          reads a file and if the file is an xml file, determine its encoding
protected  java.lang.String readHtmlFile(java.io.File file)
          read a html file.
 java.lang.String removeTags(java.lang.String string)
          Remove tags
 java.lang.String reTokenize(java.io.File file)
          DOCUMENT ME!
 void setOffset(int offset)
          Set offset
 java.lang.String tidy(java.lang.String string)
          Is being used by search-and-results.xsp.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ReTokenizeFile

public ReTokenizeFile()
Method Detail

main

public static void main(java.lang.String[] args)
DOCUMENT ME!

Parameters:
args - DOCUMENT ME!

reTokenize

public java.lang.String reTokenize(java.io.File file)
                            throws java.lang.Exception
DOCUMENT ME!

Parameters:
file - DOCUMENT ME!
Returns:
DOCUMENT ME!
Throws:
java.lang.Exception - DOCUMENT ME!

getExcerpt

public java.lang.String getExcerpt(java.io.File file,
                                   java.lang.String[] words)
                            throws java.io.FileNotFoundException,
                                   java.io.IOException
Throws:
java.io.FileNotFoundException
java.io.IOException

removeTags

public java.lang.String removeTags(java.lang.String string)
Remove tags

Parameters:
string - Content with tags
Returns:
Content without tags

tidy

public java.lang.String tidy(java.lang.String string)
Is being used by search-and-results.xsp. Is this really still necessary?

Parameters:
string - content
Returns:
content without <>&

emphasizeAsXML

public java.lang.String emphasizeAsXML(java.lang.String string,
                                       java.lang.String[] words)
Encloses all words in words that appear in string in <word> tags. The whole string is enclosed in <excerpt> tags.

Parameters:
string - The string to process.
words - The words to emphasize.
Returns:
DOCUMENT ME!

includeInCDATA

protected java.lang.String includeInCDATA(java.lang.String string)
Includes a string in CDATA delimiters.


readFileWithEncoding

protected java.lang.String readFileWithEncoding(java.io.File file)
                                         throws java.io.FileNotFoundException,
                                                java.io.IOException
reads a file and if the file is an xml file, determine its encoding

Parameters:
file - the file to read. (if the file is an xml file with an specified encoding, this will be overwritten)
Returns:
the contents of the file.
Throws:
java.io.FileNotFoundException
java.io.IOException

readHtmlFile

protected java.lang.String readHtmlFile(java.io.File file)
                                 throws java.io.FileNotFoundException,
                                        java.io.IOException
read a html file.

Parameters:
file - the file to read
Returns:
the content of the file.
Throws:
java.io.FileNotFoundException - if the file does not exists.
java.io.IOException - if something else went wrong.

readFile

protected java.lang.String readFile(java.io.File file,
                                    java.nio.charset.Charset charset)
                             throws java.io.FileNotFoundException,
                                    java.io.IOException
reads a file in the specified encoding.

Parameters:
file - the file to read.
encoding - the file encoding
Returns:
the content of the file.
Throws:
java.io.FileNotFoundException - if the file does not exists.
java.io.IOException - if something else went wrong.

setOffset

public void setOffset(int offset)
Set offset



Copyright © 1999-2005 Apache Software Foundation. All Rights Reserved.