This project has retired. For details please refer to its Attic page.

org.apache.lenya.lucene
Class ReTokenizeFile

java.lang.Object
  extended by org.apache.lenya.lucene.ReTokenizeFile

public class ReTokenizeFile
extends java.lang.Object

DOCUMENT ME!


Constructor Summary
ReTokenizeFile()
           
 
Method Summary
 java.lang.String emphasizeAsXML(java.lang.String string, java.lang.String[] words)
          Encloses all words in words that appear in string in <word> tags.
 java.lang.String getExcerpt(java.io.File file, java.lang.String[] words)
           
protected  java.lang.String includeInCDATA(java.lang.String string)
          Includes a string in CDATA delimiters.
static void main(java.lang.String[] args)
          DOCUMENT ME!
protected  java.lang.String readFile(java.io.File file, java.nio.charset.Charset charset)
          reads a file in the specified encoding.
protected  java.lang.String readFileWithEncoding(java.io.File file)
          reads a file and if the file is an xml file, determine its encoding
protected  java.lang.String readHtmlFile(java.io.File file)
          read a html file.
 java.lang.String removeTags(java.lang.String string)
          Remove tags
 java.lang.String reTokenize(java.io.File file)
          DOCUMENT ME!
 void setOffset(int offset)
          Set offset
 java.lang.String tidy(java.lang.String string)
          Is being used by search-and-results.xsp.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ReTokenizeFile

public ReTokenizeFile()
Method Detail

main

public static void main(java.lang.String[] args)
DOCUMENT ME!

Parameters:
args - DOCUMENT ME!

reTokenize

public java.lang.String reTokenize(java.io.File file)
                            throws java.lang.Exception
DOCUMENT ME!

Parameters:
file - DOCUMENT ME!
Returns:
DOCUMENT ME!
Throws:
java.lang.Exception - DOCUMENT ME!

getExcerpt

public java.lang.String getExcerpt(java.io.File file,
                                   java.lang.String[] words)
                            throws java.io.FileNotFoundException,
                                   java.io.IOException
Throws:
java.io.FileNotFoundException
java.io.IOException

removeTags

public java.lang.String removeTags(java.lang.String string)
Remove tags

Parameters:
string - Content with tags
Returns:
Content without tags

tidy

public java.lang.String tidy(java.lang.String string)
Is being used by search-and-results.xsp. Is this really still necessary?

Parameters:
string - content
Returns:
content without <>&

emphasizeAsXML

public java.lang.String emphasizeAsXML(java.lang.String string,
                                       java.lang.String[] words)
Encloses all words in words that appear in string in <word> tags. The whole string is enclosed in <excerpt> tags.

Parameters:
string - The string to process.
words - The words to emphasize.
Returns:
DOCUMENT ME!

includeInCDATA

protected java.lang.String includeInCDATA(java.lang.String string)
Includes a string in CDATA delimiters.


readFileWithEncoding

protected java.lang.String readFileWithEncoding(java.io.File file)
                                         throws java.io.FileNotFoundException,
                                                java.io.IOException
reads a file and if the file is an xml file, determine its encoding

Parameters:
file - the file to read. (if the file is an xml file with an specified encoding, this will be overwritten)
Returns:
the contents of the file.
Throws:
java.io.FileNotFoundException
java.io.IOException

readHtmlFile

protected java.lang.String readHtmlFile(java.io.File file)
                                 throws java.io.FileNotFoundException,
                                        java.io.IOException
read a html file.

Parameters:
file - the file to read
Returns:
the content of the file.
Throws:
java.io.FileNotFoundException - if the file does not exists.
java.io.IOException - if something else went wrong.

readFile

protected java.lang.String readFile(java.io.File file,
                                    java.nio.charset.Charset charset)
                             throws java.io.FileNotFoundException,
                                    java.io.IOException
reads a file in the specified encoding.

Parameters:
file - the file to read.
encoding - the file encoding
Returns:
the content of the file.
Throws:
java.io.FileNotFoundException - if the file does not exists.
java.io.IOException - if something else went wrong.

setOffset

public void setOffset(int offset)
Set offset



Copyright © 1999-2005 Apache Software Foundation. All Rights Reserved.