|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object com.sun.speech.freetts.en.TokenizerImpl
Implements the tokenizer interface. Breaks an input sequence of characters into a set of tokens.
Field Summary | |
static java.lang.String |
DEFAULT_POSTPUNCTUATION_SYMBOLS
A string containing the default post-punctuation characters. |
static java.lang.String |
DEFAULT_PREPUNCTUATION_SYMBOLS
A string containing the default pre-punctuation characters. |
static java.lang.String |
DEFAULT_SINGLE_CHAR_SYMBOLS
A string containing the default single characters. |
static java.lang.String |
DEFAULT_WHITESPACE_SYMBOLS
A string containing the default whitespace characters. |
static int |
EOF
A constant indicating that the end of the stream has been read. |
Constructor Summary | |
TokenizerImpl()
Constructs a Tokenizer. |
|
TokenizerImpl(java.io.Reader file)
Creates a tokenizer that will return tokens from the given file. |
|
TokenizerImpl(java.lang.String string)
Creates a tokenizer that will return tokens from the given string. |
Method Summary | |
java.lang.String |
getErrorDescription()
if hasErrors returns true , this will return a
description of the error encountered, otherwise
it will return null |
Token |
getNextToken()
Returns the next token. |
boolean |
hasErrors()
Returns true if there were errors while reading tokens |
boolean |
hasMoreTokens()
Returns true if there are more tokens,
false otherwise. |
boolean |
isBreak()
Determines if the current token should start a new sentence. |
void |
setInputReader(java.io.Reader reader)
Sets the input reader |
void |
setInputText(java.lang.String inputString)
Sets the text to tokenize. |
void |
setPostpunctuationSymbols(java.lang.String symbols)
Sets the postpunctuation symbols of this Tokenizer to the given symbols. |
void |
setPrepunctuationSymbols(java.lang.String symbols)
Sets the prepunctuation symbols of this Tokenizer to the given symbols. |
void |
setSingleCharSymbols(java.lang.String symbols)
Sets the single character symbols of this Tokenizer to the given symbols. |
void |
setWhitespaceSymbols(java.lang.String symbols)
Sets the whitespace symbols of this Tokenizer to the given symbols. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static final int EOF
public static final java.lang.String DEFAULT_WHITESPACE_SYMBOLS
public static final java.lang.String DEFAULT_SINGLE_CHAR_SYMBOLS
public static final java.lang.String DEFAULT_PREPUNCTUATION_SYMBOLS
public static final java.lang.String DEFAULT_POSTPUNCTUATION_SYMBOLS
Constructor Detail |
public TokenizerImpl()
public TokenizerImpl(java.lang.String string)
string
- the string to tokenizepublic TokenizerImpl(java.io.Reader file)
file
- where to read the input fromMethod Detail |
public void setWhitespaceSymbols(java.lang.String symbols)
setWhitespaceSymbols
in interface Tokenizer
symbols
- the whitespace symbolspublic void setSingleCharSymbols(java.lang.String symbols)
setSingleCharSymbols
in interface Tokenizer
symbols
- the single character symbolspublic void setPrepunctuationSymbols(java.lang.String symbols)
setPrepunctuationSymbols
in interface Tokenizer
symbols
- the prepunctuation symbolspublic void setPostpunctuationSymbols(java.lang.String symbols)
setPostpunctuationSymbols
in interface Tokenizer
symbols
- the postpunctuation symbolspublic void setInputText(java.lang.String inputString)
setInputText
in interface Tokenizer
inputString
- the string to tokenizepublic void setInputReader(java.io.Reader reader)
setInputReader
in interface Tokenizer
reader
- the input sourcepublic Token getNextToken()
getNextToken
in interface Tokenizer
null
if no more tokenspublic boolean hasMoreTokens()
true
if there are more tokens,
false
otherwise.
hasMoreTokens
in interface Tokenizer
true
if there are more tokens
false
otherwisepublic boolean hasErrors()
true
if there were errors while reading tokens
hasErrors
in interface Tokenizer
true
if there were errors;
false
otherwisepublic java.lang.String getErrorDescription()
true
, this will return a
description of the error encountered, otherwise
it will return null
getErrorDescription
in interface Tokenizer
public boolean isBreak()
isBreak
in interface Tokenizer
true
if a new sentence should be started
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |