use of org.apache.lucene.analysis.standard.StandardTokenizer in project nutch by apache.
the class LuceneTokenizer method generateTokenStreamFromText.
private TokenStream generateTokenStreamFromText(String content, TokenizerType tokenizerType) {
Tokenizer tokenizer = null;
switch(tokenizerType) {
case CLASSIC:
tokenizer = new ClassicTokenizer();
break;
case STANDARD:
default:
tokenizer = new StandardTokenizer();
}
tokenizer.setReader(new StringReader(content));
tokenStream = tokenizer;
return tokenStream;
}
use of org.apache.lucene.analysis.standard.StandardTokenizer in project omegat by omegat-org.
the class BaseTokenizer method getStandardTokenStream.
/**
* Minimal implementation that returns the default implementation
* corresponding to all false parameters. Subclasses should override this to
* handle true parameters.
*/
protected TokenStream getStandardTokenStream(String strOrig) throws IOException {
StandardTokenizer tokenizer = new StandardTokenizer();
tokenizer.setReader(new StringReader(strOrig));
return tokenizer;
}
Aggregations