Search in sources :

Example 1 with ComplexSeg

use of com.chenlb.mmseg4j.ComplexSeg in project java-basic by tzuyichao.

the class TestMMSeg4J method main.

public static void main(String[] args) throws IOException {
    Dictionary dictionary = Dictionary.getInstance();
    MMSeg mmSeg = new MMSeg(new StringReader("上一堂課之後跑18km與2500rpm的挑戰"), new ComplexSeg(dictionary));
    Word word = null;
    boolean first = true;
    while ((word = mmSeg.next()) != null) {
        System.out.println(word.getString());
    }
}
Also used : Dictionary(com.chenlb.mmseg4j.Dictionary) Word(com.chenlb.mmseg4j.Word) MMSeg(com.chenlb.mmseg4j.MMSeg) ComplexSeg(com.chenlb.mmseg4j.ComplexSeg) StringReader(java.io.StringReader)

Example 2 with ComplexSeg

use of com.chenlb.mmseg4j.ComplexSeg in project jstarcraft-nlp by HongZhaoHua.

the class MMSegTokenizerFactory method newSeg.

private Seg newSeg(Map<String, String> configuration) {
    Seg seg = null;
    logger.info("create new Seg ...");
    // default max-word
    String mode = configuration.get("mode");
    if ("simple".equals(mode)) {
        logger.info("use simple mode");
        seg = new SimpleSeg(dic);
    } else if ("complex".equals(mode)) {
        logger.info("use complex mode");
        seg = new ComplexSeg(dic);
    } else {
        logger.info("use max-word mode");
        seg = new MaxWordSeg(dic);
    }
    return seg;
}
Also used : SimpleSeg(com.chenlb.mmseg4j.SimpleSeg) MaxWordSeg(com.chenlb.mmseg4j.MaxWordSeg) ComplexSeg(com.chenlb.mmseg4j.ComplexSeg) SimpleSeg(com.chenlb.mmseg4j.SimpleSeg) Seg(com.chenlb.mmseg4j.Seg) ComplexSeg(com.chenlb.mmseg4j.ComplexSeg) MaxWordSeg(com.chenlb.mmseg4j.MaxWordSeg)

Example 3 with ComplexSeg

use of com.chenlb.mmseg4j.ComplexSeg in project jstarcraft-nlp by HongZhaoHua.

the class MmsegSegmentFactory method build.

@Override
public MMSeg build(Map<String, String> configurations) {
    Dictionary dictionary;
    String dictionaryPath = get(configurations, "dictionaryPath");
    if (StringUtility.isBlank(dictionaryPath)) {
        dictionary = Dictionary.getInstance();
    } else {
        File file = new File(dictionaryPath);
        dictionary = Dictionary.getInstance(file);
    }
    String configuration = get(configurations, "mode", "MaxWord");
    Seg seg = null;
    switch(configuration) {
        case "Complex":
            seg = new ComplexSeg(dictionary);
            break;
        case "Simple":
            seg = new SimpleSeg(dictionary);
            break;
        case "MaxWord":
            seg = new MaxWordSeg(dictionary);
            break;
        default:
            throw new IllegalArgumentException();
    }
    MMSeg mmSeg = new MMSeg(new StringReader(""), seg);
    return mmSeg;
}
Also used : Dictionary(com.chenlb.mmseg4j.Dictionary) SimpleSeg(com.chenlb.mmseg4j.SimpleSeg) MaxWordSeg(com.chenlb.mmseg4j.MaxWordSeg) ComplexSeg(com.chenlb.mmseg4j.ComplexSeg) SimpleSeg(com.chenlb.mmseg4j.SimpleSeg) Seg(com.chenlb.mmseg4j.Seg) MMSeg(com.chenlb.mmseg4j.MMSeg) ComplexSeg(com.chenlb.mmseg4j.ComplexSeg) MMSeg(com.chenlb.mmseg4j.MMSeg) MaxWordSeg(com.chenlb.mmseg4j.MaxWordSeg) StringReader(java.io.StringReader) File(java.io.File)

Example 4 with ComplexSeg

use of com.chenlb.mmseg4j.ComplexSeg in project jstarcraft-nlp by HongZhaoHua.

the class MmsegSegmenterTestCase method getSegmenter.

@Override
protected Tokenizer getSegmenter() {
    Dictionary dictionary = Dictionary.getInstance();
    ComplexSeg complex = new ComplexSeg(dictionary);
    // MMSeg mmSeg = new MMSeg(new StringReader(""), complex);
    MmsegTokenizer tokenizer = new MmsegTokenizer(complex);
    return tokenizer;
}
Also used : Dictionary(com.chenlb.mmseg4j.Dictionary) ComplexSeg(com.chenlb.mmseg4j.ComplexSeg) MmsegTokenizer(com.jstarcraft.nlp.lucene.mmseg.MmsegTokenizer)

Example 5 with ComplexSeg

use of com.chenlb.mmseg4j.ComplexSeg in project jstarcraft-nlp by HongZhaoHua.

the class MmsegTokenizerTestCase method getTokenizer.

@Override
protected NlpTokenizer<? extends NlpToken> getTokenizer() {
    Dictionary dictionary = Dictionary.getInstance();
    ComplexSeg complex = new ComplexSeg(dictionary);
    MMSeg mmSeg = new MMSeg(new StringReader(""), complex);
    return new MmsegTokenizer(mmSeg);
}
Also used : Dictionary(com.chenlb.mmseg4j.Dictionary) ComplexSeg(com.chenlb.mmseg4j.ComplexSeg) MMSeg(com.chenlb.mmseg4j.MMSeg) StringReader(java.io.StringReader) MmsegTokenizer(com.jstarcraft.nlp.tokenization.mmseg.MmsegTokenizer)

Aggregations

ComplexSeg (com.chenlb.mmseg4j.ComplexSeg)5 Dictionary (com.chenlb.mmseg4j.Dictionary)4 MMSeg (com.chenlb.mmseg4j.MMSeg)3 StringReader (java.io.StringReader)3 MaxWordSeg (com.chenlb.mmseg4j.MaxWordSeg)2 Seg (com.chenlb.mmseg4j.Seg)2 SimpleSeg (com.chenlb.mmseg4j.SimpleSeg)2 Word (com.chenlb.mmseg4j.Word)1 MmsegTokenizer (com.jstarcraft.nlp.lucene.mmseg.MmsegTokenizer)1 MmsegTokenizer (com.jstarcraft.nlp.tokenization.mmseg.MmsegTokenizer)1 File (java.io.File)1