Search in sources :

Example 1 with MaxWordSeg

use of com.chenlb.mmseg4j.MaxWordSeg in project jstarcraft-nlp by HongZhaoHua.

the class MMSegTokenizerFactory method newSeg.

private Seg newSeg(Map<String, String> configuration) {
    Seg seg = null;
    logger.info("create new Seg ...");
    // default max-word
    String mode = configuration.get("mode");
    if ("simple".equals(mode)) {
        logger.info("use simple mode");
        seg = new SimpleSeg(dic);
    } else if ("complex".equals(mode)) {
        logger.info("use complex mode");
        seg = new ComplexSeg(dic);
    } else {
        logger.info("use max-word mode");
        seg = new MaxWordSeg(dic);
    }
    return seg;
}
Also used : SimpleSeg(com.chenlb.mmseg4j.SimpleSeg) MaxWordSeg(com.chenlb.mmseg4j.MaxWordSeg) ComplexSeg(com.chenlb.mmseg4j.ComplexSeg) SimpleSeg(com.chenlb.mmseg4j.SimpleSeg) Seg(com.chenlb.mmseg4j.Seg) ComplexSeg(com.chenlb.mmseg4j.ComplexSeg) MaxWordSeg(com.chenlb.mmseg4j.MaxWordSeg)

Example 2 with MaxWordSeg

use of com.chenlb.mmseg4j.MaxWordSeg in project jstarcraft-nlp by HongZhaoHua.

the class MmsegSegmentFactory method build.

@Override
public MMSeg build(Map<String, String> configurations) {
    Dictionary dictionary;
    String dictionaryPath = get(configurations, "dictionaryPath");
    if (StringUtility.isBlank(dictionaryPath)) {
        dictionary = Dictionary.getInstance();
    } else {
        File file = new File(dictionaryPath);
        dictionary = Dictionary.getInstance(file);
    }
    String configuration = get(configurations, "mode", "MaxWord");
    Seg seg = null;
    switch(configuration) {
        case "Complex":
            seg = new ComplexSeg(dictionary);
            break;
        case "Simple":
            seg = new SimpleSeg(dictionary);
            break;
        case "MaxWord":
            seg = new MaxWordSeg(dictionary);
            break;
        default:
            throw new IllegalArgumentException();
    }
    MMSeg mmSeg = new MMSeg(new StringReader(""), seg);
    return mmSeg;
}
Also used : Dictionary(com.chenlb.mmseg4j.Dictionary) SimpleSeg(com.chenlb.mmseg4j.SimpleSeg) MaxWordSeg(com.chenlb.mmseg4j.MaxWordSeg) ComplexSeg(com.chenlb.mmseg4j.ComplexSeg) SimpleSeg(com.chenlb.mmseg4j.SimpleSeg) Seg(com.chenlb.mmseg4j.Seg) MMSeg(com.chenlb.mmseg4j.MMSeg) ComplexSeg(com.chenlb.mmseg4j.ComplexSeg) MMSeg(com.chenlb.mmseg4j.MMSeg) MaxWordSeg(com.chenlb.mmseg4j.MaxWordSeg) StringReader(java.io.StringReader) File(java.io.File)

Aggregations

ComplexSeg (com.chenlb.mmseg4j.ComplexSeg)2 MaxWordSeg (com.chenlb.mmseg4j.MaxWordSeg)2 Seg (com.chenlb.mmseg4j.Seg)2 SimpleSeg (com.chenlb.mmseg4j.SimpleSeg)2 Dictionary (com.chenlb.mmseg4j.Dictionary)1 MMSeg (com.chenlb.mmseg4j.MMSeg)1 File (java.io.File)1 StringReader (java.io.StringReader)1