Search in sources :

Example 41 with Term

use of org.ansj.domain.Term in project ansj_seg by NLPchina.

the class NumRecognition method recognition.

/**
	 * 数字+数字合并,zheng
	 * 
	 * @param terms
	 */
public void recognition(Term[] terms) {
    int length = terms.length - 1;
    Term from = null;
    Term to = null;
    Term temp = null;
    for (int i = 0; i < length; i++) {
        if (terms[i] == null) {
            continue;
        } else if (".".equals(terms[i].getName()) || ".".equals(terms[i].getName())) {
            // 如果是.前后都为数字进行特殊处理
            to = terms[i].to();
            from = terms[i].from();
            if (from.termNatures().numAttr.flag && to.termNatures().numAttr.flag) {
                from.setName(from.getName() + "." + to.getName());
                TermUtil.termLink(from, to.to());
                terms[to.getOffe()] = null;
                terms[i] = null;
                i = from.getOffe() - 1;
            }
            continue;
        } else if (!terms[i].termNatures().numAttr.flag) {
            continue;
        }
        temp = terms[i];
        // 将所有的数字合并
        while ((temp = temp.to()).termNatures().numAttr.flag) {
            terms[i].setName(terms[i].getName() + temp.getName());
        }
        // 如果是数字结尾
        if (MyStaticValue.isQuantifierRecognition && temp.termNatures().numAttr.numEndFreq > 0) {
            terms[i].setName(terms[i].getName() + temp.getName());
            temp = temp.to();
        }
        // 如果不等,说明terms[i]发生了改变
        if (terms[i].to() != temp) {
            TermUtil.termLink(terms[i], temp);
            // 将中间无用元素设置为null
            for (int j = i + 1; j < temp.getOffe(); j++) {
                terms[j] = null;
            }
            i = temp.getOffe() - 1;
        }
    }
}
Also used : Term(org.ansj.domain.Term)

Example 42 with Term

use of org.ansj.domain.Term in project ansj_seg by NLPchina.

the class UserDefineRecognition method makeNewTerm.

private void makeNewTerm() {
    StringBuilder sb = new StringBuilder();
    for (int j = offe; j <= endOffe; j++) {
        if (terms[j] == null) {
            continue;
        } else {
            sb.append(terms[j].getName());
        }
    }
    TermNatures termNatures = new TermNatures(new TermNature(tempNature, tempFreq));
    Term term = new Term(sb.toString(), offe, termNatures);
    term.selfScore(-1 * tempFreq);
    TermUtil.insertTerm(terms, term, type);
}
Also used : TermNatures(org.ansj.domain.TermNatures) Term(org.ansj.domain.Term) TermNature(org.ansj.domain.TermNature)

Example 43 with Term

use of org.ansj.domain.Term in project ansj_seg by NLPchina.

the class Graph method mergerByScore.

/**
	 * 根据分数
	 * 
	 * @param i 起始位置
	 * @param j 起始属性
	 * @param to
	 */
private void mergerByScore(Term fromTerm, int to) {
    Term term = null;
    if (terms[to] != null) {
        term = terms[to];
        while (term != null) {
            // 关系式to.set(from)
            term.setPathSelfScore(fromTerm);
            term = term.next();
        }
    }
}
Also used : Term(org.ansj.domain.Term)

Example 44 with Term

use of org.ansj.domain.Term in project ansj_seg by NLPchina.

the class Graph method rmLittleSinglePath.

/**
	 * 删除无意义的节点,防止viterbi太多
	 */
public void rmLittleSinglePath() {
    int maxTo = -1;
    Term temp = null;
    for (int i = 0; i < terms.length; i++) {
        if (terms[i] == null)
            continue;
        maxTo = terms[i].toValue();
        if (maxTo - i == 1 || i + 1 == terms.length)
            continue;
        for (int j = i; j < maxTo; j++) {
            temp = terms[j];
            if (temp != null && temp.toValue() <= maxTo && temp.getName().length() == 1) {
                terms[j] = null;
            }
        }
    }
}
Also used : Term(org.ansj.domain.Term)

Example 45 with Term

use of org.ansj.domain.Term in project ansj_seg by NLPchina.

the class Graph method walkPathByScore.

public void walkPathByScore() {
    Term term = null;
    // BEGIN先行打分
    mergerByScore(root, 0);
    // 从第一个词开始往后打分
    for (int i = 0; i < terms.length; i++) {
        term = terms[i];
        while (term != null && term.from() != null && term != end) {
            int to = term.toValue();
            mergerByScore(term, to);
            term = term.next();
        }
    }
    optimalRoot();
}
Also used : Term(org.ansj.domain.Term)

Aggregations

Term (org.ansj.domain.Term)55 ArrayList (java.util.ArrayList)10 Result (org.ansj.domain.Result)8 Test (org.junit.Test)8 TermNatures (org.ansj.domain.TermNatures)5 AsianPersonRecognition (org.ansj.recognition.arrimpl.AsianPersonRecognition)4 ForeignPersonRecognition (org.ansj.recognition.arrimpl.ForeignPersonRecognition)4 NumRecognition (org.ansj.recognition.arrimpl.NumRecognition)4 Graph (org.ansj.util.Graph)4 Forest (org.nlpcn.commons.lang.tire.domain.Forest)4 LinkedList (java.util.LinkedList)3 NewWord (org.ansj.domain.NewWord)3 UserDefineRecognition (org.ansj.recognition.arrimpl.UserDefineRecognition)3 NatureRecognition (org.ansj.recognition.impl.NatureRecognition)3 GetWord (org.nlpcn.commons.lang.tire.GetWord)3 BufferedReader (java.io.BufferedReader)2 HashMap (java.util.HashMap)2 TermNature (org.ansj.domain.TermNature)2 ToAnalysis (org.ansj.splitWord.analysis.ToAnalysis)2 Analyzer (org.apache.lucene.analysis.Analyzer)2