Search in sources :

Example 16 with Term

use of org.ansj.domain.Term in project ansj_seg by NLPchina.

the class ForeignPersonRecognition method recognition.

public void recognition(Term[] terms) {
    this.terms = terms;
    String name = null;
    Term term = null;
    reset();
    for (int i = 0; i < terms.length; i++) {
        if (terms[i] == null) {
            continue;
        }
        term = terms[i];
        // 如果名字的开始是人名的前缀,或者后缀.那么忽略
        if (tempList.size() == 0) {
            if (term.termNatures().personAttr.end > 10) {
                continue;
            }
            if ((terms[i].getName().length() == 1 && ISNOTFIRST.contains(terms[i].getName().charAt(0)))) {
                continue;
            }
        }
        name = term.getName();
        if (term.termNatures() == TermNatures.NR || term.termNatures() == TermNatures.NW || name.length() == 1) {
            boolean flag = validate(name);
            if (flag) {
                tempList.add(term);
            }
        } else if (tempList.size() == 1) {
            reset();
        } else if (tempList.size() > 1) {
            TermUtil.insertTerm(terms, tempList, TermNatures.NR);
            reset();
        }
    }
}
Also used : Term(org.ansj.domain.Term)

Example 17 with Term

use of org.ansj.domain.Term in project ansj_seg by NLPchina.

the class NewWordRecognition method makeNewTerm.

private void makeNewTerm() {
    Term term = new Term(sb.toString(), offe, tempNature.natureStr, 1);
    term.selfScore(score);
    term.setNature(tempNature);
    if (sb.length() > 3) {
        term.setSubTerm(TermUtil.getSubTerm(from, to));
    }
    TermUtil.termLink(from, term);
    TermUtil.termLink(term, to);
    TermUtil.insertTerm(terms, term, InsertTermType.SCORE_ADD_SORT);
    TermUtil.parseNature(term);
}
Also used : Term(org.ansj.domain.Term)

Example 18 with Term

use of org.ansj.domain.Term in project ansj_seg by NLPchina.

the class BookRecognition method recognition.

public void recognition(Result result) {
    List<Term> terms = result.getTerms();
    String end = null;
    String name;
    LinkedList<Term> mergeList = null;
    List<Term> list = new LinkedList<Term>();
    for (Term term : terms) {
        name = term.getName();
        if (end == null) {
            if ((end = ruleMap.get(name)) != null) {
                mergeList = new LinkedList<Term>();
                mergeList.add(term);
            } else {
                list.add(term);
            }
        } else {
            mergeList.add(term);
            if (end.equals(name)) {
                Term ft = mergeList.pollFirst();
                for (Term sub : mergeList) {
                    ft.merage(sub);
                }
                ft.setNature(nature);
                list.add(ft);
                mergeList = null;
                end = null;
            }
        }
    }
    if (mergeList != null) {
        for (Term term : list) {
            list.add(term);
        }
    }
    result.setTerms(list);
}
Also used : Term(org.ansj.domain.Term) LinkedList(java.util.LinkedList)

Example 19 with Term

use of org.ansj.domain.Term in project ansj_seg by NLPchina.

the class NatureRecognition method guessNature.

/**
	 * 通过规则 猜测词性
	 * 
	 * @param word
	 * @return
	 */
public static TermNatures guessNature(String word) {
    String nature = null;
    SmartForest<String[]> smartForest = SUFFIX_FOREST;
    int len = 0;
    for (int i = word.length() - 1; i >= 0; i--) {
        smartForest = smartForest.get(word.charAt(i));
        if (smartForest == null) {
            break;
        }
        len++;
        if (smartForest.getStatus() == 2) {
            nature = smartForest.getParam()[0];
        } else if (smartForest.getStatus() == 3) {
            nature = smartForest.getParam()[0];
            break;
        }
    }
    if ("nt".equals(nature) && (len > 1 || word.length() > 3)) {
        return TermNatures.NT;
    } else if ("ns".equals(nature)) {
        return TermNatures.NS;
    } else if (word.length() < 5) {
        Result parse = ToAnalysis.parse(word);
        for (Term term : parse.getTerms()) {
            if ("nr".equals(term.getNatureStr())) {
                return TermNatures.NR;
            }
        }
    } else if (ForeignPersonRecognition.isFName(word)) {
        return TermNatures.NRF;
    }
    return TermNatures.NW;
}
Also used : Term(org.ansj.domain.Term) Result(org.ansj.domain.Result)

Example 20 with Term

use of org.ansj.domain.Term in project ansj_seg by NLPchina.

the class NatureRecognition method recognition.

/**
	 * 传入一组。词对词语进行。词性标注
	 * 
	 * @param words
	 * @param offe
	 * @return
	 */
public List<Term> recognition(List<String> words, int offe) {
    List<Term> terms = new ArrayList<Term>(words.size());
    int tempOffe = 0;
    for (String word : words) {
        TermNatures tn = getTermNatures(word);
        terms.add(new Term(word, offe + tempOffe, tn));
        tempOffe += word.length();
    }
    new NatureRecognition().recognition(new Result(terms));
    return terms;
}
Also used : TermNatures(org.ansj.domain.TermNatures) ArrayList(java.util.ArrayList) Term(org.ansj.domain.Term) Result(org.ansj.domain.Result)

Aggregations

Term (org.ansj.domain.Term)55 ArrayList (java.util.ArrayList)10 Result (org.ansj.domain.Result)8 Test (org.junit.Test)8 TermNatures (org.ansj.domain.TermNatures)5 AsianPersonRecognition (org.ansj.recognition.arrimpl.AsianPersonRecognition)4 ForeignPersonRecognition (org.ansj.recognition.arrimpl.ForeignPersonRecognition)4 NumRecognition (org.ansj.recognition.arrimpl.NumRecognition)4 Graph (org.ansj.util.Graph)4 Forest (org.nlpcn.commons.lang.tire.domain.Forest)4 LinkedList (java.util.LinkedList)3 NewWord (org.ansj.domain.NewWord)3 UserDefineRecognition (org.ansj.recognition.arrimpl.UserDefineRecognition)3 NatureRecognition (org.ansj.recognition.impl.NatureRecognition)3 GetWord (org.nlpcn.commons.lang.tire.GetWord)3 BufferedReader (java.io.BufferedReader)2 HashMap (java.util.HashMap)2 TermNature (org.ansj.domain.TermNature)2 ToAnalysis (org.ansj.splitWord.analysis.ToAnalysis)2 Analyzer (org.apache.lucene.analysis.Analyzer)2