Search in sources :

Example 1 with Data

use of edu.illinois.cs.cogcomp.ner.LbjTagger.Data in project cogcomp-nlp by CogComp.

the class ReferenceUtils method createNerDataStructuresForText.

public Data createNerDataStructuresForText(TextAnnotation ta) {
    ArrayList<LinkedVector> sentences = new ArrayList<>();
    String[] tokens = ta.getTokens();
    int[] tokenindices = new int[tokens.length];
    int tokenIndex = 0;
    int neWordIndex = 0;
    for (int i = 0; i < ta.getNumberOfSentences(); i++) {
        Sentence sentence = ta.getSentence(i);
        String[] wtoks = sentence.getTokens();
        LinkedVector words = new LinkedVector();
        for (String w : wtoks) {
            if (w.length() > 0) {
                NEWord.addTokenToSentence(words, w, "unlabeled");
                tokenindices[neWordIndex] = tokenIndex;
                neWordIndex++;
            } else {
                throw new IllegalStateException("Bad (zero length) token.");
            }
            tokenIndex++;
        }
        if (words.size() > 0)
            sentences.add(words);
    }
    // Do the annotation.
    Data data = new Data(new NERDocument(sentences, "input"));
    return data;
}
Also used : LinkedVector(edu.illinois.cs.cogcomp.lbjava.parse.LinkedVector) ArrayList(java.util.ArrayList) Data(edu.illinois.cs.cogcomp.ner.LbjTagger.Data) NERDocument(edu.illinois.cs.cogcomp.ner.LbjTagger.NERDocument) Sentence(edu.illinois.cs.cogcomp.core.datastructures.textannotation.Sentence)

Aggregations

Sentence (edu.illinois.cs.cogcomp.core.datastructures.textannotation.Sentence)1 LinkedVector (edu.illinois.cs.cogcomp.lbjava.parse.LinkedVector)1 Data (edu.illinois.cs.cogcomp.ner.LbjTagger.Data)1 NERDocument (edu.illinois.cs.cogcomp.ner.LbjTagger.NERDocument)1 ArrayList (java.util.ArrayList)1