Search in sources :

Example 1 with Preprocessor

use of edu.illinois.cs.cogcomp.depparse.io.Preprocessor in project cogcomp-nlp by CogComp.

the class MainClass method getStructuredData.

private static SLProblem getStructuredData(String filepath, LabeledChuLiuEdmondsDecoder infSolver) throws Exception {
    CONLLReader depReader = new CONLLReader(new Preprocessor(), useGoldPOS, conllIndexOffset);
    depReader.startReading(filepath);
    SLProblem problem = new SLProblem();
    DepInst instance = depReader.getNext();
    while (instance != null) {
        infSolver.updateInferenceSolver(instance);
        Pair<IInstance, IStructure> pair = getSLPair(instance);
        problem.addExample(pair.getFirst(), pair.getSecond());
        instance = depReader.getNext();
    }
    logger.info("{} of dependency instances.", problem.size());
    return problem;
}
Also used : Preprocessor(edu.illinois.cs.cogcomp.depparse.io.Preprocessor) DepInst(edu.illinois.cs.cogcomp.depparse.core.DepInst) CONLLReader(edu.illinois.cs.cogcomp.depparse.io.CONLLReader)

Example 2 with Preprocessor

use of edu.illinois.cs.cogcomp.depparse.io.Preprocessor in project cogcomp-nlp by CogComp.

the class MainClass method annotate.

private static void annotate(String filepath) throws IOException {
    DepAnnotator annotator = new DepAnnotator();
    TextAnnotationBuilder taBuilder = new TokenizerTextAnnotationBuilder(new StatefulTokenizer(true));
    Preprocessor preprocessor = new Preprocessor();
    Files.lines(Paths.get(filepath)).forEach(line -> {
        TextAnnotation ta = taBuilder.createTextAnnotation(line);
        try {
            preprocessor.annotate(ta);
            annotator.addView(ta);
            System.out.println(ta.getView(annotator.getViewName()).toString());
        } catch (AnnotatorException e) {
            e.printStackTrace();
        }
    });
}
Also used : TextAnnotationBuilder(edu.illinois.cs.cogcomp.annotation.TextAnnotationBuilder) TokenizerTextAnnotationBuilder(edu.illinois.cs.cogcomp.nlp.utility.TokenizerTextAnnotationBuilder) TokenizerTextAnnotationBuilder(edu.illinois.cs.cogcomp.nlp.utility.TokenizerTextAnnotationBuilder) StatefulTokenizer(edu.illinois.cs.cogcomp.nlp.tokenizer.StatefulTokenizer) AnnotatorException(edu.illinois.cs.cogcomp.annotation.AnnotatorException) Preprocessor(edu.illinois.cs.cogcomp.depparse.io.Preprocessor) TextAnnotation(edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation)

Aggregations

Preprocessor (edu.illinois.cs.cogcomp.depparse.io.Preprocessor)2 AnnotatorException (edu.illinois.cs.cogcomp.annotation.AnnotatorException)1 TextAnnotationBuilder (edu.illinois.cs.cogcomp.annotation.TextAnnotationBuilder)1 TextAnnotation (edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation)1 DepInst (edu.illinois.cs.cogcomp.depparse.core.DepInst)1 CONLLReader (edu.illinois.cs.cogcomp.depparse.io.CONLLReader)1 StatefulTokenizer (edu.illinois.cs.cogcomp.nlp.tokenizer.StatefulTokenizer)1 TokenizerTextAnnotationBuilder (edu.illinois.cs.cogcomp.nlp.utility.TokenizerTextAnnotationBuilder)1