Search in sources :

Example 16 with CollectionReader

use of org.apache.uima.collection.CollectionReader in project dkpro-tc by dkpro.

the class LinewiseTextOutcomeReaderTest method testReader.

@Test
public void testReader() throws Exception {
    CollectionReader reader = CollectionReaderFactory.createReader(LinewiseTextOutcomeReader.class, LinewiseTextOutcomeReader.PARAM_TEXT_INDEX, 2, LinewiseTextOutcomeReader.PARAM_OUTCOME_INDEX, 1, LinewiseTextOutcomeReader.PARAM_SOURCE_LOCATION, "src/test/resources/semEval2017Task4/", LinewiseTextOutcomeReader.PARAM_PATTERNS, "*.txt");
    List<String> readDocumentSpans = new ArrayList<>();
    List<String> readOutcomes = new ArrayList<>();
    while (reader.hasNext()) {
        JCas emptyCas = JCasFactory.createJCas();
        reader.getNext(emptyCas.getCas());
        readDocumentSpans.add(JCasUtil.selectSingle(emptyCas, TextClassificationTarget.class).getCoveredText());
        readOutcomes.add(JCasUtil.selectSingle(emptyCas, TextClassificationOutcome.class).getOutcome());
    }
    assertEquals(15, readDocumentSpans.size());
    assertEquals(15, readOutcomes.size());
}
Also used : CollectionReader(org.apache.uima.collection.CollectionReader) ArrayList(java.util.ArrayList) JCas(org.apache.uima.jcas.JCas) Test(org.junit.Test)

Example 17 with CollectionReader

use of org.apache.uima.collection.CollectionReader in project dkpro-tc by dkpro.

the class TestFoldUtil method countNumberOfTextClassificationSequencesAndUnitsPerCas.

private List<List<Integer>> countNumberOfTextClassificationSequencesAndUnitsPerCas(List<File> writtenBins) throws Exception {
    List<List<Integer>> arrayList = new ArrayList<>();
    List<Integer> units = new ArrayList<>();
    List<Integer> seq = new ArrayList<>();
    for (File f : writtenBins) {
        JCas jcas = JCasFactory.createJCas();
        CollectionReader createReader = createReader(jcas, f);
        createReader.getNext(jcas.getCas());
        Collection<TextClassificationTarget> colUni = JCasUtil.select(jcas, TextClassificationTarget.class);
        units.add(colUni.size());
        Collection<TextClassificationSequence> colSeq = JCasUtil.select(jcas, TextClassificationSequence.class);
        seq.add(colSeq.size());
    }
    arrayList.add(seq);
    arrayList.add(units);
    return arrayList;
}
Also used : CollectionReader(org.apache.uima.collection.CollectionReader) ArrayList(java.util.ArrayList) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) JCas(org.apache.uima.jcas.JCas) ArrayList(java.util.ArrayList) List(java.util.List) TextClassificationSequence(org.dkpro.tc.api.type.TextClassificationSequence) File(java.io.File)

Example 18 with CollectionReader

use of org.apache.uima.collection.CollectionReader in project webanno by webanno.

the class CasToBratJsonTest method testGenerateBratJsonGetDocument.

/**
 * generate brat JSON data for the document
 */
@Test
public void testGenerateBratJsonGetDocument() throws Exception {
    MappingJackson2HttpMessageConverter jsonConverter = new MappingJackson2HttpMessageConverter();
    String jsonFilePath = "target/test-output/output_cas_to_json_document.json";
    String file = "src/test/resources/tcf04-karin-wl.xml";
    CAS cas = JCasFactory.createJCas().getCas();
    CollectionReader reader = CollectionReaderFactory.createReader(TcfReader.class, TcfReader.PARAM_SOURCE_LOCATION, file);
    reader.getNext(cas);
    JCas jCas = cas.getJCas();
    AnnotatorState state = new AnnotatorStateImpl(Mode.ANNOTATION);
    state.getPreferences().setWindowSize(10);
    state.setFirstVisibleUnit(WebAnnoCasUtil.getFirstSentence(jCas));
    state.setProject(project);
    VDocument vdoc = new VDocument();
    preRenderer.render(vdoc, state, jCas, annotationSchemaService.listAnnotationLayer(project));
    GetDocumentResponse response = new GetDocumentResponse();
    BratRenderer.render(response, state, vdoc, jCas, annotationSchemaService);
    JSONUtil.generatePrettyJson(jsonConverter, response, new File(jsonFilePath));
    assertThat(linesOf(new File("src/test/resources/output_cas_to_json_document_expected.json"), "UTF-8")).isEqualTo(linesOf(new File(jsonFilePath), "UTF-8"));
}
Also used : MappingJackson2HttpMessageConverter(org.springframework.http.converter.json.MappingJackson2HttpMessageConverter) CollectionReader(org.apache.uima.collection.CollectionReader) GetDocumentResponse(de.tudarmstadt.ukp.clarin.webanno.brat.message.GetDocumentResponse) CAS(org.apache.uima.cas.CAS) VDocument(de.tudarmstadt.ukp.clarin.webanno.api.annotation.rendering.model.VDocument) AnnotatorStateImpl(de.tudarmstadt.ukp.clarin.webanno.api.annotation.model.AnnotatorStateImpl) AnnotatorState(de.tudarmstadt.ukp.clarin.webanno.api.annotation.model.AnnotatorState) JCas(org.apache.uima.jcas.JCas) File(java.io.File) Test(org.junit.Test)

Example 19 with CollectionReader

use of org.apache.uima.collection.CollectionReader in project webanno by webanno.

the class DiffUtils method readXMI.

public static JCas readXMI(String aPath, TypeSystemDescription aType) throws UIMAException, IOException {
    CollectionReader reader = createReader(XmiReader.class, XmiReader.PARAM_SOURCE_LOCATION, "src/test/resources/" + aPath);
    JCas jcas;
    if (aType != null) {
        TypeSystemDescription builtInTypes = TypeSystemDescriptionFactory.createTypeSystemDescription();
        List<TypeSystemDescription> allTypes = new ArrayList<>();
        allTypes.add(builtInTypes);
        allTypes.add(aType);
        jcas = JCasFactory.createJCas(CasCreationUtils.mergeTypeSystems(allTypes));
    } else {
        jcas = JCasFactory.createJCas();
    }
    reader.getNext(jcas.getCas());
    return jcas;
}
Also used : CollectionReader(org.apache.uima.collection.CollectionReader) TypeSystemDescription(org.apache.uima.resource.metadata.TypeSystemDescription) ArrayList(java.util.ArrayList) JCas(org.apache.uima.jcas.JCas)

Example 20 with CollectionReader

use of org.apache.uima.collection.CollectionReader in project webanno by webanno.

the class LineOrientedTextReaderTest method test.

@Test
public void test() throws Exception {
    JCas doc = JCasFactory.createJCas();
    CollectionReader reader = createReader(LineOrientedTextReader.class, LineOrientedTextReader.PARAM_SOURCE_LOCATION, "LICENSE.txt");
    reader.getNext(doc.getCas());
    // select(doc, Sentence.class).forEach(s -> System.out.println(s.getCoveredText()));
    assertEquals(169, select(doc, Sentence.class).size());
    assertEquals(0, select(doc, Token.class).size());
}
Also used : CollectionReader(org.apache.uima.collection.CollectionReader) JCas(org.apache.uima.jcas.JCas) Test(org.junit.Test)

Aggregations

CollectionReader (org.apache.uima.collection.CollectionReader)35 JCas (org.apache.uima.jcas.JCas)28 ArrayList (java.util.ArrayList)25 TextClassificationOutcome (org.dkpro.tc.api.type.TextClassificationOutcome)15 AnalysisEngine (org.apache.uima.analysis_engine.AnalysisEngine)14 Test (org.junit.Test)13 Token (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token)8 CAS (org.apache.uima.cas.CAS)7 File (java.io.File)5 TextClassificationTarget (org.dkpro.tc.api.type.TextClassificationTarget)5 List (java.util.List)4 AnalysisEngineDescription (org.apache.uima.analysis_engine.AnalysisEngineDescription)4 TextClassificationSequence (org.dkpro.tc.api.type.TextClassificationSequence)4 POS (de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.POS)3 Lemma (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Lemma)3 TypeSystemDescription (org.apache.uima.resource.metadata.TypeSystemDescription)3 Evaluator (de.tudarmstadt.ukp.clarin.webanno.constraints.evaluator.Evaluator)2 PossibleValue (de.tudarmstadt.ukp.clarin.webanno.constraints.evaluator.PossibleValue)2 ValuesGenerator (de.tudarmstadt.ukp.clarin.webanno.constraints.evaluator.ValuesGenerator)2 ConstraintsGrammar (de.tudarmstadt.ukp.clarin.webanno.constraints.grammar.ConstraintsGrammar)2