Search in sources :

Example 1 with JCasId

use of org.dkpro.tc.api.type.JCasId in project dkpro-tc by dkpro.

the class JCasIdSetter method process.

@Override
public void process(JCas arg0) throws AnalysisEngineProcessException {
    boolean exists = JCasUtil.exists(arg0, JCasId.class);
    if (!exists) {
        JCasId id = new JCasId(arg0);
        id.setId(jcasId++);
        id.addToIndexes();
    }
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId)

Example 2 with JCasId

use of org.dkpro.tc.api.type.JCasId in project dkpro-tc by dkpro.

the class TestReaderSentenceToDocument method getNext.

@Override
public void getNext(JCas aJCas) throws IOException, CollectionException {
    // setting the document text
    aJCas.setDocumentText(texts.get(offset));
    aJCas.setDocumentLanguage(LANGUAGE_CODE);
    // as we are creating more than one CAS out of a single file, we need to have different
    // document titles and URIs for each CAS
    // otherwise, serialized CASes will be overwritten
    DocumentMetaData dmd = DocumentMetaData.create(aJCas);
    dmd.setDocumentTitle("Sentence" + offset);
    dmd.setDocumentUri("Sentence" + offset);
    dmd.setDocumentId(String.valueOf(offset));
    JCasId id = new JCasId(aJCas);
    id.setId(jcasId);
    id.addToIndexes();
    // setting the outcome / label for this document
    TextClassificationOutcome outcome = new TextClassificationOutcome(aJCas);
    outcome.setOutcome(getTextClassificationOutcome(aJCas));
    outcome.addToIndexes();
    new TextClassificationTarget(aJCas, 0, aJCas.getDocumentText().length()).addToIndexes();
    offset++;
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId) TextClassificationOutcome(org.dkpro.tc.api.type.TextClassificationOutcome) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) DocumentMetaData(de.tudarmstadt.ukp.dkpro.core.api.metadata.type.DocumentMetaData)

Example 3 with JCasId

use of org.dkpro.tc.api.type.JCasId in project dkpro-tc by dkpro.

the class SharedNounChunksTest method setUp.

@Before
public void setUp() throws ResourceInitializationException, AnalysisEngineProcessException {
    AnalysisEngineDescription desc = createEngineDescription(BreakIteratorSegmenter.class);
    AnalysisEngine engine = createEngine(desc);
    jcas1 = engine.newJCas();
    jcas1.setDocumentLanguage("en");
    jcas1.setDocumentText("This is the text of view 1");
    JCasId id = new JCasId(jcas1);
    id.setId(jcasId++);
    id.addToIndexes();
    engine.process(jcas1);
    jcas2 = engine.newJCas();
    jcas2.setDocumentLanguage("en");
    jcas2.setDocumentText("This is the text of view 2");
    id = new JCasId(jcas2);
    id.setId(jcasId++);
    id.addToIndexes();
    engine.process(jcas2);
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId) AnalysisEngineDescription(org.apache.uima.analysis_engine.AnalysisEngineDescription) AnalysisEngine(org.apache.uima.analysis_engine.AnalysisEngine) Before(org.junit.Before)

Example 4 with JCasId

use of org.dkpro.tc.api.type.JCasId in project dkpro-tc by dkpro.

the class TestReaderSingleLabelUnitReader method getNext.

@Override
public void getNext(CAS aCAS) throws IOException, CollectionException {
    super.getNext(aCAS);
    JCas jcas;
    try {
        jcas = aCAS.getJCas();
        JCasId id = new JCasId(jcas);
        id.setId(jcasId++);
        id.addToIndexes();
    } catch (CASException e) {
        throw new CollectionException();
    }
    String documentText = aCAS.getDocumentText();
    int s = 0;
    for (String t : documentText.split(" ")) {
        int e = documentText.indexOf(t, s) + t.length();
        new TextClassificationTarget(jcas, s, e).addToIndexes();
        new TextClassificationOutcome(jcas, s, e).addToIndexes();
        s += 1;
    }
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId) CollectionException(org.apache.uima.collection.CollectionException) TextClassificationOutcome(org.dkpro.tc.api.type.TextClassificationOutcome) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) JCas(org.apache.uima.jcas.JCas) CASException(org.apache.uima.cas.CASException)

Example 5 with JCasId

use of org.dkpro.tc.api.type.JCasId in project dkpro-tc by dkpro.

the class PairFeatureTestBase method runExtractor.

public Set<Feature> runExtractor(AnalysisEngine engine, PairFeatureExtractor extractor) throws ResourceInitializationException, TextClassificationException, AnalysisEngineProcessException {
    JCas jcas1 = engine.newJCas();
    jcas1.setDocumentLanguage("en");
    jcas1.setDocumentText("This is the text of view 1");
    engine.process(jcas1);
    JCasId id = new JCasId(jcas1);
    id.setId(jcasId++);
    id.addToIndexes();
    JCas jcas2 = engine.newJCas();
    jcas2.setDocumentLanguage("en");
    jcas2.setDocumentText("This is the text of view 2");
    engine.process(jcas2);
    id = new JCasId(jcas2);
    id.setId(jcasId++);
    id.addToIndexes();
    return extractor.extract(jcas1, jcas2);
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId) JCas(org.apache.uima.jcas.JCas)

Aggregations

JCasId (org.dkpro.tc.api.type.JCasId)24 TextClassificationTarget (org.dkpro.tc.api.type.TextClassificationTarget)11 JCas (org.apache.uima.jcas.JCas)9 TextClassificationOutcome (org.dkpro.tc.api.type.TextClassificationOutcome)8 CASException (org.apache.uima.cas.CASException)6 FeatureExtractorResource_ImplBase (org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase)6 CollectionException (org.apache.uima.collection.CollectionException)5 ArrayList (java.util.ArrayList)4 TextClassificationException (org.dkpro.tc.api.exception.TextClassificationException)4 DocumentMetaData (de.tudarmstadt.ukp.dkpro.core.api.metadata.type.DocumentMetaData)3 AnalysisEngine (org.apache.uima.analysis_engine.AnalysisEngine)3 AnalysisEngineProcessException (org.apache.uima.analysis_engine.AnalysisEngineProcessException)3 Instance (org.dkpro.tc.api.features.Instance)3 PairFeatureExtractor (org.dkpro.tc.api.features.PairFeatureExtractor)3 IOException (java.io.IOException)2 Feature (org.dkpro.tc.api.features.Feature)2 FeatureExtractor (org.dkpro.tc.api.features.FeatureExtractor)2 TextClassificationSequence (org.dkpro.tc.api.type.TextClassificationSequence)2 AnalysisEngineDescription (org.apache.uima.analysis_engine.AnalysisEngineDescription)1 AnnotationFS (org.apache.uima.cas.text.AnnotationFS)1