Search in sources :

Example 31 with TextClassificationTarget

use of org.dkpro.tc.api.type.TextClassificationTarget in project dkpro-tc by dkpro.

the class InstanceExtractor method getSingleInstanceDocument.

private Instance getSingleInstanceDocument(Instance instance, JCas jcas, boolean supportSparseFeatures) throws TextClassificationException {
    int jcasId = JCasUtil.selectSingle(jcas, JCasId.class).getId();
    TextClassificationTarget documentTcu = JCasUtil.selectSingle(jcas, TextClassificationTarget.class);
    if (addInstanceId) {
        instance.addFeature(InstanceIdFeature.retrieve(jcas));
    }
    for (FeatureExtractorResource_ImplBase featExt : featureExtractors) {
        if (!(featExt instanceof FeatureExtractor)) {
            throw new TextClassificationException("Using incompatible feature in document mode: " + featExt.getResourceName());
        }
        if (supportSparseFeatures) {
            instance.addFeatures(getSparse(jcas, documentTcu, featExt));
        } else {
            instance.addFeatures(getDense(jcas, documentTcu, featExt));
        }
        instance.setOutcomes(getOutcomes(jcas, null));
        instance.setWeight(getWeight(jcas, null));
        instance.setJcasId(jcasId);
    }
    return instance;
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId) FeatureExtractor(org.dkpro.tc.api.features.FeatureExtractor) PairFeatureExtractor(org.dkpro.tc.api.features.PairFeatureExtractor) TextClassificationException(org.dkpro.tc.api.exception.TextClassificationException) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) FeatureExtractorResource_ImplBase(org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase)

Example 32 with TextClassificationTarget

use of org.dkpro.tc.api.type.TextClassificationTarget in project dkpro-tc by dkpro.

the class InstanceExtractor method getSequenceInstances.

public List<Instance> getSequenceInstances(JCas jcas, boolean useSparse) throws TextClassificationException {
    List<Instance> instances = new ArrayList<Instance>();
    int jcasId = JCasUtil.selectSingle(jcas, JCasId.class).getId();
    int sequenceId = 0;
    int unitId = 0;
    Collection<TextClassificationSequence> sequences = JCasUtil.select(jcas, TextClassificationSequence.class);
    for (TextClassificationSequence seq : sequences) {
        unitId = 0;
        List<TextClassificationTarget> seqTargets = JCasUtil.selectCovered(jcas, TextClassificationTarget.class, seq);
        for (TextClassificationTarget aTarget : seqTargets) {
            aTarget.setId(unitId++);
            Instance instance = new Instance();
            if (addInstanceId) {
                instance.addFeature(InstanceIdFeature.retrieve(jcas, aTarget, sequenceId));
            }
            for (FeatureExtractorResource_ImplBase featExt : featureExtractors) {
                if (useSparse) {
                    instance.addFeatures(getSparse(jcas, aTarget, featExt));
                } else {
                    instance.addFeatures(getDense(jcas, aTarget, featExt));
                }
            }
            // set and write outcome label(s)
            instance.setOutcomes(getOutcomes(jcas, aTarget));
            instance.setWeight(getWeight(jcas, aTarget));
            instance.setJcasId(jcasId);
            instance.setSequenceId(sequenceId);
            instance.setSequencePosition(aTarget.getId());
            instances.add(instance);
        }
        sequenceId++;
    }
    return instances;
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId) Instance(org.dkpro.tc.api.features.Instance) ArrayList(java.util.ArrayList) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) TextClassificationSequence(org.dkpro.tc.api.type.TextClassificationSequence) FeatureExtractorResource_ImplBase(org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase)

Example 33 with TextClassificationTarget

use of org.dkpro.tc.api.type.TextClassificationTarget in project dkpro-tc by dkpro.

the class ValidityCheckConnectorPost method checkErrorConditionMissingOutcomeForTargetIfUnitOrSequenceMode.

private void checkErrorConditionMissingOutcomeForTargetIfUnitOrSequenceMode(List<TextClassificationTarget> targets, List<TextClassificationOutcome> outcomes) throws AnalysisEngineProcessException {
    // labeled with an outcome annotation
    if (featureModeI == 2 || featureModeI == 4) {
        if (targets.size() == 0) {
            throw new AnalysisEngineProcessException(new TextClassificationException("Your experiment is supposed to have [+" + TextClassificationTarget.class.getName() + "] annotations, which are missing"));
        } else {
            if (targets.size() != outcomes.size()) {
                throwException("Number of targets [" + targets.size() + "] != number of outcomes [" + outcomes.size() + "]");
            }
            for (int i = 0; i < targets.size(); i++) {
                TextClassificationTarget t = targets.get(i);
                TextClassificationOutcome o = outcomes.get(i);
                if (t.getBegin() != o.getBegin() || t.getEnd() != o.getEnd()) {
                    throwException("Index of target and outcome do not match taget span: [" + t.getBegin() + " - " + t.getEnd() + "] != outcome span " + o.getBegin() + " - " + o.getEnd());
                }
            }
        }
    }
}
Also used : TextClassificationException(org.dkpro.tc.api.exception.TextClassificationException) TextClassificationOutcome(org.dkpro.tc.api.type.TextClassificationOutcome) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) AnalysisEngineProcessException(org.apache.uima.analysis_engine.AnalysisEngineProcessException)

Example 34 with TextClassificationTarget

use of org.dkpro.tc.api.type.TextClassificationTarget in project dkpro-tc by dkpro.

the class SequenceContextMetaCollector method process.

@Override
public void process(JCas jcas) throws AnalysisEngineProcessException {
    Collection<TextClassificationSequence> sequences = JCasUtil.select(jcas, TextClassificationSequence.class);
    for (TextClassificationSequence seq : sequences) {
        int id = seq.getId();
        for (TextClassificationTarget unit : JCasUtil.selectCovered(jcas, TextClassificationTarget.class, seq)) {
            String idString;
            try {
                idString = (String) InstanceIdFeature.retrieve(jcas, unit, id).getValue();
                ContextMetaCollectorUtil.addContext(jcas, unit, idString, bw);
            } catch (Exception e) {
                throw new AnalysisEngineProcessException(e);
            }
        }
    }
}
Also used : TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) TextClassificationSequence(org.dkpro.tc.api.type.TextClassificationSequence) AnalysisEngineProcessException(org.apache.uima.analysis_engine.AnalysisEngineProcessException) AnalysisEngineProcessException(org.apache.uima.analysis_engine.AnalysisEngineProcessException)

Example 35 with TextClassificationTarget

use of org.dkpro.tc.api.type.TextClassificationTarget in project dkpro-tc by dkpro.

the class SingleLabelReaderBase method getNext.

@Override
public void getNext(CAS aCAS) throws IOException, CollectionException {
    super.getNext(aCAS);
    JCas jcas;
    try {
        jcas = aCAS.getJCas();
    } catch (CASException e) {
        throw new CollectionException();
    }
    TextClassificationOutcome outcome = new TextClassificationOutcome(jcas);
    outcome.setOutcome(getTextClassificationOutcome(jcas));
    outcome.setWeight(getTextClassificationOutcomeWeight(jcas));
    outcome.addToIndexes();
    new TextClassificationTarget(jcas, 0, jcas.getDocumentText().length()).addToIndexes();
}
Also used : CollectionException(org.apache.uima.collection.CollectionException) TextClassificationOutcome(org.dkpro.tc.api.type.TextClassificationOutcome) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) JCas(org.apache.uima.jcas.JCas) CASException(org.apache.uima.cas.CASException)

Aggregations

TextClassificationTarget (org.dkpro.tc.api.type.TextClassificationTarget)61 JCas (org.apache.uima.jcas.JCas)29 ArrayList (java.util.ArrayList)22 TextClassificationOutcome (org.dkpro.tc.api.type.TextClassificationOutcome)18 Feature (org.dkpro.tc.api.features.Feature)16 Test (org.junit.Test)16 AnalysisEngine (org.apache.uima.analysis_engine.AnalysisEngine)12 TextClassificationSequence (org.dkpro.tc.api.type.TextClassificationSequence)12 Token (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token)11 JCasId (org.dkpro.tc.api.type.JCasId)11 AnalysisEngineDescription (org.apache.uima.analysis_engine.AnalysisEngineDescription)8 AnalysisEngineProcessException (org.apache.uima.analysis_engine.AnalysisEngineProcessException)7 TextClassificationException (org.dkpro.tc.api.exception.TextClassificationException)7 FeatureTestUtil.assertFeature (org.dkpro.tc.testing.FeatureTestUtil.assertFeature)6 CollectionReader (org.apache.uima.collection.CollectionReader)5 FeatureExtractorResource_ImplBase (org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase)5 DocumentMetaData (de.tudarmstadt.ukp.dkpro.core.api.metadata.type.DocumentMetaData)4 Sentence (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence)4 OpenNlpPosTagger (de.tudarmstadt.ukp.dkpro.core.opennlp.OpenNlpPosTagger)4 BreakIteratorSegmenter (de.tudarmstadt.ukp.dkpro.core.tokit.BreakIteratorSegmenter)4