Search in sources :

Example 1 with FeatureExtractorResource_ImplBase

use of org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase in project dkpro-tc by dkpro.

the class InstanceExtractor method getSingleInstanceUnit.

private Instance getSingleInstanceUnit(Instance instance, JCas jcas, boolean supportsSparseFeature) throws Exception {
    int jcasId = JCasUtil.selectSingle(jcas, JCasId.class).getId();
    TextClassificationTarget unit = JCasUtil.selectSingle(jcas, TextClassificationTarget.class);
    if (addInstanceId) {
        instance.addFeature(InstanceIdFeature.retrieve(jcas, unit));
    }
    for (FeatureExtractorResource_ImplBase featExt : featureExtractors) {
        if (supportsSparseFeature) {
            instance.addFeatures(getSparse(jcas, unit, featExt));
        } else {
            instance.addFeatures(getDense(jcas, unit, featExt));
        }
        instance.setOutcomes(getOutcomes(jcas, unit));
        instance.setWeight(getWeight(jcas, unit));
        instance.setJcasId(jcasId);
    }
    return instance;
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) FeatureExtractorResource_ImplBase(org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase)

Example 2 with FeatureExtractorResource_ImplBase

use of org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase in project dkpro-tc by dkpro.

the class InstanceExtractor method getUnitInstances.

public List<Instance> getUnitInstances(JCas jcas, boolean supportSparseFeatures) throws TextClassificationException {
    List<Instance> instances = new ArrayList<Instance>();
    int jcasId = JCasUtil.selectSingle(jcas, JCasId.class).getId();
    Collection<TextClassificationTarget> targets = JCasUtil.select(jcas, TextClassificationTarget.class);
    for (TextClassificationTarget aTarget : targets) {
        Instance instance = new Instance();
        if (addInstanceId) {
            Feature feat = InstanceIdFeature.retrieve(jcas, aTarget);
            instance.addFeature(feat);
        }
        for (FeatureExtractorResource_ImplBase featExt : featureExtractors) {
            if (!(featExt instanceof FeatureExtractor)) {
                throw new TextClassificationException("Feature extractor does not implement interface [" + FeatureExtractor.class.getName() + "]: " + featExt.getResourceName());
            }
            if (supportSparseFeatures) {
                instance.addFeatures(getSparse(jcas, aTarget, featExt));
            } else {
                instance.addFeatures(getDense(jcas, aTarget, featExt));
            }
        }
        // set and write outcome label(s)
        instance.setOutcomes(getOutcomes(jcas, aTarget));
        instance.setWeight(getWeight(jcas, aTarget));
        instance.setJcasId(jcasId);
        // instance.setSequenceId(sequenceId);
        instance.setSequencePosition(aTarget.getId());
        instances.add(instance);
    }
    return instances;
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId) FeatureExtractor(org.dkpro.tc.api.features.FeatureExtractor) PairFeatureExtractor(org.dkpro.tc.api.features.PairFeatureExtractor) Instance(org.dkpro.tc.api.features.Instance) TextClassificationException(org.dkpro.tc.api.exception.TextClassificationException) ArrayList(java.util.ArrayList) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) Feature(org.dkpro.tc.api.features.Feature) InstanceIdFeature(org.dkpro.tc.core.feature.InstanceIdFeature) FeatureExtractorResource_ImplBase(org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase)

Example 3 with FeatureExtractorResource_ImplBase

use of org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase in project dkpro-tc by dkpro.

the class InstanceExtractor method getSingleInstancePair.

private Instance getSingleInstancePair(Instance instance, JCas jcas) throws TextClassificationException {
    try {
        int jcasId = JCasUtil.selectSingle(jcas, JCasId.class).getId();
        if (addInstanceId) {
            instance.addFeature(InstanceIdFeature.retrieve(jcas));
        }
        for (FeatureExtractorResource_ImplBase featExt : featureExtractors) {
            if (!(featExt instanceof PairFeatureExtractor)) {
                throw new TextClassificationException("Using non-pair FE in pair mode: " + featExt.getResourceName());
            }
            JCas view1 = jcas.getView(Constants.PART_ONE);
            JCas view2 = jcas.getView(Constants.PART_TWO);
            instance.setOutcomes(getOutcomes(jcas, null));
            instance.setWeight(getWeight(jcas, null));
            instance.setJcasId(jcasId);
            instance.addFeatures(((PairFeatureExtractor) featExt).extract(view1, view2));
        }
    } catch (CASException e) {
        throw new TextClassificationException(e);
    }
    return instance;
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId) PairFeatureExtractor(org.dkpro.tc.api.features.PairFeatureExtractor) TextClassificationException(org.dkpro.tc.api.exception.TextClassificationException) JCas(org.apache.uima.jcas.JCas) CASException(org.apache.uima.cas.CASException) FeatureExtractorResource_ImplBase(org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase)

Example 4 with FeatureExtractorResource_ImplBase

use of org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase in project dkpro-tc by dkpro.

the class ValidityCheckConnector method verifyNonPairFeatureExtractors.

private void verifyNonPairFeatureExtractors(String[] featureExtractors) throws Exception {
    for (String featExt : featureExtractors) {
        FeatureExtractorResource_ImplBase featExtC = (FeatureExtractorResource_ImplBase) Class.forName(featExt).newInstance();
        implementsFeatureExtractorInterface(featExt, featExtC);
        checkErrorConditionImplementsConflictingFeatureExtractorInterfaces(featExt, featExtC);
    }
}
Also used : FeatureExtractorResource_ImplBase(org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase)

Example 5 with FeatureExtractorResource_ImplBase

use of org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase in project dkpro-tc by dkpro.

the class TestTaskUtils method testInstanceMultiplicationWithoutUnitId.

@Test
public void testInstanceMultiplicationWithoutUnitId() throws Exception {
    JCas jCas = initJCas(false);
    FeatureExtractorResource_ImplBase[] featureExtractors = {};
    InstanceExtractor ie = new InstanceExtractor(Constants.FM_SEQUENCE, featureExtractors, true);
    List<Instance> multipleInstances = ie.getInstances(jCas, false);
    assertEquals(6, multipleInstances.size());
    // Sequence 1
    int idx = 0;
    assertEquals("4711_0_0", multipleInstances.get(idx).getFeatures().iterator().next().getValue());
    assertEquals(0, multipleInstances.get(idx).getSequenceId());
    assertEquals(0, multipleInstances.get(idx).getSequencePosition());
    assertEquals("DT", multipleInstances.get(idx).getOutcome());
    idx = 1;
    assertEquals("4711_0_1", multipleInstances.get(idx).getFeatures().iterator().next().getValue());
    assertEquals(0, multipleInstances.get(idx).getSequenceId());
    assertEquals(1, multipleInstances.get(idx).getSequencePosition());
    assertEquals("NN", multipleInstances.get(idx).getOutcome());
    idx = 2;
    assertEquals("4711_0_2", multipleInstances.get(idx).getFeatures().iterator().next().getValue());
    assertEquals(0, multipleInstances.get(idx).getSequenceId());
    assertEquals(2, multipleInstances.get(idx).getSequencePosition());
    assertEquals("VBZ", multipleInstances.get(idx).getOutcome());
    // Sequence 2
    idx = 3;
    assertEquals("4711_1_0", multipleInstances.get(idx).getFeatures().iterator().next().getValue());
    assertEquals(1, multipleInstances.get(idx).getSequenceId());
    assertEquals(0, multipleInstances.get(idx).getSequencePosition());
    assertEquals("DT", multipleInstances.get(idx).getOutcome());
    idx = 4;
    assertEquals("4711_1_1", multipleInstances.get(idx).getFeatures().iterator().next().getValue());
    assertEquals(1, multipleInstances.get(idx).getSequenceId());
    assertEquals(1, multipleInstances.get(idx).getSequencePosition());
    assertEquals("NN", multipleInstances.get(idx).getOutcome());
    idx = 5;
    assertEquals("4711_1_2", multipleInstances.get(idx).getFeatures().iterator().next().getValue());
    assertEquals(1, multipleInstances.get(idx).getSequenceId());
    assertEquals(2, multipleInstances.get(idx).getSequencePosition());
    assertEquals("VBZ", multipleInstances.get(idx).getOutcome());
}
Also used : Instance(org.dkpro.tc.api.features.Instance) JCas(org.apache.uima.jcas.JCas) FeatureExtractorResource_ImplBase(org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase) InstanceExtractor(org.dkpro.tc.core.task.uima.InstanceExtractor) Test(org.junit.Test)

Aggregations

FeatureExtractorResource_ImplBase (org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase)11 Instance (org.dkpro.tc.api.features.Instance)6 JCasId (org.dkpro.tc.api.type.JCasId)6 TextClassificationTarget (org.dkpro.tc.api.type.TextClassificationTarget)5 JCas (org.apache.uima.jcas.JCas)4 TextClassificationException (org.dkpro.tc.api.exception.TextClassificationException)4 ArrayList (java.util.ArrayList)3 PairFeatureExtractor (org.dkpro.tc.api.features.PairFeatureExtractor)3 InstanceExtractor (org.dkpro.tc.core.task.uima.InstanceExtractor)3 Test (org.junit.Test)3 FeatureExtractor (org.dkpro.tc.api.features.FeatureExtractor)2 AnalysisEngineProcessException (org.apache.uima.analysis_engine.AnalysisEngineProcessException)1 CASException (org.apache.uima.cas.CASException)1 Feature (org.dkpro.tc.api.features.Feature)1 TextClassificationSequence (org.dkpro.tc.api.type.TextClassificationSequence)1 InstanceIdFeature (org.dkpro.tc.core.feature.InstanceIdFeature)1