Search in sources :

Example 6 with Feature

use of org.dkpro.tc.api.features.Feature in project dkpro-tc by dkpro.

the class InstanceTest method instanceAddSingleFeatureTest.

@Test
public void instanceAddSingleFeatureTest() throws Exception {
    Feature f1 = new Feature("feature1", "value1", FeatureType.STRING);
    Feature f2 = new Feature("feature2", "value1", FeatureType.STRING);
    List<Feature> features = new ArrayList<>();
    features.add(f1);
    features.add(f2);
    Instance instance = new Instance(features, "outcome");
    Feature f3 = new Feature("feature3", "value1", FeatureType.STRING);
    instance.addFeature(f3);
    assertEquals(3, instance.getFeatures().size());
}
Also used : Instance(org.dkpro.tc.api.features.Instance) ArrayList(java.util.ArrayList) Feature(org.dkpro.tc.api.features.Feature) Test(org.junit.Test)

Example 7 with Feature

use of org.dkpro.tc.api.features.Feature in project dkpro-tc by dkpro.

the class InstanceTest method instanceAddFeatureListTest.

@Test
public void instanceAddFeatureListTest() throws Exception {
    Feature f1 = new Feature("feature1", "value1", FeatureType.STRING);
    Feature f2 = new Feature("feature2", "value1", FeatureType.STRING);
    List<Feature> features = new ArrayList<>();
    features.add(f1);
    features.add(f2);
    Instance instance = new Instance(features, "outcome");
    List<Feature> s = new ArrayList<Feature>();
    Feature f3 = new Feature("feature3", "value3", FeatureType.STRING);
    Feature f4 = new Feature("feature4", "value4", FeatureType.STRING);
    s.add(f3);
    s.add(f4);
    instance.addFeatures(s);
    assertEquals(4, instance.getFeatures().size());
}
Also used : Instance(org.dkpro.tc.api.features.Instance) ArrayList(java.util.ArrayList) Feature(org.dkpro.tc.api.features.Feature) Test(org.junit.Test)

Example 8 with Feature

use of org.dkpro.tc.api.features.Feature in project dkpro-tc by dkpro.

the class InstanceTest method instanceInitializationWithArrayOfOutcomes.

@Test
public void instanceInitializationWithArrayOfOutcomes() throws Exception {
    Feature f1 = new Feature("feature1", "value1", FeatureType.STRING);
    Feature f2 = new Feature("feature2", "value1", FeatureType.STRING);
    List<Feature> features = new ArrayList<>();
    features.add(f1);
    features.add(f2);
    Instance instance = new Instance(features, "outcome", "outcome2");
    assertEquals(2, instance.getFeatures().size());
    assertEquals(2, instance.getOutcomes().size());
}
Also used : Instance(org.dkpro.tc.api.features.Instance) ArrayList(java.util.ArrayList) Feature(org.dkpro.tc.api.features.Feature) Test(org.junit.Test)

Example 9 with Feature

use of org.dkpro.tc.api.features.Feature in project dkpro-tc by dkpro.

the class FilterLuceneCharacterNgramStartingWithLetter method applyFilter.

@Override
public void applyFilter(File inputFeatureFile) throws Exception {
    Gson gson = new Gson();
    // iterating over a stream is for large data more reasonable that
    // bulk-read of all data
    List<String> outputLines = new ArrayList<>();
    List<String> inputLines = FileUtils.readLines(inputFeatureFile, "utf-8");
    for (String l : inputLines) {
        // de-serialize
        Instance[] instances = gson.fromJson(l, Instance[].class);
        List<Instance> filter_out = new ArrayList<>();
        for (Instance inst : instances) {
            // collect features starting with a t-letter
            List<Feature> features = new ArrayList<>(inst.getFeatures());
            List<Feature> deletionTargets = new ArrayList<>();
            for (Feature f : features) {
                if (f.getName().startsWith("charngram")) {
                    deletionTargets.add(f);
                }
            }
            // remove those features
            for (Feature f : deletionTargets) {
                features.remove(f);
            }
            // update instances
            inst.setFeatures(features);
            // re-serialize
            filter_out.add(inst);
        }
        outputLines.add(gson.toJson(filter_out.toArray(new Instance[0]), Instance[].class));
    }
    // Write new file to temporary location
    File tmp = File.createTempFile("tmpFeatureFile", "tmp");
    FileUtils.writeLines(tmp, "utf-8", outputLines);
    // overwrite input file with new file
    FileUtils.copyFile(tmp, inputFeatureFile);
    tmp.delete();
}
Also used : Instance(org.dkpro.tc.api.features.Instance) ArrayList(java.util.ArrayList) Gson(com.google.gson.Gson) Feature(org.dkpro.tc.api.features.Feature) File(java.io.File)

Example 10 with Feature

use of org.dkpro.tc.api.features.Feature in project dkpro-tc by dkpro.

the class TokenLengthRatio method extract.

@Override
public Set<Feature> extract(JCas jcas, TextClassificationTarget aTarget) throws TextClassificationException {
    long maxLen = getMax();
    double ratio = getRatio(aTarget.getCoveredText().length(), maxLen);
    return new Feature(FEATURE_NAME, ratio, FeatureType.NUMERIC).asSet();
}
Also used : Feature(org.dkpro.tc.api.features.Feature)

Aggregations

Feature (org.dkpro.tc.api.features.Feature)94 Test (org.junit.Test)48 Instance (org.dkpro.tc.api.features.Instance)30 ArrayList (java.util.ArrayList)29 HashSet (java.util.HashSet)21 FeatureTestUtil.assertFeature (org.dkpro.tc.testing.FeatureTestUtil.assertFeature)17 AnalysisEngine (org.apache.uima.analysis_engine.AnalysisEngine)16 TextClassificationTarget (org.dkpro.tc.api.type.TextClassificationTarget)16 JCas (org.apache.uima.jcas.JCas)15 AnalysisEngineDescription (org.apache.uima.analysis_engine.AnalysisEngineDescription)13 File (java.io.File)8 Attribute (weka.core.Attribute)8 Token (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token)7 Sentence (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence)6 TextClassificationException (org.dkpro.tc.api.exception.TextClassificationException)5 Chunk (de.tudarmstadt.ukp.dkpro.core.api.syntax.type.chunk.Chunk)4 OpenNlpPosTagger (de.tudarmstadt.ukp.dkpro.core.opennlp.OpenNlpPosTagger)4 BreakIteratorSegmenter (de.tudarmstadt.ukp.dkpro.core.tokit.BreakIteratorSegmenter)4 Instances (weka.core.Instances)4 IOException (java.io.IOException)3