Search in sources :

Example 16 with AnalysisEngineDescription

use of org.apache.uima.analysis_engine.AnalysisEngineDescription in project dkpro-tc by dkpro.

the class LuceneMetaCollectionBasedFeatureTestBase method prepareFeatureExtractor.

protected AnalysisEngineDescription prepareFeatureExtractor(File outputPath, Class<? extends Resource_ImplBase> class1, Object[] parameters) throws ResourceInitializationException {
    List<ExternalResourceDescription> fes = makeResource(class1, parameters);
    AnalysisEngineDescription featExtractorConnector = TaskUtils.getFeatureExtractorConnector(outputPath.getAbsolutePath(), JsonDataWriter.class.getName(), Constants.LM_SINGLE_LABEL, Constants.FM_DOCUMENT, false, false, false, false, Collections.emptyList(), fes, new String[] {});
    return featExtractorConnector;
}
Also used : JsonDataWriter(org.dkpro.tc.core.io.JsonDataWriter) AnalysisEngineDescription(org.apache.uima.analysis_engine.AnalysisEngineDescription) ExternalResourceDescription(org.apache.uima.resource.ExternalResourceDescription)

Example 17 with AnalysisEngineDescription

use of org.apache.uima.analysis_engine.AnalysisEngineDescription in project dkpro-tc by dkpro.

the class PosNGramTest method runMetaCollection.

@Override
protected void runMetaCollection(File luceneFolder, AnalysisEngineDescription metaCollector) throws Exception {
    CollectionReaderDescription reader = getMetaReader();
    AnalysisEngineDescription segmenter = AnalysisEngineFactory.createEngineDescription(BreakIteratorSegmenter.class);
    AnalysisEngineDescription posTagger = AnalysisEngineFactory.createEngineDescription(OpenNlpPosTagger.class, OpenNlpPosTagger.PARAM_LANGUAGE, "en");
    SimplePipeline.runPipeline(reader, segmenter, posTagger, metaCollector);
}
Also used : CollectionReaderDescription(org.apache.uima.collection.CollectionReaderDescription) AnalysisEngineDescription(org.apache.uima.analysis_engine.AnalysisEngineDescription)

Example 18 with AnalysisEngineDescription

use of org.apache.uima.analysis_engine.AnalysisEngineDescription in project dkpro-tc by dkpro.

the class DiffNounChunkCharacterLengthTest method setUp.

@Before
public void setUp() throws ResourceInitializationException, AnalysisEngineProcessException {
    AnalysisEngineDescription desc = createEngineDescription(BreakIteratorSegmenter.class);
    AnalysisEngine engine = createEngine(desc);
    jcas1 = engine.newJCas();
    jcas1.setDocumentLanguage("en");
    jcas1.setDocumentText("This is the text of view 1");
    engine.process(jcas1);
    jcas2 = engine.newJCas();
    jcas2.setDocumentLanguage("en");
    jcas2.setDocumentText("This is the text of view 2");
    engine.process(jcas2);
}
Also used : AnalysisEngineDescription(org.apache.uima.analysis_engine.AnalysisEngineDescription) AnalysisEngine(org.apache.uima.analysis_engine.AnalysisEngine) Before(org.junit.Before)

Example 19 with AnalysisEngineDescription

use of org.apache.uima.analysis_engine.AnalysisEngineDescription in project dkpro-tc by dkpro.

the class SharedNounChunksTest method setUp.

@Before
public void setUp() throws ResourceInitializationException, AnalysisEngineProcessException {
    AnalysisEngineDescription desc = createEngineDescription(BreakIteratorSegmenter.class);
    AnalysisEngine engine = createEngine(desc);
    jcas1 = engine.newJCas();
    jcas1.setDocumentLanguage("en");
    jcas1.setDocumentText("This is the text of view 1");
    JCasId id = new JCasId(jcas1);
    id.setId(jcasId++);
    id.addToIndexes();
    engine.process(jcas1);
    jcas2 = engine.newJCas();
    jcas2.setDocumentLanguage("en");
    jcas2.setDocumentText("This is the text of view 2");
    id = new JCasId(jcas2);
    id.setId(jcasId++);
    id.addToIndexes();
    engine.process(jcas2);
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId) AnalysisEngineDescription(org.apache.uima.analysis_engine.AnalysisEngineDescription) AnalysisEngine(org.apache.uima.analysis_engine.AnalysisEngine) Before(org.junit.Before)

Example 20 with AnalysisEngineDescription

use of org.apache.uima.analysis_engine.AnalysisEngineDescription in project dkpro-tc by dkpro.

the class DiffNrOfSentencesPairFeatureExtractorTest method testExtract.

@Test
public void testExtract() throws ResourceInitializationException, AnalysisEngineProcessException, TextClassificationException {
    AnalysisEngineDescription desc = createEngineDescription(BreakIteratorSegmenter.class);
    AnalysisEngine engine = createEngine(desc);
    JCas jcas1 = engine.newJCas();
    jcas1.setDocumentLanguage("en");
    jcas1.setDocumentText("This is the text of view 1. And some more.");
    engine.process(jcas1);
    JCas jcas2 = engine.newJCas();
    jcas2.setDocumentLanguage("en");
    jcas2.setDocumentText("This is the text of view 2.");
    engine.process(jcas2);
    DiffNrOfSentencesPairFeatureExtractor extractor = new DiffNrOfSentencesPairFeatureExtractor();
    Set<Feature> features = extractor.extract(jcas1, jcas2);
    assertEquals(1, features.size());
    for (Feature feature : features) {
        assertFeature("DiffNrOfSentences", 1, feature);
    }
}
Also used : DiffNrOfSentencesPairFeatureExtractor(org.dkpro.tc.features.pair.core.length.DiffNrOfSentencesPairFeatureExtractor) AnalysisEngineDescription(org.apache.uima.analysis_engine.AnalysisEngineDescription) JCas(org.apache.uima.jcas.JCas) FeatureTestUtil.assertFeature(org.dkpro.tc.testing.FeatureTestUtil.assertFeature) Feature(org.dkpro.tc.api.features.Feature) AnalysisEngine(org.apache.uima.analysis_engine.AnalysisEngine) Test(org.junit.Test)

Aggregations

AnalysisEngineDescription (org.apache.uima.analysis_engine.AnalysisEngineDescription)62 Test (org.junit.Test)32 File (java.io.File)27 CollectionReaderDescription (org.apache.uima.collection.CollectionReaderDescription)25 ArrayList (java.util.ArrayList)22 AnalysisEngine (org.apache.uima.analysis_engine.AnalysisEngine)18 JCas (org.apache.uima.jcas.JCas)16 Feature (org.dkpro.tc.api.features.Feature)13 FeatureTestUtil.assertFeature (org.dkpro.tc.testing.FeatureTestUtil.assertFeature)11 ExternalResourceDescription (org.apache.uima.resource.ExternalResourceDescription)10 AggregateBuilder (org.apache.uima.fit.factory.AggregateBuilder)8 ResourceInitializationException (org.apache.uima.resource.ResourceInitializationException)8 JsonDataWriter (org.dkpro.tc.core.io.JsonDataWriter)8 TextClassificationTarget (org.dkpro.tc.api.type.TextClassificationTarget)7 Gson (com.google.gson.Gson)6 IOException (java.io.IOException)6 Instance (org.dkpro.tc.api.features.Instance)6 OpenNlpPosTagger (de.tudarmstadt.ukp.dkpro.core.opennlp.OpenNlpPosTagger)4 BreakIteratorSegmenter (de.tudarmstadt.ukp.dkpro.core.tokit.BreakIteratorSegmenter)4 CAS (org.apache.uima.cas.CAS)4