Search in sources :

Example 1 with SimilarityPairFeatureExtractor

use of org.dkpro.tc.features.pair.similarity.SimilarityPairFeatureExtractor in project dkpro-tc by dkpro.

the class SimilarityPairFeatureTest method similarityPairFeatureTest.

@Test
public void similarityPairFeatureTest() throws Exception {
    ExternalResourceDescription gstResource = ExternalResourceFactory.createExternalResourceDescription(GreedyStringTilingMeasureResource.class, GreedyStringTilingMeasureResource.PARAM_MIN_MATCH_LENGTH, "3");
    AnalysisEngineDescription desc = createEngineDescription(NoOpAnnotator.class);
    AnalysisEngine engine = createEngine(desc);
    JCas jcas = engine.newJCas();
    TokenBuilder<Token, Sentence> tb = new TokenBuilder<Token, Sentence>(Token.class, Sentence.class);
    JCas view1 = jcas.createView(VIEW1);
    view1.setDocumentLanguage("en");
    tb.buildTokens(view1, "This is a test .");
    JCas view2 = jcas.createView(VIEW2);
    view2.setDocumentLanguage("en");
    tb.buildTokens(view2, "Test is this .");
    engine.process(jcas);
    SimilarityPairFeatureExtractor extractor = FeatureUtil.createResource(SimilarityPairFeatureExtractor.class, SimilarityPairFeatureExtractor.PARAM_UNIQUE_EXTRACTOR_NAME, "123", SimilarityPairFeatureExtractor.PARAM_SEGMENT_FEATURE_PATH, Token.class.getName(), SimilarityPairFeatureExtractor.PARAM_TEXT_SIMILARITY_RESOURCE, gstResource);
    Set<Feature> features = extractor.extract(jcas.getView(VIEW1), jcas.getView(VIEW2));
    Assert.assertEquals(1, features.size());
    Iterator<Feature> iter = features.iterator();
    assertFeature("SimilarityGreedyStringTiling_3", 0.8125, iter.next(), 0.0001);
}
Also used : TokenBuilder(org.apache.uima.fit.testing.factory.TokenBuilder) AnalysisEngineDescription(org.apache.uima.analysis_engine.AnalysisEngineDescription) JCas(org.apache.uima.jcas.JCas) Token(de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token) Sentence(de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence) FeatureTestUtil.assertFeature(org.dkpro.tc.testing.FeatureTestUtil.assertFeature) Feature(org.dkpro.tc.api.features.Feature) SimilarityPairFeatureExtractor(org.dkpro.tc.features.pair.similarity.SimilarityPairFeatureExtractor) ExternalResourceDescription(org.apache.uima.resource.ExternalResourceDescription) AnalysisEngine(org.apache.uima.analysis_engine.AnalysisEngine) Test(org.junit.Test)

Aggregations

Sentence (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence)1 Token (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token)1 AnalysisEngine (org.apache.uima.analysis_engine.AnalysisEngine)1 AnalysisEngineDescription (org.apache.uima.analysis_engine.AnalysisEngineDescription)1 TokenBuilder (org.apache.uima.fit.testing.factory.TokenBuilder)1 JCas (org.apache.uima.jcas.JCas)1 ExternalResourceDescription (org.apache.uima.resource.ExternalResourceDescription)1 Feature (org.dkpro.tc.api.features.Feature)1 SimilarityPairFeatureExtractor (org.dkpro.tc.features.pair.similarity.SimilarityPairFeatureExtractor)1 FeatureTestUtil.assertFeature (org.dkpro.tc.testing.FeatureTestUtil.assertFeature)1 Test (org.junit.Test)1