Search in sources :

Example 36 with View

use of edu.illinois.cs.cogcomp.core.datastructures.textannotation.View in project cogcomp-nlp by CogComp.

the class PosWordConjunctionSizeTwoWindowSizeTwo method getFeatures.

@Override
public /**
     * This feature extractor assumes that the TOKEN View, POS View have been
     * generated in the Constituents TextAnnotation. It will use its own POS tag and well 
     * as the form of the word as a forms of the words around the constitent a 
     *
     **/
Set<Feature> getFeatures(Constituent c) throws EdisonException {
    TextAnnotation ta = c.getTextAnnotation();
    View TOKENS = null, POS = null;
    try {
        TOKENS = ta.getView(ViewNames.TOKENS);
        POS = ta.getView(ViewNames.POS);
    } catch (Exception e) {
        e.printStackTrace();
    }
    // We can assume that the constituent in this case is a Word(Token) described by the LBJ
    // chunk definition
    int startspan = c.getStartSpan();
    int endspan = c.getEndSpan();
    // All our constituents are words(tokens)
    // words two before & after
    int k = 2;
    int window = 2;
    String[] forms = getWindowK(TOKENS, startspan, endspan, k);
    String[] tags = getWindowKTags(POS, startspan, endspan, k);
    String classifier = "PosWordConjunctionSizeTwoWindowSizeTwo";
    String id, value;
    Set<Feature> result = new LinkedHashSet<>();
    for (int j = 0; j < k; j++) {
        for (int i = 0; i < tags.length; i++) {
            StringBuilder f = new StringBuilder();
            for (int context = 0; context <= j && i + context < tags.length; context++) {
                if (context != 0) {
                    f.append("_");
                }
                f.append(tags[i + context]);
                f.append("-");
                f.append(forms[i + context]);
            }
            // 2 is the center object in the array so i should go from -2 to +2 (with 0 being
            // the center)
            // j is the size of the n-gram so it goes 1 to 2
            id = classifier + ":" + ((i - window) + "_" + (j + 1));
            value = "(" + (f.toString()) + ")";
            result.add(new DiscreteFeature(id + value));
        }
    }
    return result;
}
Also used : LinkedHashSet(java.util.LinkedHashSet) DiscreteFeature(edu.illinois.cs.cogcomp.edison.features.DiscreteFeature) TextAnnotation(edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation) View(edu.illinois.cs.cogcomp.core.datastructures.textannotation.View) DiscreteFeature(edu.illinois.cs.cogcomp.edison.features.DiscreteFeature) Feature(edu.illinois.cs.cogcomp.edison.features.Feature) EdisonException(edu.illinois.cs.cogcomp.edison.utilities.EdisonException)

Example 37 with View

use of edu.illinois.cs.cogcomp.core.datastructures.textannotation.View in project cogcomp-nlp by CogComp.

the class NERAnnotatorTest method testResults.

/**
     * See if we get the right entities back. TODO: MS removed @Test annotation as this test
     * currently fails, but benchmark performance is good
     */
public void testResults() {
    TextAnnotation ta = tab.createTextAnnotation(TEST_INPUT);
    View view = null;
    try {
        view = getView(ta);
    } catch (AnnotatorException e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
    for (Constituent c : view.getConstituents()) {
        assertTrue("No entity named \"" + c.toString() + "\"", entities.contains(c.toString()));
    }
}
Also used : AnnotatorException(edu.illinois.cs.cogcomp.annotation.AnnotatorException) TextAnnotation(edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation) View(edu.illinois.cs.cogcomp.core.datastructures.textannotation.View) Constituent(edu.illinois.cs.cogcomp.core.datastructures.textannotation.Constituent)

Example 38 with View

use of edu.illinois.cs.cogcomp.core.datastructures.textannotation.View in project cogcomp-nlp by CogComp.

the class NerOntonotesTest method testOntonotesNer.

@Test
public void testOntonotesNer() {
    TextAnnotationBuilder tab = new TokenizerTextAnnotationBuilder(new StatefulTokenizer());
    Properties props = new Properties();
    NERAnnotator nerOntonotes = NerAnnotatorManager.buildNerAnnotator(new ResourceManager(props), ViewNames.NER_ONTONOTES);
    TextAnnotation taOnto = tab.createTextAnnotation("", "", TEST_INPUT);
    try {
        nerOntonotes.getView(taOnto);
    } catch (AnnotatorException e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
    View v = taOnto.getView(nerOntonotes.getViewName());
    assertEquals(v.getConstituents().size(), 4);
}
Also used : TextAnnotationBuilder(edu.illinois.cs.cogcomp.annotation.TextAnnotationBuilder) TokenizerTextAnnotationBuilder(edu.illinois.cs.cogcomp.nlp.utility.TokenizerTextAnnotationBuilder) TokenizerTextAnnotationBuilder(edu.illinois.cs.cogcomp.nlp.utility.TokenizerTextAnnotationBuilder) StatefulTokenizer(edu.illinois.cs.cogcomp.nlp.tokenizer.StatefulTokenizer) AnnotatorException(edu.illinois.cs.cogcomp.annotation.AnnotatorException) ResourceManager(edu.illinois.cs.cogcomp.core.utilities.configuration.ResourceManager) Properties(java.util.Properties) TextAnnotation(edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation) View(edu.illinois.cs.cogcomp.core.datastructures.textannotation.View) Test(org.junit.Test)

Example 39 with View

use of edu.illinois.cs.cogcomp.core.datastructures.textannotation.View in project cogcomp-nlp by CogComp.

the class TextAnnotationMapDBHandlerTest method updateTextAnnotation.

@Test
public void updateTextAnnotation() throws Exception {
    TextAnnotation ta = DummyTextAnnotationGenerator.generateAnnotatedTextAnnotation(false, 2);
    mapDBHandler.addTextAnnotation(testDataset, ta);
    ta = mapDBHandler.getDataset(testDataset).next();
    // Add a new view to the TextAnnotation
    String viewName = "TEST_VIEW";
    View dummyView = new View(viewName, "TEST", ta, 0.0);
    ta.addView(viewName, dummyView);
    assertTrue(ta.hasView(viewName));
    // Update the DB
    mapDBHandler.updateTextAnnotation(ta);
    // Check if the update is present
    ta = mapDBHandler.getDataset(testDataset).next();
    assertTrue(ta.hasView(viewName));
    // Revert the changes and check if it's updated
    ta.removeView(viewName);
    mapDBHandler.updateTextAnnotation(ta);
    ta = mapDBHandler.getTextAnnotation(ta);
    assertFalse(ta.hasView(viewName));
}
Also used : TextAnnotation(edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation) View(edu.illinois.cs.cogcomp.core.datastructures.textannotation.View) Test(org.junit.Test)

Example 40 with View

use of edu.illinois.cs.cogcomp.core.datastructures.textannotation.View in project cogcomp-nlp by CogComp.

the class WordBigrams method getFeatures.

@Override
public Set<Feature> getFeatures(Constituent instance) throws EdisonException {
    Set<Feature> features = new LinkedHashSet<>();
    View tokens = instance.getTextAnnotation().getView(ViewNames.TOKENS);
    List<Constituent> list = tokens.getConstituentsCoveringSpan(instance.getStartSpan(), instance.getEndSpan());
    list.sort(TextAnnotationUtilities.constituentStartComparator);
    ITransformer<Constituent, String> surfaceFormTransformer = new ITransformer<Constituent, String>() {

        public String transform(Constituent input) {
            return input.getSurfaceForm();
        }
    };
    features.addAll(FeatureNGramUtility.getNgramsOrdered(list, 1, surfaceFormTransformer));
    features.addAll(FeatureNGramUtility.getNgramsOrdered(list, 2, surfaceFormTransformer));
    return features;
}
Also used : LinkedHashSet(java.util.LinkedHashSet) ITransformer(edu.illinois.cs.cogcomp.core.transformers.ITransformer) Feature(edu.illinois.cs.cogcomp.edison.features.Feature) View(edu.illinois.cs.cogcomp.core.datastructures.textannotation.View) Constituent(edu.illinois.cs.cogcomp.core.datastructures.textannotation.Constituent)

Aggregations

View (edu.illinois.cs.cogcomp.core.datastructures.textannotation.View)64 Constituent (edu.illinois.cs.cogcomp.core.datastructures.textannotation.Constituent)51 TextAnnotation (edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation)49 Feature (edu.illinois.cs.cogcomp.edison.features.Feature)22 Test (org.junit.Test)21 FeatureExtractor (edu.illinois.cs.cogcomp.edison.features.FeatureExtractor)16 ProjectedPath (edu.illinois.cs.cogcomp.edison.features.lrec.ProjectedPath)16 FeatureManifest (edu.illinois.cs.cogcomp.edison.features.manifest.FeatureManifest)16 FileInputStream (java.io.FileInputStream)16 AnnotatorException (edu.illinois.cs.cogcomp.annotation.AnnotatorException)7 PredicateArgumentView (edu.illinois.cs.cogcomp.core.datastructures.textannotation.PredicateArgumentView)7 ArrayList (java.util.ArrayList)7 DiscreteFeature (edu.illinois.cs.cogcomp.edison.features.DiscreteFeature)6 LinkedHashSet (java.util.LinkedHashSet)6 Set (java.util.Set)6 POSBaseLineCounter (edu.illinois.cs.cogcomp.edison.utilities.POSBaseLineCounter)5 POSMikheevCounter (edu.illinois.cs.cogcomp.edison.utilities.POSMikheevCounter)5 IOException (java.io.IOException)5 EdisonException (edu.illinois.cs.cogcomp.edison.utilities.EdisonException)4 JsonObject (com.google.gson.JsonObject)3