Search in sources :

Example 1 with ITransformer

use of edu.illinois.cs.cogcomp.core.transformers.ITransformer in project cogcomp-nlp by CogComp.

the class IllinoisLemmatizer method readFromClasspath.

public static List<String> readFromClasspath(String filename) {
    List<String> lines = null;
    try {
        InputStream resource = IOUtils.lsResources(IllinoisLemmatizer.class, filename).get(0).openStream();
        lines = LineIO.read(resource, Charset.defaultCharset().name(), new ITransformer<String, String>() {

            public String transform(String line) {
                return line;
            }
        });
    } catch (IOException | URISyntaxException e) {
        System.err.println("Error while trying to read " + filename + ".");
        System.exit(-1);
    }
    return lines;
}
Also used : ITransformer(edu.illinois.cs.cogcomp.core.transformers.ITransformer) InputStream(java.io.InputStream) IOException(java.io.IOException) URISyntaxException(java.net.URISyntaxException)

Example 2 with ITransformer

use of edu.illinois.cs.cogcomp.core.transformers.ITransformer in project cogcomp-nlp by CogComp.

the class WordBigrams method getFeatures.

@Override
public Set<Feature> getFeatures(Constituent instance) throws EdisonException {
    Set<Feature> features = new LinkedHashSet<Feature>();
    View tokens = instance.getTextAnnotation().getView(ViewNames.TOKENS);
    List<Constituent> list = tokens.getConstituentsCoveringSpan(instance.getStartSpan(), instance.getEndSpan());
    Collections.sort(list, TextAnnotationUtilities.constituentStartComparator);
    ITransformer<Constituent, String> surfaceFormTransformer = new ITransformer<Constituent, String>() {

        private static final long serialVersionUID = 1L;

        public String transform(Constituent input) {
            return input.getSurfaceForm();
        }
    };
    features.addAll(FeatureNGramUtility.getNgramsOrdered(list, 1, surfaceFormTransformer));
    features.addAll(FeatureNGramUtility.getNgramsOrdered(list, 2, surfaceFormTransformer));
    return features;
}
Also used : LinkedHashSet(java.util.LinkedHashSet) ITransformer(edu.illinois.cs.cogcomp.core.transformers.ITransformer) Feature(edu.illinois.cs.cogcomp.edison.features.Feature) View(edu.illinois.cs.cogcomp.core.datastructures.textannotation.View) Constituent(edu.illinois.cs.cogcomp.core.datastructures.textannotation.Constituent)

Example 3 with ITransformer

use of edu.illinois.cs.cogcomp.core.transformers.ITransformer in project cogcomp-nlp by CogComp.

the class ParseUtils method getTokenIndexedParseTreeNodeCovering.

public static Tree<Pair<String, IntPair>> getTokenIndexedParseTreeNodeCovering(String parseViewName, Constituent c) {
    // / UGLY CODE ALERT!!!
    TextAnnotation ta = c.getTextAnnotation();
    int sentenceId = ta.getSentenceId(c);
    Tree<String> tree = getParseTree(parseViewName, ta, sentenceId);
    final int sentenceStartSpan = ta.getSentence(sentenceId).getStartSpan();
    int start = c.getStartSpan() - sentenceStartSpan;
    int end = c.getEndSpan() - sentenceStartSpan;
    // Find the tree that covers the start and end tokens. However, start
    // and end have been shifted relative to the start of the sentence. So
    // we need to shift it back, which is why we have that UGLY as sin
    // mapper at the end.
    Tree<Pair<String, IntPair>> toknTree = getTokenIndexedTreeCovering(tree, start, end);
    ITransformer<Tree<Pair<String, IntPair>>, Pair<String, IntPair>> transformer = new ITransformer<Tree<Pair<String, IntPair>>, Pair<String, IntPair>>() {

        @Override
        public Pair<String, IntPair> transform(Tree<Pair<String, IntPair>> input) {
            Pair<String, IntPair> label = input.getLabel();
            IntPair newSpan = new IntPair(label.getSecond().getFirst() + sentenceStartSpan, label.getSecond().getSecond() + sentenceStartSpan);
            return new Pair<>(label.getFirst(), newSpan);
        }
    };
    return Mappers.mapTree(toknTree, transformer);
}
Also used : ITransformer(edu.illinois.cs.cogcomp.core.transformers.ITransformer) Tree(edu.illinois.cs.cogcomp.core.datastructures.trees.Tree) TextAnnotation(edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation) IntPair(edu.illinois.cs.cogcomp.core.datastructures.IntPair) IntPair(edu.illinois.cs.cogcomp.core.datastructures.IntPair) Pair(edu.illinois.cs.cogcomp.core.datastructures.Pair)

Example 4 with ITransformer

use of edu.illinois.cs.cogcomp.core.transformers.ITransformer in project cogcomp-nlp by CogComp.

the class WordBigrams method getFeatures.

@Override
public Set<Feature> getFeatures(Constituent instance) throws EdisonException {
    Set<Feature> features = new LinkedHashSet<>();
    View tokens = instance.getTextAnnotation().getView(ViewNames.TOKENS);
    List<Constituent> list = tokens.getConstituentsCoveringSpan(instance.getStartSpan(), instance.getEndSpan());
    list.sort(TextAnnotationUtilities.constituentStartComparator);
    ITransformer<Constituent, String> surfaceFormTransformer = new ITransformer<Constituent, String>() {

        public String transform(Constituent input) {
            return input.getSurfaceForm();
        }
    };
    features.addAll(FeatureNGramUtility.getNgramsOrdered(list, 1, surfaceFormTransformer));
    features.addAll(FeatureNGramUtility.getNgramsOrdered(list, 2, surfaceFormTransformer));
    return features;
}
Also used : LinkedHashSet(java.util.LinkedHashSet) ITransformer(edu.illinois.cs.cogcomp.core.transformers.ITransformer) Feature(edu.illinois.cs.cogcomp.edison.features.Feature) View(edu.illinois.cs.cogcomp.core.datastructures.textannotation.View) Constituent(edu.illinois.cs.cogcomp.core.datastructures.textannotation.Constituent)

Example 5 with ITransformer

use of edu.illinois.cs.cogcomp.core.transformers.ITransformer in project cogcomp-nlp by CogComp.

the class ParseHelper method getTokenIndexedParseTreeNodeCovering.

public static Tree<Pair<String, IntPair>> getTokenIndexedParseTreeNodeCovering(String parseViewName, Constituent c) {
    // / UGLY CODE ALERT!!!
    TextAnnotation ta = c.getTextAnnotation();
    int sentenceId = ta.getSentenceId(c);
    Tree<String> tree = getParseTree(parseViewName, ta, sentenceId);
    final int sentenceStartSpan = ta.getSentence(sentenceId).getStartSpan();
    int start = c.getStartSpan() - sentenceStartSpan;
    int end = c.getEndSpan() - sentenceStartSpan;
    // Find the tree that covers the start and end tokens. However, start
    // and end have been shifted relative to the start of the sentence. So
    // we need to shift it back, which is why we have that UGLY as sin
    // mapper at the end.
    Tree<Pair<String, IntPair>> toknTree = getTokenIndexedTreeCovering(tree, start, end);
    ITransformer<Tree<Pair<String, IntPair>>, Pair<String, IntPair>> transformer = new ITransformer<Tree<Pair<String, IntPair>>, Pair<String, IntPair>>() {

        @Override
        public Pair<String, IntPair> transform(Tree<Pair<String, IntPair>> input) {
            Pair<String, IntPair> label = input.getLabel();
            IntPair newSpan = new IntPair(label.getSecond().getFirst() + sentenceStartSpan, label.getSecond().getSecond() + sentenceStartSpan);
            return new Pair<>(label.getFirst(), newSpan);
        }
    };
    return Mappers.mapTree(toknTree, transformer);
}
Also used : ITransformer(edu.illinois.cs.cogcomp.core.transformers.ITransformer) Tree(edu.illinois.cs.cogcomp.core.datastructures.trees.Tree) TextAnnotation(edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation) IntPair(edu.illinois.cs.cogcomp.core.datastructures.IntPair) IntPair(edu.illinois.cs.cogcomp.core.datastructures.IntPair) Pair(edu.illinois.cs.cogcomp.core.datastructures.Pair)

Aggregations

ITransformer (edu.illinois.cs.cogcomp.core.transformers.ITransformer)5 IntPair (edu.illinois.cs.cogcomp.core.datastructures.IntPair)2 Pair (edu.illinois.cs.cogcomp.core.datastructures.Pair)2 Constituent (edu.illinois.cs.cogcomp.core.datastructures.textannotation.Constituent)2 TextAnnotation (edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation)2 View (edu.illinois.cs.cogcomp.core.datastructures.textannotation.View)2 Tree (edu.illinois.cs.cogcomp.core.datastructures.trees.Tree)2 Feature (edu.illinois.cs.cogcomp.edison.features.Feature)2 LinkedHashSet (java.util.LinkedHashSet)2 IOException (java.io.IOException)1 InputStream (java.io.InputStream)1 URISyntaxException (java.net.URISyntaxException)1