Search in sources :

Example 1 with WordListFilter

use of edu.illinois.cs.cogcomp.llm.align.WordListFilter in project cogcomp-nlp by CogComp.

the class LlmComparatorTest method testRemoveStopwords.

@Test
public void testRemoveStopwords() {
    WordListFilter filter = null;
    try {
        filter = new WordListFilter(new SimConfigurator().getDefaultConfig());
    } catch (IOException e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
    String sent = "This sentence is filled with unnecessary filler like their pronouns , punctuation and function " + "words such as for , by , from , him , her , and to .";
    String[] tokens = sent.split("\\s+");
    String[] filteredTokens = filter.filter(tokens);
    int numSkipped = 0;
    List<String> filteredToks = new LinkedList<>();
    for (int i = 0; i < tokens.length; ++i) {
        String tok = filteredTokens[i];
        if (null == tok) {
            numSkipped++;
            filteredToks.add(tokens[i]);
        }
    }
    assert (numSkipped > 0);
    assert (filteredToks.contains("is"));
    System.out.println("Original text: " + sent);
    System.out.println("Filtered tokens: ");
    System.out.println(StringUtils.join(filteredToks, "; "));
}
Also used : SimConfigurator(edu.illinois.cs.cogcomp.config.SimConfigurator) WordListFilter(edu.illinois.cs.cogcomp.llm.align.WordListFilter) IOException(java.io.IOException) LinkedList(java.util.LinkedList) Test(org.junit.Test)

Example 2 with WordListFilter

use of edu.illinois.cs.cogcomp.llm.align.WordListFilter in project cogcomp-nlp by CogComp.

the class LlmStringComparator method initialize.

private void initialize(ResourceManager rm_, Comparator<String, EntailmentResult> comparator) throws IOException {
    ResourceManager fullRm = new SimConfigurator().getConfig(rm_);
    double threshold = fullRm.getDouble(SimConfigurator.LLM_ENTAILMENT_THRESHOLD.key);
    tokenizer = new IllinoisTokenizer();
    this.comparator = comparator;
    filter = new WordListFilter(fullRm);
    neAligner = new Aligner<String, EntailmentResult>(new NEComparator(), filter);
    aligner = new Aligner<String, EntailmentResult>(comparator, filter);
    scorer = new GreedyAlignmentScorer<String>(threshold);
}
Also used : SimConfigurator(edu.illinois.cs.cogcomp.config.SimConfigurator) EntailmentResult(edu.illinois.cs.cogcomp.mrcs.dataStructures.EntailmentResult) IllinoisTokenizer(edu.illinois.cs.cogcomp.nlp.tokenizer.IllinoisTokenizer) WordListFilter(edu.illinois.cs.cogcomp.llm.align.WordListFilter) ResourceManager(edu.illinois.cs.cogcomp.core.utilities.configuration.ResourceManager)

Aggregations

SimConfigurator (edu.illinois.cs.cogcomp.config.SimConfigurator)2 WordListFilter (edu.illinois.cs.cogcomp.llm.align.WordListFilter)2 ResourceManager (edu.illinois.cs.cogcomp.core.utilities.configuration.ResourceManager)1 EntailmentResult (edu.illinois.cs.cogcomp.mrcs.dataStructures.EntailmentResult)1 IllinoisTokenizer (edu.illinois.cs.cogcomp.nlp.tokenizer.IllinoisTokenizer)1 IOException (java.io.IOException)1 LinkedList (java.util.LinkedList)1 Test (org.junit.Test)1