Search in sources :

Example 1 with DuplicateSnippetFilter

use of info.ephyra.answerselection.filters.DuplicateSnippetFilter in project lucida by claritylab.

the class EphyraTREC13To16 method initOther.

// Layout 1
/**
	 * Initializes the pipeline for 'other' questions.
	 */
protected void initOther() {
    // query generation
    QueryGeneration.clearQueryGenerators();
    // search
    // - knowledge miners for unstructured knowledge sources
    Search.clearKnowledgeMiners();
    for (String[] indriIndices : IndriKM.getIndriIndices()) Search.addKnowledgeMiner(new IndriKM(indriIndices, false));
    for (String[] indriServers : IndriKM.getIndriServers()) Search.addKnowledgeMiner(new IndriKM(indriServers, true));
    // - knowledge annotators for (semi-)structured knowledge sources
    Search.clearKnowledgeAnnotators();
    // answer extraction and selection
    // (the filters are applied in this order)
    AnswerSelection.clearFilters();
    //	initialize scores
    AnswerSelection.addFilter(new ScoreResetterFilter());
    //	extract sentences from snippets
    AnswerSelection.addFilter(new SentenceExtractionFilter());
    //	cut meaningless introductions from sentences
    AnswerSelection.addFilter(new CutKeywordsFilter());
    AnswerSelection.addFilter(new CutStatementProviderFilter());
    AnswerSelection.addFilter(new SentenceSplitterFilter());
    AnswerSelection.addFilter(new CutKeywordsFilter());
    //	remove duplicates
    AnswerSelection.addFilter(new DuplicateSnippetFilter());
    //	throw out enumerations of proper names
    AnswerSelection.addFilter(new ProperNameFilter());
    //	throw out direct speech snippets, rarely contain useful information
    AnswerSelection.addFilter(new DirectSpeechFilter());
    //	sort out snippets containing no new terms
    AnswerSelection.addFilter(new TermFilter());
    AnswerSelection.addFilter(new WikipediaGoogleTermImportanceFilter(WebTermImportanceFilter.LOG_LENGTH_NORMALIZATION, WebTermImportanceFilter.LOG_LENGTH_NORMALIZATION, false));
    AnswerSelection.addFilter(new ScoreSorterFilter());
    //	cut off result
    AnswerSelection.addFilter(new ResultLengthFilter(3000));
}
Also used : ScoreSorterFilter(info.ephyra.answerselection.filters.ScoreSorterFilter) DirectSpeechFilter(info.ephyra.answerselection.filters.DirectSpeechFilter) CutStatementProviderFilter(info.ephyra.answerselection.filters.CutStatementProviderFilter) SentenceSplitterFilter(info.ephyra.answerselection.filters.SentenceSplitterFilter) WikipediaGoogleTermImportanceFilter(info.ephyra.answerselection.filters.WikipediaGoogleTermImportanceFilter) IndriKM(info.ephyra.search.searchers.IndriKM) ScoreResetterFilter(info.ephyra.answerselection.filters.ScoreResetterFilter) DuplicateSnippetFilter(info.ephyra.answerselection.filters.DuplicateSnippetFilter) ResultLengthFilter(info.ephyra.answerselection.filters.ResultLengthFilter) SentenceExtractionFilter(info.ephyra.answerselection.filters.SentenceExtractionFilter) ProperNameFilter(info.ephyra.answerselection.filters.ProperNameFilter) TermFilter(info.ephyra.answerselection.filters.TermFilter) CutKeywordsFilter(info.ephyra.answerselection.filters.CutKeywordsFilter)

Aggregations

CutKeywordsFilter (info.ephyra.answerselection.filters.CutKeywordsFilter)1 CutStatementProviderFilter (info.ephyra.answerselection.filters.CutStatementProviderFilter)1 DirectSpeechFilter (info.ephyra.answerselection.filters.DirectSpeechFilter)1 DuplicateSnippetFilter (info.ephyra.answerselection.filters.DuplicateSnippetFilter)1 ProperNameFilter (info.ephyra.answerselection.filters.ProperNameFilter)1 ResultLengthFilter (info.ephyra.answerselection.filters.ResultLengthFilter)1 ScoreResetterFilter (info.ephyra.answerselection.filters.ScoreResetterFilter)1 ScoreSorterFilter (info.ephyra.answerselection.filters.ScoreSorterFilter)1 SentenceExtractionFilter (info.ephyra.answerselection.filters.SentenceExtractionFilter)1 SentenceSplitterFilter (info.ephyra.answerselection.filters.SentenceSplitterFilter)1 TermFilter (info.ephyra.answerselection.filters.TermFilter)1 WikipediaGoogleTermImportanceFilter (info.ephyra.answerselection.filters.WikipediaGoogleTermImportanceFilter)1 IndriKM (info.ephyra.search.searchers.IndriKM)1