Search in sources :

Example 1 with QuestionInterpretationG

use of info.ephyra.querygeneration.generators.QuestionInterpretationG in project lucida by claritylab.

the class OpenEphyraCorpus method initFactoidCorpus.

/**
	 * Initializes the pipeline for factoid questions, using a local corpus as a
	 * knowledge source.
	 */
protected void initFactoidCorpus() {
    // question analysis
    Ontology wordNet = new WordNet();
    // - dictionaries for term extraction
    QuestionAnalysis.clearDictionaries();
    QuestionAnalysis.addDictionary(wordNet);
    // - ontologies for term expansion
    QuestionAnalysis.clearOntologies();
    QuestionAnalysis.addOntology(wordNet);
    // query generation
    QueryGeneration.clearQueryGenerators();
    QueryGeneration.addQueryGenerator(new BagOfWordsG());
    QueryGeneration.addQueryGenerator(new BagOfTermsG());
    QueryGeneration.addQueryGenerator(new PredicateG());
    QueryGeneration.addQueryGenerator(new QuestionInterpretationG());
    QueryGeneration.addQueryGenerator(new QuestionReformulationG());
    // search
    // - knowledge miners for unstructured knowledge sources
    Search.clearKnowledgeMiners();
    for (String[] indriIndices : IndriKM.getIndriIndices()) Search.addKnowledgeMiner(new IndriKM(indriIndices, false));
    for (String[] indriServers : IndriKM.getIndriServers()) Search.addKnowledgeMiner(new IndriKM(indriServers, true));
    // - knowledge annotators for (semi-)structured knowledge sources
    Search.clearKnowledgeAnnotators();
    // answer extraction and selection
    // (the filters are applied in this order)
    AnswerSelection.clearFilters();
    // - answer extraction filters
    AnswerSelection.addFilter(new AnswerTypeFilter());
    AnswerSelection.addFilter(new AnswerPatternFilter());
    AnswerSelection.addFilter(new WebDocumentFetcherFilter());
    AnswerSelection.addFilter(new PredicateExtractionFilter());
    AnswerSelection.addFilter(new FactoidsFromPredicatesFilter());
    AnswerSelection.addFilter(new TruncationFilter());
// - answer selection filters
}
Also used : Ontology(info.ephyra.nlp.semantics.ontologies.Ontology) AnswerPatternFilter(info.ephyra.answerselection.filters.AnswerPatternFilter) PredicateExtractionFilter(info.ephyra.answerselection.filters.PredicateExtractionFilter) WebDocumentFetcherFilter(info.ephyra.answerselection.filters.WebDocumentFetcherFilter) IndriKM(info.ephyra.search.searchers.IndriKM) TruncationFilter(info.ephyra.answerselection.filters.TruncationFilter) WordNet(info.ephyra.nlp.semantics.ontologies.WordNet) BagOfWordsG(info.ephyra.querygeneration.generators.BagOfWordsG) PredicateG(info.ephyra.querygeneration.generators.PredicateG) AnswerTypeFilter(info.ephyra.answerselection.filters.AnswerTypeFilter) QuestionReformulationG(info.ephyra.querygeneration.generators.QuestionReformulationG) BagOfTermsG(info.ephyra.querygeneration.generators.BagOfTermsG) QuestionInterpretationG(info.ephyra.querygeneration.generators.QuestionInterpretationG) FactoidsFromPredicatesFilter(info.ephyra.answerselection.filters.FactoidsFromPredicatesFilter)

Example 2 with QuestionInterpretationG

use of info.ephyra.querygeneration.generators.QuestionInterpretationG in project lucida by claritylab.

the class OpenEphyra method initFactoid.

/**
	 * Initializes the pipeline for factoid questions.
	 */
protected void initFactoid() {
    // question analysis
    Ontology wordNet = new WordNet();
    // - dictionaries for term extraction
    QuestionAnalysis.clearDictionaries();
    QuestionAnalysis.addDictionary(wordNet);
    // - ontologies for term expansion
    QuestionAnalysis.clearOntologies();
    QuestionAnalysis.addOntology(wordNet);
    // query generation
    QueryGeneration.clearQueryGenerators();
    QueryGeneration.addQueryGenerator(new BagOfWordsG());
    QueryGeneration.addQueryGenerator(new BagOfTermsG());
    QueryGeneration.addQueryGenerator(new PredicateG());
    QueryGeneration.addQueryGenerator(new QuestionInterpretationG());
    QueryGeneration.addQueryGenerator(new QuestionReformulationG());
    // search
    // - knowledge miners for unstructured knowledge sources
    Search.clearKnowledgeMiners();
    //		Search.addKnowledgeMiner(new YahooKM());
    for (String[] indriIndices : IndriKM.getIndriIndices()) Search.addKnowledgeMiner(new IndriKM(indriIndices, false));
    //		for (String[] indriServers : IndriKM.getIndriServers())
    //			Search.addKnowledgeMiner(new IndriKM(indriServers, true));
    // - knowledge annotators for (semi-)structured knowledge sources
    Search.clearKnowledgeAnnotators();
    // answer extraction and selection
    // (the filters are applied in this order)
    AnswerSelection.clearFilters();
    // - answer extraction filters
    AnswerSelection.addFilter(new AnswerTypeFilter());
    AnswerSelection.addFilter(new AnswerPatternFilter());
    //AnswerSelection.addFilter(new WebDocumentFetcherFilter());
    AnswerSelection.addFilter(new PredicateExtractionFilter());
    AnswerSelection.addFilter(new FactoidsFromPredicatesFilter());
    AnswerSelection.addFilter(new TruncationFilter());
    // - answer selection filters
    AnswerSelection.addFilter(new StopwordFilter());
    AnswerSelection.addFilter(new QuestionKeywordsFilter());
    AnswerSelection.addFilter(new ScoreNormalizationFilter(NORMALIZER));
    AnswerSelection.addFilter(new ScoreCombinationFilter());
    AnswerSelection.addFilter(new FactoidSubsetFilter());
    AnswerSelection.addFilter(new DuplicateFilter());
    AnswerSelection.addFilter(new ScoreSorterFilter());
}
Also used : ScoreCombinationFilter(info.ephyra.answerselection.filters.ScoreCombinationFilter) ScoreSorterFilter(info.ephyra.answerselection.filters.ScoreSorterFilter) Ontology(info.ephyra.nlp.semantics.ontologies.Ontology) AnswerPatternFilter(info.ephyra.answerselection.filters.AnswerPatternFilter) PredicateExtractionFilter(info.ephyra.answerselection.filters.PredicateExtractionFilter) ScoreNormalizationFilter(info.ephyra.answerselection.filters.ScoreNormalizationFilter) IndriKM(info.ephyra.search.searchers.IndriKM) StopwordFilter(info.ephyra.answerselection.filters.StopwordFilter) TruncationFilter(info.ephyra.answerselection.filters.TruncationFilter) WordNet(info.ephyra.nlp.semantics.ontologies.WordNet) BagOfWordsG(info.ephyra.querygeneration.generators.BagOfWordsG) PredicateG(info.ephyra.querygeneration.generators.PredicateG) AnswerTypeFilter(info.ephyra.answerselection.filters.AnswerTypeFilter) QuestionReformulationG(info.ephyra.querygeneration.generators.QuestionReformulationG) QuestionKeywordsFilter(info.ephyra.answerselection.filters.QuestionKeywordsFilter) DuplicateFilter(info.ephyra.answerselection.filters.DuplicateFilter) FactoidSubsetFilter(info.ephyra.answerselection.filters.FactoidSubsetFilter) BagOfTermsG(info.ephyra.querygeneration.generators.BagOfTermsG) QuestionInterpretationG(info.ephyra.querygeneration.generators.QuestionInterpretationG) FactoidsFromPredicatesFilter(info.ephyra.answerselection.filters.FactoidsFromPredicatesFilter)

Example 3 with QuestionInterpretationG

use of info.ephyra.querygeneration.generators.QuestionInterpretationG in project lucida by claritylab.

the class OpenEphyraServer method initFactoid.

/**
	 * Initializes the pipeline for factoid questions.
	 */
protected void initFactoid() {
    // question analysis
    Ontology wordNet = new WordNet();
    // - dictionaries for term extraction
    QuestionAnalysis.clearDictionaries();
    QuestionAnalysis.addDictionary(wordNet);
    // - ontologies for term expansion
    QuestionAnalysis.clearOntologies();
    QuestionAnalysis.addOntology(wordNet);
    // query generation
    QueryGeneration.clearQueryGenerators();
    QueryGeneration.addQueryGenerator(new BagOfWordsG());
    QueryGeneration.addQueryGenerator(new BagOfTermsG());
    QueryGeneration.addQueryGenerator(new PredicateG());
    QueryGeneration.addQueryGenerator(new QuestionInterpretationG());
    QueryGeneration.addQueryGenerator(new QuestionReformulationG());
    // search
    // - knowledge miners for unstructured knowledge sources
    Search.clearKnowledgeMiners();
    for (String[] indriIndices : IndriKM.getIndriIndices()) Search.addKnowledgeMiner(new IndriKM(indriIndices, false));
    // - knowledge annotators for (semi-)structured knowledge sources
    Search.clearKnowledgeAnnotators();
    /* Search.addKnowledgeAnnotator(new WikipediaKA("list.txt")); */
    // answer extraction and selection
    // (the filters are applied in this order)
    AnswerSelection.clearFilters();
    // - answer extraction filters
    AnswerSelection.addFilter(new AnswerTypeFilter());
    AnswerSelection.addFilter(new AnswerPatternFilter());
    AnswerSelection.addFilter(new PredicateExtractionFilter());
    AnswerSelection.addFilter(new FactoidsFromPredicatesFilter());
    AnswerSelection.addFilter(new TruncationFilter());
    // - answer selection filters
    AnswerSelection.addFilter(new StopwordFilter());
    AnswerSelection.addFilter(new QuestionKeywordsFilter());
    AnswerSelection.addFilter(new ScoreNormalizationFilter(NORMALIZER));
    AnswerSelection.addFilter(new ScoreCombinationFilter());
    AnswerSelection.addFilter(new FactoidSubsetFilter());
    AnswerSelection.addFilter(new DuplicateFilter());
    AnswerSelection.addFilter(new ScoreSorterFilter());
}
Also used : ScoreCombinationFilter(info.ephyra.answerselection.filters.ScoreCombinationFilter) ScoreSorterFilter(info.ephyra.answerselection.filters.ScoreSorterFilter) Ontology(info.ephyra.nlp.semantics.ontologies.Ontology) AnswerPatternFilter(info.ephyra.answerselection.filters.AnswerPatternFilter) PredicateExtractionFilter(info.ephyra.answerselection.filters.PredicateExtractionFilter) ScoreNormalizationFilter(info.ephyra.answerselection.filters.ScoreNormalizationFilter) IndriKM(info.ephyra.search.searchers.IndriKM) StopwordFilter(info.ephyra.answerselection.filters.StopwordFilter) TruncationFilter(info.ephyra.answerselection.filters.TruncationFilter) WordNet(info.ephyra.nlp.semantics.ontologies.WordNet) BagOfWordsG(info.ephyra.querygeneration.generators.BagOfWordsG) PredicateG(info.ephyra.querygeneration.generators.PredicateG) AnswerTypeFilter(info.ephyra.answerselection.filters.AnswerTypeFilter) QuestionReformulationG(info.ephyra.querygeneration.generators.QuestionReformulationG) QuestionKeywordsFilter(info.ephyra.answerselection.filters.QuestionKeywordsFilter) DuplicateFilter(info.ephyra.answerselection.filters.DuplicateFilter) FactoidSubsetFilter(info.ephyra.answerselection.filters.FactoidSubsetFilter) BagOfTermsG(info.ephyra.querygeneration.generators.BagOfTermsG) QuestionInterpretationG(info.ephyra.querygeneration.generators.QuestionInterpretationG) FactoidsFromPredicatesFilter(info.ephyra.answerselection.filters.FactoidsFromPredicatesFilter)

Example 4 with QuestionInterpretationG

use of info.ephyra.querygeneration.generators.QuestionInterpretationG in project lucida by claritylab.

the class PatternLearner method formQueries.

/**
	 * Loads target-context-answer-regex tuples from resource files and forms
	 * queries.
	 * 
	 * @param dir directory containing the target-context-answer-regex tuples
	 * @return queries formed from the tuples
	 */
private static Query[] formQueries(String dir) {
    QuestionInterpretationG queryGenerator = new QuestionInterpretationG();
    ArrayList<Query> results = new ArrayList<Query>();
    File[] files = FileUtils.getFiles(dir);
    BufferedReader in;
    String[] tuple, context, kws;
    String prop, line, target, as, regex, queryString;
    QuestionInterpretation qi;
    Query query;
    try {
        for (File file : files) {
            prop = file.getName();
            in = new BufferedReader(new FileReader(file));
            while (in.ready()) {
                line = in.readLine().trim();
                if (line.length() == 0 || line.startsWith("//"))
                    // skip blank lines and comments
                    continue;
                // extract interpretation, answer string and pattern
                tuple = line.split("#", -1);
                target = tuple[0];
                context = new String[tuple.length - 3];
                for (int i = 1; i < tuple.length - 2; i++) context[i - 1] = tuple[i];
                as = tuple[tuple.length - 2];
                regex = tuple[tuple.length - 1];
                // complement answer string or regular expression
                if (as.equals(""))
                    as = RegexConverter.regexToQueryStr(regex);
                else if (regex.equals(""))
                    regex = RegexConverter.strToRegex(as);
                // create query object
                qi = new QuestionInterpretation(target, context, prop);
                kws = new String[] { "\"" + as + "\"" };
                queryString = queryGenerator.queryString(target, context, kws);
                query = new Query(queryString, null, 0);
                query.setInterpretation(qi);
                // store query, answer and regular expression
                results.add(query);
                ass.put(queryString, as);
                regexs.put(queryString, regex);
            }
        }
    } catch (IOException e) {
        return new Query[0];
    }
    return results.toArray(new Query[results.size()]);
}
Also used : QuestionInterpretation(info.ephyra.questionanalysis.QuestionInterpretation) Query(info.ephyra.querygeneration.Query) ArrayList(java.util.ArrayList) IOException(java.io.IOException) BufferedReader(java.io.BufferedReader) QuestionInterpretationG(info.ephyra.querygeneration.generators.QuestionInterpretationG) FileReader(java.io.FileReader) File(java.io.File)

Example 5 with QuestionInterpretationG

use of info.ephyra.querygeneration.generators.QuestionInterpretationG in project lucida by claritylab.

the class OpenEphyraCorpus method initFactoidWeb.

/**
	 * Initializes the pipeline for factoid questions, using the Web as a
	 * knowledge source.
	 * 
	 * @param resultsCorp results retrieved from the corpus
	 */
protected void initFactoidWeb(Result[] resultsCorp) {
    // question analysis
    Ontology wordNet = new WordNet();
    // - dictionaries for term extraction
    QuestionAnalysis.clearDictionaries();
    QuestionAnalysis.addDictionary(wordNet);
    // - ontologies for term expansion
    QuestionAnalysis.clearOntologies();
    QuestionAnalysis.addOntology(wordNet);
    // query generation
    QueryGeneration.clearQueryGenerators();
    QueryGeneration.addQueryGenerator(new BagOfWordsG());
    QueryGeneration.addQueryGenerator(new BagOfTermsG());
    QueryGeneration.addQueryGenerator(new PredicateG());
    QueryGeneration.addQueryGenerator(new QuestionInterpretationG());
    QueryGeneration.addQueryGenerator(new QuestionReformulationG());
    // search
    // - knowledge miners for unstructured knowledge sources
    Search.clearKnowledgeMiners();
    Search.addKnowledgeMiner(new BingKM());
    //		Search.addKnowledgeMiner(new GoogleKM());
    //		Search.addKnowledgeMiner(new YahooKM());
    // - knowledge annotators for (semi-)structured knowledge sources
    Search.clearKnowledgeAnnotators();
    // answer extraction and selection
    // (the filters are applied in this order)
    AnswerSelection.clearFilters();
    // - answer extraction filters
    AnswerSelection.addFilter(new AnswerTypeFilter());
    AnswerSelection.addFilter(new AnswerPatternFilter());
    AnswerSelection.addFilter(new WebDocumentFetcherFilter());
    AnswerSelection.addFilter(new PredicateExtractionFilter());
    AnswerSelection.addFilter(new FactoidsFromPredicatesFilter());
    AnswerSelection.addFilter(new TruncationFilter());
    // - answer selection filters
    AnswerSelection.addFilter(new StopwordFilter());
    AnswerSelection.addFilter(new QuestionKeywordsFilter());
    AnswerSelection.addFilter(new AnswerProjectionFilter(resultsCorp));
    AnswerSelection.addFilter(new ScoreNormalizationFilter(NORMALIZER));
    AnswerSelection.addFilter(new ScoreCombinationFilter());
    AnswerSelection.addFilter(new FactoidSubsetFilter());
    AnswerSelection.addFilter(new DuplicateFilter());
    AnswerSelection.addFilter(new ScoreSorterFilter());
    AnswerSelection.addFilter(new ResultLengthFilter());
}
Also used : ScoreCombinationFilter(info.ephyra.answerselection.filters.ScoreCombinationFilter) ScoreSorterFilter(info.ephyra.answerselection.filters.ScoreSorterFilter) Ontology(info.ephyra.nlp.semantics.ontologies.Ontology) AnswerPatternFilter(info.ephyra.answerselection.filters.AnswerPatternFilter) PredicateExtractionFilter(info.ephyra.answerselection.filters.PredicateExtractionFilter) ScoreNormalizationFilter(info.ephyra.answerselection.filters.ScoreNormalizationFilter) WebDocumentFetcherFilter(info.ephyra.answerselection.filters.WebDocumentFetcherFilter) StopwordFilter(info.ephyra.answerselection.filters.StopwordFilter) TruncationFilter(info.ephyra.answerselection.filters.TruncationFilter) WordNet(info.ephyra.nlp.semantics.ontologies.WordNet) BagOfWordsG(info.ephyra.querygeneration.generators.BagOfWordsG) PredicateG(info.ephyra.querygeneration.generators.PredicateG) AnswerTypeFilter(info.ephyra.answerselection.filters.AnswerTypeFilter) ResultLengthFilter(info.ephyra.answerselection.filters.ResultLengthFilter) QuestionReformulationG(info.ephyra.querygeneration.generators.QuestionReformulationG) QuestionKeywordsFilter(info.ephyra.answerselection.filters.QuestionKeywordsFilter) DuplicateFilter(info.ephyra.answerselection.filters.DuplicateFilter) FactoidSubsetFilter(info.ephyra.answerselection.filters.FactoidSubsetFilter) BagOfTermsG(info.ephyra.querygeneration.generators.BagOfTermsG) BingKM(info.ephyra.search.searchers.BingKM) QuestionInterpretationG(info.ephyra.querygeneration.generators.QuestionInterpretationG) FactoidsFromPredicatesFilter(info.ephyra.answerselection.filters.FactoidsFromPredicatesFilter) AnswerProjectionFilter(info.ephyra.answerselection.filters.AnswerProjectionFilter)

Aggregations

QuestionInterpretationG (info.ephyra.querygeneration.generators.QuestionInterpretationG)5 AnswerPatternFilter (info.ephyra.answerselection.filters.AnswerPatternFilter)4 AnswerTypeFilter (info.ephyra.answerselection.filters.AnswerTypeFilter)4 FactoidsFromPredicatesFilter (info.ephyra.answerselection.filters.FactoidsFromPredicatesFilter)4 PredicateExtractionFilter (info.ephyra.answerselection.filters.PredicateExtractionFilter)4 TruncationFilter (info.ephyra.answerselection.filters.TruncationFilter)4 Ontology (info.ephyra.nlp.semantics.ontologies.Ontology)4 WordNet (info.ephyra.nlp.semantics.ontologies.WordNet)4 BagOfTermsG (info.ephyra.querygeneration.generators.BagOfTermsG)4 BagOfWordsG (info.ephyra.querygeneration.generators.BagOfWordsG)4 PredicateG (info.ephyra.querygeneration.generators.PredicateG)4 QuestionReformulationG (info.ephyra.querygeneration.generators.QuestionReformulationG)4 DuplicateFilter (info.ephyra.answerselection.filters.DuplicateFilter)3 FactoidSubsetFilter (info.ephyra.answerselection.filters.FactoidSubsetFilter)3 QuestionKeywordsFilter (info.ephyra.answerselection.filters.QuestionKeywordsFilter)3 ScoreCombinationFilter (info.ephyra.answerselection.filters.ScoreCombinationFilter)3 ScoreNormalizationFilter (info.ephyra.answerselection.filters.ScoreNormalizationFilter)3 ScoreSorterFilter (info.ephyra.answerselection.filters.ScoreSorterFilter)3 StopwordFilter (info.ephyra.answerselection.filters.StopwordFilter)3 IndriKM (info.ephyra.search.searchers.IndriKM)3