Search in sources :

Example 1 with MutableInstance

use of edu.cmu.minorthird.classify.MutableInstance in project lucida by claritylab.

the class EnglishFeatureExtractor method createInstance.

/**
     * Creates and populates an Instance from a QuestionAnalysis object.  All
     * features are binary features of one of the following types:</p>
     * 
     * Word-level features:
     * <ul>
     *   <li>UNIGRAM : individual words in the question
     *   <li>BIGRAM : pairs of adjacent words in the question
     *   <li>WH_WORD : the wh-word in the question if one exists
     * </ul>
     * 
     * Syntactic features:
     * <ul>
     *   <li>MAIN_VERB: the syntactic head of the sentence, as defined in 
     *   {@link edu.cmu.lti.chineseNLP.util.TreeHelper TreeHelper}
     *   <li>FOCUS_ADJ : the adjective following a wh-word (e.g. 'long' in 'How long is it?') 
     *   <li>WH_DET : whether or not the wh-word is the determiner of a noun phrase, as in 'which printer'
     * </ul>
     * 
     * Semantic features:
     * <ul>
     *   <li>FOCUS_TYPE : the semantic type of the focus word, 
     * </ul>
     * 
     * @throws Exception
     */
public Instance createInstance(List<Term> terms, String parseTree) {
    String question = "";
    for (Term term : terms) question += term + " ";
    question = question.trim();
    MutableInstance instance = new MutableInstance(question);
    // find the focus word
    log.debug("Parse: " + parseTree);
    Tree tree = TreeHelper.buildTree(parseTree, Tree.ENGLISH);
    Term focus = FocusFinder.findFocusTerm(tree);
    if (focus != null)
        log.debug("Focus: " + focus.getText());
    addWordLevelFeatures(instance, terms, focus);
    addSyntacticFeatures(instance, terms, parseTree, focus);
    addSemanticFeatures(instance, focus);
    return instance;
}
Also used : MutableInstance(edu.cmu.minorthird.classify.MutableInstance) Tree(edu.cmu.lti.chineseNLP.util.Tree) Term(edu.cmu.lti.javelin.qa.Term)

Example 2 with MutableInstance

use of edu.cmu.minorthird.classify.MutableInstance in project lucida by claritylab.

the class ScoreNormalizationFilter method createInstance.

/**
	 * Creates an instance for training/evaluation or classification from an
	 * answer candidate, using the question ID as a subpopulation ID.
	 * 
	 * @param features selected features
	 * @param result answer candidate
	 * @param results all answers to the question
	 * @param qid question ID
	 * @return instance for training/evaluation or classification
	 */
private static Instance createInstance(String[] features, Result result, Result[] results, String qid) {
    // create instance from source object and subpopulation ID
    MutableInstance instance = new MutableInstance(result, qid);
    // add selected features to the instance
    addSelectedFeatures(instance, features, result, results);
    return instance;
}
Also used : MutableInstance(edu.cmu.minorthird.classify.MutableInstance)

Example 3 with MutableInstance

use of edu.cmu.minorthird.classify.MutableInstance in project lucida by claritylab.

the class ScoreNormalizationFilter method createInstance.

/**
	 * Creates an instance for training/evaluation or classification from an
	 * answer candidate.
	 * 
	 * @param features selected features
	 * @param result answer candidate
	 * @param results all answers to the question
	 * @return instance for training/evaluation or classification
	 */
private static Instance createInstance(String[] features, Result result, Result[] results) {
    // create instance from source object
    MutableInstance instance = new MutableInstance(result);
    // add selected features to the instance
    addSelectedFeatures(instance, features, result, results);
    return instance;
}
Also used : MutableInstance(edu.cmu.minorthird.classify.MutableInstance)

Example 4 with MutableInstance

use of edu.cmu.minorthird.classify.MutableInstance in project lucida by claritylab.

the class QuestionClassifier method getAnswerTypes.

/**
     * Classifies the question represented by the given List of Terms and parse tree 
     * as having a particular answer type and possibly subtype. 
     * 
     * @param terms the Terms that make up the question to classify
     * @param parseTreeStr the syntactic parse tree of the question, in String format
     * 
     * @return the candidate answer type / subtypes.
     * @throws Exception
     */
public List<AnswerType> getAnswerTypes(List<Term> terms, String parseTreeStr) throws Exception {
    if (!isInitialized())
        throw new Exception("getAnswerTypes called while not initialized");
    String question = "";
    for (Term term : terms) question += term.getText() + " ";
    // create the instance
    Instance instance = new MutableInstance(question);
    if (extractor != null)
        instance = extractor.createInstance(terms, parseTreeStr);
    return classify(instance);
}
Also used : Instance(edu.cmu.minorthird.classify.Instance) MutableInstance(edu.cmu.minorthird.classify.MutableInstance) MutableInstance(edu.cmu.minorthird.classify.MutableInstance) Term(edu.cmu.lti.javelin.qa.Term) IOException(java.io.IOException)

Example 5 with MutableInstance

use of edu.cmu.minorthird.classify.MutableInstance in project lucida by claritylab.

the class HierarchicalClassifierTrainer method makeDataset.

private Dataset makeDataset(String fileName) {
    if (trainingLabels == null) {
        loadTraining = true;
        trainingLabels = new HashSet<String>();
    }
    Dataset set = new BasicDataset();
    extractor.setUseClassLevels(useClassLevels);
    extractor.setClassLevels(learnerNames.length);
    Example[] examples = extractor.loadFile(fileName);
    for (int i = 0; i < examples.length; i++) {
        String label = examples[i].getLabel().bestClassName();
        if (classLabels.contains(label)) {
            MutableInstance instance = new MutableInstance(examples[i].getSource(), examples[i].getSubpopulationId());
            Feature.Looper bLooper = examples[i].binaryFeatureIterator();
            while (bLooper.hasNext()) {
                Feature f = bLooper.nextFeature();
                if (featureTypes.contains(f.getPart(0))) {
                    instance.addBinary(f);
                }
            }
            Feature.Looper nLooper = examples[i].numericFeatureIterator();
            while (nLooper.hasNext()) {
                Feature f = nLooper.nextFeature();
                if (featureTypes.contains(f.getPart(0))) {
                    instance.addNumeric(f, examples[i].getWeight(f));
                }
            }
            Example example = new Example(instance, examples[i].getLabel());
            MLToolkit.println(example);
            if (loadTraining) {
                trainingLabels.add(label);
                set.add(example);
            } else {
                if (!trainingLabels.contains(label))
                    MLToolkit.println("Label of test example not found in training set (discarding): " + label);
                else
                    set.add(example);
            }
        } else {
            MLToolkit.println("Discarding example for Class: " + label);
        }
    }
    if (loadTraining)
        loadTraining = false;
    MLToolkit.println("Loaded " + set.size() + " examples for experiment from " + fileName);
    return set;
}
Also used : BasicDataset(edu.cmu.minorthird.classify.BasicDataset) CrossValidatedDataset(edu.cmu.minorthird.classify.experiments.CrossValidatedDataset) Dataset(edu.cmu.minorthird.classify.Dataset) Example(edu.cmu.minorthird.classify.Example) MutableInstance(edu.cmu.minorthird.classify.MutableInstance) BasicDataset(edu.cmu.minorthird.classify.BasicDataset) Feature(edu.cmu.minorthird.classify.Feature)

Aggregations

MutableInstance (edu.cmu.minorthird.classify.MutableInstance)7 Term (edu.cmu.lti.javelin.qa.Term)3 Feature (edu.cmu.minorthird.classify.Feature)2 Tree (edu.cmu.lti.chineseNLP.util.Tree)1 BasicDataset (edu.cmu.minorthird.classify.BasicDataset)1 Dataset (edu.cmu.minorthird.classify.Dataset)1 Example (edu.cmu.minorthird.classify.Example)1 Instance (edu.cmu.minorthird.classify.Instance)1 CrossValidatedDataset (edu.cmu.minorthird.classify.experiments.CrossValidatedDataset)1 IOException (java.io.IOException)1 StringReader (java.io.StringReader)1 ArrayList (java.util.ArrayList)1 List (java.util.List)1 DocumentBuilder (javax.xml.parsers.DocumentBuilder)1 DocumentBuilderFactory (javax.xml.parsers.DocumentBuilderFactory)1 Document (org.w3c.dom.Document)1 InputSource (org.xml.sax.InputSource)1