Search in sources :

Example 1 with ModelInfo

use of edu.illinois.cs.cogcomp.verbsense.core.ModelInfo in project cogcomp-nlp by CogComp.

the class PruningPreExtractor method consume.

@Override
protected void consume(Pair<SenseInstance, SenseStructure> input) {
    SenseInstance x = input.getFirst();
    SenseStructure y = input.getSecond();
    FeatureVector features = x.getCachedFeatureVector();
    ModelInfo modelInfo = manager.getModelInfo();
    Lexicon lexicon = modelInfo.getLexicon();
    int threshold = manager.getPruneSize();
    Pair<int[], float[]> pair = lexicon.pruneFeaturesByCount(features.getIdx(), features.getValue(), threshold);
    features = new FeatureVector(pair.getFirst(), pair.getSecond());
    synchronized (buffer) {
        buffer.add(new PreExtractRecord(x.getPredicateLemma(), y.getLabel(), features));
    }
    if (buffer.size() > 10000) {
        synchronized (buffer) {
            if (buffer.size() > 10000) {
                for (PreExtractRecord r : buffer) {
                    try {
                        cache.put(r.lemma, r.label, r.features);
                    } catch (Exception e) {
                        throw new RuntimeException(e);
                    }
                }
                buffer.clear();
            }
        }
    }
    counter.incrementAndGet();
}
Also used : FeatureVector(edu.illinois.cs.cogcomp.sl.util.FeatureVector) SenseStructure(edu.illinois.cs.cogcomp.verbsense.jlis.SenseStructure) ModelInfo(edu.illinois.cs.cogcomp.verbsense.core.ModelInfo) SenseInstance(edu.illinois.cs.cogcomp.verbsense.jlis.SenseInstance) Lexicon(edu.illinois.cs.cogcomp.core.datastructures.Lexicon)

Example 2 with ModelInfo

use of edu.illinois.cs.cogcomp.verbsense.core.ModelInfo in project cogcomp-nlp by CogComp.

the class SenseInstance method cacheFeatureVector.

public void cacheFeatureVector(Set<Feature> features) {
    Map<String, Float> featureMap = new HashMap<>();
    for (Feature f : features) {
        featureMap.put(f.getName(), f.getValue());
    }
    ModelInfo modelInfo = manager.getModelInfo();
    Pair<int[], float[]> feats = modelInfo.getLexicon().getFeatureVector(featureMap);
    this.cacheFeatureVector(new FeatureVector(feats.getFirst(), feats.getSecond()));
}
Also used : FeatureVector(edu.illinois.cs.cogcomp.sl.util.FeatureVector) ModelInfo(edu.illinois.cs.cogcomp.verbsense.core.ModelInfo) Feature(edu.illinois.cs.cogcomp.edison.features.Feature)

Example 3 with ModelInfo

use of edu.illinois.cs.cogcomp.verbsense.core.ModelInfo in project cogcomp-nlp by CogComp.

the class PreExtractor method countFeatures.

/**
     * This is where actual feature extraction is taking place. The features are defined in the
     * <b>features.fex</b> file and are read by {@link FeatureExtractor}
     * 
     * @param x The predicate to extract features from.
     * @throws EdisonException
     */
public void countFeatures(SenseInstance x) throws EdisonException {
    ModelInfo modelInfo = manager.getModelInfo();
    Set<Feature> feats = modelInfo.fex.getFeatures(x.getConstituent());
    // This is the only place where a new feature can be added to the lexicon.
    List<Integer> ids = new ArrayList<>();
    List<Float> values = new ArrayList<>();
    synchronized (lexicon) {
        for (Feature f : feats) {
            if (addNewFeatures) {
                if (!lexicon.contains(f.getName())) {
                    lexicon.previewFeature(f.getName());
                }
            } else if (!lexicon.contains(f.getName())) {
                continue;
            }
            int featureId = lexicon.lookupId(f.getName());
            lexicon.countFeature(featureId);
            ids.add(featureId);
            values.add(f.getValue());
        }
    }
    x.cacheFeatureVector(new FeatureVector(ArrayUtilities.asIntArray(ids), ArrayUtilities.asFloatArray(values)));
}
Also used : AtomicInteger(java.util.concurrent.atomic.AtomicInteger) FeatureVector(edu.illinois.cs.cogcomp.sl.util.FeatureVector) ModelInfo(edu.illinois.cs.cogcomp.verbsense.core.ModelInfo) Feature(edu.illinois.cs.cogcomp.edison.features.Feature)

Example 4 with ModelInfo

use of edu.illinois.cs.cogcomp.verbsense.core.ModelInfo in project cogcomp-nlp by CogComp.

the class SenseInstance method cacheAllFeatureVectors.

public void cacheAllFeatureVectors() {
    ModelInfo modelInfo = manager.getModelInfo();
    try {
        Set<Feature> feats = modelInfo.fex.getFeatures(getConstituent());
        cacheFeatureVector(feats);
    } catch (Exception e) {
        log.error("Unable to extract features for {}", this, e);
        throw new RuntimeException(e);
    }
}
Also used : ModelInfo(edu.illinois.cs.cogcomp.verbsense.core.ModelInfo) Feature(edu.illinois.cs.cogcomp.edison.features.Feature)

Example 5 with ModelInfo

use of edu.illinois.cs.cogcomp.verbsense.core.ModelInfo in project cogcomp-nlp by CogComp.

the class VerbSenseClassifierMain method preExtract.

@CommandDescription(description = "Pre-extracts the features for the verb-sense model. Run this before training.", usage = "preExtract")
public static void preExtract() throws Exception {
    SenseManager manager = getManager(true);
    ResourceManager conf = new VerbSenseConfigurator().getDefaultConfig();
    // If models directory doesn't exist create it
    if (!IOUtils.isDirectory(conf.getString(conf.getString(VerbSenseConfigurator.MODELS_DIRECTORY))))
        IOUtils.mkdir(conf.getString(conf.getString(VerbSenseConfigurator.MODELS_DIRECTORY)));
    int numConsumers = Runtime.getRuntime().availableProcessors();
    Dataset dataset = Dataset.PTBTrainDev;
    log.info("Pre-extracting features");
    ModelInfo modelInfo = manager.getModelInfo();
    String featureSet = "" + modelInfo.featureManifest.getIncludedFeatures().hashCode();
    String allDataCacheFile = VerbSenseConfigurator.getFeatureCacheFile(featureSet, dataset, rm);
    FeatureVectorCacheFile featureCache = preExtract(numConsumers, manager, dataset, allDataCacheFile);
    pruneFeatures(numConsumers, manager, featureCache, VerbSenseConfigurator.getPrunedFeatureCacheFile(featureSet, rm));
    Lexicon lexicon = modelInfo.getLexicon().getPrunedLexicon(manager.getPruneSize());
    log.info("Saving lexicon  with {} features to {}", lexicon.size(), manager.getLexiconFileName());
    log.info(lexicon.size() + " features in the lexicon");
    lexicon.save(manager.getLexiconFileName());
}
Also used : ModelInfo(edu.illinois.cs.cogcomp.verbsense.core.ModelInfo) VerbSenseConfigurator(edu.illinois.cs.cogcomp.verbsense.utilities.VerbSenseConfigurator) Dataset(edu.illinois.cs.cogcomp.verbsense.data.Dataset) Lexicon(edu.illinois.cs.cogcomp.core.datastructures.Lexicon) SenseManager(edu.illinois.cs.cogcomp.verbsense.core.SenseManager) ResourceManager(edu.illinois.cs.cogcomp.core.utilities.configuration.ResourceManager) FeatureVectorCacheFile(edu.illinois.cs.cogcomp.verbsense.caches.FeatureVectorCacheFile) CommandDescription(edu.illinois.cs.cogcomp.core.utilities.commands.CommandDescription)

Aggregations

ModelInfo (edu.illinois.cs.cogcomp.verbsense.core.ModelInfo)6 Feature (edu.illinois.cs.cogcomp.edison.features.Feature)3 FeatureVector (edu.illinois.cs.cogcomp.sl.util.FeatureVector)3 Lexicon (edu.illinois.cs.cogcomp.core.datastructures.Lexicon)2 CommandDescription (edu.illinois.cs.cogcomp.core.utilities.commands.CommandDescription)2 FeatureVectorCacheFile (edu.illinois.cs.cogcomp.verbsense.caches.FeatureVectorCacheFile)2 SenseManager (edu.illinois.cs.cogcomp.verbsense.core.SenseManager)2 ResourceManager (edu.illinois.cs.cogcomp.core.utilities.configuration.ResourceManager)1 StructuredProblem (edu.illinois.cs.cogcomp.sl.core.StructuredProblem)1 AbstractInferenceSolver (edu.illinois.cs.cogcomp.sl.inference.AbstractInferenceSolver)1 WeightVector (edu.illinois.cs.cogcomp.sl.util.WeightVector)1 Dataset (edu.illinois.cs.cogcomp.verbsense.data.Dataset)1 MulticlassInference (edu.illinois.cs.cogcomp.verbsense.inference.MulticlassInference)1 SenseInstance (edu.illinois.cs.cogcomp.verbsense.jlis.SenseInstance)1 SenseStructure (edu.illinois.cs.cogcomp.verbsense.jlis.SenseStructure)1 LearnerParameters (edu.illinois.cs.cogcomp.verbsense.learn.LearnerParameters)1 VerbSenseConfigurator (edu.illinois.cs.cogcomp.verbsense.utilities.VerbSenseConfigurator)1 AtomicInteger (java.util.concurrent.atomic.AtomicInteger)1