Search in sources :

Example 1 with Dataset

use of edu.illinois.cs.cogcomp.verbsense.data.Dataset in project cogcomp-nlp by CogComp.

the class VerbSenseClassifierMain method preExtract.

@CommandDescription(description = "Pre-extracts the features for the verb-sense model. Run this before training.", usage = "preExtract")
public static void preExtract() throws Exception {
    SenseManager manager = getManager(true);
    ResourceManager conf = new VerbSenseConfigurator().getDefaultConfig();
    // If models directory doesn't exist create it
    if (!IOUtils.isDirectory(conf.getString(conf.getString(VerbSenseConfigurator.MODELS_DIRECTORY))))
        IOUtils.mkdir(conf.getString(conf.getString(VerbSenseConfigurator.MODELS_DIRECTORY)));
    int numConsumers = Runtime.getRuntime().availableProcessors();
    Dataset dataset = Dataset.PTBTrainDev;
    log.info("Pre-extracting features");
    ModelInfo modelInfo = manager.getModelInfo();
    String featureSet = "" + modelInfo.featureManifest.getIncludedFeatures().hashCode();
    String allDataCacheFile = VerbSenseConfigurator.getFeatureCacheFile(featureSet, dataset, rm);
    FeatureVectorCacheFile featureCache = preExtract(numConsumers, manager, dataset, allDataCacheFile);
    pruneFeatures(numConsumers, manager, featureCache, VerbSenseConfigurator.getPrunedFeatureCacheFile(featureSet, rm));
    Lexicon lexicon = modelInfo.getLexicon().getPrunedLexicon(manager.getPruneSize());
    log.info("Saving lexicon  with {} features to {}", lexicon.size(), manager.getLexiconFileName());
    log.info(lexicon.size() + " features in the lexicon");
    lexicon.save(manager.getLexiconFileName());
}
Also used : ModelInfo(edu.illinois.cs.cogcomp.verbsense.core.ModelInfo) VerbSenseConfigurator(edu.illinois.cs.cogcomp.verbsense.utilities.VerbSenseConfigurator) Dataset(edu.illinois.cs.cogcomp.verbsense.data.Dataset) Lexicon(edu.illinois.cs.cogcomp.core.datastructures.Lexicon) SenseManager(edu.illinois.cs.cogcomp.verbsense.core.SenseManager) ResourceManager(edu.illinois.cs.cogcomp.core.utilities.configuration.ResourceManager) FeatureVectorCacheFile(edu.illinois.cs.cogcomp.verbsense.caches.FeatureVectorCacheFile) CommandDescription(edu.illinois.cs.cogcomp.core.utilities.commands.CommandDescription)

Example 2 with Dataset

use of edu.illinois.cs.cogcomp.verbsense.data.Dataset in project cogcomp-nlp by CogComp.

the class VerbSenseClassifierMain method evaluate.

@CommandDescription(description = "Performs evaluation.", usage = "evaluate")
public static void evaluate() throws Exception {
    SenseManager manager = getManager(false);
    Dataset testSet = Dataset.PTBTest;
    ILPSolverFactory solver = new ILPSolverFactory(ILPSolverFactory.SolverType.JLISCuttingPlaneGurobi);
    ClassificationTester senseTester = new ClassificationTester();
    long start = System.currentTimeMillis();
    int count = 0;
    manager.getModelInfo().loadWeightVector();
    IResetableIterator<TextAnnotation> dataset = SentenceDBHandler.instance.getDataset(testSet);
    while (dataset.hasNext()) {
        TextAnnotation ta = dataset.next();
        if (!ta.hasView(SenseManager.getGoldViewName()))
            continue;
        TokenLabelView gold = (TokenLabelView) ta.getView(SenseManager.getGoldViewName());
        ILPInference inference = manager.getInference(solver, gold.getConstituents());
        assert inference != null;
        TokenLabelView prediction = inference.getOutputView();
        evaluateSense(gold, prediction, senseTester);
        count++;
        if (count % 1000 == 0) {
            long end = System.currentTimeMillis();
            log.info(count + " sentences done. Took " + (end - start) + "ms, Micro-F1 so far = " + senseTester.getMicroF1());
        }
    }
    long end = System.currentTimeMillis();
    System.out.println(count + " sentences done. Took " + (end - start) + "ms");
    System.out.println("\n\n* Sense");
    System.out.println(senseTester.getPerformanceTable(false).toOrgTable());
}
Also used : ILPSolverFactory(edu.illinois.cs.cogcomp.infer.ilp.ILPSolverFactory) Dataset(edu.illinois.cs.cogcomp.verbsense.data.Dataset) ClassificationTester(edu.illinois.cs.cogcomp.core.experiments.ClassificationTester) SenseManager(edu.illinois.cs.cogcomp.verbsense.core.SenseManager) TokenLabelView(edu.illinois.cs.cogcomp.core.datastructures.textannotation.TokenLabelView) TextAnnotation(edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation) ILPInference(edu.illinois.cs.cogcomp.verbsense.inference.ILPInference) CommandDescription(edu.illinois.cs.cogcomp.core.utilities.commands.CommandDescription)

Example 3 with Dataset

use of edu.illinois.cs.cogcomp.verbsense.data.Dataset in project cogcomp-nlp by CogComp.

the class SentenceDBHandler method initializeDatasets.

public void initializeDatasets(String dbFile) {
    Connection connection = DBHelper.getConnection(dbFile);
    for (Dataset d : Dataset.values()) {
        PreparedStatement stmt;
        try {
            stmt = connection.prepareStatement("select * from datasets where name = ?");
            stmt.setString(1, d.name());
            ResultSet rs = stmt.executeQuery();
            if (!rs.next()) {
                stmt = connection.prepareStatement("insert into datasets(name) values (?)");
                stmt.setString(1, d.name());
                stmt.executeUpdate();
            }
        } catch (SQLException e) {
            log.error("Error with databse access", e);
            throw new RuntimeException(e);
        }
    }
}
Also used : SQLException(java.sql.SQLException) Dataset(edu.illinois.cs.cogcomp.verbsense.data.Dataset) Connection(java.sql.Connection) ResultSet(java.sql.ResultSet) PreparedStatement(java.sql.PreparedStatement)

Aggregations

Dataset (edu.illinois.cs.cogcomp.verbsense.data.Dataset)3 CommandDescription (edu.illinois.cs.cogcomp.core.utilities.commands.CommandDescription)2 SenseManager (edu.illinois.cs.cogcomp.verbsense.core.SenseManager)2 Lexicon (edu.illinois.cs.cogcomp.core.datastructures.Lexicon)1 TextAnnotation (edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation)1 TokenLabelView (edu.illinois.cs.cogcomp.core.datastructures.textannotation.TokenLabelView)1 ClassificationTester (edu.illinois.cs.cogcomp.core.experiments.ClassificationTester)1 ResourceManager (edu.illinois.cs.cogcomp.core.utilities.configuration.ResourceManager)1 ILPSolverFactory (edu.illinois.cs.cogcomp.infer.ilp.ILPSolverFactory)1 FeatureVectorCacheFile (edu.illinois.cs.cogcomp.verbsense.caches.FeatureVectorCacheFile)1 ModelInfo (edu.illinois.cs.cogcomp.verbsense.core.ModelInfo)1 ILPInference (edu.illinois.cs.cogcomp.verbsense.inference.ILPInference)1 VerbSenseConfigurator (edu.illinois.cs.cogcomp.verbsense.utilities.VerbSenseConfigurator)1 Connection (java.sql.Connection)1 PreparedStatement (java.sql.PreparedStatement)1 ResultSet (java.sql.ResultSet)1 SQLException (java.sql.SQLException)1