Search in sources :

Example 6 with SparseAveragedPerceptron

use of edu.illinois.cs.cogcomp.lbjava.learn.SparseAveragedPerceptron in project cogcomp-nlp by CogComp.

the class NerBenchmark method trainModel.

/**
 * This method does all the work of actually training the model. It requires the config file name, the name of the training
 * directory, and a file representation of that, and the same for the dev and test directories. The model is trained only
 * against the training data, dev data is used to test for convergence, rather than using the number of iterations, and the
 * test data is held out to compute the final accuracy results which are returned.
 * @param confFile the name of the config file.
 * @param trainDirName the name of the train directory.
 * @param trainDir the file object for the train directory.
 * @param devDirName the name of the dev directory.
 * @param devDir the file object for the dev directory.
 * @param testDirName the test directory name.
 * @param testDir the test directory file object.
 * @return
 * @throws Exception
 */
private Vector<TestDiscrete[]> trainModel(String confFile, String trainDirName, File trainDir, String devDirName, File devDir, String testDirName, File testDir) throws Exception {
    System.out.println("\n\n----- Training models for evaluation for " + confFile + " ------");
    ParametersForLbjCode prms = Parameters.readConfigAndLoadExternalData(confFile, true);
    ResourceManager rm = new ResourceManager(confFile);
    ModelLoader.load(rm, rm.getString("modelName"), true, prms);
    NETaggerLevel1 taggerLevel1 = (NETaggerLevel1) prms.taggerLevel1;
    NETaggerLevel2 taggerLevel2 = (NETaggerLevel2) prms.taggerLevel2;
    SparseAveragedPerceptron sap1 = (SparseAveragedPerceptron) taggerLevel1.getBaseLTU();
    sap1.setLearningRate(prms.learningRatePredictionsLevel1);
    sap1.setThickness(prms.thicknessPredictionsLevel1);
    System.out.println("L1 learning rate = " + sap1.getLearningRate() + ", thickness = " + sap1.getPositiveThickness());
    if (prms.featuresToUse.containsKey("PredictionsLevel1")) {
        SparseAveragedPerceptron sap2 = (SparseAveragedPerceptron) taggerLevel2.getBaseLTU();
        sap2.setLearningRate(prms.learningRatePredictionsLevel2);
        sap2.setThickness(prms.thicknessPredictionsLevel2);
        System.out.println("L2 learning rate = " + sap2.getLearningRate() + ", thickness = " + sap2.getPositiveThickness());
    }
    // there is a training directory, with training enabled, so train. We use the same dataset
    // for both training and evaluating.
    LearningCurveMultiDataset.getLearningCurve(iterations, trainDirName, devDirName, incremental, prms);
    System.out.println("\n\n----- Final results for " + confFile + ", verbose ------");
    NETesterMultiDataset.test(testDirName, true, prms.labelsToIgnoreInEvaluation, prms.labelsToAnonymizeInEvaluation, prms);
    System.out.println("\n\n----- Final results for " + confFile + ", F1 only ------");
    return NETesterMultiDataset.test(testDirName, false, prms.labelsToIgnoreInEvaluation, prms.labelsToAnonymizeInEvaluation, prms);
}
Also used : NETaggerLevel2(edu.illinois.cs.cogcomp.ner.LbjFeatures.NETaggerLevel2) NETaggerLevel1(edu.illinois.cs.cogcomp.ner.LbjFeatures.NETaggerLevel1) ResourceManager(edu.illinois.cs.cogcomp.core.utilities.configuration.ResourceManager) SparseAveragedPerceptron(edu.illinois.cs.cogcomp.lbjava.learn.SparseAveragedPerceptron)

Example 7 with SparseAveragedPerceptron

use of edu.illinois.cs.cogcomp.lbjava.learn.SparseAveragedPerceptron in project cogcomp-nlp by CogComp.

the class LearningCurveMultiDataset method getLearningCurve.

/**
 * use fixedNumIterations=-1 if you want to use the automatic convergence criterion, incremental
 * true will start with the existing models weights, and continue training with that set of default
 * weights. Training data is assumed to be in column format.
 * <p>
 * @param fixedNumIterations if this is > -1 the number of training iterations that will run.
 * @param trainDataSet the path on the file system for the training data.
 * @param testDataSet the path on the file system for the test data used to test convergence.
 * @param incremental if the model is being incremented, this is true.
 * @throws Exception
 */
public static void getLearningCurve(Vector<Data> trainDataSet, Vector<Data> testDataSet, int fixedNumIterations, boolean incremental, ParametersForLbjCode params) throws Exception {
    double bestF1Level1 = -1;
    int bestRoundLevel1 = 0;
    // Get the directory name (<configname>.model is appended in LbjTagger/Parameters.java:139)
    String modelPath = params.pathToModelFile;
    String modelPathDir = modelPath.substring(0, modelPath.lastIndexOf("/"));
    if (IOUtils.exists(modelPathDir)) {
        if (!IOUtils.isDirectory(modelPathDir)) {
            String msg = "ERROR: " + NAME + ".getLearningCurve(): model directory '" + modelPathDir + "' already exists as a (non-directory) file.";
            logger.error(msg);
            throw new IOException(msg);
        } else
            logger.warn(NAME + ".getLearningCurve(): writing to existing model path '" + modelPathDir + "'...");
    } else {
        IOUtils.mkdir(modelPathDir);
    }
    NETaggerLevel1.Parameters paramLevel1 = new NETaggerLevel1.Parameters();
    paramLevel1.baseLTU = new SparseAveragedPerceptron(params.learningRatePredictionsLevel1, 0, params.thicknessPredictionsLevel1);
    paramLevel1.baseLTU.featurePruningThreshold = params.featurePruningThreshold;
    logger.info("Level 1 classifier learning rate = " + params.learningRatePredictionsLevel1 + ", thickness = " + params.thicknessPredictionsLevel1);
    NETaggerLevel1 tagger1 = new NETaggerLevel1(paramLevel1, modelPath + ".level1", modelPath + ".level1.lex");
    if (!incremental) {
        logger.info("Training L1 model from scratch.");
        tagger1.forget();
    } else {
        logger.info("Training L1 model incrementally.");
    }
    params.taggerLevel1 = tagger1;
    for (int dataId = 0; dataId < trainDataSet.size(); dataId++) {
        Data trainData = trainDataSet.elementAt(dataId);
        if (params.featuresToUse.containsKey("PredictionsLevel1")) {
            PredictionsAndEntitiesConfidenceScores.getAndMarkEntities(trainData, NEWord.LabelToLookAt.GoldLabel);
            TwoLayerPredictionAggregationFeatures.setLevel1AggregationFeatures(trainData, true);
        }
    }
    // preextract the L1 test and train data.
    String path = params.pathToModelFile;
    String trainPathL1 = path + ".level1.prefetchedTrainData";
    File deleteme = new File(trainPathL1);
    if (deleteme.exists())
        deleteme.delete();
    String testPathL1 = path + ".level1.prefetchedTestData";
    deleteme = new File(testPathL1);
    if (deleteme.exists())
        deleteme.delete();
    logger.info("Pre-extracting the training data for Level 1 classifier, saving to " + trainPathL1);
    BatchTrainer bt1train = prefetchAndGetBatchTrainer(tagger1, trainDataSet, trainPathL1, params);
    logger.info("Pre-extracting the testing data for Level 1 classifier, saving to " + testPathL1);
    BatchTrainer bt1test = prefetchAndGetBatchTrainer(tagger1, testDataSet, testPathL1, params);
    Parser testParser1 = bt1test.getParser();
    // create the best model possible.
    {
        NETaggerLevel1 saveme = null;
        for (int i = 0; (fixedNumIterations == -1 && i < 200 && i - bestRoundLevel1 < 10) || (fixedNumIterations > 0 && i <= fixedNumIterations); ++i) {
            bt1train.train(1);
            testParser1.reset();
            TestDiscrete simpleTest = new TestDiscrete();
            simpleTest.addNull("O");
            TestDiscrete.testDiscrete(simpleTest, tagger1, null, testParser1, true, 0);
            double f1Level1 = simpleTest.getOverallStats()[2];
            if (Double.isNaN(f1Level1))
                f1Level1 = 0;
            if (f1Level1 > bestF1Level1) {
                bestF1Level1 = f1Level1;
                bestRoundLevel1 = i;
                saveme = (NETaggerLevel1) tagger1.clone();
                saveme.beginTraining();
                System.out.println(saveme);
                System.out.println(bestF1Level1);
                System.out.println(f1Level1);
            }
            logger.info(i + " rounds.  Best so far for Level1 : (" + bestRoundLevel1 + ")=" + bestF1Level1);
        }
        saveme.getBaseLTU().featurePruningThreshold = params.featurePruningThreshold;
        saveme.doneTraining();
        saveme.save();
        logger.info("Level 1; best round : " + bestRoundLevel1 + "\tbest F1 : " + bestF1Level1);
    }
    // Read the best model back in, optimize by pruning useless features, then write it agains
    tagger1 = new NETaggerLevel1(paramLevel1, modelPath + ".level1", modelPath + ".level1.lex");
    // trash the l2 prefetch data
    String trainPathL2 = path + ".level2.prefetchedTrainData";
    deleteme = new File(trainPathL2);
    if (deleteme.exists())
        deleteme.delete();
    String testPathL2 = path + ".level2.prefetchedTestData";
    deleteme = new File(testPathL1);
    if (deleteme.exists())
        deleteme.delete();
    NETaggerLevel2.Parameters paramLevel2 = new NETaggerLevel2.Parameters();
    paramLevel2.baseLTU = new SparseAveragedPerceptron(params.learningRatePredictionsLevel2, 0, params.thicknessPredictionsLevel2);
    paramLevel2.baseLTU.featurePruningThreshold = params.featurePruningThreshold;
    NETaggerLevel2 tagger2 = new NETaggerLevel2(paramLevel2, params.pathToModelFile + ".level2", params.pathToModelFile + ".level2.lex");
    if (!incremental) {
        logger.info("Training L2 model from scratch.");
        tagger2.forget();
    } else {
        logger.info("Training L2 model incrementally.");
    }
    params.taggerLevel2 = tagger2;
    // Previously checked if PatternFeatures was in featuresToUse.
    if (params.featuresToUse.containsKey("PredictionsLevel1")) {
        logger.info("Level 2 classifier learning rate = " + params.learningRatePredictionsLevel2 + ", thickness = " + params.thicknessPredictionsLevel2);
        double bestF1Level2 = -1;
        int bestRoundLevel2 = 0;
        logger.info("Pre-extracting the training data for Level 2 classifier, saving to " + trainPathL2);
        BatchTrainer bt2train = prefetchAndGetBatchTrainer(tagger2, trainDataSet, trainPathL2, params);
        logger.info("Pre-extracting the testing data for Level 2 classifier, saving to " + testPathL2);
        BatchTrainer bt2test = prefetchAndGetBatchTrainer(tagger2, testDataSet, testPathL2, params);
        Parser testParser2 = bt2test.getParser();
        // create the best model possible.
        {
            NETaggerLevel2 saveme = null;
            for (int i = 0; (fixedNumIterations == -1 && i < 200 && i - bestRoundLevel2 < 10) || (fixedNumIterations > 0 && i <= fixedNumIterations); ++i) {
                logger.info("Learning level 2 classifier; round " + i);
                bt2train.train(1);
                logger.info("Testing level 2 classifier;  on prefetched data, round: " + i);
                testParser2.reset();
                TestDiscrete simpleTest = new TestDiscrete();
                simpleTest.addNull("O");
                TestDiscrete.testDiscrete(simpleTest, tagger2, null, testParser2, true, 0);
                double f1Level2 = simpleTest.getOverallStats()[2];
                if (f1Level2 >= bestF1Level2) {
                    bestF1Level2 = f1Level2;
                    bestRoundLevel2 = i;
                    saveme = (NETaggerLevel2) tagger2.clone();
                    saveme.beginTraining();
                }
                logger.info(i + " rounds.  Best so far for Level2 : (" + bestRoundLevel2 + ") " + bestF1Level2);
            }
            saveme.getBaseLTU().featurePruningThreshold = params.featurePruningThreshold;
            saveme.doneTraining();
            saveme.save();
        }
        // trash the l2 prefetch data
        deleteme = new File(trainPathL2);
        if (deleteme.exists())
            deleteme.delete();
        deleteme = new File(testPathL1);
        if (deleteme.exists())
            deleteme.delete();
        logger.info("Level1: bestround=" + bestRoundLevel1 + "\t F1=" + bestF1Level1 + "\t Level2: bestround=" + bestRoundLevel2 + "\t F1=" + bestF1Level2);
    }
    NETesterMultiDataset.printTestResultsByDataset(testDataSet, tagger1, tagger2, true, params);
    /*
         * This will override the models forcing to save the iteration we're interested in- the
         * fixedNumIterations iteration, the last one. But note - both layers will be saved for this
         * iteration. If the best performance for one of the layers came before the final iteration,
         * we're in a small trouble- the performance will decrease
         */
    if (fixedNumIterations > -1) {
        tagger1.save();
        tagger2.save();
    }
}
Also used : NETaggerLevel2(edu.illinois.cs.cogcomp.ner.LbjFeatures.NETaggerLevel2) NETaggerLevel1(edu.illinois.cs.cogcomp.ner.LbjFeatures.NETaggerLevel1) TestDiscrete(edu.illinois.cs.cogcomp.lbjava.classify.TestDiscrete) IOException(java.io.IOException) Parser(edu.illinois.cs.cogcomp.lbjava.parse.Parser) BatchTrainer(edu.illinois.cs.cogcomp.lbjava.learn.BatchTrainer) File(java.io.File) SparseAveragedPerceptron(edu.illinois.cs.cogcomp.lbjava.learn.SparseAveragedPerceptron)

Example 8 with SparseAveragedPerceptron

use of edu.illinois.cs.cogcomp.lbjava.learn.SparseAveragedPerceptron in project cogcomp-nlp by CogComp.

the class ClassifierComparison method printConstrainedClassifierPerformance.

public static void printConstrainedClassifierPerformance(Parser parser) {
    List<Pair<Classifier, EvaluateDiscrete>> classifiers = new ArrayList<>();
    LocalCommaClassifier learner = new LocalCommaClassifier();
    EvaluateDiscrete unconstrainedPerformance = new EvaluateDiscrete();
    learner.setLTU(new SparseAveragedPerceptron(0.003, 0, 3.5));
    classifiers.add(new Pair<Classifier, EvaluateDiscrete>(new SubstitutePairConstrainedCommaClassifier(), new EvaluateDiscrete()));
    classifiers.add(new Pair<Classifier, EvaluateDiscrete>(new LocativePairConstrainedCommaClassifier(), new EvaluateDiscrete()));
    classifiers.add(new Pair<Classifier, EvaluateDiscrete>(new ListCommasConstrainedCommaClassifier(), new EvaluateDiscrete()));
    classifiers.add(new Pair<Classifier, EvaluateDiscrete>(new OxfordCommaConstrainedCommaClassifier(), new EvaluateDiscrete()));
    int k = 5;
    parser.reset();
    FoldParser foldParser = new FoldParser(parser, k, SplitPolicy.sequential, 0, false);
    for (int i = 0; i < k; foldParser.setPivot(++i)) {
        foldParser.setFromPivot(false);
        foldParser.reset();
        learner.forget();
        BatchTrainer bt = new BatchTrainer(learner, foldParser);
        Lexicon lexicon = bt.preExtract(null);
        learner.setLexicon(lexicon);
        bt.train(250);
        learner.save();
        foldParser.setFromPivot(true);
        foldParser.reset();
        unconstrainedPerformance.reportAll(EvaluateDiscrete.evaluateDiscrete(learner, learner.getLabeler(), foldParser));
        for (Pair<Classifier, EvaluateDiscrete> pair : classifiers) {
            foldParser.reset();
            pair.getSecond().reportAll(EvaluateDiscrete.evaluateDiscrete(pair.getFirst(), learner.getLabeler(), foldParser));
        }
    }
    for (Pair<Classifier, EvaluateDiscrete> pair : classifiers) {
        System.out.println(pair.getFirst().name + " " + pair.getSecond().getOverallStats()[2]);
    }
}
Also used : ListCommasConstrainedCommaClassifier(edu.illinois.cs.cogcomp.comma.lbj.ListCommasConstrainedCommaClassifier) OxfordCommaConstrainedCommaClassifier(edu.illinois.cs.cogcomp.comma.lbj.OxfordCommaConstrainedCommaClassifier) LocativePairConstrainedCommaClassifier(edu.illinois.cs.cogcomp.comma.lbj.LocativePairConstrainedCommaClassifier) Lexicon(edu.illinois.cs.cogcomp.lbjava.learn.Lexicon) ArrayList(java.util.ArrayList) OxfordCommaConstrainedCommaClassifier(edu.illinois.cs.cogcomp.comma.lbj.OxfordCommaConstrainedCommaClassifier) Classifier(edu.illinois.cs.cogcomp.lbjava.classify.Classifier) LocativePairConstrainedCommaClassifier(edu.illinois.cs.cogcomp.comma.lbj.LocativePairConstrainedCommaClassifier) StructuredCommaClassifier(edu.illinois.cs.cogcomp.comma.sl.StructuredCommaClassifier) SubstitutePairConstrainedCommaClassifier(edu.illinois.cs.cogcomp.comma.lbj.SubstitutePairConstrainedCommaClassifier) LocalCommaClassifier(edu.illinois.cs.cogcomp.comma.lbj.LocalCommaClassifier) ListCommasConstrainedCommaClassifier(edu.illinois.cs.cogcomp.comma.lbj.ListCommasConstrainedCommaClassifier) BatchTrainer(edu.illinois.cs.cogcomp.lbjava.learn.BatchTrainer) EvaluateDiscrete(edu.illinois.cs.cogcomp.comma.utils.EvaluateDiscrete) SubstitutePairConstrainedCommaClassifier(edu.illinois.cs.cogcomp.comma.lbj.SubstitutePairConstrainedCommaClassifier) SparseAveragedPerceptron(edu.illinois.cs.cogcomp.lbjava.learn.SparseAveragedPerceptron) Pair(edu.illinois.cs.cogcomp.core.datastructures.Pair) LocalCommaClassifier(edu.illinois.cs.cogcomp.comma.lbj.LocalCommaClassifier) FoldParser(edu.illinois.cs.cogcomp.lbjava.parse.FoldParser)

Example 9 with SparseAveragedPerceptron

use of edu.illinois.cs.cogcomp.lbjava.learn.SparseAveragedPerceptron in project cogcomp-nlp by CogComp.

the class ClassifierComparison method localCVal.

public static EvaluateDiscrete localCVal(boolean trainOnGold, boolean testOnGold, Parser parser, int learningRounds, double learningRate, double threshold, double thickness, boolean testOnTrain) {
    int k = 5;
    LocalCommaClassifier learner = new LocalCommaClassifier();
    learner.setLTU(new SparseAveragedPerceptron(learningRate, threshold, thickness));
    parser.reset();
    final FoldParser foldParser = new FoldParser(parser, k, SplitPolicy.sequential, 0, false);
    EvaluateDiscrete performanceRecord = new EvaluateDiscrete();
    for (int i = 0; i < k; foldParser.setPivot(++i)) {
        foldParser.setFromPivot(false);
        foldParser.reset();
        learner.forget();
        BatchTrainer bt = new BatchTrainer(learner, foldParser);
        Comma.useGoldFeatures(trainOnGold);
        Lexicon lexicon = bt.preExtract(null);
        learner.setLexicon(lexicon);
        bt.train(learningRounds);
        if (!testOnTrain)
            foldParser.setFromPivot(true);
        foldParser.reset();
        Comma.useGoldFeatures(testOnGold);
        EvaluateDiscrete currentPerformance = EvaluateDiscrete.evaluateDiscrete(learner, learner.getLabeler(), foldParser);
        performanceRecord.reportAll(currentPerformance);
    }
    // System.out.println(performanceRecord.getOverallStats()[2]);
    performanceRecord.printPerformance(System.out);
    // performanceRecord.printConfusion(System.out);
    return performanceRecord;
}
Also used : BatchTrainer(edu.illinois.cs.cogcomp.lbjava.learn.BatchTrainer) EvaluateDiscrete(edu.illinois.cs.cogcomp.comma.utils.EvaluateDiscrete) Lexicon(edu.illinois.cs.cogcomp.lbjava.learn.Lexicon) SparseAveragedPerceptron(edu.illinois.cs.cogcomp.lbjava.learn.SparseAveragedPerceptron) LocalCommaClassifier(edu.illinois.cs.cogcomp.comma.lbj.LocalCommaClassifier) FoldParser(edu.illinois.cs.cogcomp.lbjava.parse.FoldParser)

Aggregations

SparseAveragedPerceptron (edu.illinois.cs.cogcomp.lbjava.learn.SparseAveragedPerceptron)9 BatchTrainer (edu.illinois.cs.cogcomp.lbjava.learn.BatchTrainer)4 NETaggerLevel1 (edu.illinois.cs.cogcomp.ner.LbjFeatures.NETaggerLevel1)3 NETaggerLevel2 (edu.illinois.cs.cogcomp.ner.LbjFeatures.NETaggerLevel2)3 LocalCommaClassifier (edu.illinois.cs.cogcomp.comma.lbj.LocalCommaClassifier)2 EvaluateDiscrete (edu.illinois.cs.cogcomp.comma.utils.EvaluateDiscrete)2 OVector (edu.illinois.cs.cogcomp.core.datastructures.vectors.OVector)2 Feature (edu.illinois.cs.cogcomp.lbjava.classify.Feature)2 FeatureVector (edu.illinois.cs.cogcomp.lbjava.classify.FeatureVector)2 TestDiscrete (edu.illinois.cs.cogcomp.lbjava.classify.TestDiscrete)2 Lexicon (edu.illinois.cs.cogcomp.lbjava.learn.Lexicon)2 AveragedWeightVector (edu.illinois.cs.cogcomp.lbjava.learn.SparseAveragedPerceptron.AveragedWeightVector)2 SparseNetworkLearner (edu.illinois.cs.cogcomp.lbjava.learn.SparseNetworkLearner)2 FoldParser (edu.illinois.cs.cogcomp.lbjava.parse.FoldParser)2 LinkedVector (edu.illinois.cs.cogcomp.lbjava.parse.LinkedVector)2 Parser (edu.illinois.cs.cogcomp.lbjava.parse.Parser)2 File (java.io.File)2 IOException (java.io.IOException)2 HashMap (java.util.HashMap)2 Map (java.util.Map)2