Search in sources :

Example 6 with ClassificationEvent

use of com.joliciel.talismane.machineLearning.ClassificationEvent in project talismane by joliciel-informatique.

the class PosTagEventStream method next.

@Override
public ClassificationEvent next() throws TalismaneException, IOException {
    ClassificationEvent event = null;
    if (this.hasNext()) {
        PosTaggedToken taggedToken = currentSentence.get(currentIndex++);
        String classification = taggedToken.getTag().getCode();
        if (LOG.isDebugEnabled())
            LOG.debug("next event, token: " + taggedToken.getToken().getAnalyisText() + " : " + classification);
        PosTaggerContext context = new PosTaggerContextImpl(taggedToken.getToken(), currentHistory);
        List<FeatureResult<?>> posTagFeatureResults = new ArrayList<FeatureResult<?>>();
        for (PosTaggerFeature<?> posTaggerFeature : posTaggerFeatures) {
            RuntimeEnvironment env = new RuntimeEnvironment();
            FeatureResult<?> featureResult = posTaggerFeature.check(context, env);
            if (featureResult != null)
                posTagFeatureResults.add(featureResult);
        }
        if (LOG.isTraceEnabled()) {
            LOG.trace("Token: " + taggedToken.getToken().getAnalyisText());
            SortedSet<String> featureResultSet = posTagFeatureResults.stream().map(f -> f.toString()).collect(Collectors.toCollection(() -> new TreeSet<String>()));
            for (String featureResultString : featureResultSet) {
                LOG.trace(featureResultString);
            }
        }
        event = new ClassificationEvent(posTagFeatureResults, classification);
        currentHistory.addPosTaggedToken(taggedToken);
        if (currentIndex == currentSentence.size()) {
            currentSentence = null;
        }
    }
    return event;
}
Also used : Logger(org.slf4j.Logger) SortedSet(java.util.SortedSet) LoggerFactory(org.slf4j.LoggerFactory) Set(java.util.Set) IOException(java.io.IOException) ClassificationEvent(com.joliciel.talismane.machineLearning.ClassificationEvent) Collectors(java.util.stream.Collectors) TreeSet(java.util.TreeSet) TalismaneException(com.joliciel.talismane.TalismaneException) ArrayList(java.util.ArrayList) LinkedHashMap(java.util.LinkedHashMap) RuntimeEnvironment(com.joliciel.talismane.machineLearning.features.RuntimeEnvironment) PosTaggerFeature(com.joliciel.talismane.posTagger.features.PosTaggerFeature) List(java.util.List) ClassificationEventStream(com.joliciel.talismane.machineLearning.ClassificationEventStream) FeatureResult(com.joliciel.talismane.machineLearning.features.FeatureResult) Map(java.util.Map) RuntimeEnvironment(com.joliciel.talismane.machineLearning.features.RuntimeEnvironment) ArrayList(java.util.ArrayList) TreeSet(java.util.TreeSet) ClassificationEvent(com.joliciel.talismane.machineLearning.ClassificationEvent) FeatureResult(com.joliciel.talismane.machineLearning.features.FeatureResult)

Example 7 with ClassificationEvent

use of com.joliciel.talismane.machineLearning.ClassificationEvent in project talismane by joliciel-informatique.

the class LanguageDetectorEventStream method next.

@Override
public ClassificationEvent next() throws TalismaneException {
    LanguageTaggedText languageTaggedText = this.corpusReader.nextText();
    List<FeatureResult<?>> featureResults = new ArrayList<FeatureResult<?>>();
    for (LanguageDetectorFeature<?> feature : features) {
        RuntimeEnvironment env = new RuntimeEnvironment();
        FeatureResult<?> featureResult = feature.check(languageTaggedText.getText(), env);
        if (featureResult != null)
            featureResults.add(featureResult);
    }
    String classification = languageTaggedText.getLanguage().toLanguageTag();
    if (LOG.isTraceEnabled()) {
        for (FeatureResult<?> result : featureResults) {
            LOG.trace(result.toString());
        }
        LOG.trace("classification: " + classification);
    }
    ClassificationEvent event = new ClassificationEvent(featureResults, classification);
    return event;
}
Also used : RuntimeEnvironment(com.joliciel.talismane.machineLearning.features.RuntimeEnvironment) ArrayList(java.util.ArrayList) ClassificationEvent(com.joliciel.talismane.machineLearning.ClassificationEvent) FeatureResult(com.joliciel.talismane.machineLearning.features.FeatureResult)

Example 8 with ClassificationEvent

use of com.joliciel.talismane.machineLearning.ClassificationEvent in project talismane by joliciel-informatique.

the class LinearSVMModelTrainer method getFeatureMatrix.

private Feature[][] getFeatureMatrix(ClassificationEventStream corpusEventStream, TObjectIntMap<String> featureIndexMap, TObjectIntMap<String> outcomeIndexMap, TIntList outcomeList, TIntIntMap featureCountMap, CountingInfo countingInfo) {
    try {
        int maxFeatureCount = 0;
        List<Feature[]> fullFeatureList = new ArrayList<Feature[]>();
        while (corpusEventStream.hasNext()) {
            ClassificationEvent corpusEvent = corpusEventStream.next();
            int outcomeIndex = outcomeIndexMap.get(corpusEvent.getClassification());
            if (outcomeIndex < 0) {
                outcomeIndex = countingInfo.currentOutcomeIndex++;
                outcomeIndexMap.put(corpusEvent.getClassification(), outcomeIndex);
            }
            outcomeList.add(outcomeIndex);
            Map<Integer, Feature> featureList = new TreeMap<Integer, Feature>();
            for (FeatureResult<?> featureResult : corpusEvent.getFeatureResults()) {
                if (featureResult.getOutcome() instanceof List) {
                    @SuppressWarnings("unchecked") FeatureResult<List<WeightedOutcome<String>>> stringCollectionResult = (FeatureResult<List<WeightedOutcome<String>>>) featureResult;
                    for (WeightedOutcome<String> stringOutcome : stringCollectionResult.getOutcome()) {
                        String featureName = featureResult.getTrainingName() + "|" + featureResult.getTrainingOutcome(stringOutcome.getOutcome());
                        double value = stringOutcome.getWeight();
                        this.addFeatureResult(featureName, value, featureList, featureIndexMap, featureCountMap, countingInfo);
                    }
                } else {
                    double value = 1.0;
                    if (featureResult.getOutcome() instanceof Double) {
                        @SuppressWarnings("unchecked") FeatureResult<Double> doubleResult = (FeatureResult<Double>) featureResult;
                        value = doubleResult.getOutcome().doubleValue();
                    }
                    this.addFeatureResult(featureResult.getTrainingName(), value, featureList, featureIndexMap, featureCountMap, countingInfo);
                }
            }
            if (featureList.size() > maxFeatureCount)
                maxFeatureCount = featureList.size();
            // convert to array immediately, to avoid double storage
            int j = 0;
            Feature[] featureArray = new Feature[featureList.size()];
            for (Feature feature : featureList.values()) {
                featureArray[j] = feature;
                j++;
            }
            fullFeatureList.add(featureArray);
            countingInfo.numEvents++;
            if (countingInfo.numEvents % 1000 == 0) {
                LOG.debug("Processed " + countingInfo.numEvents + " events.");
            }
        }
        Feature[][] featureMatrix = new Feature[countingInfo.numEvents][];
        int i = 0;
        for (Feature[] featureArray : fullFeatureList) {
            featureMatrix[i] = featureArray;
            i++;
        }
        fullFeatureList = null;
        LOG.debug("Event count: " + countingInfo.numEvents);
        LOG.debug("Feature count: " + featureIndexMap.size());
        return featureMatrix;
    } catch (TalismaneException e) {
        LOG.error(e.getMessage(), e);
        throw new RuntimeException(e);
    } catch (IOException e) {
        LOG.error(e.getMessage(), e);
        throw new RuntimeException(e);
    }
}
Also used : TalismaneException(com.joliciel.talismane.TalismaneException) TIntArrayList(gnu.trove.list.array.TIntArrayList) ArrayList(java.util.ArrayList) WeightedOutcome(com.joliciel.talismane.utils.WeightedOutcome) IOException(java.io.IOException) TreeMap(java.util.TreeMap) Feature(de.bwaldvogel.liblinear.Feature) TIntArrayList(gnu.trove.list.array.TIntArrayList) ArrayList(java.util.ArrayList) TIntList(gnu.trove.list.TIntList) List(java.util.List) ClassificationEvent(com.joliciel.talismane.machineLearning.ClassificationEvent) FeatureResult(com.joliciel.talismane.machineLearning.features.FeatureResult)

Example 9 with ClassificationEvent

use of com.joliciel.talismane.machineLearning.ClassificationEvent in project talismane by joliciel-informatique.

the class PatternEventStream method next.

@Override
public ClassificationEvent next() throws TalismaneException, IOException {
    ClassificationEvent event = null;
    if (this.hasNext()) {
        TokenPatternMatch tokenPatternMatch = currentPatternMatches.get(currentIndex);
        TokeniserOutcome outcome = currentOutcomes.get(currentIndex);
        String classification = outcome.name();
        LOG.debug("next event, pattern match: " + tokenPatternMatch.toString() + ", outcome:" + classification);
        List<FeatureResult<?>> tokenFeatureResults = new ArrayList<FeatureResult<?>>();
        for (TokenPatternMatchFeature<?> feature : tokenPatternMatchFeatures) {
            RuntimeEnvironment env = new RuntimeEnvironment();
            FeatureResult<?> featureResult = feature.check(tokenPatternMatch, env);
            if (featureResult != null) {
                tokenFeatureResults.add(featureResult);
            }
        }
        if (LOG.isTraceEnabled()) {
            SortedSet<String> featureResultSet = tokenFeatureResults.stream().map(f -> f.toString()).collect(Collectors.toCollection(() -> new TreeSet<String>()));
            for (String featureResultString : featureResultSet) {
                LOG.trace(featureResultString);
            }
        }
        event = new ClassificationEvent(tokenFeatureResults, classification);
        currentIndex++;
        if (currentIndex == currentPatternMatches.size()) {
            currentPatternMatches = null;
        }
    }
    return event;
}
Also used : TokeniserAnnotatedCorpusReader(com.joliciel.talismane.tokeniser.TokeniserAnnotatedCorpusReader) SortedSet(java.util.SortedSet) LoggerFactory(org.slf4j.LoggerFactory) TokenSequence(com.joliciel.talismane.tokeniser.TokenSequence) TaggedToken(com.joliciel.talismane.tokeniser.TaggedToken) TreeSet(java.util.TreeSet) TalismaneException(com.joliciel.talismane.TalismaneException) TalismaneSession(com.joliciel.talismane.TalismaneSession) ArrayList(java.util.ArrayList) LinkedHashMap(java.util.LinkedHashMap) RuntimeEnvironment(com.joliciel.talismane.machineLearning.features.RuntimeEnvironment) ClassificationEventStream(com.joliciel.talismane.machineLearning.ClassificationEventStream) TokenPatternMatchFeature(com.joliciel.talismane.tokeniser.features.TokenPatternMatchFeature) FeatureResult(com.joliciel.talismane.machineLearning.features.FeatureResult) Map(java.util.Map) Logger(org.slf4j.Logger) Set(java.util.Set) IOException(java.io.IOException) TokeniserOutcome(com.joliciel.talismane.tokeniser.TokeniserOutcome) ClassificationEvent(com.joliciel.talismane.machineLearning.ClassificationEvent) Decision(com.joliciel.talismane.machineLearning.Decision) Collectors(java.util.stream.Collectors) List(java.util.List) Token(com.joliciel.talismane.tokeniser.Token) Sentence(com.joliciel.talismane.rawText.Sentence) RuntimeEnvironment(com.joliciel.talismane.machineLearning.features.RuntimeEnvironment) ArrayList(java.util.ArrayList) TokeniserOutcome(com.joliciel.talismane.tokeniser.TokeniserOutcome) TreeSet(java.util.TreeSet) ClassificationEvent(com.joliciel.talismane.machineLearning.ClassificationEvent) FeatureResult(com.joliciel.talismane.machineLearning.features.FeatureResult)

Example 10 with ClassificationEvent

use of com.joliciel.talismane.machineLearning.ClassificationEvent in project jochre by urieli.

the class JochreLetterEventStream method next.

@Override
public ClassificationEvent next() {
    ClassificationEvent event = null;
    if (this.hasNext()) {
        Shape shape = shapeInSequence.getShape();
        LOG.debug("next event, shape: " + shape);
        LetterGuesserContext context = new LetterGuesserContext(shapeInSequence, history);
        List<FeatureResult<?>> featureResults = new ArrayList<>();
        // analyse features
        for (LetterFeature<?> feature : features) {
            RuntimeEnvironment env = new RuntimeEnvironment();
            FeatureResult<?> featureResult = feature.check(context, env);
            if (featureResult != null) {
                featureResults.add(featureResult);
                if (LOG.isTraceEnabled()) {
                    LOG.trace(featureResult.toString());
                }
            }
        }
        String outcome = shape.getLetter();
        event = new ClassificationEvent(featureResults, outcome);
        history.getLetters().add(outcome);
        // set shape to null so that hasNext can retrieve the next one.
        this.shapeInSequence = null;
    }
    return event;
}
Also used : Shape(com.joliciel.jochre.graphics.Shape) RuntimeEnvironment(com.joliciel.talismane.machineLearning.features.RuntimeEnvironment) ArrayList(java.util.ArrayList) ClassificationEvent(com.joliciel.talismane.machineLearning.ClassificationEvent) FeatureResult(com.joliciel.talismane.machineLearning.features.FeatureResult)

Aggregations

ClassificationEvent (com.joliciel.talismane.machineLearning.ClassificationEvent)11 ArrayList (java.util.ArrayList)10 FeatureResult (com.joliciel.talismane.machineLearning.features.FeatureResult)9 RuntimeEnvironment (com.joliciel.talismane.machineLearning.features.RuntimeEnvironment)8 TalismaneException (com.joliciel.talismane.TalismaneException)6 IOException (java.io.IOException)6 List (java.util.List)5 ClassificationEventStream (com.joliciel.talismane.machineLearning.ClassificationEventStream)4 LinkedHashMap (java.util.LinkedHashMap)4 Map (java.util.Map)4 Set (java.util.Set)4 SortedSet (java.util.SortedSet)4 TreeSet (java.util.TreeSet)4 Collectors (java.util.stream.Collectors)4 Logger (org.slf4j.Logger)4 LoggerFactory (org.slf4j.LoggerFactory)4 Shape (com.joliciel.jochre.graphics.Shape)1 TalismaneSession (com.joliciel.talismane.TalismaneSession)1 Decision (com.joliciel.talismane.machineLearning.Decision)1 ParseConfigurationFeature (com.joliciel.talismane.parser.features.ParseConfigurationFeature)1