Search in sources :

Example 16 with TextClassificationException

use of org.dkpro.tc.api.exception.TextClassificationException in project dkpro-tc by dkpro.

the class InstanceExtractor method getSingleInstanceDocument.

private Instance getSingleInstanceDocument(Instance instance, JCas jcas, boolean supportSparseFeatures) throws TextClassificationException {
    int jcasId = JCasUtil.selectSingle(jcas, JCasId.class).getId();
    TextClassificationTarget documentTcu = JCasUtil.selectSingle(jcas, TextClassificationTarget.class);
    if (addInstanceId) {
        instance.addFeature(InstanceIdFeature.retrieve(jcas));
    }
    for (FeatureExtractorResource_ImplBase featExt : featureExtractors) {
        if (!(featExt instanceof FeatureExtractor)) {
            throw new TextClassificationException("Using incompatible feature in document mode: " + featExt.getResourceName());
        }
        if (supportSparseFeatures) {
            instance.addFeatures(getSparse(jcas, documentTcu, featExt));
        } else {
            instance.addFeatures(getDense(jcas, documentTcu, featExt));
        }
        instance.setOutcomes(getOutcomes(jcas, null));
        instance.setWeight(getWeight(jcas, null));
        instance.setJcasId(jcasId);
    }
    return instance;
}
Also used : JCasId(org.dkpro.tc.api.type.JCasId) FeatureExtractor(org.dkpro.tc.api.features.FeatureExtractor) PairFeatureExtractor(org.dkpro.tc.api.features.PairFeatureExtractor) TextClassificationException(org.dkpro.tc.api.exception.TextClassificationException) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) FeatureExtractorResource_ImplBase(org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase)

Example 17 with TextClassificationException

use of org.dkpro.tc.api.exception.TextClassificationException in project dkpro-tc by dkpro.

the class ValidityCheckConnector method checkErrorConditionCasHasTwoVies.

private void checkErrorConditionCasHasTwoVies(JCas jcas) throws AnalysisEngineProcessException {
    try {
        jcas.getView(Constants.PART_ONE);
        jcas.getView(Constants.PART_TWO);
    } catch (CASException e) {
        throw new AnalysisEngineProcessException(new TextClassificationException("Your experiment is configured to be pair classification, but I could not find the two views " + Constants.PART_ONE + " and " + Constants.PART_TWO + ". Please use a reader that inhereits from " + Constants.class.getName()));
    }
}
Also used : TextClassificationException(org.dkpro.tc.api.exception.TextClassificationException) CASException(org.apache.uima.cas.CASException) AnalysisEngineProcessException(org.apache.uima.analysis_engine.AnalysisEngineProcessException)

Example 18 with TextClassificationException

use of org.dkpro.tc.api.exception.TextClassificationException in project dkpro-tc by dkpro.

the class ValidityCheckConnectorPost method checkErrorConditionMissingOutcomeForTargetIfUnitOrSequenceMode.

private void checkErrorConditionMissingOutcomeForTargetIfUnitOrSequenceMode(List<TextClassificationTarget> targets, List<TextClassificationOutcome> outcomes) throws AnalysisEngineProcessException {
    // labeled with an outcome annotation
    if (featureModeI == 2 || featureModeI == 4) {
        if (targets.size() == 0) {
            throw new AnalysisEngineProcessException(new TextClassificationException("Your experiment is supposed to have [+" + TextClassificationTarget.class.getName() + "] annotations, which are missing"));
        } else {
            if (targets.size() != outcomes.size()) {
                throwException("Number of targets [" + targets.size() + "] != number of outcomes [" + outcomes.size() + "]");
            }
            for (int i = 0; i < targets.size(); i++) {
                TextClassificationTarget t = targets.get(i);
                TextClassificationOutcome o = outcomes.get(i);
                if (t.getBegin() != o.getBegin() || t.getEnd() != o.getEnd()) {
                    throwException("Index of target and outcome do not match taget span: [" + t.getBegin() + " - " + t.getEnd() + "] != outcome span " + o.getBegin() + " - " + o.getEnd());
                }
            }
        }
    }
}
Also used : TextClassificationException(org.dkpro.tc.api.exception.TextClassificationException) TextClassificationOutcome(org.dkpro.tc.api.type.TextClassificationOutcome) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) AnalysisEngineProcessException(org.apache.uima.analysis_engine.AnalysisEngineProcessException)

Example 19 with TextClassificationException

use of org.dkpro.tc.api.exception.TextClassificationException in project dkpro-tc by dkpro.

the class IdfPairMetaCollector method process.

@Override
public void process(JCas jcas) throws AnalysisEngineProcessException {
    JCas view1;
    JCas view2;
    try {
        view1 = jcas.getView(PART_ONE);
        view2 = jcas.getView(PART_TWO);
    } catch (Exception e) {
        throw new AnalysisEngineProcessException(e);
    }
    FrequencyDistribution<String> document1NGrams;
    FrequencyDistribution<String> document2NGrams;
    try {
        document1NGrams = getNgramsFD(view1);
        document2NGrams = getNgramsFD(view2);
    } catch (TextClassificationException e) {
        throw new AnalysisEngineProcessException(e);
    }
    FrequencyDistribution<String> documentNGrams = new FrequencyDistribution<String>();
    // This is different than other metacollectors.
    for (String key : document1NGrams.getKeys()) {
        documentNGrams.addSample(key, 1);
    }
    for (String key : document2NGrams.getKeys()) {
        documentNGrams.addSample(key, 1);
    }
    for (String ngram : documentNGrams.getKeys()) {
        for (int i = 0; i < documentNGrams.getCount(ngram); i++) {
            Field field = new Field(getFieldName(), ngram, fieldType);
            currentDocument.add(field);
        }
    }
    try {
        writeToIndex();
    } catch (IOException e) {
        throw new AnalysisEngineProcessException(e);
    }
}
Also used : Field(org.apache.lucene.document.Field) TextClassificationException(org.dkpro.tc.api.exception.TextClassificationException) JCas(org.apache.uima.jcas.JCas) IOException(java.io.IOException) AnalysisEngineProcessException(org.apache.uima.analysis_engine.AnalysisEngineProcessException) IOException(java.io.IOException) ResourceInitializationException(org.apache.uima.resource.ResourceInitializationException) TextClassificationException(org.dkpro.tc.api.exception.TextClassificationException) AnalysisEngineProcessException(org.apache.uima.analysis_engine.AnalysisEngineProcessException) FrequencyDistribution(de.tudarmstadt.ukp.dkpro.core.api.frequency.util.FrequencyDistribution)

Example 20 with TextClassificationException

use of org.dkpro.tc.api.exception.TextClassificationException in project dkpro-tc by dkpro.

the class LibsvmDataFormatSerializeModelConnector method trainAndStoreModel.

private void trainAndStoreModel(TaskContext aContext) throws Exception {
    boolean multiLabel = learningMode.equals(Constants.LM_MULTI_LABEL);
    if (multiLabel) {
        throw new TextClassificationException("Multi-label is not yet implemented");
    }
    File fileTrain = getTrainFile(aContext);
    trainModel(aContext, fileTrain);
    copyOutcomeMappingToThisFolder(aContext);
    copyFeatureNameMappingToThisFolder(aContext);
}
Also used : TextClassificationException(org.dkpro.tc.api.exception.TextClassificationException) File(java.io.File)

Aggregations

TextClassificationException (org.dkpro.tc.api.exception.TextClassificationException)25 ArrayList (java.util.ArrayList)10 TextClassificationTarget (org.dkpro.tc.api.type.TextClassificationTarget)7 AnalysisEngineProcessException (org.apache.uima.analysis_engine.AnalysisEngineProcessException)6 IOException (java.io.IOException)5 Feature (org.dkpro.tc.api.features.Feature)5 File (java.io.File)4 JCas (org.apache.uima.jcas.JCas)4 ResourceInitializationException (org.apache.uima.resource.ResourceInitializationException)4 FeatureExtractorResource_ImplBase (org.dkpro.tc.api.features.FeatureExtractorResource_ImplBase)4 JCasId (org.dkpro.tc.api.type.JCasId)4 TextClassificationOutcome (org.dkpro.tc.api.type.TextClassificationOutcome)4 CASException (org.apache.uima.cas.CASException)3 PairFeatureExtractor (org.dkpro.tc.api.features.PairFeatureExtractor)3 FrequencyDistribution (de.tudarmstadt.ukp.dkpro.core.api.frequency.util.FrequencyDistribution)2 Token (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token)2 SimilarityException (dkpro.similarity.algorithms.api.SimilarityException)2 HashSet (java.util.HashSet)2 FeatureExtractor (org.dkpro.tc.api.features.FeatureExtractor)2 Instance (org.dkpro.tc.api.features.Instance)2