Search in sources :

Example 1 with CollectionException

use of org.apache.uima.collection.CollectionException in project webanno by webanno.

the class TcfReader method getNext.

@Override
public void getNext(JCas aJCas) throws IOException, CollectionException {
    Resource res = nextFile();
    initCas(aJCas, res);
    InputStream is = null;
    try {
        is = new BufferedInputStream(res.getInputStream());
        WLData wLData = WLDObjector.read(is);
        TextCorpus aCorpusData = wLData.getTextCorpus();
        convertToCas(aJCas, aCorpusData);
    } catch (WLFormatException e) {
        throw new CollectionException(e);
    } finally {
        closeQuietly(is);
    }
}
Also used : BufferedInputStream(java.io.BufferedInputStream) BufferedInputStream(java.io.BufferedInputStream) InputStream(java.io.InputStream) CollectionException(org.apache.uima.collection.CollectionException) TextCorpus(eu.clarin.weblicht.wlfxb.tc.api.TextCorpus) WLData(eu.clarin.weblicht.wlfxb.xb.WLData) WLFormatException(eu.clarin.weblicht.wlfxb.io.WLFormatException)

Example 2 with CollectionException

use of org.apache.uima.collection.CollectionException in project webanno by webanno.

the class TeiReader method getNext.

@Override
public void getNext(CAS aCAS) throws IOException, CollectionException {
    initCas(aCAS, currentResource);
    InputStream is = null;
    try {
        JCas jcas = aCAS.getJCas();
        // Create handler
        Handler handler = newSaxHandler();
        handler.setJCas(jcas);
        handler.setLogger(getLogger());
        // Parse TEI text
        SAXWriter writer = new SAXWriter(handler);
        writer.write(currentTeiElement);
        handler.endDocument();
    } catch (CASException e) {
        throw new CollectionException(e);
    } catch (SAXException e) {
        throw new IOException(e);
    } catch (Exception e) {
        throw new IOException("This is not a valid WebAnno CPH TEI file");
    } finally {
        closeQuietly(is);
    }
    // Move currentTeiElement to the next text
    nextTeiElement();
}
Also used : SAXWriter(org.dom4j.io.SAXWriter) GZIPInputStream(java.util.zip.GZIPInputStream) InputStream(java.io.InputStream) CollectionException(org.apache.uima.collection.CollectionException) JCas(org.apache.uima.jcas.JCas) DefaultHandler(org.xml.sax.helpers.DefaultHandler) CASException(org.apache.uima.cas.CASException) IOException(java.io.IOException) JaxenException(org.jaxen.JaxenException) ResourceInitializationException(org.apache.uima.resource.ResourceInitializationException) DocumentException(org.dom4j.DocumentException) CASException(org.apache.uima.cas.CASException) CollectionException(org.apache.uima.collection.CollectionException) IOException(java.io.IOException) SAXException(org.xml.sax.SAXException) SAXException(org.xml.sax.SAXException)

Example 3 with CollectionException

use of org.apache.uima.collection.CollectionException in project webanno by webanno.

the class TeiReader method initialize.

@Override
public void initialize(UimaContext aContext) throws ResourceInitializationException {
    super.initialize(aContext);
    if (writePOS && !writeTokens) {
        throw new ResourceInitializationException(new IllegalArgumentException("Setting writePOS to 'true' requires writeToken to be 'true' too."));
    }
    try {
        // Init with an empty iterator
        teiElementIterator = asList(new Element[0]).iterator();
        // Make sure we know about the first element;
        nextTeiElement();
    } catch (CollectionException | IOException e) {
        throw new ResourceInitializationException(e);
    }
}
Also used : ResourceInitializationException(org.apache.uima.resource.ResourceInitializationException) CollectionException(org.apache.uima.collection.CollectionException) IOException(java.io.IOException)

Example 4 with CollectionException

use of org.apache.uima.collection.CollectionException in project dkpro-tc by dkpro.

the class BrownCorpusReader method getNext.

@Override
public void getNext(CAS cas) throws IOException, CollectionException {
    super.getNext(cas);
    JCas jcas;
    try {
        jcas = cas.getJCas();
    } catch (CASException e) {
        throw new CollectionException(e);
    }
    for (Sentence sentence : JCasUtil.select(jcas, Sentence.class)) {
        TextClassificationSequence sequence = new TextClassificationSequence(jcas, sentence.getBegin(), sentence.getEnd());
        sequence.addToIndexes();
        for (Token token : JCasUtil.selectCovered(jcas, Token.class, sentence)) {
            TextClassificationTarget unit = new TextClassificationTarget(jcas, token.getBegin(), token.getEnd());
            // will add the token content as a suffix to the ID of this unit
            unit.setSuffix(token.getCoveredText());
            unit.addToIndexes();
            TextClassificationOutcome outcome = new TextClassificationOutcome(jcas, token.getBegin(), token.getEnd());
            outcome.setOutcome(getTextClassificationOutcome(jcas, unit));
            outcome.addToIndexes();
        }
    }
}
Also used : CollectionException(org.apache.uima.collection.CollectionException) TextClassificationOutcome(org.dkpro.tc.api.type.TextClassificationOutcome) TextClassificationTarget(org.dkpro.tc.api.type.TextClassificationTarget) JCas(org.apache.uima.jcas.JCas) Token(de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token) CASException(org.apache.uima.cas.CASException) TextClassificationSequence(org.dkpro.tc.api.type.TextClassificationSequence) Sentence(de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence)

Example 5 with CollectionException

use of org.apache.uima.collection.CollectionException in project dkpro-tc by dkpro.

the class ReutersCorpusReader method getNext.

@Override
public void getNext(CAS aCAS) throws IOException, CollectionException {
    super.getNext(aCAS);
    JCas jcas;
    try {
        jcas = aCAS.getJCas();
    } catch (CASException e) {
        throw new CollectionException();
    }
    for (String outcomeValue : getTextClassificationOutcomes(jcas)) {
        TextClassificationOutcome outcome = new TextClassificationOutcome(jcas);
        outcome.setOutcome(outcomeValue);
        outcome.addToIndexes();
    }
}
Also used : CollectionException(org.apache.uima.collection.CollectionException) TextClassificationOutcome(org.dkpro.tc.api.type.TextClassificationOutcome) JCas(org.apache.uima.jcas.JCas) CASException(org.apache.uima.cas.CASException)

Aggregations

CollectionException (org.apache.uima.collection.CollectionException)15 CASException (org.apache.uima.cas.CASException)10 JCas (org.apache.uima.jcas.JCas)9 TextClassificationOutcome (org.dkpro.tc.api.type.TextClassificationOutcome)9 JCasId (org.dkpro.tc.api.type.JCasId)5 TextClassificationTarget (org.dkpro.tc.api.type.TextClassificationTarget)4 DocumentMetaData (de.tudarmstadt.ukp.dkpro.core.api.metadata.type.DocumentMetaData)2 IOException (java.io.IOException)2 InputStream (java.io.InputStream)2 HashSet (java.util.HashSet)2 ResourceInitializationException (org.apache.uima.resource.ResourceInitializationException)2 Sentence (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence)1 Token (de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token)1 WLFormatException (eu.clarin.weblicht.wlfxb.io.WLFormatException)1 TextCorpus (eu.clarin.weblicht.wlfxb.tc.api.TextCorpus)1 WLData (eu.clarin.weblicht.wlfxb.xb.WLData)1 BufferedInputStream (java.io.BufferedInputStream)1 GZIPInputStream (java.util.zip.GZIPInputStream)1 AnalysisEngineProcessException (org.apache.uima.analysis_engine.AnalysisEngineProcessException)1 TextClassificationException (org.dkpro.tc.api.exception.TextClassificationException)1