Search in sources :

Example 31 with CatalogTransformerException

use of ddf.catalog.transform.CatalogTransformerException in project ddf by codice.

the class TikaInputTransformer method transform.

@Override
public Metacard transform(InputStream input, String id) throws IOException, CatalogTransformerException {
    LOGGER.debug("Transforming input stream using Tika.");
    long bytes;
    if (input == null) {
        throw new CatalogTransformerException("Cannot transform null input.");
    }
    try (TemporaryFileBackedOutputStream fileBackedOutputStream = new TemporaryFileBackedOutputStream()) {
        try {
            bytes = IOUtils.copyLarge(input, fileBackedOutputStream);
        } catch (IOException e) {
            throw new CatalogTransformerException("Could not copy bytes of content message.", e);
        }
        Metadata metadata;
        String bodyText = null;
        String metadataText;
        Metacard metacard = new MetacardImpl(commonTikaMetacardType);
        String contentType = DataType.DATASET.name();
        TikaMetadataExtractor extractor = null;
        try (InputStream inputStreamCopy = fileBackedOutputStream.asByteSource().openStream()) {
            extractor = new TikaMetadataExtractor(inputStreamCopy, previewMaxLength, metadataMaxLength);
        } catch (TikaException | RuntimeException t) {
            LOGGER.debug("Unable to extract tika metadata", t);
        }
        if (extractor != null) {
            metadataText = getMetadataXml(extractor.getMetadataXml());
            Attribute validationAttribute = null;
            if (metadataText.equals(TikaMetadataExtractor.METADATA_LIMIT_REACHED_MSG)) {
                validationAttribute = new AttributeImpl(Validation.VALIDATION_WARNINGS, Collections.singletonList(metadataText));
                metadataText = "";
            }
            bodyText = extractor.getBodyText();
            metadata = extractor.getMetadata();
            contentType = metadata.get(Metadata.CONTENT_TYPE);
            MetacardType metacardType = mergeAttributes(getMetacardType(contentType));
            metacard = MetacardCreator.createMetacard(metadata, id, metadataText, metacardType, useResourceTitleAsTitle);
            if (StringUtils.isNotBlank(bodyText)) {
                metacard.setAttribute(new AttributeImpl(Extracted.EXTRACTED_TEXT, bodyText));
                processContentMetadataExtractors(bodyText, metacard);
            }
            if (StringUtils.isNotBlank(metadataText)) {
                processMetadataExtractors(metadataText, metacard);
            }
            if (validationAttribute != null) {
                metacard.setAttribute(validationAttribute);
            }
        }
        enrichMetacard(fileBackedOutputStream, contentType, bytes, metacard);
        LOGGER.debug("Finished transforming input stream using Tika.");
        return metacard;
    }
}
Also used : TikaException(org.apache.tika.exception.TikaException) TemporaryFileBackedOutputStream(org.codice.ddf.platform.util.TemporaryFileBackedOutputStream) Attribute(ddf.catalog.data.Attribute) CloseShieldInputStream(org.apache.tika.io.CloseShieldInputStream) InputStream(java.io.InputStream) AttributeImpl(ddf.catalog.data.impl.AttributeImpl) Metadata(org.apache.tika.metadata.Metadata) CatalogTransformerException(ddf.catalog.transform.CatalogTransformerException) IOException(java.io.IOException) MetacardImpl(ddf.catalog.data.impl.MetacardImpl) MetacardType(ddf.catalog.data.MetacardType) TikaMetadataExtractor(ddf.catalog.transformer.common.tika.TikaMetadataExtractor) Metacard(ddf.catalog.data.Metacard)

Example 32 with CatalogTransformerException

use of ddf.catalog.transform.CatalogTransformerException in project ddf by codice.

the class InputTransformerErrorHandler method read.

/**
 * Takes in an {@link InputStream} returns a {@link Metacard}, populated with all the {@link
 * Attribute}s parsed by the {@link SaxEventHandlerDelegate#eventHandlers}
 *
 * @param inputStream an XML document that can be parsed into a Metacard
 * @return a {@link Metacard}, populated with all the {@link Attribute}s parsed by the {@link
 *     SaxEventHandlerDelegate#eventHandlers}
 * @throws CatalogTransformerException
 */
public SaxEventHandlerDelegate read(InputStream inputStream) throws CatalogTransformerException {
    try {
        InputSource newStream = new InputSource(new BufferedInputStream(inputStream));
        /*
       * Set the parser's ContentHandler to this delegate, which ensures the delegate receives all
       * parse events that should be handled by a SaxEventHandler (startElement, endElement, characters, startPrefixMapping, etc)
       * Set the parser's ErrorHandler to be a new InputTransformerHandler
       */
        parser.setContentHandler(this);
        InputTransformerErrorHandler inputTransformerErrorHandler = getInputTransformerErrorHandler().configure(new StringBuilder());
        parser.setErrorHandler(inputTransformerErrorHandler);
        parser.parse(newStream);
    } catch (IOException | SAXException e) {
        throw new CatalogTransformerException("Could not properly parse metacard", e);
    }
    return this;
}
Also used : InputSource(org.xml.sax.InputSource) BufferedInputStream(java.io.BufferedInputStream) CatalogTransformerException(ddf.catalog.transform.CatalogTransformerException) IOException(java.io.IOException) SAXException(org.xml.sax.SAXException)

Example 33 with CatalogTransformerException

use of ddf.catalog.transform.CatalogTransformerException in project ddf by codice.

the class AtomTransformer method createOutputStream.

private byte[] createOutputStream(Feed feed) throws CatalogTransformerException {
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ClassLoader tccl = Thread.currentThread().getContextClassLoader();
    try {
        Thread.currentThread().setContextClassLoader(AtomTransformer.class.getClassLoader());
        feed.writeTo(baos);
    } catch (IOException e) {
        LOGGER.info("Could not write to output stream.", e);
        throw new CatalogTransformerException("Could not transform into Atom.", e);
    } finally {
        Thread.currentThread().setContextClassLoader(tccl);
    }
    return baos.toByteArray();
}
Also used : CatalogTransformerException(ddf.catalog.transform.CatalogTransformerException) ByteArrayOutputStream(org.apache.commons.io.output.ByteArrayOutputStream) IOException(java.io.IOException)

Example 34 with CatalogTransformerException

use of ddf.catalog.transform.CatalogTransformerException in project ddf by codice.

the class XmlResponseQueueTransformer method transform.

@Override
public BinaryContent transform(SourceResponse response, Map<String, Serializable> args) throws CatalogTransformerException {
    try {
        PrintWriter writer = printWriterProvider.build(Metacard.class);
        writer.setRawValue("<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n");
        writer.startNode("metacards");
        for (Map.Entry<String, String> nsRow : NAMESPACE_MAP.entrySet()) {
            writer.addAttribute(nsRow.getKey(), nsRow.getValue());
        }
        if (response.getResults() != null && !response.getResults().isEmpty()) {
            StringWriter metacardContent = fjp.invoke(new MetacardForkTask(ImmutableList.copyOf(response.getResults()), fjp, geometryTransformer, threshold, metacardMarshaller));
            writer.setRawValue(metacardContent.getBuffer().toString());
        }
        // metacards
        writer.endNode();
        ByteArrayInputStream bais = new ByteArrayInputStream(writer.makeString().getBytes(StandardCharsets.UTF_8));
        return new BinaryContentImpl(bais, mimeType);
    } catch (Exception e) {
        LOGGER.info("Failed Query response transformation", e);
        throw new CatalogTransformerException("Failed Query response transformation");
    }
}
Also used : StringWriter(java.io.StringWriter) ByteArrayInputStream(java.io.ByteArrayInputStream) CatalogTransformerException(ddf.catalog.transform.CatalogTransformerException) BinaryContentImpl(ddf.catalog.data.impl.BinaryContentImpl) HashMap(java.util.HashMap) Map(java.util.Map) ImmutableMap(com.google.common.collect.ImmutableMap) DataBindingException(javax.xml.bind.DataBindingException) IOException(java.io.IOException) CatalogTransformerException(ddf.catalog.transform.CatalogTransformerException) XmlPullParserException(org.xmlpull.v1.XmlPullParserException) MimeTypeParseException(javax.activation.MimeTypeParseException) PrintWriter(ddf.catalog.transformer.api.PrintWriter)

Example 35 with CatalogTransformerException

use of ddf.catalog.transform.CatalogTransformerException in project ddf by codice.

the class VideoInputTransformer method transform.

@Override
public Metacard transform(InputStream input, String id) throws IOException, CatalogTransformerException {
    Metacard metacard;
    try {
        TikaMetadataExtractor tikaMetadataExtractor = new TikaMetadataExtractor(input);
        Metadata metadata = tikaMetadataExtractor.getMetadata();
        String metadataText = tikaMetadataExtractor.getMetadataXml();
        metacard = MetacardCreator.createMetacard(metadata, id, metadataText, metacardType);
        metacard.setAttribute(new AttributeImpl(Core.DATATYPE, DataType.MOVING_IMAGE.toString()));
    } catch (TikaException e) {
        throw new CatalogTransformerException(e);
    }
    return metacard;
}
Also used : TikaMetadataExtractor(ddf.catalog.transformer.common.tika.TikaMetadataExtractor) Metacard(ddf.catalog.data.Metacard) TikaException(org.apache.tika.exception.TikaException) AttributeImpl(ddf.catalog.data.impl.AttributeImpl) Metadata(org.apache.tika.metadata.Metadata) CatalogTransformerException(ddf.catalog.transform.CatalogTransformerException)

Aggregations

CatalogTransformerException (ddf.catalog.transform.CatalogTransformerException)112 IOException (java.io.IOException)53 Metacard (ddf.catalog.data.Metacard)44 InputStream (java.io.InputStream)40 ByteArrayInputStream (java.io.ByteArrayInputStream)29 BinaryContent (ddf.catalog.data.BinaryContent)25 InputTransformer (ddf.catalog.transform.InputTransformer)21 Serializable (java.io.Serializable)21 HashMap (java.util.HashMap)21 Result (ddf.catalog.data.Result)16 BinaryContentImpl (ddf.catalog.data.impl.BinaryContentImpl)15 TemporaryFileBackedOutputStream (org.codice.ddf.platform.util.TemporaryFileBackedOutputStream)14 ArrayList (java.util.ArrayList)13 MetacardImpl (ddf.catalog.data.impl.MetacardImpl)12 Test (org.junit.Test)12 AttributeImpl (ddf.catalog.data.impl.AttributeImpl)10 MimeType (javax.activation.MimeType)10 SourceResponse (ddf.catalog.operation.SourceResponse)9 MetacardTransformer (ddf.catalog.transform.MetacardTransformer)8 List (java.util.List)8