Search in sources :

Example 1 with ExtractorFactory

use of org.semanticdesktop.aperture.extractor.ExtractorFactory in project stanbol by apache.

the class MetaxaCore method extract.

/**
     * Returns a model containing all the metadata that could be extracted
     * by reading the given input stream using the given MIME type.
     *
     * @param in
     *            an {@link InputStream} where to read the document from
     * @param docId
     *            a {@link String} with the document URI
     * @param mimeType
     *            a {@link String} with the MIME type
     * @return a {@link Model} containing the metadata or {@code null} if no
     *         extractor is available for the given MIME type
     * @throws ExtractorException
     *             if there is an error when extracting the metadata
     * @throws IOException
     *             if there is an error when reading the input stream
     */
public Model extract(InputStream in, URIImpl docId, String mimeType) throws ExtractorException, IOException {
    @SuppressWarnings("rawtypes") Set factories = this.extractorRegistry.getExtractorFactories(mimeType);
    Model result = null;
    if (factories != null && !factories.isEmpty()) {
        // get extractor from the first available factory
        ExtractorFactory factory = (ExtractorFactory) factories.iterator().next();
        Extractor extractor = factory.get();
        RDFContainerFactory containerFactory = new RDFContainerFactoryImpl();
        RDFContainer container = containerFactory.getRDFContainer(docId);
        extractor.extract(container.getDescribedUri(), new BufferedInputStream(in, 8192), null, mimeType, container);
        in.close();
        result = container.getModel();
    }
    return result;
}
Also used : Set(java.util.Set) RDFContainer(org.semanticdesktop.aperture.rdf.RDFContainer) BufferedInputStream(java.io.BufferedInputStream) ExtractorFactory(org.semanticdesktop.aperture.extractor.ExtractorFactory) Model(org.ontoware.rdf2go.model.Model) Extractor(org.semanticdesktop.aperture.extractor.Extractor) RDFContainerFactory(org.semanticdesktop.aperture.rdf.RDFContainerFactory) RDFContainerFactoryImpl(org.semanticdesktop.aperture.rdf.impl.RDFContainerFactoryImpl)

Aggregations

BufferedInputStream (java.io.BufferedInputStream)1 Set (java.util.Set)1 Model (org.ontoware.rdf2go.model.Model)1 Extractor (org.semanticdesktop.aperture.extractor.Extractor)1 ExtractorFactory (org.semanticdesktop.aperture.extractor.ExtractorFactory)1 RDFContainer (org.semanticdesktop.aperture.rdf.RDFContainer)1 RDFContainerFactory (org.semanticdesktop.aperture.rdf.RDFContainerFactory)1 RDFContainerFactoryImpl (org.semanticdesktop.aperture.rdf.impl.RDFContainerFactoryImpl)1