Examples with ParsingProvider - org.apache.clerezza.rdf.core.serializedform.ParsingProvider

Example 1 with ParsingProvider

use of org.apache.clerezza.rdf.core.serializedform.ParsingProvider in project stanbol by apache.

the class ClerezzaBackendTest method readTestData.

@BeforeClass
public static void readTestData() throws IOException {
    ParsingProvider parser = new JenaParserProvider();
    //NOTE(rw): the new third parameter is the base URI used to resolve relative paths
    graph = new IndexedGraph();
    InputStream in = ClerezzaBackendTest.class.getClassLoader().getResourceAsStream("testdata.rdf.zip");
    assertNotNull(in);
    ZipInputStream zipIn = new ZipInputStream(new BufferedInputStream(in));
    InputStream uncloseable = new UncloseableStream(zipIn);
    ZipEntry entry;
    while ((entry = zipIn.getNextEntry()) != null) {
        if (entry.getName().endsWith(".rdf")) {
            parser.parse(graph, uncloseable, SupportedFormat.RDF_XML, null);
        }
    }
    assertTrue(graph.size() > 0);
    zipIn.close();
}

Also used : JenaParserProvider(org.apache.clerezza.rdf.jena.parser.JenaParserProvider) ParsingProvider(org.apache.clerezza.rdf.core.serializedform.ParsingProvider) ZipInputStream(java.util.zip.ZipInputStream) BufferedInputStream(java.io.BufferedInputStream) BufferedInputStream(java.io.BufferedInputStream) ZipInputStream(java.util.zip.ZipInputStream) FilterInputStream(java.io.FilterInputStream) ByteArrayInputStream(java.io.ByteArrayInputStream) InputStream(java.io.InputStream) ZipEntry(java.util.zip.ZipEntry) IndexedGraph(org.apache.stanbol.commons.indexedgraph.IndexedGraph) BeforeClass(org.junit.BeforeClass)

Example 2 with ParsingProvider

use of org.apache.clerezza.rdf.core.serializedform.ParsingProvider in project stanbol by apache.

the class OWLAPIToClerezzaConverter method owlOntologyToClerezzaGraph.

/**
     * 
     * Converts a OWL API {@link OWLOntology} to Clerezza {@link Graph}.
     * 
     * @param ontology
     *            {@link OWLOntology}
     * @return the equivalent Clerezza {@link Graph}.
     */
public static org.apache.clerezza.commons.rdf.Graph owlOntologyToClerezzaGraph(OWLOntology ontology) {
    org.apache.clerezza.commons.rdf.Graph mGraph = null;
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    OWLOntologyManager manager = ontology.getOWLOntologyManager();
    try {
        manager.saveOntology(ontology, new RDFXMLOntologyFormat(), out);
        ByteArrayInputStream in = new ByteArrayInputStream(out.toByteArray());
        ParsingProvider parser = new JenaParserProvider();
        mGraph = new SimpleGraph();
        parser.parse(mGraph, in, SupportedFormat.RDF_XML, null);
    } catch (OWLOntologyStorageException e) {
        log.error("Failed to serialize OWL Ontology " + ontology + "for conversion", e);
    }
    return mGraph;
}

Also used : JenaParserProvider(org.apache.clerezza.rdf.jena.parser.JenaParserProvider) ParsingProvider(org.apache.clerezza.rdf.core.serializedform.ParsingProvider) ByteArrayInputStream(java.io.ByteArrayInputStream) SimpleGraph(org.apache.clerezza.commons.rdf.impl.utils.simple.SimpleGraph) ByteArrayOutputStream(java.io.ByteArrayOutputStream) OWLOntologyManager(org.semanticweb.owlapi.model.OWLOntologyManager) RDFXMLOntologyFormat(org.semanticweb.owlapi.io.RDFXMLOntologyFormat) OWLOntologyStorageException(org.semanticweb.owlapi.model.OWLOntologyStorageException)

Example 3 with ParsingProvider

use of org.apache.clerezza.rdf.core.serializedform.ParsingProvider in project stanbol by apache.

the class JenaToClerezzaConverter method jenaModelToClerezzaGraph.

/**
	 * 
	 * Converts a Jena {@link Model} to Clerezza {@link Graph}.
	 * 
	 * @param model {@link Model}
	 * @return the equivalent Clerezza {@link Graph}.
	 */
public static org.apache.clerezza.commons.rdf.Graph jenaModelToClerezzaGraph(Model model) {
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    model.write(out);
    ByteArrayInputStream in = new ByteArrayInputStream(out.toByteArray());
    ParsingProvider parser = new JenaParserProvider();
    org.apache.clerezza.commons.rdf.Graph mGraph = new SimpleGraph();
    parser.parse(mGraph, in, SupportedFormat.RDF_XML, null);
    return mGraph;
}

Also used : JenaParserProvider(org.apache.clerezza.rdf.jena.parser.JenaParserProvider) ParsingProvider(org.apache.clerezza.rdf.core.serializedform.ParsingProvider) ByteArrayInputStream(java.io.ByteArrayInputStream) SimpleGraph(org.apache.clerezza.commons.rdf.impl.utils.simple.SimpleGraph) ByteArrayOutputStream(java.io.ByteArrayOutputStream)

Example 4 with ParsingProvider

use of org.apache.clerezza.rdf.core.serializedform.ParsingProvider in project stanbol by apache.

the class UsageExamples method readTestData.

@BeforeClass
public static void readTestData() throws IOException {
    //add the metadata
    ParsingProvider parser = new JenaParserProvider();
    //create the content Item with the HTML content
    Graph rdfData = parseRdfData(parser, "example.rdf.zip");
    IRI contentItemId = null;
    Iterator<Triple> it = rdfData.filter(null, Properties.ENHANCER_EXTRACTED_FROM, null);
    while (it.hasNext()) {
        RDFTerm r = it.next().getObject();
        if (contentItemId == null) {
            if (r instanceof IRI) {
                contentItemId = (IRI) r;
            }
        } else {
            assertEquals("multiple ContentItems IDs contained in the RDF test data", contentItemId, r);
        }
    }
    assertNotNull("RDF data doe not contain an Enhancement extracted form " + "the content item", contentItemId);
    InputStream in = getTestResource("example.txt");
    assertNotNull("Example Plain text content not found", in);
    byte[] textData = IOUtils.toByteArray(in);
    IOUtils.closeQuietly(in);
    ci = ciFactory.createContentItem(contentItemId, new ByteArraySource(textData, "text/html; charset=UTF-8"));
    ci.getMetadata().addAll(rdfData);
}

Example 5 with ParsingProvider

use of org.apache.clerezza.rdf.core.serializedform.ParsingProvider in project stanbol by apache.

the class ContentItemBackendTest method readTestData.

@BeforeClass
public static void readTestData() throws IOException {
    //add the metadata
    ParsingProvider parser = new JenaParserProvider();
    //create the content Item with the HTML content
    Graph rdfData = parseRdfData(parser, "metadata.rdf.zip");
    IRI contentItemId = null;
    Iterator<Triple> it = rdfData.filter(null, Properties.ENHANCER_EXTRACTED_FROM, null);
    while (it.hasNext()) {
        RDFTerm r = it.next().getObject();
        if (contentItemId == null) {
            if (r instanceof IRI) {
                contentItemId = (IRI) r;
            }
        } else {
            assertEquals("multiple ContentItems IDs contained in the RDF test data", contentItemId, r);
        }
    }
    assertNotNull("RDF data doe not contain an Enhancement extracted form " + "the content item", contentItemId);
    InputStream in = getTestResource("content.html");
    assertNotNull("HTML content not found", in);
    byte[] htmlData = IOUtils.toByteArray(in);
    IOUtils.closeQuietly(in);
    ci = ciFactory.createContentItem(contentItemId, new ByteArraySource(htmlData, "text/html; charset=UTF-8"));
    htmlContent = new String(htmlData, UTF8);
    //create a Blob with the text content
    in = getTestResource("content.txt");
    byte[] textData = IOUtils.toByteArray(in);
    IOUtils.closeQuietly(in);
    assertNotNull("Plain text content not found", in);
    ci.addPart(new IRI(ci.getUri().getUnicodeString() + "_text"), ciFactory.createBlob(new ByteArraySource(textData, "text/plain; charset=UTF-8")));
    textContent = new String(textData, UTF8);
    //add the metadata
    ci.getMetadata().addAll(rdfData);
}

Also used : JenaParserProvider(org.apache.clerezza.rdf.jena.parser.JenaParserProvider) Triple(org.apache.clerezza.commons.rdf.Triple) IRI(org.apache.clerezza.commons.rdf.IRI) ParsingProvider(org.apache.clerezza.rdf.core.serializedform.ParsingProvider) IndexedGraph(org.apache.stanbol.commons.indexedgraph.IndexedGraph) SimpleGraph(org.apache.clerezza.commons.rdf.impl.utils.simple.SimpleGraph) Graph(org.apache.clerezza.commons.rdf.Graph) BufferedInputStream(java.io.BufferedInputStream) ZipInputStream(java.util.zip.ZipInputStream) FilterInputStream(java.io.FilterInputStream) InputStream(java.io.InputStream) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm) ByteArraySource(org.apache.stanbol.enhancer.servicesapi.impl.ByteArraySource) BeforeClass(org.junit.BeforeClass)

Aggregations

ParsingProvider (org.apache.clerezza.rdf.core.serializedform.ParsingProvider)5 JenaParserProvider (org.apache.clerezza.rdf.jena.parser.JenaParserProvider)5 ByteArrayInputStream (java.io.ByteArrayInputStream)3 InputStream (java.io.InputStream)3 SimpleGraph (org.apache.clerezza.commons.rdf.impl.utils.simple.SimpleGraph)3 BeforeClass (org.junit.BeforeClass)3 BufferedInputStream (java.io.BufferedInputStream)2 ByteArrayOutputStream (java.io.ByteArrayOutputStream)2 FilterInputStream (java.io.FilterInputStream)2 ZipInputStream (java.util.zip.ZipInputStream)2 Graph (org.apache.clerezza.commons.rdf.Graph)2 IRI (org.apache.clerezza.commons.rdf.IRI)2 RDFTerm (org.apache.clerezza.commons.rdf.RDFTerm)2 Triple (org.apache.clerezza.commons.rdf.Triple)2 IndexedGraph (org.apache.stanbol.commons.indexedgraph.IndexedGraph)2 ByteArraySource (org.apache.stanbol.enhancer.servicesapi.impl.ByteArraySource)2 ZipEntry (java.util.zip.ZipEntry)1 RDFXMLOntologyFormat (org.semanticweb.owlapi.io.RDFXMLOntologyFormat)1 OWLOntologyManager (org.semanticweb.owlapi.model.OWLOntologyManager)1 OWLOntologyStorageException (org.semanticweb.owlapi.model.OWLOntologyStorageException)1