Search in sources :

Example 86 with RDFTerm

use of org.apache.clerezza.commons.rdf.RDFTerm in project stanbol by apache.

the class EnhancementStructureHelper method validateNERAnnotations.

/**
     * Validates that fise:TextAnnotations with the dc:type dbp-ont:Person,
     * dbp-ont:Organisation and dbp-ont:Place do have a
     * fise:selected-text value (this implicitly also checks that
     * fise:selection-context, fise:start and fise:end are defined!<p>
     * Called by {@link #validateTextAnnotation(Graph, IRI, String, Map)}
     * @param enhancements
     * @param textAnnotation
     * @param selectedTextResource the fise:selected-text value
     */
private static void validateNERAnnotations(Graph enhancements, IRI textAnnotation, RDFTerm selectedTextResource) {
    Iterator<Triple> dcTypeIterator = enhancements.filter(textAnnotation, DC_TYPE, null);
    boolean isNERAnnotation = false;
    while (dcTypeIterator.hasNext() && !isNERAnnotation) {
        RDFTerm dcTypeValue = dcTypeIterator.next().getObject();
        isNERAnnotation = DBPEDIA_PERSON.equals(dcTypeValue) || DBPEDIA_ORGANISATION.equals(dcTypeValue) || DBPEDIA_PLACE.equals(dcTypeValue);
    }
    if (isNERAnnotation) {
        assertNotNull("fise:TextAnnotations with a dc:type of c:type dbp-ont:Person, " + "dbp-ont:Organisation or dbp-ont:Place MUST have a fise:selected-text value (uri " + textAnnotation + ")", selectedTextResource);
    }
}
Also used : Triple(org.apache.clerezza.commons.rdf.Triple) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm)

Example 87 with RDFTerm

use of org.apache.clerezza.commons.rdf.RDFTerm in project stanbol by apache.

the class MultipartRequestTest method testUploadWithMetadata.

/**
     * Stanbol also supports to upload pre-existing metadata with the content.
     * This UnitTest uses an example that parsed TextAnnotations for free text
     * tags provided by users that are than linked to Entities in DBPedia
     * @throws IOException
     */
@Test
public void testUploadWithMetadata() throws IOException {
    //create the metadata
    RDFTerm user = new PlainLiteralImpl("Rupert Westenthaler");
    final IRI contentItemId = new IRI("http://www.example.com/test.html");
    Graph metadata = new SimpleGraph();
    addTagAsTextAnnotation(metadata, contentItemId, "Germany", DBPEDIA_PLACE, user);
    addTagAsTextAnnotation(metadata, contentItemId, "Europe", DBPEDIA_PLACE, user);
    addTagAsTextAnnotation(metadata, contentItemId, "NATO", DBPEDIA_ORGANISATION, user);
    addTagAsTextAnnotation(metadata, contentItemId, "Silvio Berlusconi", DBPEDIA_PERSON, user);
    String rdfContentType = SupportedFormat.RDF_XML;
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    serializer.serialize(out, metadata, rdfContentType);
    String rdfContent = new String(out.toByteArray(), UTF8);
    MultipartEntityBuilder ciBuilder = MultipartEntityBuilder.create();
    //add the metadata
    /*
         * NOTE: We need here to override the getFilename, because this MUST
         *       BE the URI of the ContentItem. This is important, because the
         *       Metadata do contain triples about that ContentItem and therefore
         *       it MUST BE assured that the URI of the ContentItem created by
         *       the Stanbol Enhancer is the same of as the URI used in the
         *       Metadata!
         */
    ciBuilder.addPart("metadata", new StringBody(rdfContent, ContentType.create(rdfContentType).withCharset(UTF8)) {

        @Override
        public String getFilename() {
            //uri of the ContentItem
            return contentItemId.getUnicodeString();
        }
    });
    //add the content
    ciBuilder.addTextBody("content", HTML_CONTENT, ContentType.TEXT_HTML.withCharset(UTF8));
    //send the request
    String receivedContent = executor.execute(builder.buildPostRequest(getEndpoint()).withHeader("Accept", "text/rdf+nt").withEntity(ciBuilder.build())).assertStatus(200).assertContentRegexp(//and the expected enhancements based on the parsed content
    "http://purl.org/dc/terms/creator.*LanguageDetectionEnhancementEngine", "http://purl.org/dc/terms/language.*en", "http://fise.iks-project.eu/ontology/entity-label.*Paris", "http://purl.org/dc/terms/creator.*org.apache.stanbol.enhancer.engines.opennlp.*NamedEntityExtractionEnhancementEngine", "http://fise.iks-project.eu/ontology/entity-label.*Bob Marley", //additional enhancements based on parsed metadata
    "http://fise.iks-project.eu/ontology/entity-reference.*http://dbpedia.org/resource/Germany.*", "http://fise.iks-project.eu/ontology/entity-reference.*http://dbpedia.org/resource/NATO.*", "http://fise.iks-project.eu/ontology/entity-reference.*http://dbpedia.org/resource/Silvio_Berlusconi.*", "http://fise.iks-project.eu/ontology/entity-reference.*http://dbpedia.org/resource/Europe.*").getContent();
    log.debug("Content:\n{}\n", receivedContent);
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) SimpleGraph(org.apache.clerezza.commons.rdf.impl.utils.simple.SimpleGraph) Graph(org.apache.clerezza.commons.rdf.Graph) MultipartEntityBuilder(org.apache.http.entity.mime.MultipartEntityBuilder) PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) StringBody(org.apache.http.entity.mime.content.StringBody) SimpleGraph(org.apache.clerezza.commons.rdf.impl.utils.simple.SimpleGraph) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm) ByteArrayOutputStream(java.io.ByteArrayOutputStream) Test(org.junit.Test)

Example 88 with RDFTerm

use of org.apache.clerezza.commons.rdf.RDFTerm in project stanbol by apache.

the class GraphMultiplexer method buildPublicKey.

/**
     * Creates an {@link OWLOntologyID} object by combining the ontologyIRI and the versionIRI, where
     * applicable, of the stored graph.
     * 
     * @param resource
     *            the ontology
     * @return
     */
protected OWLOntologyID buildPublicKey(final IRI resource) {
    // TODO desanitize?
    org.semanticweb.owlapi.model.IRI oiri = null, viri = null;
    Iterator<Triple> it = meta.filter(resource, HAS_ONTOLOGY_IRI_URIREF, null);
    if (it.hasNext()) {
        RDFTerm obj = it.next().getObject();
        if (obj instanceof IRI)
            oiri = org.semanticweb.owlapi.model.IRI.create(((IRI) obj).getUnicodeString());
        else if (obj instanceof Literal)
            oiri = org.semanticweb.owlapi.model.IRI.create(((Literal) obj).getLexicalForm());
    } else {
        // Anonymous ontology? Decode the resource itself (which is not null)
        return OntologyUtils.decode(resource.getUnicodeString());
    }
    it = meta.filter(resource, HAS_VERSION_IRI_URIREF, null);
    if (it.hasNext()) {
        RDFTerm obj = it.next().getObject();
        if (obj instanceof IRI)
            viri = org.semanticweb.owlapi.model.IRI.create(((IRI) obj).getUnicodeString());
        else if (obj instanceof Literal)
            viri = org.semanticweb.owlapi.model.IRI.create(((Literal) obj).getLexicalForm());
    }
    if (viri == null)
        return new OWLOntologyID(oiri);
    else
        return new OWLOntologyID(oiri, viri);
}
Also used : Triple(org.apache.clerezza.commons.rdf.Triple) IRI(org.apache.clerezza.commons.rdf.IRI) BlankNodeOrIRI(org.apache.clerezza.commons.rdf.BlankNodeOrIRI) Literal(org.apache.clerezza.commons.rdf.Literal) OWLOntologyID(org.semanticweb.owlapi.model.OWLOntologyID) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm)

Example 89 with RDFTerm

use of org.apache.clerezza.commons.rdf.RDFTerm in project stanbol by apache.

the class GraphMultiplexer method getSize.

@Override
public int getSize(OWLOntologyID publicKey) {
    IRI subj = buildResource(publicKey);
    Iterator<Triple> it = meta.filter(subj, SIZE_IN_TRIPLES_URIREF, null);
    if (it.hasNext()) {
        RDFTerm obj = it.next().getObject();
        if (obj instanceof Literal) {
            String s = ((Literal) obj).getLexicalForm();
            try {
                return Integer.parseInt(s);
            } catch (Exception ex) {
                log.warn("Not a valid integer value {} for size of {}", s, publicKey);
                return -1;
            }
        }
    }
    return 0;
}
Also used : Triple(org.apache.clerezza.commons.rdf.Triple) IRI(org.apache.clerezza.commons.rdf.IRI) BlankNodeOrIRI(org.apache.clerezza.commons.rdf.BlankNodeOrIRI) Literal(org.apache.clerezza.commons.rdf.Literal) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm)

Example 90 with RDFTerm

use of org.apache.clerezza.commons.rdf.RDFTerm in project stanbol by apache.

the class GraphMultiplexer method buildResource.

/**
     * Creates an {@link IRI} out of an {@link OWLOntologyID}, so it can be used as an identifier. This
     * does NOT necessarily correspond to the IRI that identifies the stored graph. In order to obtain
     * that, check the objects of any MAPS_TO_GRAPH assertions.
     * 
     * @param publicKey
     * @return
     */
protected IRI buildResource(final OWLOntologyID publicKey) {
    if (publicKey == null)
        throw new IllegalArgumentException("Cannot build a IRI resource on a null public key!");
    // The IRI is of the form ontologyIRI[:::versionIRI] (TODO use something less conventional?)
    // XXX should versionIRI also include the version IRI set by owners? Currently not
    // Remember not to sanitize logical identifiers.
    org.semanticweb.owlapi.model.IRI ontologyIri = publicKey.getOntologyIRI(), versionIri = publicKey.getVersionIRI();
    if (ontologyIri == null)
        throw new IllegalArgumentException("Cannot build a IRI resource on an anonymous public key!");
    log.debug("Searching for a meta graph entry for public key:");
    log.debug(" -- {}", publicKey);
    IRI match = null;
    LiteralFactory lf = LiteralFactory.getInstance();
    Literal oiri = lf.createTypedLiteral(new IRI(ontologyIri.toString()));
    Literal viri = versionIri == null ? null : lf.createTypedLiteral(new IRI(versionIri.toString()));
    for (Iterator<Triple> it = meta.filter(null, HAS_ONTOLOGY_IRI_URIREF, oiri); it.hasNext(); ) {
        RDFTerm subj = it.next().getSubject();
        log.debug(" -- Ontology IRI match found. Scanning");
        log.debug(" -- RDFTerm : {}", subj);
        if (!(subj instanceof IRI)) {
            log.debug(" ---- (uncomparable: skipping...)");
            continue;
        }
        if (viri != null) {
            // Must find matching versionIRI
            if (meta.contains(new TripleImpl((IRI) subj, HAS_VERSION_IRI_URIREF, viri))) {
                log.debug(" ---- Version IRI match!");
                match = (IRI) subj;
                // Found
                break;
            } else {
                log.debug(" ---- Expected version IRI match not found.");
                // There could be another with the right versionIRI.
                continue;
            }
        } else {
            // Must find unversioned resource
            if (meta.filter((IRI) subj, HAS_VERSION_IRI_URIREF, null).hasNext()) {
                log.debug(" ---- Unexpected version IRI found. Skipping.");
                continue;
            } else {
                log.debug(" ---- Unversioned match!");
                match = (IRI) subj;
                // Found
                break;
            }
        }
    }
    log.debug("Matching IRI in graph : {}", match);
    if (match == null)
        return new IRI(OntologyUtils.encode(publicKey));
    else
        return match;
}
Also used : Triple(org.apache.clerezza.commons.rdf.Triple) IRI(org.apache.clerezza.commons.rdf.IRI) BlankNodeOrIRI(org.apache.clerezza.commons.rdf.BlankNodeOrIRI) Literal(org.apache.clerezza.commons.rdf.Literal) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl) LiteralFactory(org.apache.clerezza.rdf.core.LiteralFactory)

Aggregations

RDFTerm (org.apache.clerezza.commons.rdf.RDFTerm)126 IRI (org.apache.clerezza.commons.rdf.IRI)84 Triple (org.apache.clerezza.commons.rdf.Triple)70 BlankNodeOrIRI (org.apache.clerezza.commons.rdf.BlankNodeOrIRI)48 Literal (org.apache.clerezza.commons.rdf.Literal)35 Test (org.junit.Test)35 HashSet (java.util.HashSet)30 HashMap (java.util.HashMap)28 TripleImpl (org.apache.clerezza.commons.rdf.impl.utils.TripleImpl)26 Graph (org.apache.clerezza.commons.rdf.Graph)24 ContentItem (org.apache.stanbol.enhancer.servicesapi.ContentItem)18 ArrayList (java.util.ArrayList)17 PlainLiteralImpl (org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl)16 EngineException (org.apache.stanbol.enhancer.servicesapi.EngineException)13 OWLOntologyID (org.semanticweb.owlapi.model.OWLOntologyID)13 SimpleGraph (org.apache.clerezza.commons.rdf.impl.utils.simple.SimpleGraph)12 Collection (java.util.Collection)10 IndexedGraph (org.apache.stanbol.commons.indexedgraph.IndexedGraph)10 Lock (java.util.concurrent.locks.Lock)9 IOException (java.io.IOException)5