Search in sources :

Example 46 with IRI

use of org.apache.clerezza.commons.rdf.IRI in project stanbol by apache.

the class RepresentationReader method parseFromContent.

public Map<String, Representation> parseFromContent(RequestData content, MediaType acceptedMediaType) {
    // (3) Parse the Representtion(s) form the entity stream
    if (content.getMediaType().isCompatible(MediaType.APPLICATION_JSON_TYPE)) {
        // parse from json
        throw new UnsupportedOperationException("Parsing of JSON not yet implemented :(");
    } else if (isSupported(content.getMediaType())) {
        // from RDF serialisation
        RdfValueFactory valueFactory = RdfValueFactory.getInstance();
        Map<String, Representation> representations = new HashMap<String, Representation>();
        Set<BlankNodeOrIRI> processed = new HashSet<BlankNodeOrIRI>();
        Graph graph = new IndexedGraph();
        try {
            parser.parse(graph, content.getEntityStream(), content.getMediaType().toString());
        } catch (UnsupportedParsingFormatException e) {
            // String acceptedMediaType = httpHeaders.getFirst("Accept");
            // throw an internal server Error, because we check in
            // isReadable(..) for supported types and still we get here a
            // unsupported format -> therefore it looks like an configuration
            // error the server (e.g. a missing Bundle with the required bundle)
            String message = "Unable to create the Parser for the supported format" + content.getMediaType() + " (" + e + ")";
            log.error(message, e);
            throw new WebApplicationException(Response.status(Status.INTERNAL_SERVER_ERROR).entity(message).header(HttpHeaders.ACCEPT, acceptedMediaType).build());
        } catch (RuntimeException e) {
            // NOTE: Clerezza seams not to provide specific exceptions on
            // parsing errors. Hence the catch for all RuntimeException
            String message = "Unable to parse the provided RDF data (format: " + content.getMediaType() + ", message: " + e.getMessage() + ")";
            log.error(message, e);
            throw new WebApplicationException(Response.status(Status.BAD_REQUEST).entity(message).header(HttpHeaders.ACCEPT, acceptedMediaType).build());
        }
        for (Iterator<Triple> st = graph.iterator(); st.hasNext(); ) {
            BlankNodeOrIRI resource = st.next().getSubject();
            if (resource instanceof IRI && processed.add(resource)) {
                // build a new representation
                representations.put(((IRI) resource).getUnicodeString(), valueFactory.createRdfRepresentation((IRI) resource, graph));
            }
        }
        return representations;
    } else {
        // unsupported media type
        String message = String.format("Parsed Content-Type '%s' is not one of the supported %s", content.getMediaType(), supportedMediaTypes);
        log.info("Bad Request: {}", message);
        throw new WebApplicationException(Response.status(Status.BAD_REQUEST).entity(message).header(HttpHeaders.ACCEPT, acceptedMediaType).build());
    }
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) BlankNodeOrIRI(org.apache.clerezza.commons.rdf.BlankNodeOrIRI) HashSet(java.util.HashSet) Set(java.util.Set) WebApplicationException(javax.ws.rs.WebApplicationException) BlankNodeOrIRI(org.apache.clerezza.commons.rdf.BlankNodeOrIRI) Representation(org.apache.stanbol.entityhub.servicesapi.model.Representation) IndexedGraph(org.apache.stanbol.commons.indexedgraph.IndexedGraph) Graph(org.apache.clerezza.commons.rdf.Graph) Iterator(java.util.Iterator) RdfValueFactory(org.apache.stanbol.entityhub.model.clerezza.RdfValueFactory) IndexedGraph(org.apache.stanbol.commons.indexedgraph.IndexedGraph) HashMap(java.util.HashMap) Map(java.util.Map) MultivaluedMap(javax.ws.rs.core.MultivaluedMap) UnsupportedParsingFormatException(org.apache.clerezza.rdf.core.serializedform.UnsupportedParsingFormatException)

Example 47 with IRI

use of org.apache.clerezza.commons.rdf.IRI in project stanbol by apache.

the class RdfRepresentation method remove.

@Override
public void remove(String field, Object parsedValue) {
    if (field == null) {
        throw new IllegalArgumentException("The parsed field MUST NOT be NULL");
    } else if (field.isEmpty()) {
        throw new IllegalArgumentException("The parsed field MUST NOT be Empty");
    }
    if (parsedValue == null) {
        log.warn("NULL parsed as value in remove method for symbol " + getId() + " and field " + field + " -> call ignored");
        return;
    }
    IRI fieldIRI = new IRI(field);
    Collection<Object> removeValues = new ArrayList<Object>();
    ModelUtils.checkValues(valueFactory, parsedValue, removeValues);
    // We still need to implement support for specific types supported by this implementation
    for (Object current : removeValues) {
        if (current instanceof RDFTerm) {
            // native support for Clerezza types!
            graphNode.deleteProperty(fieldIRI, (RDFTerm) current);
        } else if (current instanceof RdfReference) {
            // treat RDF Implementations special to avoid creating new instances
            graphNode.deleteProperty(fieldIRI, ((RdfReference) current).getIRI());
        } else if (current instanceof Reference) {
            graphNode.deleteProperty(fieldIRI, new IRI(((Reference) current).getReference()));
        } else if (current instanceof RdfText) {
            // treat RDF Implementations special to avoid creating new instances
            graphNode.deleteProperty(fieldIRI, ((RdfText) current).getLiteral());
        } else if (current instanceof Text) {
            removeNaturalText(field, ((Text) current).getText(), ((Text) current).getLanguage());
        } else {
            // else add an typed Literal!
            removeTypedLiteral(fieldIRI, current);
        }
    }
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) Reference(org.apache.stanbol.entityhub.servicesapi.model.Reference) ArrayList(java.util.ArrayList) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm) Text(org.apache.stanbol.entityhub.servicesapi.model.Text)

Example 48 with IRI

use of org.apache.clerezza.commons.rdf.IRI in project stanbol by apache.

the class CeliNamedEntityExtractionEnhancementEngineTest method testInput.

private void testInput(String txt, String lang) throws EngineException, IOException {
    ContentItem ci = wrapAsContentItem(txt);
    try {
        // add a simple triple to statically define the language of the test content
        ci.getMetadata().add(new TripleImpl(ci.getUri(), DC_LANGUAGE, new PlainLiteralImpl(lang)));
        nerEngine.computeEnhancements(ci);
        TestUtils.logEnhancements(ci);
        HashMap<IRI, RDFTerm> expectedValues = new HashMap<IRI, RDFTerm>();
        expectedValues.put(Properties.ENHANCER_EXTRACTED_FROM, ci.getUri());
        expectedValues.put(Properties.DC_CREATOR, LiteralFactory.getInstance().createTypedLiteral(nerEngine.getClass().getName()));
        int textAnnoNum = validateAllTextAnnotations(ci.getMetadata(), txt, expectedValues);
        log.info(textAnnoNum + " TextAnnotations found ...");
        int entityAnnoNum = EnhancementStructureHelper.validateAllEntityAnnotations(ci.getMetadata(), expectedValues);
        log.info(entityAnnoNum + " EntityAnnotations found ...");
    } catch (EngineException e) {
        RemoteServiceHelper.checkServiceUnavailable(e);
    }
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) HashMap(java.util.HashMap) EngineException(org.apache.stanbol.enhancer.servicesapi.EngineException) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl) ContentItem(org.apache.stanbol.enhancer.servicesapi.ContentItem)

Example 49 with IRI

use of org.apache.clerezza.commons.rdf.IRI in project stanbol by apache.

the class CeliSentimentAnalysisEngineTest method testInput.

private void testInput(String txt, String lang) throws EngineException, IOException {
    ContentItem ci = wrapAsContentItem(txt);
    try {
        // add a simple triple to statically define the language of the test content
        ci.getMetadata().add(new TripleImpl(ci.getUri(), DC_LANGUAGE, new PlainLiteralImpl(lang)));
        sentimentAnalysisEngine.computeEnhancements(ci);
        TestUtils.logEnhancements(ci);
        HashMap<IRI, RDFTerm> expectedValues = new HashMap<IRI, RDFTerm>();
        expectedValues.put(Properties.ENHANCER_EXTRACTED_FROM, ci.getUri());
        expectedValues.put(Properties.DC_CREATOR, LiteralFactory.getInstance().createTypedLiteral(sentimentAnalysisEngine.getClass().getName()));
        expectedValues.put(DC_TYPE, CeliConstants.SENTIMENT_EXPRESSION);
        int textAnnoNum = validateAllTextAnnotations(ci.getMetadata(), txt, expectedValues);
        log.info(textAnnoNum + " TextAnnotations found ...");
        assertTrue("2 sentiment expressions should be recognized in: " + txt, textAnnoNum == 2);
        int entityAnnoNum = EnhancementStructureHelper.validateAllEntityAnnotations(ci.getMetadata(), expectedValues);
        assertTrue("0 entity annotations should be recognized in: " + txt, entityAnnoNum == 0);
    } catch (EngineException e) {
        RemoteServiceHelper.checkServiceUnavailable(e);
    }
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) HashMap(java.util.HashMap) EngineException(org.apache.stanbol.enhancer.servicesapi.EngineException) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl) ContentItem(org.apache.stanbol.enhancer.servicesapi.ContentItem)

Example 50 with IRI

use of org.apache.clerezza.commons.rdf.IRI in project stanbol by apache.

the class DBPSpotlightDisambiguateEnhancementEngine method createEnhancements.

/**
 * The method adds the returned DBpedia Spotlight annotations to the content
 * item's metadata. For each DBpedia resource an EntityAnnotation is created
 * and linked to the according TextAnnotation.
 *
 * @param occs
 *            a Collection of entity information
 * @param ci
 *            the content item
 */
public void createEnhancements(Collection<Annotation> occs, ContentItem ci, Language language) {
    HashMap<RDFTerm, IRI> entityAnnotationMap = new HashMap<RDFTerm, IRI>();
    for (Annotation occ : occs) {
        if (textAnnotationsMap.get(occ.surfaceForm) != null) {
            IRI textAnnotation = textAnnotationsMap.get(occ.surfaceForm);
            Graph model = ci.getMetadata();
            IRI entityAnnotation = EnhancementEngineHelper.createEntityEnhancement(ci, this);
            entityAnnotationMap.put(occ.uri, entityAnnotation);
            Literal label = new PlainLiteralImpl(occ.surfaceForm.name, language);
            model.add(new TripleImpl(entityAnnotation, DC_RELATION, textAnnotation));
            model.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_LABEL, label));
            Collection<String> t = occ.getTypeNames();
            if (t != null) {
                Iterator<String> it = t.iterator();
                while (it.hasNext()) model.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_TYPE, new IRI(it.next())));
            }
            model.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_REFERENCE, occ.uri));
        }
    }
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) Graph(org.apache.clerezza.commons.rdf.Graph) HashMap(java.util.HashMap) PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) Literal(org.apache.clerezza.commons.rdf.Literal) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl) Annotation(org.apache.stanbol.enhancer.engines.dbpspotlight.model.Annotation)

Aggregations

IRI (org.apache.clerezza.commons.rdf.IRI)346 BlankNodeOrIRI (org.apache.clerezza.commons.rdf.BlankNodeOrIRI)113 Graph (org.apache.clerezza.commons.rdf.Graph)109 TripleImpl (org.apache.clerezza.commons.rdf.impl.utils.TripleImpl)104 Triple (org.apache.clerezza.commons.rdf.Triple)88 RDFTerm (org.apache.clerezza.commons.rdf.RDFTerm)84 Test (org.junit.Test)78 PlainLiteralImpl (org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl)58 HashSet (java.util.HashSet)50 ContentItem (org.apache.stanbol.enhancer.servicesapi.ContentItem)46 EngineException (org.apache.stanbol.enhancer.servicesapi.EngineException)39 HashMap (java.util.HashMap)38 IOException (java.io.IOException)37 ArrayList (java.util.ArrayList)37 Blob (org.apache.stanbol.enhancer.servicesapi.Blob)36 Literal (org.apache.clerezza.commons.rdf.Literal)35 SimpleGraph (org.apache.clerezza.commons.rdf.impl.utils.simple.SimpleGraph)31 IndexedGraph (org.apache.stanbol.commons.indexedgraph.IndexedGraph)29 Recipe (org.apache.stanbol.rules.base.api.Recipe)29 Language (org.apache.clerezza.commons.rdf.Language)24