Search in sources :

Example 71 with TripleImpl

use of org.apache.clerezza.commons.rdf.impl.utils.TripleImpl in project stanbol by apache.

the class CeliLanguageIdentifierEnhancementEngine method computeEnhancements.

@Override
public void computeEnhancements(ContentItem ci) throws EngineException {
    Entry<IRI, Blob> contentPart = ContentItemHelper.getBlob(ci, SUPPORTED_MIMTYPES);
    if (contentPart == null) {
        throw new IllegalStateException("No ContentPart with Mimetype '" + TEXT_PLAIN_MIMETYPE + "' found for ContentItem " + ci.getUri() + ": This is also checked in the canEnhance method! -> This " + "indicated an Bug in the implementation of the " + "EnhancementJobManager!");
    }
    String text = "";
    try {
        text = ContentItemHelper.getText(contentPart.getValue());
    } catch (IOException e) {
        throw new InvalidContentException(this, ci, e);
    }
    if (text.trim().length() == 0) {
        log.info("No text contained in ContentPart {" + contentPart.getKey() + "} of ContentItem {" + ci.getUri() + "}");
        return;
    }
    try {
        String[] tmps = text.split(" ");
        List<GuessedLanguage> lista = null;
        if (tmps.length > 5)
            lista = this.client.guessLanguage(text);
        else
            lista = this.client.guessQueryLanguage(text);
        Graph g = ci.getMetadata();
        //in ENHANCE_ASYNC we need to use read/write locks on the ContentItem
        ci.getLock().writeLock().lock();
        try {
            GuessedLanguage gl = lista.get(0);
            IRI textEnhancement = EnhancementEngineHelper.createTextEnhancement(ci, this);
            g.add(new TripleImpl(textEnhancement, DC_LANGUAGE, new PlainLiteralImpl(gl.getLang())));
            g.add(new TripleImpl(textEnhancement, ENHANCER_CONFIDENCE, literalFactory.createTypedLiteral(gl.getConfidence())));
            g.add(new TripleImpl(textEnhancement, DC_TYPE, DCTERMS_LINGUISTIC_SYSTEM));
        } finally {
            ci.getLock().writeLock().unlock();
        }
    } catch (IOException e) {
        throw new EngineException("Error while calling the CELI language" + " identifier service (configured URL: " + serviceURL + ")!", e);
    } catch (SOAPException e) {
        throw new EngineException("Error wile encoding/decoding the request/" + "response to the CELI language identifier service!", e);
    }
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) Blob(org.apache.stanbol.enhancer.servicesapi.Blob) PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) EngineException(org.apache.stanbol.enhancer.servicesapi.EngineException) IOException(java.io.IOException) InvalidContentException(org.apache.stanbol.enhancer.servicesapi.InvalidContentException) Graph(org.apache.clerezza.commons.rdf.Graph) SOAPException(javax.xml.soap.SOAPException) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl)

Example 72 with TripleImpl

use of org.apache.clerezza.commons.rdf.impl.utils.TripleImpl in project stanbol by apache.

the class DBPSpotlightDisambiguateEnhancementEngine method createEnhancements.

/**
	 * The method adds the returned DBpedia Spotlight annotations to the content
	 * item's metadata. For each DBpedia resource an EntityAnnotation is created
	 * and linked to the according TextAnnotation.
	 * 
	 * @param occs
	 *            a Collection of entity information
	 * @param ci
	 *            the content item
	 */
public void createEnhancements(Collection<Annotation> occs, ContentItem ci, Language language) {
    HashMap<RDFTerm, IRI> entityAnnotationMap = new HashMap<RDFTerm, IRI>();
    for (Annotation occ : occs) {
        if (textAnnotationsMap.get(occ.surfaceForm) != null) {
            IRI textAnnotation = textAnnotationsMap.get(occ.surfaceForm);
            Graph model = ci.getMetadata();
            IRI entityAnnotation = EnhancementEngineHelper.createEntityEnhancement(ci, this);
            entityAnnotationMap.put(occ.uri, entityAnnotation);
            Literal label = new PlainLiteralImpl(occ.surfaceForm.name, language);
            model.add(new TripleImpl(entityAnnotation, DC_RELATION, textAnnotation));
            model.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_LABEL, label));
            Collection<String> t = occ.getTypeNames();
            if (t != null) {
                Iterator<String> it = t.iterator();
                while (it.hasNext()) model.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_TYPE, new IRI(it.next())));
            }
            model.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_REFERENCE, occ.uri));
        }
    }
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) Graph(org.apache.clerezza.commons.rdf.Graph) HashMap(java.util.HashMap) PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) Literal(org.apache.clerezza.commons.rdf.Literal) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl) Annotation(org.apache.stanbol.enhancer.engines.dbpspotlight.model.Annotation)

Example 73 with TripleImpl

use of org.apache.clerezza.commons.rdf.impl.utils.TripleImpl in project stanbol by apache.

the class LocationEnhancementEngine method writeEntityEnhancement.

/**
     * Writes an entity enhancement for the content item in the parsed graph
     * based on the parsed toponym.
     *
     * @param contentItemId The id of the contentItem
     * @param graph The graph used to write the triples
     * @param literalFactory the literal factory used to create literals
     * @param toponym the toponym
     * @param relatedEnhancements related enhancements
     * @param requiresEnhancements required enhancements
     * @param defaultScore the score used as default id not present. This is
     * used to parse the score of the Toponym if this method is used to add a
     * parent Toponym.
     *
     * @return The IRI of the created entity enhancement
     */
private IRI writeEntityEnhancement(IRI contentItemId, Graph graph, LiteralFactory literalFactory, Toponym toponym, Collection<BlankNodeOrIRI> relatedEnhancements, Collection<BlankNodeOrIRI> requiresEnhancements, Double score) {
    IRI entityRef = new IRI("http://sws.geonames.org/" + toponym.getGeoNameId() + '/');
    FeatureClass featureClass = toponym.getFeatureClass();
    log.debug("  > featureClass " + featureClass);
    IRI entityAnnotation = EnhancementEngineHelper.createEntityEnhancement(graph, this, contentItemId);
    // first relate this entity annotation to the text annotation(s)
    if (relatedEnhancements != null) {
        for (BlankNodeOrIRI related : relatedEnhancements) {
            graph.add(new TripleImpl(entityAnnotation, DC_RELATION, related));
        }
    }
    if (requiresEnhancements != null) {
        for (BlankNodeOrIRI requires : requiresEnhancements) {
            graph.add(new TripleImpl(entityAnnotation, DC_REQUIRES, requires));
            //STANBOL-767: also add dc:relation link
            graph.add(new TripleImpl(entityAnnotation, DC_RELATION, requires));
        }
    }
    graph.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_REFERENCE, entityRef));
    log.debug("  > name " + toponym.getName());
    graph.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_LABEL, new PlainLiteralImpl(toponym.getName())));
    if (score != null) {
        graph.add(new TripleImpl(entityAnnotation, ENHANCER_CONFIDENCE, literalFactory.createTypedLiteral(score)));
    }
    //now get all the entity types for the results
    Set<IRI> entityTypes = new HashSet<IRI>();
    //first based on the feature class
    Collection<IRI> featureClassTypes = FEATURE_CLASS_CONCEPT_MAPPINGS.get(featureClass);
    if (featureClassTypes != null) {
        entityTypes.addAll(featureClassTypes);
    }
    //second for the feature Code
    String featureCode = toponym.getFeatureCode();
    Collection<IRI> featureCodeTypes = FEATURE_TYPE_CONCEPT_MAPPINGS.get(featureCode);
    if (featureCodeTypes != null) {
        entityTypes.addAll(featureCodeTypes);
    }
    //third add the feature Code as additional type
    entityTypes.add(new IRI(NamespaceEnum.geonames + featureClass.name() + '.' + featureCode));
    //finally add the type triples to the enhancement
    for (IRI entityType : entityTypes) {
        graph.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_TYPE, entityType));
    }
    return entityAnnotation;
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) BlankNodeOrIRI(org.apache.clerezza.commons.rdf.BlankNodeOrIRI) PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) BlankNodeOrIRI(org.apache.clerezza.commons.rdf.BlankNodeOrIRI) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl) HashSet(java.util.HashSet)

Example 74 with TripleImpl

use of org.apache.clerezza.commons.rdf.impl.utils.TripleImpl in project stanbol by apache.

the class ClerezzaRDFUtils method makeConnected.

public static void makeConnected(Graph model, BlankNodeOrIRI root, IRI property) {
    Set<BlankNodeOrIRI> roots = findRoots(model);
    LOG.debug("Roots: {}", roots.size());
    boolean found = roots.remove(root);
    //connect all hanging roots to root by property
    for (BlankNodeOrIRI n : roots) {
        model.add(new TripleImpl(root, property, n));
    }
}
Also used : BlankNodeOrIRI(org.apache.clerezza.commons.rdf.BlankNodeOrIRI) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl)

Example 75 with TripleImpl

use of org.apache.clerezza.commons.rdf.impl.utils.TripleImpl in project stanbol by apache.

the class EnhancerUtils method addActiveEngines.

/**
     * Create the RDF data for the currently active EnhancementEngines.<p>
     * Note the the parsed rootUrl MUST already consider offsets configured
     * for the Stanbol RESTful service. When called from within a
     * {@link BaseStanbolResource} the following code segment should be used:<p>
     * <code><pre>
     *     String rootUrl = uriInfo.getBaseUriBuilder().path(getRootUrl()).build().toString();
     * </pre></code>
     * @param activeEngines the active enhancement engines as {@link Entry entries}.
     * @param graph the RDF graph to add the triples
     * @param rootUrl the root URL used by the current request
     * @see EnhancerUtils#buildEnginesMap(EnhancementEngineManager)
     */
public static void addActiveEngines(Iterable<Entry<ServiceReference, EnhancementEngine>> activeEngines, Graph graph, String rootUrl) {
    IRI enhancerResource = new IRI(rootUrl + "enhancer");
    graph.add(new TripleImpl(enhancerResource, RDF.type, Enhancer.ENHANCER));
    for (Entry<ServiceReference, EnhancementEngine> entry : activeEngines) {
        IRI engineResource = new IRI(rootUrl + "enhancer/engine/" + entry.getValue().getName());
        graph.add(new TripleImpl(enhancerResource, Enhancer.HAS_ENGINE, engineResource));
        graph.add(new TripleImpl(engineResource, RDF.type, ENHANCEMENT_ENGINE));
        graph.add(new TripleImpl(engineResource, RDFS.label, new PlainLiteralImpl(entry.getValue().getName())));
    }
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl) EnhancementEngine(org.apache.stanbol.enhancer.servicesapi.EnhancementEngine) ServiceReference(org.osgi.framework.ServiceReference)

Aggregations

TripleImpl (org.apache.clerezza.commons.rdf.impl.utils.TripleImpl)143 IRI (org.apache.clerezza.commons.rdf.IRI)104 PlainLiteralImpl (org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl)69 Graph (org.apache.clerezza.commons.rdf.Graph)66 BlankNodeOrIRI (org.apache.clerezza.commons.rdf.BlankNodeOrIRI)49 Triple (org.apache.clerezza.commons.rdf.Triple)41 RDFTerm (org.apache.clerezza.commons.rdf.RDFTerm)26 EngineException (org.apache.stanbol.enhancer.servicesapi.EngineException)23 HashMap (java.util.HashMap)20 Language (org.apache.clerezza.commons.rdf.Language)20 Literal (org.apache.clerezza.commons.rdf.Literal)20 LiteralFactory (org.apache.clerezza.rdf.core.LiteralFactory)20 IOException (java.io.IOException)18 SimpleGraph (org.apache.clerezza.commons.rdf.impl.utils.simple.SimpleGraph)17 Test (org.junit.Test)16 ContentItem (org.apache.stanbol.enhancer.servicesapi.ContentItem)15 IndexedGraph (org.apache.stanbol.commons.indexedgraph.IndexedGraph)14 HashSet (java.util.HashSet)13 StringSource (org.apache.stanbol.enhancer.servicesapi.impl.StringSource)13 BlankNode (org.apache.clerezza.commons.rdf.BlankNode)11