Search in sources :

Example 1 with Representation

use of org.apache.stanbol.entityhub.servicesapi.model.Representation in project stanbol by apache.

the class TestSearcherImpl method addEntity.

public void addEntity(Representation rep) {
    entities.put(rep.getId(), rep);
    Iterator<Text> labels = rep.getText(nameField);
    while (labels.hasNext()) {
        Text label = labels.next();
        for (String token : tokenizer.tokenize(label.getText())) {
            Collection<Representation> values = data.get(token);
            if (values == null) {
                values = new ArrayList<Representation>();
                data.put(label.getText(), values);
            }
            values.add(rep);
        }
    }
}
Also used : Text(org.apache.stanbol.entityhub.servicesapi.model.Text) Representation(org.apache.stanbol.entityhub.servicesapi.model.Representation)

Example 2 with Representation

use of org.apache.stanbol.entityhub.servicesapi.model.Representation in project stanbol by apache.

the class EntityLinker method processRedirects.

/**
     * Processes {@link EntitySearcher#getRedirectField() redirect field} values for
     * the parsed suggestions based on the {@link RedirectProcessingMode}
     * as configured in the {@link #config}.<p>
     * The results of this method are stored within the parsed {@link Suggestion}s
     * @param suggestion The suggestion to process.
     */
private void processRedirects(Suggestion suggestion) {
    //if mode is IGNORE -> nothing to do
    if (config.getRedirectProcessingMode() == RedirectProcessingMode.IGNORE) {
        return;
    }
    //therefore there is a small internal state that stores this information
    if (suggestion.isRedirectedProcessed()) {
        //Redirects for ResultMatch are already processed ... ignore
        return;
    }
    Representation result = suggestion.getResult();
    Iterator<Reference> redirects = result.getReferences(config.getRedirectField());
    switch(config.getRedirectProcessingMode()) {
        case ADD_VALUES:
            while (redirects.hasNext()) {
                Reference redirect = redirects.next();
                if (redirect != null) {
                    Representation redirectedEntity = entitySearcher.get(redirect.getReference(), config.getSelectedFields());
                    if (redirectedEntity != null) {
                        for (Iterator<String> fields = redirectedEntity.getFieldNames(); fields.hasNext(); ) {
                            String field = fields.next();
                            result.add(field, redirectedEntity.get(field));
                        }
                    }
                    //set that the redirects where searched for this result
                    suggestion.setRedirectProcessed(true);
                }
            }
        case FOLLOW:
            while (redirects.hasNext()) {
                Reference redirect = redirects.next();
                if (redirect != null) {
                    Representation redirectedEntity = entitySearcher.get(redirect.getReference(), config.getSelectedFields());
                    if (redirectedEntity != null) {
                        //copy the original result score
                        redirectedEntity.set(RdfResourceEnum.resultScore.getUri(), result.get(RdfResourceEnum.resultScore.getUri()));
                        //set the redirect
                        suggestion.setRedirect(redirectedEntity);
                    }
                }
            }
        //nothing to do
        default:
    }
}
Also used : Reference(org.apache.stanbol.entityhub.servicesapi.model.Reference) Representation(org.apache.stanbol.entityhub.servicesapi.model.Representation)

Example 3 with Representation

use of org.apache.stanbol.entityhub.servicesapi.model.Representation in project stanbol by apache.

the class ReferencedSiteSearcher method lookup.

@Override
public Collection<? extends Representation> lookup(String field, Set<String> includeFields, List<String> search, String... languages) throws IllegalStateException {
    //build the query and than return the result
    Site site = getSearchService();
    if (site == null) {
        throw new IllegalStateException("ReferencedSite " + siteId + " is currently not available");
    }
    FieldQuery query = EntitySearcherUtils.createFieldQuery(site.getQueryFactory(), field, includeFields, search, languages);
    if (limit != null) {
        query.setLimit(limit);
    }
    QueryResultList<Representation> results;
    try {
        results = site.find(query);
    } catch (SiteException e) {
        throw new IllegalStateException("Exception while searchign for " + search + '@' + Arrays.toString(languages) + "in the ReferencedSite " + site.getId(), e);
    }
    return results.results();
}
Also used : Site(org.apache.stanbol.entityhub.servicesapi.site.Site) FieldQuery(org.apache.stanbol.entityhub.servicesapi.query.FieldQuery) Representation(org.apache.stanbol.entityhub.servicesapi.model.Representation) SiteException(org.apache.stanbol.entityhub.servicesapi.site.SiteException)

Example 4 with Representation

use of org.apache.stanbol.entityhub.servicesapi.model.Representation in project stanbol by apache.

the class LdpathSourceProcessor method process.

@SuppressWarnings({ "unchecked", "rawtypes" })
@Override
public Representation process(Representation source) {
    if (log.isTraceEnabled()) {
        log.trace(" - process {} (backend: {}, program: {}, append: {})", source.getId(), backend, program, appendMode);
    }
    Object context = backend.createURI(source.getId());
    Representation result = appendMode ? source : vf.createRepresentation(source.getId());
    /*
         * NOTE: LDPath will return Node instances of the RDFRepositroy if no
         * transformation is defined for a statement (line) in the configured
         * LDpath program (the ":: xsd:int" at the end). this Nodes need to be
         * converted to valid Entityhub Representation values.
         * As we can not know the generic type used by the RDFRepository
         * implementation of the indexing source this is a little bit tricky.
         * What this does is:
         *   - for URIs it creates References
         *   - for plain literal it adds natural texts
         *   - for typed literals it uses the NodeTransformer registered with 
         *     the LDPath (or more precise the Configuration object parsed to 
         *     the LDPath in the constructor) to transform the values to
         *     Java objects. If no transformer is found or an Exeption occurs
         *     than the lexical form is used and added as String to the 
         *     Entityhub.
         */
    Map<String, Collection<Object>> resultMap = (Map<String, Collection<Object>>) program.execute(backend, context);
    for (Entry<String, Collection<Object>> entry : resultMap.entrySet()) {
        NodeTransformer fieldTransformer = program.getField(entry.getKey()).getTransformer();
        if (fieldTransformer == null || fieldTransformer instanceof IdentityTransformer<?>) {
            //we need to convert the RDFBackend Node to an Representation object
            for (Object value : entry.getValue()) {
                if (backend.isURI(value)) {
                    result.addReference(entry.getKey(), backend.stringValue(value));
                } else if (backend.isLiteral(value)) {
                    //literal
                    Locale locale = backend.getLiteralLanguage(value);
                    if (locale != null) {
                        //text with language
                        String lang = locale.getLanguage();
                        result.addNaturalText(entry.getKey(), backend.stringValue(value), lang.isEmpty() ? null : lang);
                    } else {
                        // no language
                        URI type = backend.getLiteralType(value);
                        if (type != null) {
                            //typed literal -> need to transform
                            NodeTransformer nt = transformer.get(type.toString());
                            if (nt != null) {
                                //add typed literal
                                try {
                                    result.add(entry.getKey(), nt.transform(backend, value, Collections.<String, String>emptyMap()));
                                } catch (RuntimeException e) {
                                    log.info("Unable to transform {} to dataType {} -> will use lexical form", value, type);
                                    result.add(entry.getKey(), backend.stringValue(value));
                                }
                            } else {
                                //no transformer
                                log.info("No transformer for type {} -> will use lexical form", type);
                                result.add(entry.getKey(), backend.stringValue(value));
                            }
                        } else {
                            //no langauge and no type -> literal with no language
                            result.addNaturalText(entry.getKey(), backend.stringValue(value));
                        }
                    }
                } else {
                    //bNode
                    log.info("Ignore bNode {} (class: {})", value, value.getClass());
                }
            }
        //end for all values
        } else {
            //already a transformed values
            //just add all values
            result.add(entry.getKey(), entry.getValue());
        }
    }
    return result;
}
Also used : Locale(java.util.Locale) NodeTransformer(org.apache.marmotta.ldpath.api.transformers.NodeTransformer) Collection(java.util.Collection) Representation(org.apache.stanbol.entityhub.servicesapi.model.Representation) Map(java.util.Map) URI(java.net.URI)

Example 5 with Representation

use of org.apache.stanbol.entityhub.servicesapi.model.Representation in project stanbol by apache.

the class EntityIdBasedIndexingDaemon method run.

@Override
public void run() {
    while (entityIdIterator.hasNext()) {
        Long start = Long.valueOf(System.currentTimeMillis());
        EntityScore entityScore = entityIdIterator.next();
        Float score;
        if (normaliser != null) {
            score = normaliser.normalise(entityScore.score);
        } else {
            score = entityScore.score;
        }
        if (//all entities are indexed anyway
        indexAllEntitiesState || //no score available
        score == null || score.compareTo(ScoreNormaliser.ZERO) >= 0) {
            //score >= 0
            Representation rep = dataProvider.getEntityData(entityScore.id);
            if (rep == null) {
                log.debug("unable to get Data for Entity {} (score=norm:{}|orig:{})", new Object[] { entityScore.id, score, entityScore.score });
            }
            produce(rep, score, start);
        }
    //else ignore this entity
    }
    setFinished();
}
Also used : EntityScore(org.apache.stanbol.entityhub.indexing.core.EntityIterator.EntityScore) Representation(org.apache.stanbol.entityhub.servicesapi.model.Representation)

Aggregations

Representation (org.apache.stanbol.entityhub.servicesapi.model.Representation)198 Test (org.junit.Test)117 Text (org.apache.stanbol.entityhub.servicesapi.model.Text)32 HashSet (java.util.HashSet)31 Yard (org.apache.stanbol.entityhub.servicesapi.yard.Yard)25 Entity (org.apache.stanbol.entityhub.servicesapi.model.Entity)16 YardException (org.apache.stanbol.entityhub.servicesapi.yard.YardException)15 ValueFactory (org.apache.stanbol.entityhub.servicesapi.model.ValueFactory)14 Reference (org.apache.stanbol.entityhub.servicesapi.model.Reference)12 FieldQuery (org.apache.stanbol.entityhub.servicesapi.query.FieldQuery)12 ArrayList (java.util.ArrayList)11 RdfRepresentation (org.apache.stanbol.entityhub.model.sesame.RdfRepresentation)10 IOException (java.io.IOException)9 IRI (org.apache.clerezza.commons.rdf.IRI)9 ResponseBuilder (javax.ws.rs.core.Response.ResponseBuilder)8 Graph (org.apache.clerezza.commons.rdf.Graph)8 IndexedGraph (org.apache.stanbol.commons.indexedgraph.IndexedGraph)8 RdfRepresentation (org.apache.stanbol.entityhub.model.clerezza.RdfRepresentation)8 RdfValueFactory (org.apache.stanbol.entityhub.model.clerezza.RdfValueFactory)8 EntityhubException (org.apache.stanbol.entityhub.servicesapi.EntityhubException)8