Search in sources :

Example 1 with NoConvertorException

use of org.apache.clerezza.rdf.core.NoConvertorException in project stanbol by apache.

the class Mapping method toResource.

/**
 * Converts the parsed value based on the mapping information to an RDF
 * {@link RDFTerm}. Optionally supports also validation if the parsed
 * value is valid for the {@link Mapping#ontType ontology type} specified by
 * the parsed mapping.
 * @param value the value
 * @param mapping the mapping
 * @param validate
 * @return the {@link RDFTerm} or <code>null</code> if the parsed value is
 * <code>null</code> or {@link String#isEmpty() empty}.
 * @throws IllegalArgumentException if the parsed {@link Mapping} is
 * <code>null</code>
 */
protected RDFTerm toResource(String value, boolean validate) {
    // used for date validation
    Metadata dummy = null;
    if (value == null || value.isEmpty()) {
        // ignore null and empty values
        return null;
    }
    RDFTerm object;
    if (ontType == null) {
        object = new PlainLiteralImpl(value);
    } else if (ontType == RDFS.Resource) {
        try {
            if (validate) {
                new URI(value);
            }
            object = new IRI(value);
        } catch (URISyntaxException e) {
            log.warn("Unable to create Reference for value {} (not a valid URI)" + " -> create a literal instead", value);
            object = new PlainLiteralImpl(value);
        }
    } else {
        // typed literal
        Class<?> clazz = Mapping.ONT_TYPE_MAP.get(ontType);
        if (clazz.equals(Date.class)) {
            // parseDate(..) method
            if (dummy == null) {
                dummy = new Metadata();
            }
            // any Property with the Date type could be used here
            dummy.add(DATE.getName(), value);
            // access parseDate(..)
            Date date = dummy.getDate(DublinCore.DATE);
            if (date != null) {
                // now use the Clerezza Literal factory
                object = lf.createTypedLiteral(date);
            } else {
                // fall back to xsd:string
                object = new TypedLiteralImpl(value, XSD.string);
            }
        } else {
            object = new TypedLiteralImpl(value, ontType);
        }
        if (validate && clazz != null && !clazz.equals(Date.class)) {
            // we need not to validate dates
            try {
                lf.createObject(clazz, (Literal) object);
            } catch (NoConvertorException e) {
                log.info("Unable to validate typed literals of type {} because" + "there is no converter for Class {} registered with Clerezza", ontType, clazz);
            } catch (InvalidLiteralTypeException e) {
                log.info("The value '{}' is not valid for dataType {}!" + "create literal with type 'xsd:string' instead", value, ontType);
                object = new TypedLiteralImpl(value, XSD.string);
            }
        }
    // else no validation needed
    }
    if (converter != null) {
        object = converter.convert(object);
    }
    return object;
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) BlankNodeOrIRI(org.apache.clerezza.commons.rdf.BlankNodeOrIRI) PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) Literal(org.apache.clerezza.commons.rdf.Literal) NoConvertorException(org.apache.clerezza.rdf.core.NoConvertorException) Metadata(org.apache.tika.metadata.Metadata) InvalidLiteralTypeException(org.apache.clerezza.rdf.core.InvalidLiteralTypeException) RDFTerm(org.apache.clerezza.commons.rdf.RDFTerm) TypedLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.TypedLiteralImpl) URISyntaxException(java.net.URISyntaxException) URI(java.net.URI) Date(java.util.Date)

Example 2 with NoConvertorException

use of org.apache.clerezza.rdf.core.NoConvertorException in project stanbol by apache.

the class RdfRepresentation method removeTypedLiteral.

protected void removeTypedLiteral(IRI field, Object object) {
    Literal literal;
    try {
        literal = RdfResourceUtils.createLiteral(object);
    } catch (NoConvertorException e) {
        log.info("No Converter for value type " + object.getClass() + " (parsed for field " + field + ") use toString() Method to get String representation");
        literal = RdfResourceUtils.createLiteral(object.toString(), null);
    }
    graphNode.deleteProperty(field, literal);
}
Also used : Literal(org.apache.clerezza.commons.rdf.Literal) NoConvertorException(org.apache.clerezza.rdf.core.NoConvertorException)

Example 3 with NoConvertorException

use of org.apache.clerezza.rdf.core.NoConvertorException in project stanbol by apache.

the class CeliSentimentAnalysisEngine method computeEnhancements.

@Override
public void computeEnhancements(ContentItem ci) throws EngineException {
    Entry<IRI, Blob> contentPart = ContentItemHelper.getBlob(ci, SUPPORTED_MIMTYPES);
    if (contentPart == null) {
        throw new IllegalStateException("No ContentPart with Mimetype '" + TEXT_PLAIN_MIMETYPE + "' found for ContentItem " + ci.getUri() + ": This is also checked in the canEnhance method! -> This " + "indicated an Bug in the implementation of the " + "EnhancementJobManager!");
    }
    String text = "";
    try {
        text = ContentItemHelper.getText(contentPart.getValue());
    } catch (IOException e) {
        throw new InvalidContentException(this, ci, e);
    }
    if (text.trim().length() == 0) {
        log.info("No text contained in ContentPart {" + contentPart.getKey() + "} of ContentItem {" + ci.getUri() + "}");
        return;
    }
    String language = EnhancementEngineHelper.getLanguage(ci);
    if (language == null) {
        throw new IllegalStateException("Unable to extract Language for " + "ContentItem " + ci.getUri() + ": This is also checked in the canEnhance " + "method! -> This indicated an Bug in the implementation of the " + "EnhancementJobManager!");
    }
    // used for the palin literals in TextAnnotations
    Language lang = new Language(language);
    try {
        List<SentimentExpression> lista = this.client.extractSentimentExpressions(text, language);
        LiteralFactory literalFactory = LiteralFactory.getInstance();
        Graph g = ci.getMetadata();
        for (SentimentExpression se : lista) {
            try {
                IRI textAnnotation = EnhancementEngineHelper.createTextEnhancement(ci, this);
                // add selected text as PlainLiteral in the language extracted from the text
                g.add(new TripleImpl(textAnnotation, ENHANCER_SELECTED_TEXT, new PlainLiteralImpl(se.getSnippetStr(), lang)));
                g.add(new TripleImpl(textAnnotation, DC_TYPE, CeliConstants.SENTIMENT_EXPRESSION));
                if (se.getStartSnippet() != null && se.getEndSnippet() != null) {
                    g.add(new TripleImpl(textAnnotation, ENHANCER_START, literalFactory.createTypedLiteral(se.getStartSnippet().intValue())));
                    g.add(new TripleImpl(textAnnotation, ENHANCER_END, literalFactory.createTypedLiteral(se.getEndSnippet().intValue())));
                    g.add(new TripleImpl(textAnnotation, ENHANCER_SELECTION_CONTEXT, new PlainLiteralImpl(getSelectionContext(text, se.getSnippetStr(), se.getStartSnippet()), lang)));
                    g.add(new TripleImpl(textAnnotation, CeliConstants.HAS_SENTIMENT_EXPRESSION_POLARITY, literalFactory.createTypedLiteral(se.getSentimentPolarityAsDoubleValue())));
                }
            } catch (NoConvertorException e) {
                log.error(e.getMessage(), e);
            }
        }
    } catch (IOException e) {
        throw new EngineException("Error while calling the CELI Sentiment Analysis service (configured URL: " + serviceURL + ")!", e);
    } catch (SOAPException e) {
        throw new EngineException("Error wile encoding/decoding the request/response to the CELI Sentiment Analysis service!", e);
    }
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) Blob(org.apache.stanbol.enhancer.servicesapi.Blob) PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) EngineException(org.apache.stanbol.enhancer.servicesapi.EngineException) IOException(java.io.IOException) LiteralFactory(org.apache.clerezza.rdf.core.LiteralFactory) InvalidContentException(org.apache.stanbol.enhancer.servicesapi.InvalidContentException) Graph(org.apache.clerezza.commons.rdf.Graph) Language(org.apache.clerezza.commons.rdf.Language) NoConvertorException(org.apache.clerezza.rdf.core.NoConvertorException) SOAPException(javax.xml.soap.SOAPException) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl)

Example 4 with NoConvertorException

use of org.apache.clerezza.rdf.core.NoConvertorException in project stanbol by apache.

the class CeliNamedEntityExtractionEnhancementEngine method computeEnhancements.

@Override
public void computeEnhancements(ContentItem ci) throws EngineException {
    Entry<IRI, Blob> contentPart = ContentItemHelper.getBlob(ci, SUPPORTED_MIMTYPES);
    if (contentPart == null) {
        throw new IllegalStateException("No ContentPart with Mimetype '" + TEXT_PLAIN_MIMETYPE + "' found for ContentItem " + ci.getUri() + ": This is also checked in the canEnhance method! -> This " + "indicated an Bug in the implementation of the " + "EnhancementJobManager!");
    }
    String text = "";
    try {
        text = ContentItemHelper.getText(contentPart.getValue());
    } catch (IOException e) {
        throw new InvalidContentException(this, ci, e);
    }
    if (text.trim().length() == 0) {
        log.info("No text contained in ContentPart {" + contentPart.getKey() + "} of ContentItem {" + ci.getUri() + "}");
        return;
    }
    String language = EnhancementEngineHelper.getLanguage(ci);
    if (language == null) {
        throw new IllegalStateException("Unable to extract Language for " + "ContentItem " + ci.getUri() + ": This is also checked in the canEnhance " + "method! -> This indicated an Bug in the implementation of the " + "EnhancementJobManager!");
    }
    // used for the palin literals in TextAnnotations
    Language lang = new Language(language);
    try {
        List<NamedEntity> lista = this.client.extractEntities(text, language);
        LiteralFactory literalFactory = LiteralFactory.getInstance();
        Graph g = ci.getMetadata();
        for (NamedEntity ne : lista) {
            try {
                IRI textAnnotation = EnhancementEngineHelper.createTextEnhancement(ci, this);
                // add selected text as PlainLiteral in the language extracted from the text
                g.add(new TripleImpl(textAnnotation, ENHANCER_SELECTED_TEXT, new PlainLiteralImpl(ne.getFormKind(), lang)));
                g.add(new TripleImpl(textAnnotation, DC_TYPE, getEntityRefForType(ne.type)));
                if (ne.getFrom() != null && ne.getTo() != null) {
                    g.add(new TripleImpl(textAnnotation, ENHANCER_START, literalFactory.createTypedLiteral(ne.getFrom().intValue())));
                    g.add(new TripleImpl(textAnnotation, ENHANCER_END, literalFactory.createTypedLiteral(ne.getTo().intValue())));
                    g.add(new TripleImpl(textAnnotation, ENHANCER_SELECTION_CONTEXT, new PlainLiteralImpl(getSelectionContext(text, ne.getFormKind(), ne.getFrom().intValue()), lang)));
                }
            } catch (NoConvertorException e) {
                log.error(e.getMessage(), e);
            }
        }
    } catch (IOException e) {
        throw new EngineException("Error while calling the CELI NER (Named Entity Recognition)" + " service (configured URL: " + serviceURL + ")!", e);
    } catch (SOAPException e) {
        throw new EngineException("Error wile encoding/decoding the request/" + "response to the CELI NER (Named Entity Recognition) service!", e);
    }
}
Also used : IRI(org.apache.clerezza.commons.rdf.IRI) Blob(org.apache.stanbol.enhancer.servicesapi.Blob) PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) EngineException(org.apache.stanbol.enhancer.servicesapi.EngineException) IOException(java.io.IOException) LiteralFactory(org.apache.clerezza.rdf.core.LiteralFactory) InvalidContentException(org.apache.stanbol.enhancer.servicesapi.InvalidContentException) Graph(org.apache.clerezza.commons.rdf.Graph) Language(org.apache.clerezza.commons.rdf.Language) NoConvertorException(org.apache.clerezza.rdf.core.NoConvertorException) SOAPException(javax.xml.soap.SOAPException) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl)

Example 5 with NoConvertorException

use of org.apache.clerezza.rdf.core.NoConvertorException in project stanbol by apache.

the class ExecutionPlanHelper method writeEnhancementProperty.

/**
 * Writes enhancement property value(s) for the parsed node, property to the
 * execution plan graph.
 * @param ep the RDF graph holding the execution plan
 * @param epNode the execution node
 * @param property the property
 * @param value the value(s). {@link Collection} and <code>Object[]</code> are
 * supported for multiple values.
 * @throws NullPointerException if any of the parsed parameter is <code>null</code>
 */
@SuppressWarnings("unchecked")
private static void writeEnhancementProperty(Graph ep, BlankNodeOrIRI epNode, IRI property, Object value) {
    Collection<Object> values;
    if (value instanceof Collection<?>) {
        values = (Collection<Object>) value;
    } else if (value instanceof Object[]) {
        values = Arrays.asList((Object[]) value);
    } else {
        values = Collections.singleton(value);
    }
    for (Object v : values) {
        if (v != null) {
            Literal literal;
            if (v instanceof String) {
                literal = new PlainLiteralImpl((String) v);
            } else {
                try {
                    literal = lf.createTypedLiteral(v);
                } catch (NoConvertorException e) {
                    log.warn("Use toString() value '{}' for EnhancementProperty " + "'{}' as no TypedLiteral converter is registered for " + "class {}", new Object[] { v, property, v.getClass().getName() });
                    literal = new PlainLiteralImpl(v.toString());
                }
            }
            ep.add(new TripleImpl(epNode, property, literal));
        }
    }
}
Also used : PlainLiteralImpl(org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl) Literal(org.apache.clerezza.commons.rdf.Literal) NoConvertorException(org.apache.clerezza.rdf.core.NoConvertorException) Collection(java.util.Collection) EnhancementEngineHelper.getString(org.apache.stanbol.enhancer.servicesapi.helper.EnhancementEngineHelper.getString) TripleImpl(org.apache.clerezza.commons.rdf.impl.utils.TripleImpl)

Aggregations

NoConvertorException (org.apache.clerezza.rdf.core.NoConvertorException)6 Literal (org.apache.clerezza.commons.rdf.Literal)4 PlainLiteralImpl (org.apache.clerezza.commons.rdf.impl.utils.PlainLiteralImpl)4 IRI (org.apache.clerezza.commons.rdf.IRI)3 TripleImpl (org.apache.clerezza.commons.rdf.impl.utils.TripleImpl)3 IOException (java.io.IOException)2 SOAPException (javax.xml.soap.SOAPException)2 Graph (org.apache.clerezza.commons.rdf.Graph)2 Language (org.apache.clerezza.commons.rdf.Language)2 LiteralFactory (org.apache.clerezza.rdf.core.LiteralFactory)2 Blob (org.apache.stanbol.enhancer.servicesapi.Blob)2 EngineException (org.apache.stanbol.enhancer.servicesapi.EngineException)2 InvalidContentException (org.apache.stanbol.enhancer.servicesapi.InvalidContentException)2 URI (java.net.URI)1 URISyntaxException (java.net.URISyntaxException)1 Collection (java.util.Collection)1 Date (java.util.Date)1 BlankNodeOrIRI (org.apache.clerezza.commons.rdf.BlankNodeOrIRI)1 RDFTerm (org.apache.clerezza.commons.rdf.RDFTerm)1 TypedLiteralImpl (org.apache.clerezza.commons.rdf.impl.utils.TypedLiteralImpl)1