use of org.apache.stanbol.enhancer.nlp.model.AnalysedText in project stanbol by apache.
the class NlpEngineHelper method initAnalysedText.
/**
* Retrieves - or if not present - creates the {@link AnalysedText} content
* part for the parsed {@link ContentItem}. If the {@link Blob} with the
* mime type '<code>text/plain</code>' is present this method
* throws an {@link IllegalStateException} (this method internally uses
* {@link #getPlainText(EnhancementEngine, ContentItem, boolean)} with
* <code>true</code> as third parameters. Users of this method should call
* this method with <code>false</code> as third parameter in their
* {@link EnhancementEngine#canEnhance(ContentItem)} implementation.<p>
* <i>NOTE:</i> This method is intended for Engines that want to create an
* empty {@link AnalysedText} content part. Engines that assume that this
* content part is already present (e.g. if the consume already existing
* annotations) should use the
* {@link #getAnalysedText(EnhancementEngine, ContentItem, boolean)}
* method instead.
* @param engine the EnhancementEngine calling this method (used for logging)
* @param analysedTextFactory the {@link AnalysedTextFactory} used to create
* the {@link AnalysedText} instance (if not present).
* @param ci the {@link ContentItem}
* @return the AnalysedText
* @throws EngineException on any exception while accessing the
* '<code>text/plain</code>' Blob
* @throws IllegalStateException if no '<code>text/plain</code>' Blob is
* present as content part of the parsed {@link ContentItem} or the parsed
* {@link AnalysedTextFactory} is <code>null</code>. <i>NOTE</i> that
* {@link IllegalStateException} are only thrown if the {@link AnalysedText}
* ContentPart is not yet present in the parsed {@link ContentItem}
*/
public static AnalysedText initAnalysedText(EnhancementEngine engine, AnalysedTextFactory analysedTextFactory, ContentItem ci) throws EngineException {
AnalysedText at = AnalysedTextUtils.getAnalysedText(ci);
if (at == null) {
if (analysedTextFactory == null) {
throw new IllegalStateException("Unable to initialise AnalysedText" + "ContentPart because the parsed AnalysedTextFactory is NULL");
}
Entry<IRI, Blob> textBlob = getPlainText(engine, ci, true);
//we need to create
ci.getLock().writeLock().lock();
try {
//try again to retrieve (maybe an concurrent thread has created
//the content part in the meantime
at = AnalysedTextUtils.getAnalysedText(ci);
if (at == null) {
log.debug(" ... create new AnalysedText instance for Engine {}", engine.getName());
at = analysedTextFactory.createAnalysedText(ci, textBlob.getValue());
}
} catch (IOException e) {
throw new EngineException("Unable to create AnalysetText instance for Blob " + textBlob.getKey() + " of ContentItem " + ci.getUri() + "!", e);
} finally {
ci.getLock().writeLock().unlock();
}
} else {
log.debug(" ... use existing AnalysedText instance for Engine {}", engine.getName());
}
return at;
}
use of org.apache.stanbol.enhancer.nlp.model.AnalysedText in project stanbol by apache.
the class CorefFeatureSupportTest method testSerializationAndParse.
@Test
public void testSerializationAndParse() throws IOException {
String serialized = getSerializedString();
Assert.assertTrue(serialized.contains(jsonCorefCheckObama));
Assert.assertTrue(serialized.contains(jsonCorefCheckHe));
AnalysedText parsedAt = getParsedAnalysedText(serialized);
assertAnalysedTextEquality(parsedAt);
}
Aggregations