Search in sources :

Example 1 with SentenceDetector

use of opennlp.tools.sentdetect.SentenceDetector in project stanbol by apache.

the class OpenNlpSentenceDetectionEngine method computeEnhancements.

/**
 * Compute enhancements for supplied ContentItem. The results of the process
 * are expected to be stored in the metadata of the content item.
 * <p/>
 * The client (usually an {@link org.apache.stanbol.enhancer.servicesapi.EnhancementJobManager}) should take care of
 * persistent storage of the enhanced {@link org.apache.stanbol.enhancer.servicesapi.ContentItem}.
 * <p/>
 * This method creates a new POSContentPart using {@link org.apache.stanbol.enhancer.engines.pos.api.POSTaggerHelper#createContentPart} from a text/plain part and
 * stores it as a new part in the content item. The metadata is not changed.
 *
 * @throws org.apache.stanbol.enhancer.servicesapi.EngineException
 *          if the underlying process failed to work as
 *          expected
 */
@Override
public void computeEnhancements(ContentItem ci) throws EngineException {
    AnalysedText at = initAnalysedText(this, analysedTextFactory, ci);
    String language = getLanguage(this, ci, true);
    SentenceDetector sentenceDetector = getSentenceDetector(language);
    if (sentenceDetector != null) {
        for (opennlp.tools.util.Span sentSpan : sentenceDetector.sentPosDetect(at.getSpan())) {
            // detect sentences and add it to the AnalyzedText.
            Sentence sentence = at.addSentence(sentSpan.getStart(), sentSpan.getEnd());
            log.trace(" > add {}", sentence);
        }
    } else {
        log.warn("SentenceDetector model for language {} is no longer available. " + "This might happen if the model becomes unavailable during enhancement. " + "If this happens more often it might also indicate an bug in the used " + "EnhancementJobManager implementation as the availability is also checked " + "in the canEnhance(..) method of this Enhancement Engine.");
    }
}
Also used : AnalysedText(org.apache.stanbol.enhancer.nlp.model.AnalysedText) NlpEngineHelper.initAnalysedText(org.apache.stanbol.enhancer.nlp.utils.NlpEngineHelper.initAnalysedText) SentenceDetector(opennlp.tools.sentdetect.SentenceDetector) Sentence(org.apache.stanbol.enhancer.nlp.model.Sentence)

Example 2 with SentenceDetector

use of opennlp.tools.sentdetect.SentenceDetector in project stanbol by apache.

the class OpenNlpPosTaggingEngine method detectSentences.

private List<Section> detectSentences(AnalysedText at, String language) {
    SentenceDetector sentenceDetector = getSentenceDetector(language);
    List<Section> sentences;
    if (sentenceDetector != null) {
        sentences = new ArrayList<Section>();
        for (opennlp.tools.util.Span sentSpan : sentenceDetector.sentPosDetect(at.getSpan())) {
            Sentence sentence = at.addSentence(sentSpan.getStart(), sentSpan.getEnd());
            log.trace(" > add {}", sentence);
            sentences.add(sentence);
        }
    } else {
        sentences = null;
    }
    return sentences;
}
Also used : SentenceDetector(opennlp.tools.sentdetect.SentenceDetector) Section(org.apache.stanbol.enhancer.nlp.model.Section) Sentence(org.apache.stanbol.enhancer.nlp.model.Sentence)

Example 3 with SentenceDetector

use of opennlp.tools.sentdetect.SentenceDetector in project stanbol by apache.

the class OpenNLPTest method testLoadMissingSentence.

@Test
public void testLoadMissingSentence() throws IOException {
    SentenceModel model = openNLP.getSentenceModel("ru");
    Assert.assertNull(model);
    SentenceDetector sentDetector = openNLP.getSentenceDetector("ru");
    Assert.assertNull(sentDetector);
}
Also used : SentenceModel(opennlp.tools.sentdetect.SentenceModel) SentenceDetector(opennlp.tools.sentdetect.SentenceDetector) Test(org.junit.Test)

Example 4 with SentenceDetector

use of opennlp.tools.sentdetect.SentenceDetector in project stanbol by apache.

the class OpenNLPTest method testLoadEnSentence.

@Test
public void testLoadEnSentence() throws IOException {
    SentenceModel model = openNLP.getSentenceModel("en");
    Assert.assertNotNull(model);
    SentenceDetector sentDetector = openNLP.getSentenceDetector("en");
    Assert.assertNotNull(sentDetector);
}
Also used : SentenceModel(opennlp.tools.sentdetect.SentenceModel) SentenceDetector(opennlp.tools.sentdetect.SentenceDetector) Test(org.junit.Test)

Aggregations

SentenceDetector (opennlp.tools.sentdetect.SentenceDetector)4 SentenceModel (opennlp.tools.sentdetect.SentenceModel)2 Sentence (org.apache.stanbol.enhancer.nlp.model.Sentence)2 Test (org.junit.Test)2 AnalysedText (org.apache.stanbol.enhancer.nlp.model.AnalysedText)1 Section (org.apache.stanbol.enhancer.nlp.model.Section)1 NlpEngineHelper.initAnalysedText (org.apache.stanbol.enhancer.nlp.utils.NlpEngineHelper.initAnalysedText)1