Search in sources :

Example 1 with LdLocale

use of com.optimaize.langdetect.i18n.LdLocale in project languagetool by languagetool-org.

the class LanguageIdentifier method detectLanguageCode.

/**
   * @return language or {@code null} if language could not be identified
   */
@Nullable
private String detectLanguageCode(String text) {
    TextObject textObject = textObjectFactory.forText(text);
    Optional<LdLocale> lang = languageDetector.detect(textObject);
    //System.out.println(languageDetector.getProbabilities(textObject));
    if (lang.isPresent()) {
        return lang.get().getLanguage();
    } else {
        return null;
    }
}
Also used : LdLocale(com.optimaize.langdetect.i18n.LdLocale) TextObject(com.optimaize.langdetect.text.TextObject) Nullable(org.jetbrains.annotations.Nullable)

Example 2 with LdLocale

use of com.optimaize.langdetect.i18n.LdLocale in project tika by apache.

the class OptimaizeLangDetector method createDetector.

private com.optimaize.langdetect.LanguageDetector createDetector(List<LanguageProfile> languageProfiles) {
    // FUTURE currently the short text algorithm doesn't normalize probabilities until the end, which
    // means you can often get 0 probabilities. So we pick a very short length for this limit.
    LanguageDetectorBuilder builder = LanguageDetectorBuilder.create(NgramExtractors.standard()).shortTextAlgorithm(30).withProfiles(languageProfiles);
    if (languageProbabilities != null) {
        Map<LdLocale, Double> languageWeights = new HashMap<>(languageProbabilities.size());
        for (String language : languageProbabilities.keySet()) {
            Double priority = (double) languageProbabilities.get(language);
            languageWeights.put(LdLocale.fromString(language), priority);
        }
        builder.languagePriorities(languageWeights);
    }
    return builder.build();
}
Also used : LdLocale(com.optimaize.langdetect.i18n.LdLocale) HashMap(java.util.HashMap) LanguageDetectorBuilder(com.optimaize.langdetect.LanguageDetectorBuilder)

Example 3 with LdLocale

use of com.optimaize.langdetect.i18n.LdLocale in project tika by apache.

the class OptimaizeLangDetector method loadModels.

@Override
public LanguageDetector loadModels(Set<String> languages) throws IOException {
    // Normalize languages.
    this.languages = new HashSet<>(languages.size());
    for (String language : languages) {
        this.languages.add(LanguageNames.normalizeName(language));
    }
    // TODO what happens if you request a language that has no profile?
    Set<LdLocale> locales = new HashSet<>();
    for (LdLocale locale : BuiltInLanguages.getLanguages()) {
        String languageName = makeLanguageName(locale);
        if (this.languages.contains(languageName)) {
            locales.add(locale);
        }
    }
    detector = createDetector(new LanguageProfileReader().readBuiltIn(locales));
    return this;
}
Also used : LdLocale(com.optimaize.langdetect.i18n.LdLocale) LanguageProfileReader(com.optimaize.langdetect.profiles.LanguageProfileReader) HashSet(java.util.HashSet)

Example 4 with LdLocale

use of com.optimaize.langdetect.i18n.LdLocale in project KaellyBot by Kaysoro.

the class Translator method getLanguageFrom.

public static Language getLanguageFrom(String source) {
    TextObject textObject = CommonTextObjectFactories.forDetectingOnLargeText().forText(source);
    Optional<LdLocale> lang = getLanguageDetector().detect(textObject);
    if (lang.isPresent())
        for (Language lg : Language.values()) if (lang.get().getLanguage().equals(lg.getAbrev().toLowerCase()))
            return lg;
    return null;
}
Also used : LdLocale(com.optimaize.langdetect.i18n.LdLocale) TextObject(com.optimaize.langdetect.text.TextObject) Language(enums.Language) ChannelLanguage(data.ChannelLanguage)

Example 5 with LdLocale

use of com.optimaize.langdetect.i18n.LdLocale in project neo4j-nlp by graphaware.

the class LanguageManager method detectLanguage.

public String detectLanguage(String text) {
    if (!initialized) {
        initialize();
    }
    if (text != null) {
        TextObject textObject = textObjectFactory.forText(text);
        Optional<LdLocale> lang = languageDetector.detect(textObject);
        if (lang.isPresent()) {
            return lang.get().getLanguage();
        }
    }
    return LANGUAGE_NA;
}
Also used : LdLocale(com.optimaize.langdetect.i18n.LdLocale) TextObject(com.optimaize.langdetect.text.TextObject)

Aggregations

LdLocale (com.optimaize.langdetect.i18n.LdLocale)5 TextObject (com.optimaize.langdetect.text.TextObject)3 LanguageDetectorBuilder (com.optimaize.langdetect.LanguageDetectorBuilder)1 LanguageProfileReader (com.optimaize.langdetect.profiles.LanguageProfileReader)1 ChannelLanguage (data.ChannelLanguage)1 Language (enums.Language)1 HashMap (java.util.HashMap)1 HashSet (java.util.HashSet)1 Nullable (org.jetbrains.annotations.Nullable)1