Search in sources :

Example 1 with LanguageProfileReader

use of com.optimaize.langdetect.profiles.LanguageProfileReader in project languagetool by languagetool-org.

the class LanguageIdentifier method loadProfiles.

private List<LanguageProfile> loadProfiles(List<String> langCodes) throws IOException {
    LanguageProfileReader profileReader = new LanguageProfileReader();
    List<LanguageProfile> profiles = profileReader.read(langCodes);
    for (String externalLangCode : externalLangCodes) {
        String profilePath = "/" + externalLangCode + "/" + externalLangCode + ".profile";
        if (JLanguageTool.getDataBroker().resourceExists(profilePath)) {
            // not all languages are always available
            try (InputStream profile = JLanguageTool.getDataBroker().getFromResourceDirAsStream(profilePath)) {
                profiles.add(new LanguageProfileReader().read(profile));
            }
        }
    }
    return profiles;
}
Also used : LanguageProfile(com.optimaize.langdetect.profiles.LanguageProfile) InputStream(java.io.InputStream) LanguageProfileReader(com.optimaize.langdetect.profiles.LanguageProfileReader)

Example 2 with LanguageProfileReader

use of com.optimaize.langdetect.profiles.LanguageProfileReader in project tika by apache.

the class OptimaizeLangDetector method loadModels.

@Override
public LanguageDetector loadModels() throws IOException {
    List<LanguageProfile> languageProfiles = new LanguageProfileReader().readAllBuiltIn();
    // FUTURE when the "language-detector" project supports short profiles, check if
    // isShortText() returns true and switch to those.
    languages = new HashSet<>();
    for (LanguageProfile profile : languageProfiles) {
        languages.add(makeLanguageName(profile.getLocale()));
    }
    detector = createDetector(languageProfiles);
    return this;
}
Also used : LanguageProfile(com.optimaize.langdetect.profiles.LanguageProfile) LanguageProfileReader(com.optimaize.langdetect.profiles.LanguageProfileReader)

Example 3 with LanguageProfileReader

use of com.optimaize.langdetect.profiles.LanguageProfileReader in project neo4j-nlp by graphaware.

the class LanguageManager method initialize.

public void initialize() {
    if (initialized) {
        return;
    }
    LOG.info("Initializing Language Detector ...");
    try {
        List<LanguageProfile> languageProfiles = new LanguageProfileReader().readAllBuiltIn();
        // build language detector:
        languageDetector = LanguageDetectorBuilder.create(NgramExtractors.standard()).withProfiles(languageProfiles).build();
        // create a text object factory
        textObjectFactory = CommonTextObjectFactories.forDetectingOnLargeText();
        initialized = true;
    } catch (IOException ex) {
        initialized = false;
        LOG.error("Error while initializing Language Detector", ex);
    }
}
Also used : LanguageProfile(com.optimaize.langdetect.profiles.LanguageProfile) LanguageProfileReader(com.optimaize.langdetect.profiles.LanguageProfileReader) IOException(java.io.IOException)

Example 4 with LanguageProfileReader

use of com.optimaize.langdetect.profiles.LanguageProfileReader in project tika by apache.

the class OptimaizeLangDetector method loadModels.

@Override
public LanguageDetector loadModels(Set<String> languages) throws IOException {
    // Normalize languages.
    this.languages = new HashSet<>(languages.size());
    for (String language : languages) {
        this.languages.add(LanguageNames.normalizeName(language));
    }
    // TODO what happens if you request a language that has no profile?
    Set<LdLocale> locales = new HashSet<>();
    for (LdLocale locale : BuiltInLanguages.getLanguages()) {
        String languageName = makeLanguageName(locale);
        if (this.languages.contains(languageName)) {
            locales.add(locale);
        }
    }
    detector = createDetector(new LanguageProfileReader().readBuiltIn(locales));
    return this;
}
Also used : LdLocale(com.optimaize.langdetect.i18n.LdLocale) LanguageProfileReader(com.optimaize.langdetect.profiles.LanguageProfileReader) HashSet(java.util.HashSet)

Example 5 with LanguageProfileReader

use of com.optimaize.langdetect.profiles.LanguageProfileReader in project KaellyBot by Kaysoro.

the class Translator method getLanguageDetector.

private static LanguageDetector getLanguageDetector() {
    if (languageDetector == null) {
        try {
            List<String> languages = new ArrayList<>();
            for (Language lg : Language.values()) languages.add(lg.getAbrev().toLowerCase());
            List<LanguageProfile> languageProfiles = new LanguageProfileReader().read(languages);
            languageDetector = LanguageDetectorBuilder.create(NgramExtractors.standard()).withProfiles(languageProfiles).build();
        } catch (IOException e) {
            LoggerFactory.getLogger(Translator.class).error("Translator.getLanguageDetector", e);
        }
    }
    return languageDetector;
}
Also used : LanguageProfile(com.optimaize.langdetect.profiles.LanguageProfile) Language(enums.Language) ChannelLanguage(data.ChannelLanguage) LanguageProfileReader(com.optimaize.langdetect.profiles.LanguageProfileReader)

Aggregations

LanguageProfileReader (com.optimaize.langdetect.profiles.LanguageProfileReader)5 LanguageProfile (com.optimaize.langdetect.profiles.LanguageProfile)4 LdLocale (com.optimaize.langdetect.i18n.LdLocale)1 ChannelLanguage (data.ChannelLanguage)1 Language (enums.Language)1 IOException (java.io.IOException)1 InputStream (java.io.InputStream)1 HashSet (java.util.HashSet)1