Search in sources :

Example 1 with TextObject

use of com.optimaize.langdetect.text.TextObject in project languagetool by languagetool-org.

the class LanguageIdentifier method detectLanguageCode.

/**
   * @return language or {@code null} if language could not be identified
   */
@Nullable
private String detectLanguageCode(String text) {
    TextObject textObject = textObjectFactory.forText(text);
    Optional<LdLocale> lang = languageDetector.detect(textObject);
    //System.out.println(languageDetector.getProbabilities(textObject));
    if (lang.isPresent()) {
        return lang.get().getLanguage();
    } else {
        return null;
    }
}
Also used : LdLocale(com.optimaize.langdetect.i18n.LdLocale) TextObject(com.optimaize.langdetect.text.TextObject) Nullable(org.jetbrains.annotations.Nullable)

Example 2 with TextObject

use of com.optimaize.langdetect.text.TextObject in project languagetool by languagetool-org.

the class LanguageDetectionTrainer method main.

public static void main(String[] args) throws IOException {
    if (args.length != 3) {
        System.out.println("Usage: " + LanguageDetectionTrainer.class.getName() + " <languageCode> <plainTextFile> <minimalFrequency>");
        System.exit(1);
    }
    String langCode = args[0];
    String fileName = args[1];
    int minimalFrequency = Integer.parseInt(args[2]);
    String text = IOUtils.toString(new FileReader(fileName));
    TextObjectFactory textObjectFactory = CommonTextObjectFactories.forIndexingCleanText();
    TextObject inputText = textObjectFactory.create().append(text);
    LanguageProfile languageProfile = new LanguageProfileBuilder(langCode).ngramExtractor(NgramExtractors.standard()).minimalFrequency(minimalFrequency).addText(inputText).build();
    // current dir
    File outputDir = new File(System.getProperty("user.dir"));
    new LanguageProfileWriter().writeToDirectory(languageProfile, outputDir);
    System.out.println("Language profile written to " + new File(outputDir, langCode).getAbsolutePath());
}
Also used : TextObject(com.optimaize.langdetect.text.TextObject) LanguageProfile(com.optimaize.langdetect.profiles.LanguageProfile) LanguageProfileBuilder(com.optimaize.langdetect.profiles.LanguageProfileBuilder) LanguageProfileWriter(com.optimaize.langdetect.profiles.LanguageProfileWriter) FileReader(java.io.FileReader) File(java.io.File) TextObjectFactory(com.optimaize.langdetect.text.TextObjectFactory)

Example 3 with TextObject

use of com.optimaize.langdetect.text.TextObject in project KaellyBot by Kaysoro.

the class Translator method getLanguageFrom.

public static Language getLanguageFrom(String source) {
    TextObject textObject = CommonTextObjectFactories.forDetectingOnLargeText().forText(source);
    Optional<LdLocale> lang = getLanguageDetector().detect(textObject);
    if (lang.isPresent())
        for (Language lg : Language.values()) if (lang.get().getLanguage().equals(lg.getAbrev().toLowerCase()))
            return lg;
    return null;
}
Also used : LdLocale(com.optimaize.langdetect.i18n.LdLocale) TextObject(com.optimaize.langdetect.text.TextObject) Language(enums.Language) ChannelLanguage(data.ChannelLanguage)

Example 4 with TextObject

use of com.optimaize.langdetect.text.TextObject in project neo4j-nlp by graphaware.

the class LanguageManager method detectLanguage.

public String detectLanguage(String text) {
    if (!initialized) {
        initialize();
    }
    if (text != null) {
        TextObject textObject = textObjectFactory.forText(text);
        Optional<LdLocale> lang = languageDetector.detect(textObject);
        if (lang.isPresent()) {
            return lang.get().getLanguage();
        }
    }
    return LANGUAGE_NA;
}
Also used : LdLocale(com.optimaize.langdetect.i18n.LdLocale) TextObject(com.optimaize.langdetect.text.TextObject)

Aggregations

TextObject (com.optimaize.langdetect.text.TextObject)4 LdLocale (com.optimaize.langdetect.i18n.LdLocale)3 LanguageProfile (com.optimaize.langdetect.profiles.LanguageProfile)1 LanguageProfileBuilder (com.optimaize.langdetect.profiles.LanguageProfileBuilder)1 LanguageProfileWriter (com.optimaize.langdetect.profiles.LanguageProfileWriter)1 TextObjectFactory (com.optimaize.langdetect.text.TextObjectFactory)1 ChannelLanguage (data.ChannelLanguage)1 Language (enums.Language)1 File (java.io.File)1 FileReader (java.io.FileReader)1 Nullable (org.jetbrains.annotations.Nullable)1