Search in sources :

Example 36 with TurkishMorphology

use of zemberek.morphology.TurkishMorphology in project zemberek-nlp by ahmetaa.

the class BasicWordSpellingCheckAndSuggestion method main.

public static void main(String[] args) throws IOException {
    TurkishMorphology morphology = TurkishMorphology.createWithDefaults();
    TurkishSpellChecker spellChecker = new TurkishSpellChecker(morphology);
    Log.info("Check if written correctly.");
    String[] words = { "Ankara'ya", "Ankar'aya", "yapbileceksen", "yapabileceğinizden" };
    for (String word : words) {
        Log.info(word + " -> " + spellChecker.check(word));
    }
    Log.info();
    Log.info("Give suggestions.");
    String[] toSuggest = { "Kraamanda", "okumuştk", "yapbileceksen", "oukyamıyorum" };
    for (String s : toSuggest) {
        Log.info(s + " -> " + spellChecker.suggestForWord(s));
    }
}
Also used : TurkishSpellChecker(zemberek.normalization.TurkishSpellChecker) TurkishMorphology(zemberek.morphology.TurkishMorphology)

Example 37 with TurkishMorphology

use of zemberek.morphology.TurkishMorphology in project zemberek-nlp by ahmetaa.

the class DistanceBasedStemmer method load.

public static DistanceBasedStemmer load(Path vector, Path distances, Path vocabFile) throws IOException {
    Log.info("Loading vector file.");
    List<WordVector> wordVectors = WordVector.loadFromBinary(vector);
    Map<String, WordVector> map = new HashMap<>(wordVectors.size());
    for (WordVector wordVector : wordVectors) {
        map.put(wordVector.word, wordVector);
    }
    Log.info("Loading distances.");
    DistanceList experiment = DistanceList.readFromBinary(distances, vocabFile);
    TurkishMorphology morphology = TurkishMorphology.createWithDefaults();
    return new DistanceBasedStemmer(map, experiment, morphology);
}
Also used : HashMap(java.util.HashMap) TurkishMorphology(zemberek.morphology.TurkishMorphology)

Example 38 with TurkishMorphology

use of zemberek.morphology.TurkishMorphology in project zemberek-nlp by ahmetaa.

the class StemmingAndLemmatization method main.

public static void main(String[] args) {
    TurkishMorphology morphology = TurkishMorphology.createWithDefaults();
    String word = "kutucuğumuz";
    Log.info("Word = " + word);
    Log.info("Results: ");
    WordAnalysis results = morphology.analyze(word);
    for (SingleAnalysis result : results) {
        Log.info(result.formatLong());
        Log.info("\tStems = " + result.getStems());
        Log.info("\tLemmas = " + result.getLemmas());
    }
}
Also used : SingleAnalysis(zemberek.morphology.analysis.SingleAnalysis) WordAnalysis(zemberek.morphology.analysis.WordAnalysis) TurkishMorphology(zemberek.morphology.TurkishMorphology)

Example 39 with TurkishMorphology

use of zemberek.morphology.TurkishMorphology in project zemberek-nlp by ahmetaa.

the class UseNer method main.

public static void main(String[] args) throws IOException {
    // assumes you generated a model in my-model directory.
    Path modelRoot = Paths.get("my-model");
    TurkishMorphology morphology = TurkishMorphology.createWithDefaults();
    PerceptronNer ner = PerceptronNer.loadModel(modelRoot, morphology);
    String sentence = "Ali Kaan yarın İstanbul'a gidecek.";
    NerSentence result = ner.findNamedEntities(sentence);
    List<NamedEntity> namedEntities = result.getNamedEntities();
    for (NamedEntity namedEntity : namedEntities) {
        System.out.println(namedEntity);
    }
}
Also used : Path(java.nio.file.Path) NerSentence(zemberek.ner.NerSentence) NamedEntity(zemberek.ner.NamedEntity) PerceptronNer(zemberek.ner.PerceptronNer) TurkishMorphology(zemberek.morphology.TurkishMorphology)

Example 40 with TurkishMorphology

use of zemberek.morphology.TurkishMorphology in project zemberek-nlp by ahmetaa.

the class AmbiguityResolutionTests method issue157ShouldNotThrowNPE.

@Test
public void issue157ShouldNotThrowNPE() {
    String input = "Yıldız Kızlar Dünya Şampiyonası FIVB'nin düzenlediği ve 18 " + "yaşının altındaki voleybolcuların katılabildiği bir şampiyonadır.";
    TurkishMorphology morphology = TurkishMorphology.createWithDefaults();
    SentenceAnalysis analysis = morphology.analyzeAndDisambiguate(input);
    Assert.assertEquals(TurkishTokenizer.DEFAULT.tokenize(input).size(), analysis.size());
    for (SentenceWordAnalysis sentenceWordAnalysis : analysis) {
        String token = sentenceWordAnalysis.getWordAnalysis().getInput();
        SingleAnalysis an = sentenceWordAnalysis.getBestAnalysis();
        System.out.println(token + " = " + an.formatLong());
    }
}
Also used : SingleAnalysis(zemberek.morphology.analysis.SingleAnalysis) SentenceAnalysis(zemberek.morphology.analysis.SentenceAnalysis) TurkishMorphology(zemberek.morphology.TurkishMorphology) SentenceWordAnalysis(zemberek.morphology.analysis.SentenceWordAnalysis) Test(org.junit.Test)

Aggregations

TurkishMorphology (zemberek.morphology.TurkishMorphology)87 Test (org.junit.Test)38 Path (java.nio.file.Path)34 ArrayList (java.util.ArrayList)23 SingleAnalysis (zemberek.morphology.analysis.SingleAnalysis)23 WordAnalysis (zemberek.morphology.analysis.WordAnalysis)23 Ignore (org.junit.Ignore)21 DictionaryItem (zemberek.morphology.lexicon.DictionaryItem)15 LinkedHashSet (java.util.LinkedHashSet)13 PrintWriter (java.io.PrintWriter)10 SentenceAnalysis (zemberek.morphology.analysis.SentenceAnalysis)10 Stopwatch (com.google.common.base.Stopwatch)8 Histogram (zemberek.core.collections.Histogram)8 Token (zemberek.tokenization.Token)8 HashSet (java.util.HashSet)7 SentenceWordAnalysis (zemberek.morphology.analysis.SentenceWordAnalysis)7 TurkishTokenizer (zemberek.tokenization.TurkishTokenizer)7 ScoredItem (zemberek.core.ScoredItem)6 IOException (java.io.IOException)5 BlockTextLoader (zemberek.core.text.BlockTextLoader)5