Search in sources :

Example 51 with TurkishMorphology

use of zemberek.morphology.TurkishMorphology in project zemberek-nlp by ahmetaa.

the class Scripts method foobar.

static void foobar() throws IOException {
    Path path = Paths.get("/home/aaa/projects/zemberek-nlp/morphology/src/main/resources/tr/person-names.dict");
    Path path2 = Paths.get("/home/aaa/projects/zemberek-nlp/morphology/src/main/resources/tr/person-names-reduced.dict");
    List<String> bb = Files.readAllLines(path);
    TurkishMorphology morphology = TurkishMorphology.create(RootLexicon.builder().addTextDictionaryResources("tr/master-dictionary.dict", "tr/non-tdk.dict", "tr/proper.dict", "tr/proper-from-corpus.dict", "tr/abbreviations.dict").build());
    List<String> r = new ArrayList<>();
    for (String s : bb) {
        if (s.trim().length() == 0) {
            continue;
        }
        s = s.replaceAll("[ ]+", " ").trim();
        DictionaryItem d = TurkishDictionaryLoader.loadFromString(s);
        if (!morphology.getLexicon().containsItem(d)) {
            r.add(s.trim());
        }
    }
    r.sort(Turkish.STRING_COMPARATOR_ASC);
    Files.write(path2, r);
}
Also used : Path(java.nio.file.Path) DictionaryItem(zemberek.morphology.lexicon.DictionaryItem) ArrayList(java.util.ArrayList) TurkishMorphology(zemberek.morphology.TurkishMorphology)

Example 52 with TurkishMorphology

use of zemberek.morphology.TurkishMorphology in project zemberek-nlp by ahmetaa.

the class AddNewDictionaryItem method main.

public static void main(String[] args) throws IOException {
    TurkishMorphology morphology = TurkishMorphology.createWithDefaults();
    AddNewDictionaryItem app = new AddNewDictionaryItem(morphology);
    Log.info("Proper Noun Test - 1 :");
    app.test("Meydan'a", new DictionaryItem("Meydan", "meydan", "meydan", PrimaryPos.Noun, SecondaryPos.ProperNoun));
    Log.info("----");
    Log.info("Proper Noun Test - 2 :");
    app.test("Meeeydan'a", new DictionaryItem("Meeeydan", "meeeydan", "meeeydan", PrimaryPos.Noun, SecondaryPos.ProperNoun));
    Log.info("----");
    Log.info("Verb Test : ");
    app.test("tweetleyeyazdım", new DictionaryItem("tweetlemek", "tweetle", "tivitle", PrimaryPos.Verb, SecondaryPos.None));
}
Also used : DictionaryItem(zemberek.morphology.lexicon.DictionaryItem) TurkishMorphology(zemberek.morphology.TurkishMorphology)

Example 53 with TurkishMorphology

use of zemberek.morphology.TurkishMorphology in project zemberek-nlp by ahmetaa.

the class AnalyzeIgnoreDiacritics method main.

public static void main(String[] args) throws IOException {
    TurkishMorphology morphology = TurkishMorphology.builder().ignoreDiacriticsInAnalysis().setLexicon(RootLexicon.getDefault()).build();
    morphology.analyze("kisi").forEach(System.out::println);
}
Also used : TurkishMorphology(zemberek.morphology.TurkishMorphology)

Example 54 with TurkishMorphology

use of zemberek.morphology.TurkishMorphology in project zemberek-nlp by ahmetaa.

the class ChangeStem method main.

public static void main(String[] args) {
    TurkishMorphology morphology = TurkishMorphology.createWithDefaults();
    DictionaryItem newStem = morphology.getLexicon().getMatchingItems("poğaça").get(0);
    String word = "simidime";
    Log.info("Input Word = " + word);
    WordAnalysis results = morphology.analyze(word);
    for (SingleAnalysis result : results) {
        List<Result> generated = morphology.getWordGenerator().generate(newStem, result.getMorphemes());
        for (Result s : generated) {
            Log.info("Input analysis: " + result.formatLong());
            Log.info("After stem change, word = " + s.surface);
            Log.info("After stem change, Analysis = " + s.analysis.formatLong());
        }
    }
}
Also used : DictionaryItem(zemberek.morphology.lexicon.DictionaryItem) SingleAnalysis(zemberek.morphology.analysis.SingleAnalysis) WordAnalysis(zemberek.morphology.analysis.WordAnalysis) TurkishMorphology(zemberek.morphology.TurkishMorphology) Result(zemberek.morphology.generator.WordGenerator.Result)

Example 55 with TurkishMorphology

use of zemberek.morphology.TurkishMorphology in project zemberek-nlp by ahmetaa.

the class FindPOS method main.

public static void main(String[] args) {
    TurkishMorphology morphology = TurkishMorphology.createWithDefaults();
    String sentence = "Keşke yarın hava güzel olsa.";
    Log.info("Sentence  = " + sentence);
    SentenceAnalysis analysis = morphology.analyzeAndDisambiguate(sentence);
    for (SentenceWordAnalysis a : analysis) {
        PrimaryPos primaryPos = a.getBestAnalysis().getPos();
        Log.info("%s : %s ", a.getWordAnalysis().getInput(), primaryPos);
    }
}
Also used : PrimaryPos(zemberek.core.turkish.PrimaryPos) SentenceAnalysis(zemberek.morphology.analysis.SentenceAnalysis) TurkishMorphology(zemberek.morphology.TurkishMorphology) SentenceWordAnalysis(zemberek.morphology.analysis.SentenceWordAnalysis)

Aggregations

TurkishMorphology (zemberek.morphology.TurkishMorphology)87 Test (org.junit.Test)38 Path (java.nio.file.Path)34 ArrayList (java.util.ArrayList)23 SingleAnalysis (zemberek.morphology.analysis.SingleAnalysis)23 WordAnalysis (zemberek.morphology.analysis.WordAnalysis)23 Ignore (org.junit.Ignore)21 DictionaryItem (zemberek.morphology.lexicon.DictionaryItem)15 LinkedHashSet (java.util.LinkedHashSet)13 PrintWriter (java.io.PrintWriter)10 SentenceAnalysis (zemberek.morphology.analysis.SentenceAnalysis)10 Stopwatch (com.google.common.base.Stopwatch)8 Histogram (zemberek.core.collections.Histogram)8 Token (zemberek.tokenization.Token)8 HashSet (java.util.HashSet)7 SentenceWordAnalysis (zemberek.morphology.analysis.SentenceWordAnalysis)7 TurkishTokenizer (zemberek.tokenization.TurkishTokenizer)7 ScoredItem (zemberek.core.ScoredItem)6 IOException (java.io.IOException)5 BlockTextLoader (zemberek.core.text.BlockTextLoader)5