Search in sources :

Example 1 with SuffixData

use of zemberek.morphology.lexicon.graph.SuffixData in project zemberek-nlp by ahmetaa.

the class StemNodeGenerator method generateModifiedRootNodes.

private StemNode[] generateModifiedRootNodes(DictionaryItem dicItem) {
    if (dicItem.hasAttribute(Special)) {
        return handleSpecialStems(dicItem);
    }
    TurkishLetterSequence modifiedSeq = new TurkishLetterSequence(dicItem.pronunciation, alphabet);
    EnumSet<PhoneticAttribute> originalAttrs = calculateAttributes(dicItem.pronunciation);
    EnumSet<PhoneticAttribute> modifiedAttrs = originalAttrs.clone();
    EnumSet<PhoneticExpectation> originalExpectations = EnumSet.noneOf(PhoneticExpectation.class);
    EnumSet<PhoneticExpectation> modifiedExpectations = EnumSet.noneOf(PhoneticExpectation.class);
    for (RootAttribute attribute : dicItem.attributes) {
        // generate other boundary attributes and modified root state.
        switch(attribute) {
            case Voicing:
                TurkicLetter last = modifiedSeq.lastLetter();
                TurkicLetter modifiedLetter = alphabet.voice(last);
                if (modifiedLetter == null) {
                    throw new LexiconException("Voicing letter is not proper in:" + dicItem);
                }
                if (dicItem.lemma.endsWith("nk")) {
                    modifiedLetter = TurkishAlphabet.L_g;
                }
                modifiedSeq.changeLetter(modifiedSeq.length() - 1, modifiedLetter);
                modifiedAttrs.remove(PhoneticAttribute.LastLetterVoicelessStop);
                originalExpectations.add(PhoneticExpectation.ConsonantStart);
                modifiedExpectations.add(PhoneticExpectation.VowelStart);
                break;
            case Doubling:
                modifiedSeq.append(modifiedSeq.lastLetter());
                originalExpectations.add(PhoneticExpectation.ConsonantStart);
                modifiedExpectations.add(PhoneticExpectation.VowelStart);
                break;
            case LastVowelDrop:
                if (modifiedSeq.lastLetter().isVowel()) {
                    modifiedSeq.delete(modifiedSeq.length() - 1);
                    modifiedExpectations.add(PhoneticExpectation.ConsonantStart);
                } else {
                    modifiedSeq.delete(modifiedSeq.length() - 2);
                    if (!dicItem.primaryPos.equals(PrimaryPos.Verb)) {
                        originalExpectations.add(PhoneticExpectation.ConsonantStart);
                    }
                    modifiedExpectations.add(PhoneticExpectation.VowelStart);
                }
                break;
            case InverseHarmony:
                originalAttrs.add(PhoneticAttribute.LastVowelFrontal);
                originalAttrs.remove(PhoneticAttribute.LastVowelBack);
                modifiedAttrs.add(PhoneticAttribute.LastVowelFrontal);
                modifiedAttrs.remove(PhoneticAttribute.LastVowelBack);
                break;
            case ProgressiveVowelDrop:
                modifiedSeq.delete(modifiedSeq.length() - 1);
                if (modifiedSeq.hasVowel()) {
                    modifiedAttrs = calculateAttributes(modifiedSeq);
                }
                break;
            default:
                break;
        }
    }
    StemNode original = new StemNode(dicItem.root, dicItem, originalAttrs, originalExpectations);
    StemNode modified = new StemNode(modifiedSeq.toString(), dicItem, modifiedAttrs, modifiedExpectations);
    SuffixData[] roots = suffixProvider.defineSuccessorSuffixes(dicItem);
    original.exclusiveSuffixData = roots[0];
    modified.exclusiveSuffixData = roots[1];
    if (original.equals(modified)) {
        return new StemNode[] { original };
    }
    modified.setTermination(TerminationType.NON_TERMINAL);
    if (dicItem.hasAttribute(RootAttribute.CompoundP3sgRoot)) {
        original.setTermination(TerminationType.NON_TERMINAL);
    }
    return new StemNode[] { original, modified };
}
Also used : RootAttribute(zemberek.core.turkish.RootAttribute) TurkicLetter(zemberek.core.turkish.TurkicLetter) PhoneticExpectation(zemberek.core.turkish.PhoneticExpectation) TurkishLetterSequence(zemberek.core.turkish.TurkishLetterSequence) LexiconException(zemberek.morphology.lexicon.LexiconException) StemNode(zemberek.morphology.lexicon.graph.StemNode) PhoneticAttribute(zemberek.core.turkish.PhoneticAttribute) SuffixData(zemberek.morphology.lexicon.graph.SuffixData)

Example 2 with SuffixData

use of zemberek.morphology.lexicon.graph.SuffixData in project zemberek-nlp by ahmetaa.

the class StemNodeGenerator method generate.

/**
 * Generates StemNode objects from the dictionary item.
 * <p>Most of the time a single StemNode is generated.
 *
 * @param item DictionaryItem
 * @return one or more StemNode objects.
 */
public StemNode[] generate(DictionaryItem item) {
    if (hasModifierAttribute(item)) {
        return generateModifiedRootNodes(item);
    } else {
        SuffixData[] roots = suffixProvider.defineSuccessorSuffixes(item);
        EnumSet<PhoneticAttribute> phoneticAttributes = calculateAttributes(item.pronunciation);
        StemNode stemNode = new StemNode(item.root, item, TerminationType.TERMINAL, phoneticAttributes, EnumSet.noneOf(PhoneticExpectation.class));
        stemNode.exclusiveSuffixData = roots[0];
        return new StemNode[] { stemNode };
    }
}
Also used : PhoneticExpectation(zemberek.core.turkish.PhoneticExpectation) StemNode(zemberek.morphology.lexicon.graph.StemNode) SuffixData(zemberek.morphology.lexicon.graph.SuffixData) PhoneticAttribute(zemberek.core.turkish.PhoneticAttribute)

Example 3 with SuffixData

use of zemberek.morphology.lexicon.graph.SuffixData in project zemberek-nlp by ahmetaa.

the class TurkishSuffixes method defineSuccessorSuffixes.

@Override
public SuffixData[] defineSuccessorSuffixes(DictionaryItem item) {
    SuffixData original = new SuffixData();
    SuffixData modified = new SuffixData();
    PrimaryPos primaryPos = item.primaryPos;
    switch(primaryPos) {
        case Verb:
            getForVerb(item, original, modified);
            break;
        default:
            break;
    }
    return new SuffixData[] { original, modified };
}
Also used : PrimaryPos(zemberek.core.turkish.PrimaryPos) SuffixData(zemberek.morphology.lexicon.graph.SuffixData)

Example 4 with SuffixData

use of zemberek.morphology.lexicon.graph.SuffixData in project zemberek-nlp by ahmetaa.

the class NullSuffixForm method copy.

public NullSuffixForm copy() {
    NullSuffixForm copy = new NullSuffixForm(index, id, template, terminationType);
    copy.connections = new SuffixData(this.connections);
    copy.indirectConnections = new SuffixData(this.indirectConnections);
    return copy;
}
Also used : SuffixData(zemberek.morphology.lexicon.graph.SuffixData)

Aggregations

SuffixData (zemberek.morphology.lexicon.graph.SuffixData)4 PhoneticAttribute (zemberek.core.turkish.PhoneticAttribute)2 PhoneticExpectation (zemberek.core.turkish.PhoneticExpectation)2 StemNode (zemberek.morphology.lexicon.graph.StemNode)2 PrimaryPos (zemberek.core.turkish.PrimaryPos)1 RootAttribute (zemberek.core.turkish.RootAttribute)1 TurkicLetter (zemberek.core.turkish.TurkicLetter)1 TurkishLetterSequence (zemberek.core.turkish.TurkishLetterSequence)1 LexiconException (zemberek.morphology.lexicon.LexiconException)1