Search in sources :

Example 1 with PatternTokenBuilder

use of org.languagetool.rules.patterns.PatternTokenBuilder in project languagetool by languagetool-org.

the class SpellingCheckRule method getTokensForSentenceStart.

private List<PatternToken> getTokensForSentenceStart(String[] parts) {
    List<PatternToken> ucPatternTokens = new ArrayList<>();
    int j = 0;
    for (String part : parts) {
        if (j == 0) {
            // at sentence start, we also need to accept a phrase that starts with an uppercase char:
            String uppercased = StringTools.uppercaseFirstChar(part);
            ucPatternTokens.add(new PatternTokenBuilder().posRegex(JLanguageTool.SENTENCE_START_TAGNAME).build());
            ucPatternTokens.add(new PatternTokenBuilder().csToken(uppercased).build());
        } else {
            ucPatternTokens.add(new PatternTokenBuilder().csToken(part).build());
        }
        j++;
    }
    return ucPatternTokens;
}
Also used : PatternToken(org.languagetool.rules.patterns.PatternToken) PatternTokenBuilder(org.languagetool.rules.patterns.PatternTokenBuilder)

Example 2 with PatternTokenBuilder

use of org.languagetool.rules.patterns.PatternTokenBuilder in project languagetool by languagetool-org.

the class SpellingCheckRule method acceptPhrases.

/**
   * Accept (case-sensitively, unless at the start of a sentence) the given phrases even though they
   * are not in the built-in dictionary.
   * Use this to avoid false alarms on e.g. names and technical terms. Unlike {@link #addIgnoreTokens(List)}
   * this can deal with phrases. A way to call this is like this:
   * <code>rule.acceptPhrases(Arrays.asList("duodenal atresia"))</code>
   * This way, checking would not create an error for "duodenal atresia", but it would still
   * create and error for "duodenal" or "atresia" if they appear on their own.
   * @since 3.3
   */
public void acceptPhrases(List<String> phrases) {
    List<List<PatternToken>> antiPatterns = new ArrayList<>();
    for (String phrase : phrases) {
        String[] parts = phrase.split(" ");
        List<PatternToken> patternTokens = new ArrayList<>();
        int i = 0;
        boolean startsLowercase = false;
        for (String part : parts) {
            if (i == 0) {
                String uppercased = StringTools.uppercaseFirstChar(part);
                if (!uppercased.equals(part)) {
                    startsLowercase = true;
                }
            }
            patternTokens.add(new PatternTokenBuilder().csToken(part).build());
            i++;
        }
        antiPatterns.add(patternTokens);
        if (startsLowercase) {
            antiPatterns.add(getTokensForSentenceStart(parts));
        }
    }
    this.antiPatterns = makeAntiPatterns(antiPatterns, language);
}
Also used : PatternToken(org.languagetool.rules.patterns.PatternToken) PatternTokenBuilder(org.languagetool.rules.patterns.PatternTokenBuilder)

Aggregations

PatternToken (org.languagetool.rules.patterns.PatternToken)2 PatternTokenBuilder (org.languagetool.rules.patterns.PatternTokenBuilder)2