Examples with Recognizer - org.antlr.v4.runtime.Recognizer

Example 36 with Recognizer

use of org.antlr.v4.runtime.Recognizer in project antlr4 by antlr.

the class DefaultErrorStrategy method sync.

/**
 * The default implementation of {@link ANTLRErrorStrategy#sync} makes sure
 * that the current lookahead symbol is consistent with what were expecting
 * at this point in the ATN. You can call this anytime but ANTLR only
 * generates code to check before subrules/loops and each iteration.
 *
 * <p>Implements Jim Idle's magic sync mechanism in closures and optional
 * subrules. E.g.,</p>
 *
 * <pre>
 * a : sync ( stuff sync )* ;
 * sync : {consume to what can follow sync} ;
 * </pre>
 *
 * At the start of a sub rule upon error, {@link #sync} performs single
 * token deletion, if possible. If it can't do that, it bails on the current
 * rule and uses the default error recovery, which consumes until the
 * resynchronization set of the current rule.
 *
 * <p>If the sub rule is optional ({@code (...)?}, {@code (...)*}, or block
 * with an empty alternative), then the expected set includes what follows
 * the subrule.</p>
 *
 * <p>During loop iteration, it consumes until it sees a token that can start a
 * sub rule or what follows loop. Yes, that is pretty aggressive. We opt to
 * stay in the loop as long as possible.</p>
 *
 * <p><strong>ORIGINS</strong></p>
 *
 * <p>Previous versions of ANTLR did a poor job of their recovery within loops.
 * A single mismatch token or missing token would force the parser to bail
 * out of the entire rules surrounding the loop. So, for rule</p>
 *
 * <pre>
 * classDef : 'class' ID '{' member* '}'
 * </pre>
 *
 * input with an extra token between members would force the parser to
 * consume until it found the next class definition rather than the next
 * member definition of the current class.
 *
 * <p>This functionality cost a little bit of effort because the parser has to
 * compare token set at the start of the loop and at each iteration. If for
 * some reason speed is suffering for you, you can turn off this
 * functionality by simply overriding this method as a blank { }.</p>
 */
@Override
public void sync(Parser recognizer) throws RecognitionException {
    ATNState s = recognizer.getInterpreter().atn.states.get(recognizer.getState());
    // If already recovering, don't try to sync
    if (inErrorRecoveryMode(recognizer)) {
        return;
    }
    TokenStream tokens = recognizer.getInputStream();
    int la = tokens.LA(1);
    // try cheaper subset first; might get lucky. seems to shave a wee bit off
    IntervalSet nextTokens = recognizer.getATN().nextTokens(s);
    if (nextTokens.contains(la)) {
        // We are sure the token matches
        nextTokensContext = null;
        nextTokensState = ATNState.INVALID_STATE_NUMBER;
        return;
    }
    if (nextTokens.contains(Token.EPSILON)) {
        if (nextTokensContext == null) {
            // It's possible the next token won't match; information tracked
            // by sync is restricted for performance.
            nextTokensContext = recognizer.getContext();
            nextTokensState = recognizer.getState();
        }
        return;
    }
    switch(s.getStateType()) {
        case ATNState.BLOCK_START:
        case ATNState.STAR_BLOCK_START:
        case ATNState.PLUS_BLOCK_START:
        case ATNState.STAR_LOOP_ENTRY:
            // report error and recover if possible
            if (singleTokenDeletion(recognizer) != null) {
                return;
            }
            throw new InputMismatchException(recognizer);
        case ATNState.PLUS_LOOP_BACK:
        case ATNState.STAR_LOOP_BACK:
            // System.err.println("at loop back: "+s.getClass().getSimpleName());
            reportUnwantedToken(recognizer);
            IntervalSet expecting = recognizer.getExpectedTokens();
            IntervalSet whatFollowsLoopIterationOrRule = expecting.or(getErrorRecoverySet(recognizer));
            consumeUntil(recognizer, whatFollowsLoopIterationOrRule);
            break;
        default:
            // do nothing if we can't identify the exact kind of ATN state
            break;
    }
}

Also used : IntervalSet(org.antlr.v4.runtime.misc.IntervalSet) ATNState(org.antlr.v4.runtime.atn.ATNState)

Example 37 with Recognizer

use of org.antlr.v4.runtime.Recognizer in project batfish by batfish.

the class BatfishANTLRErrorStrategy method sync.

@Override
public void sync(Parser recognizer) throws RecognitionException {
    /*
     * BEGIN: Copied from super
     */
    ATNState s = recognizer.getInterpreter().atn.states.get(recognizer.getState());
    if (inErrorRecoveryMode(recognizer)) {
        return;
    }
    TokenStream tokens = recognizer.getInputStream();
    int la = tokens.LA(1);
    IntervalSet nextTokens = recognizer.getATN().nextTokens(s);
    if (nextTokens.contains(Token.EPSILON) || nextTokens.contains(la)) {
        return;
    }
    /*
     * END: Copied from super
     */
    boolean topLevel = recognizer.getContext().parent == null;
    switch(s.getStateType()) {
        case ATNState.BLOCK_START:
        case ATNState.STAR_BLOCK_START:
        case ATNState.PLUS_BLOCK_START:
        case ATNState.STAR_LOOP_ENTRY:
        case ATNState.PLUS_LOOP_BACK:
        case ATNState.STAR_LOOP_BACK:
            if (topLevel) {
                /*
           * When at top level, we cannot pop up. So consume every "line" until we have one that
           * starts with a token acceptable at the top level.
           */
                reportUnwantedToken(recognizer);
                consumeBlocksUntilWanted(recognizer);
                return;
            } else {
                /*
           * If not at the top level, error out to pop up a level. This may repeat until the next
           * token is acceptable at the given level.
           */
                throw new InputMismatchException(recognizer);
            }
        default:
            return;
    }
}

Also used : TokenStream(org.antlr.v4.runtime.TokenStream) IntervalSet(org.antlr.v4.runtime.misc.IntervalSet) InputMismatchException(org.antlr.v4.runtime.InputMismatchException) ATNState(org.antlr.v4.runtime.atn.ATNState)

Example 38 with Recognizer

use of org.antlr.v4.runtime.Recognizer in project batfish by batfish.

the class BatfishANTLRErrorStrategy method createErrorNode.

/**
 * Create an error node with the text of the current line and insert it into parse tree
 *
 * @param recognizer The recognizer with which to create the error node
 * @param separator The token that ends the unrecognized link. This is also used to determine the
 *     index of the line to return in error messages.
 * @return The token contained in the error node
 */
private Token createErrorNode(Parser recognizer, ParserRuleContext ctx, Token separator) {
    if (_recoveredAtEof) {
        _recoveredAtEof = false;
        throw new BatfishRecognitionException(recognizer, recognizer.getInputStream(), ctx);
    }
    if (separator.getType() == Lexer.EOF) {
        _recoveredAtEof = true;
    }
    String lineText = _lines[separator.getLine() - 1] + separator.getText();
    Token lineToken = recognizer.getTokenFactory().create(new Pair<>(null, null), BatfishLexer.UNRECOGNIZED_LINE_TOKEN, lineText, Lexer.DEFAULT_TOKEN_CHANNEL, -1, -1, separator.getLine(), 0);
    ErrorNode errorNode = recognizer.createErrorNode(ctx, lineToken);
    ctx.addErrorNode(errorNode);
    return lineToken;
}

Also used : ErrorNode(org.antlr.v4.runtime.tree.ErrorNode) Token(org.antlr.v4.runtime.Token)

Example 39 with Recognizer

use of org.antlr.v4.runtime.Recognizer in project batfish by batfish.

the class BatfishANTLRErrorStrategy method recoverInCurrentNode.

/**
 * Recover from adaptive prediction failure (when more than one token is needed for rule
 * prediction, and the first token by itself is insufficient to determine an error has occured) by
 * throwing away lines until adaptive prediction succeeds or there is nothing left to throw away.
 * Each discarded line is inserted as a child of the current rule as an {@link ErrorNode}.
 *
 * @param recognizer The {@link Parser} for whom adaptive prediction has failed
 */
public void recoverInCurrentNode(Parser recognizer) {
    beginErrorCondition(recognizer);
    lastErrorIndex = recognizer.getInputStream().index();
    if (lastErrorStates == null) {
        lastErrorStates = new IntervalSet();
    }
    lastErrorStates.add(recognizer.getState());
    consumeUntilEndOfLine(recognizer);
    // Get the line number and separator text from the separator token
    Token separatorToken = recognizer.getCurrentToken();
    ParserRuleContext ctx = recognizer.getContext();
    recognizer.consume();
    createErrorNode(recognizer, ctx, separatorToken);
    endErrorCondition(recognizer);
    if (recognizer.getInputStream().LA(1) == Lexer.EOF) {
        recover(recognizer);
    }
}

Also used : ParserRuleContext(org.antlr.v4.runtime.ParserRuleContext) IntervalSet(org.antlr.v4.runtime.misc.IntervalSet) Token(org.antlr.v4.runtime.Token)

Example 40 with Recognizer

use of org.antlr.v4.runtime.Recognizer in project batfish by batfish.

the class BatfishLexerErrorListener method syntaxError.

@Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
    if (!_settings.getDisableUnrecognized()) {
        return;
    }
    StringBuilder sb = new StringBuilder();
    BatfishParser parser = _combinedParser.getParser();
    BatfishLexer lexer = _combinedParser.getLexer();
    List<String> ruleNames = Arrays.asList(parser.getRuleNames());
    ParserRuleContext ctx = parser.getContext();
    String ruleStack = ctx.toString(ruleNames);
    sb.append("lexer: " + _grammarName + ": line " + line + ":" + charPositionInLine + ": " + msg + "\n");
    sb.append("Current rule stack: '" + ruleStack + "'.\n");
    if (ctx.getStart() != null) {
        sb.append("Current rule starts at: line: " + ctx.getStart().getLine() + ", col " + ctx.getStart().getCharPositionInLine() + "\n");
    }
    sb.append("Parse tree for current rule:\n");
    sb.append(ParseTreePrettyPrinter.print(ctx, _combinedParser) + "\n");
    sb.append("Lexer mode: " + lexer.getMode() + "\n");
    sb.append("Lexer state variables:\n");
    sb.append(lexer.printStateVariables());
    // collect context from text
    String text = _combinedParser.getInput();
    String[] lines = text.split("\n", -1);
    int errorLineIndex = line - 1;
    int errorContextStartLine = Math.max(errorLineIndex - _settings.getMaxParserContextLines(), 0);
    int errorContextEndLine = Math.min(errorLineIndex + _settings.getMaxParserContextLines(), lines.length);
    sb.append("Error context lines:\n");
    for (int i = errorContextStartLine; i < errorLineIndex; i++) {
        sb.append(String.format("%-11s%s\n", "   " + (i + 1) + ":", lines[i]));
    }
    sb.append(String.format("%-11s%s\n", ">>>" + (errorLineIndex + 1) + ":", lines[errorLineIndex]));
    for (int i = errorLineIndex + 1; i <= errorContextEndLine && i < lines.length; i++) {
        sb.append(String.format("%-11s%s\n", "   " + (i + 1) + ":", lines[i]));
    }
    String error = sb.toString();
    if (_settings.getThrowOnLexerError()) {
        throw new DebugBatfishException("\n" + error);
    } else {
        _combinedParser.getErrors().add(error);
    }
}

Also used : ParserRuleContext(org.antlr.v4.runtime.ParserRuleContext) DebugBatfishException(org.batfish.common.DebugBatfishException)

Aggregations

IntervalSet (org.antlr.v4.runtime.misc.IntervalSet)24 Token (org.antlr.v4.runtime.Token)22 RecognitionException (org.antlr.v4.runtime.RecognitionException)19 CommonTokenStream (org.antlr.v4.runtime.CommonTokenStream)15 File (java.io.File)11 ParserRuleContext (org.antlr.v4.runtime.ParserRuleContext)10 BaseRuntimeTest.antlrOnString (org.antlr.v4.test.runtime.BaseRuntimeTest.antlrOnString)10 ATNState (org.antlr.v4.runtime.atn.ATNState)9 IOException (java.io.IOException)8 BaseErrorListener (org.antlr.v4.runtime.BaseErrorListener)8 Parser (org.antlr.v4.runtime.Parser)8 BaseRuntimeTest.writeFile (org.antlr.v4.test.runtime.BaseRuntimeTest.writeFile)8 ArrayList (java.util.ArrayList)7 ATN (org.antlr.v4.runtime.atn.ATN)6 Pair (com.abubusoft.kripton.common.Pair)5 InputMismatchException (org.antlr.v4.runtime.InputMismatchException)5 TokenStream (org.antlr.v4.runtime.TokenStream)5 BeetlException (org.beetl.core.exception.BeetlException)5 STGroupString (org.stringtemplate.v4.STGroupString)5 CommonToken (org.antlr.v4.runtime.CommonToken)4