Search in sources :

Example 51 with ATNState

use of org.antlr.v4.runtime.atn.ATNState in project antlr4 by antlr.

the class Parser method isExpectedToken.

/**
 * Checks whether or not {@code symbol} can follow the current state in the
 * ATN. The behavior of this method is equivalent to the following, but is
 * implemented such that the complete context-sensitive follow set does not
 * need to be explicitly constructed.
 *
 * <pre>
 * return getExpectedTokens().contains(symbol);
 * </pre>
 *
 * @param symbol the symbol type to check
 * @return {@code true} if {@code symbol} can follow the current state in
 * the ATN, otherwise {@code false}.
 */
public boolean isExpectedToken(int symbol) {
    // return getInterpreter().atn.nextTokens(_ctx);
    ATN atn = getInterpreter().atn;
    ParserRuleContext ctx = _ctx;
    ATNState s = atn.states.get(getState());
    IntervalSet following = atn.nextTokens(s);
    if (following.contains(symbol)) {
        return true;
    }
    // System.out.println("following "+s+"="+following);
    if (!following.contains(Token.EPSILON))
        return false;
    while (ctx != null && ctx.invokingState >= 0 && following.contains(Token.EPSILON)) {
        ATNState invokingState = atn.states.get(ctx.invokingState);
        RuleTransition rt = (RuleTransition) invokingState.transition(0);
        following = atn.nextTokens(rt.followState);
        if (following.contains(symbol)) {
            return true;
        }
        ctx = (ParserRuleContext) ctx.parent;
    }
    if (following.contains(Token.EPSILON) && symbol == Token.EOF) {
        return true;
    }
    return false;
}
Also used : IntervalSet(org.antlr.v4.runtime.misc.IntervalSet) RuleTransition(org.antlr.v4.runtime.atn.RuleTransition) ATN(org.antlr.v4.runtime.atn.ATN) ATNState(org.antlr.v4.runtime.atn.ATNState)

Example 52 with ATNState

use of org.antlr.v4.runtime.atn.ATNState in project antlr4 by antlr.

the class DefaultErrorStrategy method getErrorRecoverySet.

/*  Compute the error recovery set for the current rule.  During
	 *  rule invocation, the parser pushes the set of tokens that can
	 *  follow that rule reference on the stack; this amounts to
	 *  computing FIRST of what follows the rule reference in the
	 *  enclosing rule. See LinearApproximator.FIRST().
	 *  This local follow set only includes tokens
	 *  from within the rule; i.e., the FIRST computation done by
	 *  ANTLR stops at the end of a rule.
	 *
	 *  EXAMPLE
	 *
	 *  When you find a "no viable alt exception", the input is not
	 *  consistent with any of the alternatives for rule r.  The best
	 *  thing to do is to consume tokens until you see something that
	 *  can legally follow a call to r *or* any rule that called r.
	 *  You don't want the exact set of viable next tokens because the
	 *  input might just be missing a token--you might consume the
	 *  rest of the input looking for one of the missing tokens.
	 *
	 *  Consider grammar:
	 *
	 *  a : '[' b ']'
	 *    | '(' b ')'
	 *    ;
	 *  b : c '^' INT ;
	 *  c : ID
	 *    | INT
	 *    ;
	 *
	 *  At each rule invocation, the set of tokens that could follow
	 *  that rule is pushed on a stack.  Here are the various
	 *  context-sensitive follow sets:
	 *
	 *  FOLLOW(b1_in_a) = FIRST(']') = ']'
	 *  FOLLOW(b2_in_a) = FIRST(')') = ')'
	 *  FOLLOW(c_in_b) = FIRST('^') = '^'
	 *
	 *  Upon erroneous input "[]", the call chain is
	 *
	 *  a -> b -> c
	 *
	 *  and, hence, the follow context stack is:
	 *
	 *  depth     follow set       start of rule execution
	 *    0         <EOF>                    a (from main())
	 *    1          ']'                     b
	 *    2          '^'                     c
	 *
	 *  Notice that ')' is not included, because b would have to have
	 *  been called from a different context in rule a for ')' to be
	 *  included.
	 *
	 *  For error recovery, we cannot consider FOLLOW(c)
	 *  (context-sensitive or otherwise).  We need the combined set of
	 *  all context-sensitive FOLLOW sets--the set of all tokens that
	 *  could follow any reference in the call chain.  We need to
	 *  resync to one of those tokens.  Note that FOLLOW(c)='^' and if
	 *  we resync'd to that token, we'd consume until EOF.  We need to
	 *  sync to context-sensitive FOLLOWs for a, b, and c: {']','^'}.
	 *  In this case, for input "[]", LA(1) is ']' and in the set, so we would
	 *  not consume anything. After printing an error, rule c would
	 *  return normally.  Rule b would not find the required '^' though.
	 *  At this point, it gets a mismatched token error and throws an
	 *  exception (since LA(1) is not in the viable following token
	 *  set).  The rule exception handler tries to recover, but finds
	 *  the same recovery set and doesn't consume anything.  Rule b
	 *  exits normally returning to rule a.  Now it finds the ']' (and
	 *  with the successful match exits errorRecovery mode).
	 *
	 *  So, you can see that the parser walks up the call chain looking
	 *  for the token that was a member of the recovery set.
	 *
	 *  Errors are not generated in errorRecovery mode.
	 *
	 *  ANTLR's error recovery mechanism is based upon original ideas:
	 *
	 *  "Algorithms + Data Structures = Programs" by Niklaus Wirth
	 *
	 *  and
	 *
	 *  "A note on error recovery in recursive descent parsers":
	 *  http://portal.acm.org/citation.cfm?id=947902.947905
	 *
	 *  Later, Josef Grosch had some good ideas:
	 *
	 *  "Efficient and Comfortable Error Recovery in Recursive Descent
	 *  Parsers":
	 *  ftp://www.cocolab.com/products/cocktail/doca4.ps/ell.ps.zip
	 *
	 *  Like Grosch I implement context-sensitive FOLLOW sets that are combined
	 *  at run-time upon error to avoid overhead during parsing.
	 */
protected IntervalSet getErrorRecoverySet(Parser recognizer) {
    ATN atn = recognizer.getInterpreter().atn;
    RuleContext ctx = recognizer._ctx;
    IntervalSet recoverSet = new IntervalSet();
    while (ctx != null && ctx.invokingState >= 0) {
        // compute what follows who invoked us
        ATNState invokingState = atn.states.get(ctx.invokingState);
        RuleTransition rt = (RuleTransition) invokingState.transition(0);
        IntervalSet follow = atn.nextTokens(rt.followState);
        recoverSet.addAll(follow);
        ctx = ctx.parent;
    }
    recoverSet.remove(Token.EPSILON);
    // System.out.println("recover set "+recoverSet.toString(recognizer.getTokenNames()));
    return recoverSet;
}
Also used : IntervalSet(org.antlr.v4.runtime.misc.IntervalSet) RuleTransition(org.antlr.v4.runtime.atn.RuleTransition) ATN(org.antlr.v4.runtime.atn.ATN) ATNState(org.antlr.v4.runtime.atn.ATNState)

Example 53 with ATNState

use of org.antlr.v4.runtime.atn.ATNState in project antlr4 by antlr.

the class DefaultErrorStrategy method sync.

/**
 * The default implementation of {@link ANTLRErrorStrategy#sync} makes sure
 * that the current lookahead symbol is consistent with what were expecting
 * at this point in the ATN. You can call this anytime but ANTLR only
 * generates code to check before subrules/loops and each iteration.
 *
 * <p>Implements Jim Idle's magic sync mechanism in closures and optional
 * subrules. E.g.,</p>
 *
 * <pre>
 * a : sync ( stuff sync )* ;
 * sync : {consume to what can follow sync} ;
 * </pre>
 *
 * At the start of a sub rule upon error, {@link #sync} performs single
 * token deletion, if possible. If it can't do that, it bails on the current
 * rule and uses the default error recovery, which consumes until the
 * resynchronization set of the current rule.
 *
 * <p>If the sub rule is optional ({@code (...)?}, {@code (...)*}, or block
 * with an empty alternative), then the expected set includes what follows
 * the subrule.</p>
 *
 * <p>During loop iteration, it consumes until it sees a token that can start a
 * sub rule or what follows loop. Yes, that is pretty aggressive. We opt to
 * stay in the loop as long as possible.</p>
 *
 * <p><strong>ORIGINS</strong></p>
 *
 * <p>Previous versions of ANTLR did a poor job of their recovery within loops.
 * A single mismatch token or missing token would force the parser to bail
 * out of the entire rules surrounding the loop. So, for rule</p>
 *
 * <pre>
 * classDef : 'class' ID '{' member* '}'
 * </pre>
 *
 * input with an extra token between members would force the parser to
 * consume until it found the next class definition rather than the next
 * member definition of the current class.
 *
 * <p>This functionality cost a little bit of effort because the parser has to
 * compare token set at the start of the loop and at each iteration. If for
 * some reason speed is suffering for you, you can turn off this
 * functionality by simply overriding this method as a blank { }.</p>
 */
@Override
public void sync(Parser recognizer) throws RecognitionException {
    ATNState s = recognizer.getInterpreter().atn.states.get(recognizer.getState());
    // If already recovering, don't try to sync
    if (inErrorRecoveryMode(recognizer)) {
        return;
    }
    TokenStream tokens = recognizer.getInputStream();
    int la = tokens.LA(1);
    // try cheaper subset first; might get lucky. seems to shave a wee bit off
    IntervalSet nextTokens = recognizer.getATN().nextTokens(s);
    if (nextTokens.contains(la)) {
        // We are sure the token matches
        nextTokensContext = null;
        nextTokensState = ATNState.INVALID_STATE_NUMBER;
        return;
    }
    if (nextTokens.contains(Token.EPSILON)) {
        if (nextTokensContext == null) {
            // It's possible the next token won't match; information tracked
            // by sync is restricted for performance.
            nextTokensContext = recognizer.getContext();
            nextTokensState = recognizer.getState();
        }
        return;
    }
    switch(s.getStateType()) {
        case ATNState.BLOCK_START:
        case ATNState.STAR_BLOCK_START:
        case ATNState.PLUS_BLOCK_START:
        case ATNState.STAR_LOOP_ENTRY:
            // report error and recover if possible
            if (singleTokenDeletion(recognizer) != null) {
                return;
            }
            throw new InputMismatchException(recognizer);
        case ATNState.PLUS_LOOP_BACK:
        case ATNState.STAR_LOOP_BACK:
            // System.err.println("at loop back: "+s.getClass().getSimpleName());
            reportUnwantedToken(recognizer);
            IntervalSet expecting = recognizer.getExpectedTokens();
            IntervalSet whatFollowsLoopIterationOrRule = expecting.or(getErrorRecoverySet(recognizer));
            consumeUntil(recognizer, whatFollowsLoopIterationOrRule);
            break;
        default:
            // do nothing if we can't identify the exact kind of ATN state
            break;
    }
}
Also used : IntervalSet(org.antlr.v4.runtime.misc.IntervalSet) ATNState(org.antlr.v4.runtime.atn.ATNState)

Example 54 with ATNState

use of org.antlr.v4.runtime.atn.ATNState in project antlr4 by antlr.

the class TestATNInterpreter method checkMatchedAlt.

public void checkMatchedAlt(LexerGrammar lg, final Grammar g, String inputString, int expected) {
    ATN lexatn = createATN(lg, true);
    LexerATNSimulator lexInterp = new LexerATNSimulator(lexatn, new DFA[] { new DFA(lexatn.modeToStartState.get(Lexer.DEFAULT_MODE)) }, null);
    IntegerList types = getTokenTypesViaATN(inputString, lexInterp);
    // System.out.println(types);
    g.importVocab(lg);
    ParserATNFactory f = new ParserATNFactory(g);
    ATN atn = f.createATN();
    TokenStream input = new MockIntTokenStream(types);
    // System.out.println("input="+input.types);
    ParserInterpreterForTesting interp = new ParserInterpreterForTesting(g, input);
    ATNState startState = atn.ruleToStartState[g.getRule("a").index];
    if (startState.transition(0).target instanceof BlockStartState) {
        startState = startState.transition(0).target;
    }
    DOTGenerator dot = new DOTGenerator(g);
    // System.out.println(dot.getDOT(atn.ruleToStartState[g.getRule("a").index]));
    Rule r = g.getRule("e");
    // if ( r!=null ) System.out.println(dot.getDOT(atn.ruleToStartState[r.index]));
    int result = interp.matchATN(input, startState);
    assertEquals(expected, result);
}
Also used : MockIntTokenStream(org.antlr.v4.test.runtime.MockIntTokenStream) TokenStream(org.antlr.v4.runtime.TokenStream) DOTGenerator(org.antlr.v4.tool.DOTGenerator) ATNState(org.antlr.v4.runtime.atn.ATNState) ParserATNFactory(org.antlr.v4.automata.ParserATNFactory) LexerATNSimulator(org.antlr.v4.runtime.atn.LexerATNSimulator) IntegerList(org.antlr.v4.runtime.misc.IntegerList) MockIntTokenStream(org.antlr.v4.test.runtime.MockIntTokenStream) RuntimeTestUtils.getTokenTypesViaATN(org.antlr.v4.test.runtime.RuntimeTestUtils.getTokenTypesViaATN) ATN(org.antlr.v4.runtime.atn.ATN) Rule(org.antlr.v4.tool.Rule) BlockStartState(org.antlr.v4.runtime.atn.BlockStartState) DFA(org.antlr.v4.runtime.dfa.DFA)

Example 55 with ATNState

use of org.antlr.v4.runtime.atn.ATNState in project antlr4 by antlr.

the class ParserATNFactory method star.

/**
 * From {@code (blk)*} build {@code ( blk+ )?} with *two* decisions, one for
 * entry and one for choosing alts of {@code blk}.
 *
 * <pre>
 *   |-------------|
 *   v             |
 *   o--[o-blk-o]-&gt;o  o
 *   |                ^
 *   -----------------|
 * </pre>
 *
 * Note that the optional bypass must jump outside the loop as
 * {@code (A|B)*} is not the same thing as {@code (A|B|)+}.
 */
@Override
public Handle star(GrammarAST starAST, Handle elem) {
    StarBlockStartState blkStart = (StarBlockStartState) elem.left;
    BlockEndState blkEnd = (BlockEndState) elem.right;
    preventEpsilonClosureBlocks.add(new Triple<Rule, ATNState, ATNState>(currentRule, blkStart, blkEnd));
    StarLoopEntryState entry = newState(StarLoopEntryState.class, starAST);
    entry.nonGreedy = !((QuantifierAST) starAST).isGreedy();
    atn.defineDecisionState(entry);
    LoopEndState end = newState(LoopEndState.class, starAST);
    StarLoopbackState loop = newState(StarLoopbackState.class, starAST);
    entry.loopBackState = loop;
    end.loopBackState = loop;
    BlockAST blkAST = (BlockAST) starAST.getChild(0);
    if (((QuantifierAST) starAST).isGreedy()) {
        if (expectNonGreedy(blkAST)) {
            g.tool.errMgr.grammarError(ErrorType.EXPECTED_NON_GREEDY_WILDCARD_BLOCK, g.fileName, starAST.getToken(), starAST.getToken().getText());
        }
        // loop enter edge (alt 1)
        epsilon(entry, blkStart);
        // bypass loop edge (alt 2)
        epsilon(entry, end);
    } else {
        // if not greedy, priority to exit branch; make it first
        // bypass loop edge (alt 1)
        epsilon(entry, end);
        // loop enter edge (alt 2)
        epsilon(entry, blkStart);
    }
    // block end hits loop back
    epsilon(blkEnd, loop);
    // loop back to entry/exit decision
    epsilon(loop, entry);
    // decision is to enter/exit; blk is its own decision
    starAST.atnState = entry;
    return new Handle(entry, end);
}
Also used : StarBlockStartState(org.antlr.v4.runtime.atn.StarBlockStartState) LoopEndState(org.antlr.v4.runtime.atn.LoopEndState) StarLoopbackState(org.antlr.v4.runtime.atn.StarLoopbackState) StarLoopEntryState(org.antlr.v4.runtime.atn.StarLoopEntryState) BlockAST(org.antlr.v4.tool.ast.BlockAST) QuantifierAST(org.antlr.v4.tool.ast.QuantifierAST) Rule(org.antlr.v4.tool.Rule) LeftRecursiveRule(org.antlr.v4.tool.LeftRecursiveRule) BlockEndState(org.antlr.v4.runtime.atn.BlockEndState) ATNState(org.antlr.v4.runtime.atn.ATNState)

Aggregations

ATNState (org.antlr.v4.runtime.atn.ATNState)94 IntervalSet (org.antlr.v4.runtime.misc.IntervalSet)34 ATN (org.antlr.v4.runtime.atn.ATN)25 RuleTransition (org.antlr.v4.runtime.atn.RuleTransition)23 Rule (org.antlr.v4.tool.Rule)22 Transition (org.antlr.v4.runtime.atn.Transition)21 NotNull (org.antlr.v4.runtime.misc.NotNull)19 AtomTransition (org.antlr.v4.runtime.atn.AtomTransition)18 RuleStartState (org.antlr.v4.runtime.atn.RuleStartState)17 ActionTransition (org.antlr.v4.runtime.atn.ActionTransition)16 SetTransition (org.antlr.v4.runtime.atn.SetTransition)16 NotSetTransition (org.antlr.v4.runtime.atn.NotSetTransition)15 DecisionState (org.antlr.v4.runtime.atn.DecisionState)11 DOTGenerator (org.antlr.v4.tool.DOTGenerator)10 GrammarAST (org.antlr.v4.tool.ast.GrammarAST)10 ParserATNFactory (org.antlr.v4.automata.ParserATNFactory)9 LeftRecursiveRule (org.antlr.v4.tool.LeftRecursiveRule)9 ArrayList (java.util.ArrayList)8 EpsilonTransition (org.antlr.v4.runtime.atn.EpsilonTransition)8 PrecedencePredicateTransition (org.antlr.v4.runtime.atn.PrecedencePredicateTransition)8