Search in sources :

Example 31 with IntervalSet

use of org.antlr.v4.runtime.misc.IntervalSet in project antlr4 by antlr.

the class DefaultErrorStrategy method getMissingSymbol.

/** Conjure up a missing token during error recovery.
	 *
	 *  The recognizer attempts to recover from single missing
	 *  symbols. But, actions might refer to that missing symbol.
	 *  For example, x=ID {f($x);}. The action clearly assumes
	 *  that there has been an identifier matched previously and that
	 *  $x points at that token. If that token is missing, but
	 *  the next token in the stream is what we want we assume that
	 *  this token is missing and we keep going. Because we
	 *  have to return some token to replace the missing token,
	 *  we have to conjure one up. This method gives the user control
	 *  over the tokens returned for missing tokens. Mostly,
	 *  you will want to create something special for identifier
	 *  tokens. For literals such as '{' and ',', the default
	 *  action in the parser or tree parser works. It simply creates
	 *  a CommonToken of the appropriate type. The text will be the token.
	 *  If you change what tokens must be created by the lexer,
	 *  override this method to create the appropriate tokens.
	 */
protected Token getMissingSymbol(Parser recognizer) {
    Token currentSymbol = recognizer.getCurrentToken();
    IntervalSet expecting = getExpectedTokens(recognizer);
    int expectedTokenType = Token.INVALID_TYPE;
    if (!expecting.isNil()) {
        // get any element
        expectedTokenType = expecting.getMinElement();
    }
    String tokenText;
    if (expectedTokenType == Token.EOF)
        tokenText = "<missing EOF>";
    else
        tokenText = "<missing " + recognizer.getVocabulary().getDisplayName(expectedTokenType) + ">";
    Token current = currentSymbol;
    Token lookback = recognizer.getInputStream().LT(-1);
    if (current.getType() == Token.EOF && lookback != null) {
        current = lookback;
    }
    return recognizer.getTokenFactory().create(new Pair<TokenSource, CharStream>(current.getTokenSource(), current.getTokenSource().getInputStream()), expectedTokenType, tokenText, Token.DEFAULT_CHANNEL, -1, -1, current.getLine(), current.getCharPositionInLine());
}
Also used : IntervalSet(org.antlr.v4.runtime.misc.IntervalSet)

Example 32 with IntervalSet

use of org.antlr.v4.runtime.misc.IntervalSet in project antlr4 by antlr.

the class DefaultErrorStrategy method sync.

/**
	 * The default implementation of {@link ANTLRErrorStrategy#sync} makes sure
	 * that the current lookahead symbol is consistent with what were expecting
	 * at this point in the ATN. You can call this anytime but ANTLR only
	 * generates code to check before subrules/loops and each iteration.
	 *
	 * <p>Implements Jim Idle's magic sync mechanism in closures and optional
	 * subrules. E.g.,</p>
	 *
	 * <pre>
	 * a : sync ( stuff sync )* ;
	 * sync : {consume to what can follow sync} ;
	 * </pre>
	 *
	 * At the start of a sub rule upon error, {@link #sync} performs single
	 * token deletion, if possible. If it can't do that, it bails on the current
	 * rule and uses the default error recovery, which consumes until the
	 * resynchronization set of the current rule.
	 *
	 * <p>If the sub rule is optional ({@code (...)?}, {@code (...)*}, or block
	 * with an empty alternative), then the expected set includes what follows
	 * the subrule.</p>
	 *
	 * <p>During loop iteration, it consumes until it sees a token that can start a
	 * sub rule or what follows loop. Yes, that is pretty aggressive. We opt to
	 * stay in the loop as long as possible.</p>
	 *
	 * <p><strong>ORIGINS</strong></p>
	 *
	 * <p>Previous versions of ANTLR did a poor job of their recovery within loops.
	 * A single mismatch token or missing token would force the parser to bail
	 * out of the entire rules surrounding the loop. So, for rule</p>
	 *
	 * <pre>
	 * classDef : 'class' ID '{' member* '}'
	 * </pre>
	 *
	 * input with an extra token between members would force the parser to
	 * consume until it found the next class definition rather than the next
	 * member definition of the current class.
	 *
	 * <p>This functionality cost a little bit of effort because the parser has to
	 * compare token set at the start of the loop and at each iteration. If for
	 * some reason speed is suffering for you, you can turn off this
	 * functionality by simply overriding this method as a blank { }.</p>
	 */
@Override
public void sync(Parser recognizer) throws RecognitionException {
    ATNState s = recognizer.getInterpreter().atn.states.get(recognizer.getState());
    // If already recovering, don't try to sync
    if (inErrorRecoveryMode(recognizer)) {
        return;
    }
    TokenStream tokens = recognizer.getInputStream();
    int la = tokens.LA(1);
    // try cheaper subset first; might get lucky. seems to shave a wee bit off
    IntervalSet nextTokens = recognizer.getATN().nextTokens(s);
    if (nextTokens.contains(Token.EPSILON) || nextTokens.contains(la)) {
        return;
    }
    switch(s.getStateType()) {
        case ATNState.BLOCK_START:
        case ATNState.STAR_BLOCK_START:
        case ATNState.PLUS_BLOCK_START:
        case ATNState.STAR_LOOP_ENTRY:
            // report error and recover if possible
            if (singleTokenDeletion(recognizer) != null) {
                return;
            }
            throw new InputMismatchException(recognizer);
        case ATNState.PLUS_LOOP_BACK:
        case ATNState.STAR_LOOP_BACK:
            //			System.err.println("at loop back: "+s.getClass().getSimpleName());
            reportUnwantedToken(recognizer);
            IntervalSet expecting = recognizer.getExpectedTokens();
            IntervalSet whatFollowsLoopIterationOrRule = expecting.or(getErrorRecoverySet(recognizer));
            consumeUntil(recognizer, whatFollowsLoopIterationOrRule);
            break;
        default:
            // do nothing if we can't identify the exact kind of ATN state
            break;
    }
}
Also used : IntervalSet(org.antlr.v4.runtime.misc.IntervalSet) ATNState(org.antlr.v4.runtime.atn.ATNState)

Example 33 with IntervalSet

use of org.antlr.v4.runtime.misc.IntervalSet in project antlr4 by antlr.

the class Parser method isExpectedToken.

/**
	 * Checks whether or not {@code symbol} can follow the current state in the
	 * ATN. The behavior of this method is equivalent to the following, but is
	 * implemented such that the complete context-sensitive follow set does not
	 * need to be explicitly constructed.
	 *
	 * <pre>
	 * return getExpectedTokens().contains(symbol);
	 * </pre>
	 *
	 * @param symbol the symbol type to check
	 * @return {@code true} if {@code symbol} can follow the current state in
	 * the ATN, otherwise {@code false}.
	 */
public boolean isExpectedToken(int symbol) {
    //   		return getInterpreter().atn.nextTokens(_ctx);
    ATN atn = getInterpreter().atn;
    ParserRuleContext ctx = _ctx;
    ATNState s = atn.states.get(getState());
    IntervalSet following = atn.nextTokens(s);
    if (following.contains(symbol)) {
        return true;
    }
    //        System.out.println("following "+s+"="+following);
    if (!following.contains(Token.EPSILON))
        return false;
    while (ctx != null && ctx.invokingState >= 0 && following.contains(Token.EPSILON)) {
        ATNState invokingState = atn.states.get(ctx.invokingState);
        RuleTransition rt = (RuleTransition) invokingState.transition(0);
        following = atn.nextTokens(rt.followState);
        if (following.contains(symbol)) {
            return true;
        }
        ctx = (ParserRuleContext) ctx.parent;
    }
    if (following.contains(Token.EPSILON) && symbol == Token.EOF) {
        return true;
    }
    return false;
}
Also used : IntervalSet(org.antlr.v4.runtime.misc.IntervalSet) RuleTransition(org.antlr.v4.runtime.atn.RuleTransition) ATN(org.antlr.v4.runtime.atn.ATN) ATNState(org.antlr.v4.runtime.atn.ATNState)

Example 34 with IntervalSet

use of org.antlr.v4.runtime.misc.IntervalSet in project antlr4 by antlr.

the class ATN method nextTokens.

/** Compute the set of valid tokens that can occur starting in state {@code s}.
	 *  If {@code ctx} is null, the set of tokens will not include what can follow
	 *  the rule surrounding {@code s}. In other words, the set will be
	 *  restricted to tokens reachable staying within {@code s}'s rule.
	 */
public IntervalSet nextTokens(ATNState s, RuleContext ctx) {
    LL1Analyzer anal = new LL1Analyzer(this);
    IntervalSet next = anal.LOOK(s, ctx);
    return next;
}
Also used : IntervalSet(org.antlr.v4.runtime.misc.IntervalSet)

Example 35 with IntervalSet

use of org.antlr.v4.runtime.misc.IntervalSet in project antlr4 by antlr.

the class ATN method getExpectedTokens.

/**
	 * Computes the set of input symbols which could follow ATN state number
	 * {@code stateNumber} in the specified full {@code context}. This method
	 * considers the complete parser context, but does not evaluate semantic
	 * predicates (i.e. all predicates encountered during the calculation are
	 * assumed true). If a path in the ATN exists from the starting state to the
	 * {@link RuleStopState} of the outermost context without matching any
	 * symbols, {@link Token#EOF} is added to the returned set.
	 *
	 * <p>If {@code context} is {@code null}, it is treated as {@link ParserRuleContext#EMPTY}.</p>
	 *
	 * Note that this does NOT give you the set of all tokens that could
	 * appear at a given token position in the input phrase.  In other words,
	 * it does not answer:
	 *
	 *   "Given a specific partial input phrase, return the set of all tokens
	 *    that can follow the last token in the input phrase."
	 *
	 * The big difference is that with just the input, the parser could
	 * land right in the middle of a lookahead decision. Getting
     * all *possible* tokens given a partial input stream is a separate
     * computation. See https://github.com/antlr/antlr4/issues/1428
	 *
	 * For this function, we are specifying an ATN state and call stack to compute
	 * what token(s) can come next and specifically: outside of a lookahead decision.
	 * That is what you want for error reporting and recovery upon parse error.
	 *
	 * @param stateNumber the ATN state number
	 * @param context the full parse context
	 * @return The set of potentially valid input symbols which could follow the
	 * specified state in the specified context.
	 * @throws IllegalArgumentException if the ATN does not contain a state with
	 * number {@code stateNumber}
	 */
public IntervalSet getExpectedTokens(int stateNumber, RuleContext context) {
    if (stateNumber < 0 || stateNumber >= states.size()) {
        throw new IllegalArgumentException("Invalid state number.");
    }
    RuleContext ctx = context;
    ATNState s = states.get(stateNumber);
    IntervalSet following = nextTokens(s);
    if (!following.contains(Token.EPSILON)) {
        return following;
    }
    IntervalSet expected = new IntervalSet();
    expected.addAll(following);
    expected.remove(Token.EPSILON);
    while (ctx != null && ctx.invokingState >= 0 && following.contains(Token.EPSILON)) {
        ATNState invokingState = states.get(ctx.invokingState);
        RuleTransition rt = (RuleTransition) invokingState.transition(0);
        following = nextTokens(rt.followState);
        expected.addAll(following);
        expected.remove(Token.EPSILON);
        ctx = ctx.parent;
    }
    if (following.contains(Token.EPSILON)) {
        expected.add(Token.EOF);
    }
    return expected;
}
Also used : ParserRuleContext(org.antlr.v4.runtime.ParserRuleContext) RuleContext(org.antlr.v4.runtime.RuleContext) IntervalSet(org.antlr.v4.runtime.misc.IntervalSet)

Aggregations

IntervalSet (org.antlr.v4.runtime.misc.IntervalSet)84 Test (org.junit.Test)48 ATNState (org.antlr.v4.runtime.atn.ATNState)11 GrammarAST (org.antlr.v4.tool.ast.GrammarAST)10 ATN (org.antlr.v4.runtime.atn.ATN)8 ArrayList (java.util.ArrayList)7 Grammar (org.antlr.v4.tool.Grammar)7 Interval (org.antlr.v4.runtime.misc.Interval)6 SetTransition (org.antlr.v4.runtime.atn.SetTransition)5 UnicodeSet (com.ibm.icu.text.UnicodeSet)4 HashMap (java.util.HashMap)4 Token (org.antlr.runtime.Token)4 NotSetTransition (org.antlr.v4.runtime.atn.NotSetTransition)4 BaseJavaTest (org.antlr.v4.test.runtime.java.BaseJavaTest)4 LinkedHashMap (java.util.LinkedHashMap)3 ParserRuleContext (org.antlr.v4.runtime.ParserRuleContext)3 AtomTransition (org.antlr.v4.runtime.atn.AtomTransition)3 DecisionState (org.antlr.v4.runtime.atn.DecisionState)3 RuleTransition (org.antlr.v4.runtime.atn.RuleTransition)3 Transition (org.antlr.v4.runtime.atn.Transition)3