use of org.antlr.v4.runtime.misc.NotNull in project antlr4 by tunnelvisionlabs.
the class DefaultErrorStrategy method reportMissingToken.
/**
* This method is called to report a syntax error which requires the
* insertion of a missing token into the input stream. At the time this
* method is called, the missing token has not yet been inserted. When this
* method returns, {@code recognizer} is in error recovery mode.
*
* <p>This method is called when {@link #singleTokenInsertion} identifies
* single-token insertion as a viable recovery strategy for a mismatched
* input error.</p>
*
* <p>The default implementation simply returns if the handler is already in
* error recovery mode. Otherwise, it calls {@link #beginErrorCondition} to
* enter error recovery mode, followed by calling
* {@link Parser#notifyErrorListeners}.</p>
*
* @param recognizer the parser instance
*/
protected void reportMissingToken(@NotNull Parser recognizer) {
if (inErrorRecoveryMode(recognizer)) {
return;
}
beginErrorCondition(recognizer);
Token t = recognizer.getCurrentToken();
IntervalSet expecting = getExpectedTokens(recognizer);
String msg = "missing " + expecting.toString(recognizer.getVocabulary()) + " at " + getTokenErrorDisplay(t);
recognizer.notifyErrorListeners(t, msg, null);
}
use of org.antlr.v4.runtime.misc.NotNull in project antlr4 by tunnelvisionlabs.
the class DefaultErrorStrategy method singleTokenDeletion.
/**
* This method implements the single-token deletion inline error recovery
* strategy. It is called by {@link #recoverInline} to attempt to recover
* from mismatched input. If this method returns null, the parser and error
* handler state will not have changed. If this method returns non-null,
* {@code recognizer} will <em>not</em> be in error recovery mode since the
* returned token was a successful match.
*
* <p>If the single-token deletion is successful, this method calls
* {@link #reportUnwantedToken} to report the error, followed by
* {@link Parser#consume} to actually "delete" the extraneous token. Then,
* before returning {@link #reportMatch} is called to signal a successful
* match.</p>
*
* @param recognizer the parser instance
* @return the successfully matched {@link Token} instance if single-token
* deletion successfully recovers from the mismatched input, otherwise
* {@code null}
*/
@Nullable
protected Token singleTokenDeletion(@NotNull Parser recognizer) {
int nextTokenType = recognizer.getInputStream().LA(2);
IntervalSet expecting = getExpectedTokens(recognizer);
if (expecting.contains(nextTokenType)) {
reportUnwantedToken(recognizer);
/*
System.err.println("recoverFromMismatchedToken deleting "+
((TokenStream)recognizer.getInputStream()).LT(1)+
" since "+((TokenStream)recognizer.getInputStream()).LT(2)+
" is what we want");
*/
// simply delete extra token
recognizer.consume();
// we want to return the token we're actually matching
Token matchedSymbol = recognizer.getCurrentToken();
// we know current token is correct
reportMatch(recognizer);
return matchedSymbol;
}
return null;
}
use of org.antlr.v4.runtime.misc.NotNull in project antlr4 by tunnelvisionlabs.
the class DefaultErrorStrategy method getMissingSymbol.
/**
* Conjure up a missing token during error recovery.
*
* The recognizer attempts to recover from single missing
* symbols. But, actions might refer to that missing symbol.
* For example, x=ID {f($x);}. The action clearly assumes
* that there has been an identifier matched previously and that
* $x points at that token. If that token is missing, but
* the next token in the stream is what we want we assume that
* this token is missing and we keep going. Because we
* have to return some token to replace the missing token,
* we have to conjure one up. This method gives the user control
* over the tokens returned for missing tokens. Mostly,
* you will want to create something special for identifier
* tokens. For literals such as '{' and ',', the default
* action in the parser or tree parser works. It simply creates
* a CommonToken of the appropriate type. The text will be the token.
* If you change what tokens must be created by the lexer,
* override this method to create the appropriate tokens.
*/
@NotNull
protected Token getMissingSymbol(@NotNull Parser recognizer) {
Token currentSymbol = recognizer.getCurrentToken();
IntervalSet expecting = getExpectedTokens(recognizer);
int expectedTokenType = Token.INVALID_TYPE;
if (!expecting.isNil()) {
// get any element
expectedTokenType = expecting.getMinElement();
}
String tokenText;
if (expectedTokenType == Token.EOF)
tokenText = "<missing EOF>";
else
tokenText = "<missing " + recognizer.getVocabulary().getDisplayName(expectedTokenType) + ">";
Token current = currentSymbol;
Token lookback = recognizer.getInputStream().LT(-1);
if (current.getType() == Token.EOF && lookback != null) {
current = lookback;
}
return constructToken(recognizer.getInputStream().getTokenSource(), expectedTokenType, tokenText, current);
}
use of org.antlr.v4.runtime.misc.NotNull in project antlr4 by tunnelvisionlabs.
the class DefaultErrorStrategy method getErrorRecoverySet.
/* Compute the error recovery set for the current rule. During
* rule invocation, the parser pushes the set of tokens that can
* follow that rule reference on the stack; this amounts to
* computing FIRST of what follows the rule reference in the
* enclosing rule. See LinearApproximator.FIRST().
* This local follow set only includes tokens
* from within the rule; i.e., the FIRST computation done by
* ANTLR stops at the end of a rule.
*
* EXAMPLE
*
* When you find a "no viable alt exception", the input is not
* consistent with any of the alternatives for rule r. The best
* thing to do is to consume tokens until you see something that
* can legally follow a call to r *or* any rule that called r.
* You don't want the exact set of viable next tokens because the
* input might just be missing a token--you might consume the
* rest of the input looking for one of the missing tokens.
*
* Consider grammar:
*
* a : '[' b ']'
* | '(' b ')'
* ;
* b : c '^' INT ;
* c : ID
* | INT
* ;
*
* At each rule invocation, the set of tokens that could follow
* that rule is pushed on a stack. Here are the various
* context-sensitive follow sets:
*
* FOLLOW(b1_in_a) = FIRST(']') = ']'
* FOLLOW(b2_in_a) = FIRST(')') = ')'
* FOLLOW(c_in_b) = FIRST('^') = '^'
*
* Upon erroneous input "[]", the call chain is
*
* a -> b -> c
*
* and, hence, the follow context stack is:
*
* depth follow set start of rule execution
* 0 <EOF> a (from main())
* 1 ']' b
* 2 '^' c
*
* Notice that ')' is not included, because b would have to have
* been called from a different context in rule a for ')' to be
* included.
*
* For error recovery, we cannot consider FOLLOW(c)
* (context-sensitive or otherwise). We need the combined set of
* all context-sensitive FOLLOW sets--the set of all tokens that
* could follow any reference in the call chain. We need to
* resync to one of those tokens. Note that FOLLOW(c)='^' and if
* we resync'd to that token, we'd consume until EOF. We need to
* sync to context-sensitive FOLLOWs for a, b, and c: {']','^'}.
* In this case, for input "[]", LA(1) is ']' and in the set, so we would
* not consume anything. After printing an error, rule c would
* return normally. Rule b would not find the required '^' though.
* At this point, it gets a mismatched token error and throws an
* exception (since LA(1) is not in the viable following token
* set). The rule exception handler tries to recover, but finds
* the same recovery set and doesn't consume anything. Rule b
* exits normally returning to rule a. Now it finds the ']' (and
* with the successful match exits errorRecovery mode).
*
* So, you can see that the parser walks up the call chain looking
* for the token that was a member of the recovery set.
*
* Errors are not generated in errorRecovery mode.
*
* ANTLR's error recovery mechanism is based upon original ideas:
*
* "Algorithms + Data Structures = Programs" by Niklaus Wirth
*
* and
*
* "A note on error recovery in recursive descent parsers":
* http://portal.acm.org/citation.cfm?id=947902.947905
*
* Later, Josef Grosch had some good ideas:
*
* "Efficient and Comfortable Error Recovery in Recursive Descent
* Parsers":
* ftp://www.cocolab.com/products/cocktail/doca4.ps/ell.ps.zip
*
* Like Grosch I implement context-sensitive FOLLOW sets that are combined
* at run-time upon error to avoid overhead during parsing.
*/
@NotNull
protected IntervalSet getErrorRecoverySet(@NotNull Parser recognizer) {
ATN atn = recognizer.getInterpreter().atn;
RuleContext ctx = recognizer._ctx;
IntervalSet recoverSet = new IntervalSet();
while (ctx != null && ctx.invokingState >= 0) {
// compute what follows who invoked us
ATNState invokingState = atn.states.get(ctx.invokingState);
RuleTransition rt = (RuleTransition) invokingState.transition(0);
IntervalSet follow = atn.nextTokens(rt.followState);
recoverSet.addAll(follow);
ctx = ctx.parent;
}
recoverSet.remove(Token.EPSILON);
// System.out.println("recover set "+recoverSet.toString(recognizer.getTokenNames()));
return recoverSet;
}
use of org.antlr.v4.runtime.misc.NotNull in project antlr4 by tunnelvisionlabs.
the class LexerATNSimulator method execATN.
protected int execATN(@NotNull CharStream input, @NotNull DFAState ds0) {
// System.out.println("enter exec index "+input.index()+" from "+ds0.configs);
if (debug) {
System.out.format(Locale.getDefault(), "start state closure=%s\n", ds0.configs);
}
if (ds0.isAcceptState()) {
// allow zero-length tokens
captureSimState(prevAccept, input, ds0);
}
int t = input.LA(1);
@NotNull DFAState // s is current/from DFA state
s = ds0;
while (true) {
// while more work
if (debug) {
System.out.format(Locale.getDefault(), "execATN loop starting closure: %s\n", s.configs);
}
// As we move src->trg, src->trg, we keep track of the previous trg to
// avoid looking up the DFA state again, which is expensive.
// If the previous target was already part of the DFA, we might
// be able to avoid doing a reach operation upon t. If s!=null,
// it means that semantic predicates didn't prevent us from
// creating a DFA state. Once we know s!=null, we check to see if
// the DFA state has an edge already for t. If so, we can just reuse
// it's configuration set; there's no point in re-computing it.
// This is kind of like doing DFA simulation within the ATN
// simulation because DFA simulation is really just a way to avoid
// computing reach/closure sets. Technically, once we know that
// we have a previously added DFA state, we could jump over to
// the DFA simulator. But, that would mean popping back and forth
// a lot and making things more complicated algorithmically.
// This optimization makes a lot of sense for loops within DFA.
// A character will take us back to an existing DFA state
// that already has lots of edges out of it. e.g., .* in comments.
DFAState target = getExistingTargetState(s, t);
if (target == null) {
target = computeTargetState(input, s, t);
}
if (target == ERROR) {
break;
}
// end of the token.
if (t != IntStream.EOF) {
consume(input);
}
if (target.isAcceptState()) {
captureSimState(prevAccept, input, target);
if (t == IntStream.EOF) {
break;
}
}
t = input.LA(1);
// flip; current DFA target becomes new src/from state
s = target;
}
return failOrAccept(prevAccept, input, s.configs, t);
}
Aggregations