Wednesday, 28 August 2013

ANTLR: How the behavior of this grammar which recognizes suffixes of a Java code be explained?

ANTLR: How the behavior of this grammar which recognizes suffixes of a
Java code be explained?

a week ago I started the following project: a grammar which recognizes
suffixes of a Java code. I used the official ANTLR grammar for Java -
Java.g4 as a baseline and started to add some rules. However, those new
rules also introduced left recursion which I also had to deal with. So
after a couple of days of work I got the following code. When I started
testing I noticed something unusual which I still can't explain. When
given the input { } the parser tells me no viable alternative at input
'<EOF>' but when I switch the order of the terminals in the right-handed
side of the rule s2, particularly if we change the right-handed side from
v2_1 | v2_2 | v2_3 ... to v2_36 | v2_1 | v2_2 ... (the terminal v2_36 is
moved to the first position), the sequence { } gets accepted. My first
thoughts were that Antlr does not backtrack because I noticed that with
the input { } the first version of the parser starts to follow the rule
v2_3 and just reports that nothing is found and does not try to consider
other options (that's what I think but maybe it's not true) like v2_36
which give exactly the positive answer. But after some research I found
out that ANTLR actually does backtrack but only if everything else fails.
At least this is true for v3.3 (read it in official ANTLR paper) but I
guess it's also true for v4. Now I'm a little bit confused. After spending
so many hours on this project I would feel really terrible if I don't make
it work. Can someone gives some kind of tip or something? It would be
greatly appreciated, thanks.

No comments:

Post a Comment