An Example of ANTLR
ANTLR Introduction
Create a new maven project
add dependency in pom.xml
<dependencies>
<dependency>
<groupId>org.antlr</groupId>
<artifactId>antlr4-maven-plugin</artifactId>
<version>4.7.1</version>
</dependency>
</dependencies>
Create ANTLTR file LabeledExpr.g4
/** Grammars always start with a grammar header. This grammar is called LabeledExpr
* and must match the file name:LabeledExpr.
*/
grammar LabeledExpr;
/** A rule called prog */
prog: stat+ ;
// A rule called stat
stat: expr NEWLINE # printExpr // '#' indicate a label
| ID '=' expr NEWLINE # assign
| NEWLINE # blank
;
// expr rule
expr: expr op=('*' | '/') expr # MulDiv
| expr op=('+' | '-') expr # AddSub
| INT # int
| ID # id
| '(' expr ')' # parens
;
/*
* parser rules start with lower case letters, lexer rules with uppercase *
*/
MUL : '*' ; // ASSIGNS TOKEN NAME TO '*' USED ABOVE IN GRAMMAR
DIV : '/' ;
ADD : '+' ;
SUB : '-' ;
ID : [a-zA-Z]+ ;
INT : [0-9]+ ; // Define token INT as one or more digits.
NEWLINE: '\r' ? '\n' ;
WS : [ \t]+ -> skip; // Define white space rule, toss it out
Parsing Terms
This chapter introduced a number of important language recognition terms.
- Language A language is a set of valid sentences; sentences are composed of phrases,
which are composed of subphrases, and so on. - Grammar A grammar formally defines the syntax rules of a language. Each rule in
a grammar expresses the structure of a subphrase. - Syntax tree or parse tree This represents the structure of the sentence where each
subtree root gives an abstract name to the elements beneath it. The subtree roots
correspond to grammar rule names. The leaves of the tree are symbols or tokens
of the sentence. - Token A token is a vocabulary symbol in a language; these can represent a category
of symbols such as “identifier” or can represent a single operator or keyword. - Lexer or tokenizer This breaks up an input character stream into tokens. A lexer
performs lexical analysis. - Parser A parser checks sentences for membership in a specific language by checking
the sentence’s structure against the rules of a grammar. The best analogy for
parsing is traversing a maze, comparing words of a sentence to words written
along the floor to go from entrance to exit. ANTLR generates top-down parsers
called ALL(*) that can use all remaining input symbols to make decisions. Top-
down parsers are goal-oriented and start matching at the rule associated with
the coarsest construct, such as program or inputFile . - Recursive-descent parser This is a specific kind of top-down parser implemented
with a function for each rule in the grammar.
Lookahead Parsers use lookahead to make decisions by comparing the symbols that
begin each alternative.
Generate files
antlr4 LabeledExpr.g4 -visitor
Generate the following files
-rw-r--r--. 1 houzhizhen root 3884 6月 6 17:53 LabeledExprBaseListener.java
-rw-r--r--. 1 houzhizhen root 1545 6月 6 17:53 LabeledExpr.interp
-rw-r--r--. 1 houzhizhen root 2199 6月 6 17:53 LabeledExprLexer.interp
-rw-r--r--. 1 houzhizhen root 4181 6月 6 17:53 LabeledExprLexer.java
-rw-r--r--. 1 houzhizhen root 115 6月 6 17:53 LabeledExprLexer.tokens
-rw-r--r--. 1 houzhizhen root 3894 6月 6 17:53 LabeledExprListener.java
-rw-r--r--. 1 houzhizhen root 15231 6月 6 17:53 LabeledExprParser.java
-rw-r--r--. 1 houzhizhen root 115 6月 6 17:53 LabeledExpr.tokens
Parsing Process
Use grun LabeledExpr prog -tokens
to Generate tokens
After input grun LabeledExpr prog -tokens
command, you can input the expression you let ANTLR parse. After input 3 * (1 + 2)
, input CTRL + D
to terminate the input.
$ grun LabeledExpr prog -tokens
30 * ( 123 + 4567)
[@0,0:1='30',<INT>,1:0]
[@1,3:3='*',<'*'>,1:3]
[@2,5:5='(',<'('>,1:5]
[@3,7:9='123',<INT>,1:7]
[@4,11:11='+',<'+'>,1:11]
[@5,13:16='4567',<INT>,1:13]
[@6,17:17=')',<')'>,1:17]
[@7,18:18='\n',<NEWLINE>,1:18]
[@8,19:18='<EOF>',<EOF>,2:0]
The meaning of tokens
[@0,0:1='30',<INT>,1:0]
indicates that the token is first token( indexed from 0), goes from character position 0 to 1, has text ‘30’, has token type ‘INT’, is on line 1 from 1, and is at character position 0.
Use grun LabeledExpr prog -gui
to display the parse tree visually in a dialog box
Input 30 * (123 + 4567)
and CTRL + D
will launch the following diaglog box.
Input
a = 2
b = 3
c = 4
a * (b+c)
and CTRL + D
will launch the following diaglog box.
ANTLR Core Notation
Syntax Description
x Match token, rule reference, or subrule x .
x y ... z Match a sequence of rule elements.
(... | ... | ...) Subrule with multiple alternatives.
x ? Match x or skip it.
x * Match x zero or more times.
x + Match x one or more times.
r : ... ; Define rule r .
r : ... | ... | ... ; Define rule r with multiple alternatives.
Core Language Patterns
Pattern Name Description
- Sequence
This is a finite or arbitrarily long sequence of tokens or subphrases. Examples include variable declarations (type followed by identifier) and lists of integers. Here are some sample implementations:x y ... z
//'[' INT+ ']'
// Matlab vector of integers
- Sequence with terminator.
This is an arbitrarily long, potentially empty sequence of tokens or subphrases separated by a token, usually a semicolon or newline. Examples include statement lists from C-like languages and rows of data terminated with newlines. Here are some sample implementations:(statement ';')*
// Java statement list
(row '\n')*
// Lines of data
- Sequence with separator
This is a nonempty arbitrarily long sequence of tokens or subphrases separated by a token, usually a comma, semicolon, or period. Examples include function argument definition lists, function call argument lists, languages where statements are separated but not terminated, and directory names. Here are some sample implementations:expr (',' expr)*
// function call arguments( expr (',' expr)* )?
// optional function call arguments'/'? name ('/' name)*
//simplified directory namestat ('.' stat)*
// SmallTalk statement list Choice
This is a set of alternative phrases. Examples include the different kinds of types, statements, expressions, or XML tags. Here are some sample implementations:type : 'int' | 'float' ;
stat : ifstat | whilestat | 'return' expr ';' ;
expr : '(' expr ')' | INT | ID ;
tag : '<' Name attribute* '>' | '<' '/' Name '>' ;
Token dependency
The presence of one token requires the presence of one or more subsequent tokens. Examples include matching parentheses,curly braces, square bracket, and angle brackets. Here are some sample implementations:'(' expr ')'
// nested expressionID '[' expr ']'
//array index
'{' stat* '}'
//statements grouped in curlies
'<' ID (',' ID)* '>'
//generic type specifier
Nested phrase
This is a self-similar language structure. Examples include expressions, nested Java classes, nested code blocks, and nested Python function definitions. Here are some sample implementations:expr : '(' expr ')' | ID ;
classDef : 'class' ID '{' (classDef|method|field) '}' ;
Calculate values with a Visitor
import java.io.FileInputStream;
import java.io.InputStream;
import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.tree.ParseTree;
public class Calc {
public static void main(String[] args) throws Exception {
String inputFile = null;
if ( args.length>0 ) inputFile = args[0];
InputStream is = System.in;
if ( inputFile!=null ) is = new FileInputStream(inputFile);
ANTLRInputStream input = new ANTLRInputStream(is);
LabeledExprLexer lexer = new LabeledExprLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
LabeledExprParser parser = new LabeledExprParser(tokens);
ParseTree tree = parser.prog(); // parse
//TreeUitls.printlnTree(tree,1);
EvalVisitor eval = new EvalVisitor();
eval.visit(tree);
}
}
Visitor interface
$ antrl4 LabeledExpr.g4 -visitor
generates a visitor interface with a method for each labeled alternative name.
// Generated from LabeledExpr.g4 by ANTLR 4.7.1
import org.antlr.v4.runtime.tree.ParseTreeListener;
/**
* This interface defines a complete listener for a parse tree produced by
* {@link LabeledExprParser}.
*/
public interface LabeledExprListener extends ParseTreeListener {
/**
* Enter a parse tree produced by {@link LabeledExprParser#prog}.
* @param ctx the parse tree
*/
void enterProg(LabeledExprParser.ProgContext ctx);
/**
* Exit a parse tree produced by {@link LabeledExprParser#prog}.
* @param ctx the parse tree
*/
void exitProg(LabeledExprParser.ProgContext ctx);
/**
* Enter a parse tree produced by the {@code printExpr}
* labeled alternative in {@link LabeledExprParser#stat}.
* @param ctx the parse tree
*/
void enterPrintExpr(LabeledExprParser.PrintExprContext ctx);
/**
* Exit a parse tree produced by the {@code printExpr}
* labeled alternative in {@link LabeledExprParser#stat}.
* @param ctx the parse tree
*/
void exitPrintExpr(LabeledExprParser.PrintExprContext ctx);
/**
* Enter a parse tree produced by the {@code assign}
* labeled alternative in {@link LabeledExprParser#stat}.
* @param ctx the parse tree
*/
void enterAssign(LabeledExprParser.AssignContext ctx);
/**
* Exit a parse tree produced by the {@code assign}
* labeled alternative in {@link LabeledExprParser#stat}.
* @param ctx the parse tree
*/
void exitAssign(LabeledExprParser.AssignContext ctx);
/**
* Enter a parse tree produced by the {@code blank}
* labeled alternative in {@link LabeledExprParser#stat}.
* @param ctx the parse tree
*/
void enterBlank(LabeledExprParser.BlankContext ctx);
/**
* Exit a parse tree produced by the {@code blank}
* labeled alternative in {@link LabeledExprParser#stat}.
* @param ctx the parse tree
*/
void exitBlank(LabeledExprParser.BlankContext ctx);
/**
* Enter a parse tree produced by the {@code parens}
* labeled alternative in {@link LabeledExprParser#expr}.
* @param ctx the parse tree
*/
void enterParens(LabeledExprParser.ParensContext ctx);
/**
* Exit a parse tree produced by the {@code parens}
* labeled alternative in {@link LabeledExprParser#expr}.
* @param ctx the parse tree
*/
void exitParens(LabeledExprParser.ParensContext ctx);
/**
* Enter a parse tree produced by the {@code MulDiv}
* labeled alternative in {@link LabeledExprParser#expr}.
* @param ctx the parse tree
*/
void enterMulDiv(LabeledExprParser.MulDivContext ctx);
/**
* Exit a parse tree produced by the {@code MulDiv}
* labeled alternative in {@link LabeledExprParser#expr}.
* @param ctx the parse tree
*/
void exitMulDiv(LabeledExprParser.MulDivContext ctx);
/**
* Enter a parse tree produced by the {@code AddSub}
* labeled alternative in {@link LabeledExprParser#expr}.
* @param ctx the parse tree
*/
void enterAddSub(LabeledExprParser.AddSubContext ctx);
/**
* Exit a parse tree produced by the {@code AddSub}
* labeled alternative in {@link LabeledExprParser#expr}.
* @param ctx the parse tree
*/
void exitAddSub(LabeledExprParser.AddSubContext ctx);
/**
* Enter a parse tree produced by the {@code id}
* labeled alternative in {@link LabeledExprParser#expr}.
* @param ctx the parse tree
*/
void enterId(LabeledExprParser.IdContext ctx);
/**
* Exit a parse tree produced by the {@code id}
* labeled alternative in {@link LabeledExprParser#expr}.
* @param ctx the parse tree
*/
void exitId(LabeledExprParser.IdContext ctx);
/**
* Enter a parse tree produced by the {@code int}
* labeled alternative in {@link LabeledExprParser#expr}.
* @param ctx the parse tree
*/
void enterInt(LabeledExprParser.IntContext ctx);
/**
* Exit a parse tree produced by the {@code int}
* labeled alternative in {@link LabeledExprParser#expr}.
* @param ctx the parse tree
*/
void exitInt(LabeledExprParser.IntContext ctx);
}
Visitor Base
$ antrl4 LabeledExpr.g4 -visitor
generates a default visitor implementation called LabeledExprBaseVisitor
// Generated from LabeledExpr.g4 by ANTLR 4.7.1
import org.antlr.v4.runtime.tree.AbstractParseTreeVisitor;
/**
* This class provides an empty implementation of {@link LabeledExprVisitor},
* which can be extended to create a visitor which only needs to handle a subset
* of the available methods.
*
* @param <T> The return type of the visit operation. Use {@link Void} for
* operations with no return type.
*/
public class LabeledExprBaseVisitor<T> extends AbstractParseTreeVisitor<T> implements LabeledExprVisitor<T> {
/**
* {@inheritDoc}
*
* <p>The default implementation returns the result of calling
* {@link #visitChildren} on {@code ctx}.</p>
*/
@Override public T visitProg(LabeledExprParser.ProgContext ctx) { return visitChildren(ctx); }
// ...
}
EvalVisitor
public class EvalVisitor extends LabeledExprBaseVisitor<Integer> {
Map<String, Integer> memory = new HashMap<String, Integer>();
@Override
public Integer visitAssign(LabeledExprParser.AssignContext ctx) {
String id = ctx.ID().getText();
int value = visit(ctx.expr());
memory.put(id, value);
return value;
}
@Override
public Integer visitPrintExpr(LabeledExprParser.PrintExprContext ctx) {
Integer value = visit(ctx.expr());
System.out.println(value);
return value;
}
@Override
public Integer visitInt(LabeledExprParser.IntContext ctx) {
return Integer.valueOf(ctx.INT().getText());
}
@Override
public Integer visitId(LabeledExprParser.IdContext ctx) {
String id = ctx.ID().getText();
if (memory.containsKey(id)) {
return memory.get(id);
}
return 0;
}
@Override
public Integer visitMulDiv(LabeledExprParser.MulDivContext ctx) {
int left = visit(ctx.expr(0));
int right = visit(ctx.expr(1));
if (ctx.op.getType() == LabeledExprParser.MUL) {
return left * right;
} else {
return left / right;
}
}
@Override
public Integer visitAddSub(LabeledExprParser.AddSubContext ctx) {
int left = visit(ctx.expr(0));
int right = visit(ctx.expr(1));
if (ctx.op.getType() == LabeledExprParser.ADD) {
return left + right;
} else {
return left - right;
}
}
@Override
public Integer visitParens(LabeledExprParser.ParensContext ctx) {
return visit(ctx.expr());
}
}
Tree Listener
The biggest difference between the listener and visitor mechanisms is that listener methods are called by the ANTLR-provided walker object, whereas visitor methods must walk their children with explicit visit calls.
Antlr generates listener interface and base listener as visitor.
Generated LabeledExprBaseListener
import java.util.HashMap;
import java.util.Map;
import java.util.Stack;
import org.antlr.v4.runtime.ParserRuleContext;
import org.antlr.v4.runtime.tree.ErrorNode;
import org.antlr.v4.runtime.tree.TerminalNode;
/**
* Created by houzhizhen on 18-6-8.
*/
public class LabeledExprListenerImpl extends LabeledExprBaseListener {
Map<String, Integer> memory = new HashMap<String, Integer>();
Stack<Integer> stack = new Stack<Integer>();
@Override public void enterProg(LabeledExprParser.ProgContext ctx) {
}
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void exitProg(LabeledExprParser.ProgContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void enterPrintExpr(LabeledExprParser.PrintExprContext ctx) {
ctx.expr();
}
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void exitPrintExpr(LabeledExprParser.PrintExprContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void enterAssign(LabeledExprParser.AssignContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void exitAssign(LabeledExprParser.AssignContext ctx) {
String id = ctx.ID().getText();
int value = 0;
memory.put(id, value);
}
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void enterBlank(LabeledExprParser.BlankContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void exitBlank(LabeledExprParser.BlankContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void enterParens(LabeledExprParser.ParensContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void exitParens(LabeledExprParser.ParensContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void enterMulDiv(LabeledExprParser.MulDivContext ctx) {
}
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void exitMulDiv(LabeledExprParser.MulDivContext ctx) {
int right = stack.pop();
int left = stack.pop();
if (ctx.op.getType() == LabeledExprParser.MUL) {
stack.push(left * right);
} else {
stack.push(left / right) ;
}
}
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void enterAddSub(LabeledExprParser.AddSubContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void exitAddSub(LabeledExprParser.AddSubContext ctx) {
int right = stack.pop();
int left = stack.pop();
if (ctx.op.getType() == LabeledExprParser.ADD) {
stack.push(left + right);
} else {
stack.push(left + right) ;
}
}
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void enterId(LabeledExprParser.IdContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void exitId(LabeledExprParser.IdContext ctx) {
Integer value = memory.get(ctx.ID().getText());
if (value == null) {
throw new RuntimeException(ctx.ID().getText() + " is not specified a value yet.");
}
stack.push(value);
}
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void enterInt(LabeledExprParser.IntContext ctx) {
stack.push( Integer.valueOf(ctx.INT().getText()));
}
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void exitInt(LabeledExprParser.IntContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void enterEveryRule(ParserRuleContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void exitEveryRule(ParserRuleContext ctx) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void visitTerminal(TerminalNode node) { }
/**
* {@inheritDoc}
*
* <p>The default implementation does nothing.</p>
*/
@Override public void visitErrorNode(ErrorNode node) { }
public int getValue() {
return 0;
}
}
CalculatorWithListenerTest
public static void main(String[] args) throws Exception {
String inputFile = null;
if ( args.length>0 ) inputFile = args[0];
InputStream is = System.in;
if ( inputFile!=null ) is = new FileInputStream(inputFile);
ANTLRInputStream input = new ANTLRInputStream(is);
LabeledExprLexer lexer = new LabeledExprLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
LabeledExprParser parser = new LabeledExprParser(tokens);
ParseTree tree = parser.prog(); // parse
ParseTreeWalker walker = new ParseTreeWalker();
LabeledExprListenerImpl listener = new LabeledExprListenerImpl();
walker.walk(listener , tree);
System.out.println(listener.stack.pop());
}
The input and output of CalculatorWithListenerTest.
2 * (2 + 14 /2)
^D
18
ANTLR History
- https://github.com/antlr/antlr4/ is released in year 2013.
- Hive Parser uses ANTLR3.
- Spark SQL uses ANTLR4.
下一篇: OpenCV example
推荐阅读
-
注册微软账户失败无法查看验证example.com邮箱该怎么办?
-
springboot 报错 org.springframework.beans.factory.NoSuchBeanDefinitionException:No qualifying bean of type 'com.example.service.HrService' avai
-
[PHP] small example to show dynamic type change
-
2017-12-02 编程语言试验之Antlr4+JavaScript实现"圈4"
-
Example: Getting WMI Data from the Local Computer
-
Android Studio导入google training example gradle失败问题
-
Field testMapper in com.example.demo.service.TestService required a bean of type ‘com.example.demo.m
-
MyBatis Generator Example.Criteria 查询条件复制
-
An example of polybase for Oracle
-
example2.php