MiniJava Syntax


The syntactic grammar for MiniJava is given here. This grammar has Java tokens defined by the Java Language Specification's lexical grammar as its terminal symbols. It defines a set of productions starting from the goal symbol Goal which describe how sequences of tokens can form syntactically correct MiniJava programs. This is an LALR(1) grammar (apart from ambiguities that can be resolved by precedence and associativity). It is not an LL(1) grammar because it contains left-recursion.


Grammar Notation

Terminal symbols are shown in fixed width font in the productions of the lexical and syntactic grammars, and throughout this specification whenever the text is directly referring to such a terminal symbol. These are to appear in a program exactly as written.

Nonterminal symbols are shown in italic type. The definition of a nonterminal is introduced by the name of the nonterminal being defined followed by a colon. One or more alternative right-hand sides for the nonterminal then follow on succeeding lines. For example, the syntactic definition:

states that the nonterminal IfThenStatement represents the token if, followed by a left parenthesis token, followed by an Expression, followed by a right parenthesis token, followed by a Statement.

As another example, the syntactic definition:

states that an ArgumentList may represent either a single Argument or an ArgumentList, followed by a comma token, followed by an Argument. This definition of ArgumentList is recursive, that is to say, it is defined in terms of itself. The result is that an ArgumentList may contain any positive number of arguments. Such recursive definitions of nonterminals are common. Moreover, this definition is left-recursive, which means it cannot be used directly with a top-down (LL) parser like JavaCC. Rather, the grammar will need to be manipulated to eliminate left-recursion (and to left-factor common prefixes).

The grammar below uses the following BNF-style conventions:


The Syntactic Grammar

Lexical Structure

The Java lexical definition describes tokens for Identifier, and the various literals:

In MiniJava we accept identifiers consisting of ASCII characters only, integer literals including octal and hexadecimal, and character and string literals consisting of ASCII characters only, and including escapes.

Types

Names

Packages

Modifiers

Classes

Class Declaration

Field Declarations

Method Declarations

Blocks and Statements

Expressions