Care should be taken, when using colons and semicolons in the same sentence, that the reader understands how far the force of each sign carries. —Robert Graves and Alan Hodge
array class else for method override ref to var break def enum if new record repeat type while const do extends loop of return then until
boolean false nil char int true
|| < <= + - { } ; , && > >= * / ( ) : . ! == != ^ % [ ] := =
A comment is an arbitrary character sequence opened by /* and closed by */. Comments can be nested and can extend over more than one line.
We use the following notation for defining syntax:
X Y X followed by Y X|Y X or Y. [X] X or empty {X} A possibly empty sequence of X's X&Y X or Y or X Y"Followed by" has greater binding power than | or &; parentheses are used to override this precedence rule. Non-terminals begin with an upper-case letter. Terminals are either keywords or quoted operators. The symbols Ident, Number, TextLiteral, and CharLiteral are defined in the token grammar. Each production is terminated by a period.
Compilation = {Decl} [ Block ]. Block = "{" {Decl} { Stmt } "}" Decl = const ConstDecl ";" | type TypeDecl ";" | var VariableDecl ";" | def Id Signature ( Block | ";" ). ConstDecl = Id [":" Type] "=" ConstExpr. TypeDecl = Id "=" Type. VariableDecl = IdList (":" Type & ":=" Expr). Signature = "(" Formals ")" [":" Type]. Formals = [ Formal {";" Formal} [";"] ]. Formal = [var] IdList ":" Type.
Stmt = AssignSt | Block | CallSt | BreakSt | ForSt | IfSt | LoopSt | RepeatSt | ReturnSt | WhileSt. AssignSt = Expr ":=" Expr ";". CallSt = Expr "(" [Actual {"," Actual}] ")" ";". BreakSt = break ";". ForSt = for Id ":=" Expr to Expr do Stmt. IfSt = if Expr then Stmt [ else Stmt ]. LoopSt = loop Stmt. RepeatSt = repeat Stmt until Expr ";". ReturnSt = return [Expr] ";". WhileSt = while Expr do Stmt. Actual = Type | Expr .
Type = TypeName | ArrayType | EnumType | RecordType | ObjectType | RefType. ArrayType = array [ "[" Expr "]" ] of Type. EnumType = enum "{" [ IdList ] "}. RecordType = record "{" Fields "}". ObjectType = class [ extends Type ] "{" Members "}". RefType = ref Type. Fields = [ Field {";" Field} [ ";" ] ]. Field = IdList ":" Type. Members = [ Member {";" Member} [ ";" ] ]. Member = Field | Method | Override. Method = Id Signature [":=" ConstExpr]. Override = Id ":=" ConstExpr .
ConstExpr = Expr. Expr = E1 {"||" E1}. E1 = E2 {"&&" E2}. E2 = {"!"} E3. E3 = E4 {Relop E4}. E4 = E5 {Addop E5}. E5 = E6 {Mulop E6}. E6 = {"+" | "-"} E7. E7 = E8 {Selector}. E8 = Id | Number | CharLiteral | TextLiteral | "(" Expr ")" | new Type. Relop = "==" | "!=" | "<" | "<=" | ">" | ">=". Addop = "+" | "-". Mulop = "*" | "/" | "%". Selector = "^" | "." Id | "[" Expr "]" | "(" [ Actual {"," Actual} ] ")".
IdList = Id {"," Id}. TypeName = Id.
To read a token, first skip all blanks, tabs, newlines, carriage returns, vertical tabs, form feeds, comments, and pragmas. Then read the longest sequence of characters that forms an operator or an Id or Literal.
An Id is a case-significant sequence of letters, digits, and underscores that begins with a letter. An Id is a keyword if it appears in the list of keywords, a reserved identifier if it appears in the list of reserved identifiers, and an ordinary identifier otherwise.
In the following grammar, terminals are characters surrounded by double-quotes and the terminal "\"" represents double-quote itself.
Id = Letter {Letter | Digit | "_"}. Literal = Number | CharLiteral | TextLiteral. CharLiteral = "'" (PrintingChar | Escape | "\"") "'". TextLiteral = "\"" {PrintingChar | Escape | "'"} "\"". Escape = "\" "n" | "\" "t" | "\" "r" | "\" "f" | "\" "\" | "\" "'" | "\" "\"" | "\" OctalDigit OctalDigit OctalDigit. Number = Digit {Digit} | Digit {Digit} "_" HexDigit {HexDigit}. PrintingChar = Letter | Digit | OtherChar. HexDigit = Digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f". Digit = "0" | "1" | ... | "9". OctalDigit = "0" | "1" | ... | "7". Letter = "A" | "B" | ... | "Z" | "a" | "b" | ... | "z". OtherChar = " " | "!" | "#" | "$" | "%" | "&" | "(" | ")" | "*" | "+" | "," | "-" | "." | "/" | ":" | ";" | "<" | "=" | ">" | "?" | "@" | "[" | "]" | "^" | "_" | "`" | "{" | "|" | "}" | "~" | ExtendedChar ExtendedChar = any char with ISO-Latin-1 code in [8_240..8_377].