The purpose of this document is to provide a comprehensive guide to the lexical and syntactic grammar of Hook, using the EBNF (Extended Backus-Naur Form) notation to describe both grammars. By following this guide, you will gain a thorough understanding of the rules and conventions that govern the Hook language.
The lexical grammar in Hook defines the set of valid tokens, which are the building blocks of the language. This includes literals, names, keywords and other elements that make up the Hook lexicon.
Hook supports two types of literal numbers: integers and floating-point numbers. Both types of numbers are represented in the same way, with the only difference being that integers do not have a fractional part. Below you can find the EBNF grammar:
number ::= ( '0' | nonzero_digit ) digit* fraction? exponent?
nonzero_digit ::= digit - '0'
digit ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
fraction ::= '.' digit+
exponent ::= ( 'e' | 'E' ) ( '+' | '-' )? digit+
In Hook, literal strings are sequences of characters enclosed in double quotes. They can contain any character except newline, carriage return, double quote, and backslash. The strings are encoded using UTF-8, which means they can include any Unicode code point.
To include special characters within strings, Hook provides escape sequences that are sequences of backslash followed by a character. The supported escape sequences include:
Escape sequence | Resulting character |
---|---|
\" |
Double quote |
\\ |
Backslash |
\b |
Backspace |
\f |
Form feed |
\n |
Newline |
\r |
Carriage return |
\t |
Tab |
The following EBNF grammar describes the structure of string literals:
string ::= '"' ( char | escape )* '"'
char ::= CODE_POINT - '\n' - '\r' - '"' - '\'
escape ::= '\' ( '"' | '\' | 'b' | 'f' | 'n' | 'r' | 't' )
Here, CODE_POINT
refers to any Unicode code point.
Names, also known as identifiers, are sequences of ASCII characters used to name variables, functions, and other entities in Hook. To define a name in Hook, you can use any combination of letters, digits, and underscores. However, a name must start with a letter or an underscore. Hook follows the same rules for naming variables as the C programming language.
The following EBNF grammar describes the structure of names:
name ::= ( LETTER | '_' ) ( LETTER | digit | '_' )*
Here, LETTER
refers to any uppercase or lowercase letter in the ASCII character set, while digit
was defined earlier in this document.
Keywords are reserved words that have a special meaning and cannot be used as names for variables, functions, or other entities. The following table lists all the keywords in Hook:
as |
break |
continue |
del |
do |
else |
false |
fn |
for |
foreach |
from |
if |
if! |
import |
in |
let |
loop |
match |
nil |
return |
struct |
true |
var |
while |
while! |
Besides literals, names, and keywords, Hook uses several other tokens consisting of one or two ASCII characters. These tokens have special meanings in the language.
.. |
.. |
, |
: |
; |
( |
) |
[ |
] |
{ |
} |
|= |
|| |
| |
^= |
^ |
&= |
&& |
& |
=> |
== |
= |
!= |
! |
>= |
>>= |
>> |
> |
<= |
<<= |
<< |
< |
+= |
++ |
+ |
-= |
-- |
- |
*= |
* |
/= |
/ |
~/= |
~/ |
~ |
%= |
% |
_ |
In addition, Hook uses a special token to indicate the end of a file. This token is represented by the '\0'
character.
The complete syntactic grammar of Hook is defined by the following EBNF grammar:
chunk ::= stmt* EOF
stmt ::= import_stmt
| var_decl ';'
| assign_call ';'
| struct_decl
| fn_decl
| del_stmt
| if_stmt
| match_stmt
| loop_stmt
| while_stmt
| for_stmt
| break_stmt
| return_stmt
| block
import_stmt ::= 'import' NAME ( 'as' NAME )? ';'
| 'import' string 'as' NAME ';'
| 'import' '{' NAME ( ',' NAME )* '}' 'from' ( NAME | string ) ';'
var_decl ::= 'let' NAME '=' expr
| 'var' NAME ( '=' expr )?
| ( 'let' | 'var' ) '[' '_' | NAME ( ',' '_' | NAME )* ']' '=' expr
| ( 'let' | 'var' ) '{' NAME ( ',' NAME )* '}' '=' expr
assign_call ::= NAME subsc* assign_op expr
| NAME subsc* ( '++' | '--' )
| NAME subsc* '[' ']' '=' expr
| NAME subsc* subsc '=' expr
| NAME ( subsc | call )* call
struct_decl ::= 'struct' NAME '{' ( string | NAME ( ',' string | NAME )* )? '}'
fn_decl ::= 'fn' NAME '(' params? ')' ( '=>' expr ";" | block )
params ::= NAME ( ',' NAME )*
del_stmt ::= 'del' NAME subsc* '[' expr ']' ';'
if_stmt ::= ( 'if' | 'if!' ) '(' ( var_decl ';' )? expr ')'
stmt ( 'else' stmt )?
match_stmt ::= 'match' '(' ( var_decl ';' )? expr ')'
'{' ( expr '=>' stmt )+ ( '_' '=>' stmt )? '}'
loop_stmt ::= 'loop' stmt
while_stmt ::= ( 'while' | 'while!' ) '(' expr ')' stmt
| 'do' stmt ( 'while' | 'while!' ) '(' expr ')' ';'
for_stmt ::= 'for' '(' ( var_decl | assign_call )? ';' expr?
';' assign_call? ')' stmt
| 'foreach' '(' NAME 'in' expr ')' stmt
break_stmt ::= ( 'break' | 'continue' ) ';'
return_stmt ::= 'return' expr? ';'
block ::= '{' stmt* '}'
assign_op ::= '=' | '|=' | '^=' | '&=' | '<<=' | '>>='
| '+=' | '-=' | '*=' | '/=' | '~/=' | '%='
subsc ::= '[' expr ']' | '.' NAME
call ::= '(' ( expr ( ',' expr )* )? ')'
expr ::= and_expr ( '||' and_expr )*
and_expr ::= equal_expr ( '&&' equal_expr )*
equal_expr ::= comp_expr ( ( '==' | '!=' ) comp_expr )*
comp_expr ::= bor_expr ( ( '>' | '>=' | '<' | '<=' ) bor_expr )*
bor_expr ::= bxor_expr ( '|' bxor_expr )*
bxor_expr ::= band_expr ( '^' band_expr )*
band_expr ::= shift_expr ( '&' shift_expr )*
shift_expr ::= range_expr ( ( '<<' | '>>' ) range_expr )*
range_expr ::= add_expr ( '..' add_expr )?
add_expr ::= mul_expr ( ( '+' | '-' ) mul_expr )*
mul_expr ::= unary_expr ( ( '*' | '/' | '~/' | '%' ) unary_expr )*
unary_expr ::= ( '-' | '!' | '~' ) unary_expr | primary_expr
primary_expr ::= literal
| array_constructor
| struct_constructor
| anonymous_struct
| anonymous_fn
| if_expr
| match_expr
| subsc_call
| group_expr
literal ::= 'nil' | 'false' | 'true' | number | string
array_constructor ::= '[' ( expr ( ',' expr )* )? ']'
struct_constructor ::= '{' ( string | NAME ':' expr ( ',' string | NAME ':' expr )* )? '}'
anonymous_struct ::= 'struct' '{' ( string | NAME ( ',' string | NAME )* )? '}'
anonymous_fn ::= '|' params? '|' ( '=>' expr | block )
| '||' ( '=>' expr | block )
if_expr ::= ( 'if' | 'if!' ) '(' expr ')' expr 'else' expr
match_expr ::= 'match' '(' expr ')' '{' expr '=>' expr ( ',' expr '=>' expr )*
',' '_' '=>' expr '}'
subsc_call ::= NAME ( subsc | call )* ( '{' ( expr ( ',' expr )* )? '}' )?
group_expr ::= '(' expr ')'