Compilers

CatScript Overview

A small statically-typed programming language

Introduction

CatScript is a small programming language designed for learning compiler construction. It features static typing, functions, control flow, and lists.


Grammar

The following EBNF grammar defines the CatScript language:

catscript_program = expression | { program_statement };

program_statement = statement |
                    function_declaration;

statement = for_statement |
            while_statement |
            if_statement |
            print_statement |
            variable_statement |
            assignment_statement |
            return_statement |
            function_call_statement;

for_statement = 'for', '(', IDENTIFIER, 'in', expression ')',
                '{', { statement }, '}';

while_statement = 'while', '(', expression ')',
                  '{', { statement }, '}';

if_statement = 'if', '(', expression, ')', '{',
                    { statement },
               '}' [ 'else', ( if_statement | '{', { statement }, '}' ) ];

print_statement = 'print', '(', expression, ')'

variable_statement = 'var', IDENTIFIER,
                     [':', type_expression, ] '=', expression;

function_call_statement = function_call;

assignment_statement = IDENTIFIER [ '[' expression ']' ] '=', expression;

function_declaration = 'function', IDENTIFIER, '(', parameter_list, ')' +
                       [ ':' + type_expression ], '{',  { statement },  '}';

parameter_list = [ parameter, {',' parameter } ];

parameter = IDENTIFIER [ , ':', type_expression ];

return_statement = 'return' [, expression];

expression = equality_expression;

equality_expression = comparison_expression { ("!=" | "==") comparison_expression };

comparison_expression = additive_expression { (">" | ">=" | "<" | "<=" ) additive_expression };

additive_expression = factor_expression { ("+" | "-" ) factor_expression };

factor_expression = unary_expression { ("/" | "*" | "%") unary_expression };

unary_expression = ( "not" | "-" ) unary_expression | index_expression;

index_expression = primary_expression { '[' expression ']' }

primary_expression = IDENTIFIER | STRING | INTEGER | "true" | "false" | "null"|
                     list_literal | function_call | "(", expression, ")"

list_literal = '[', [ expression,  { ',', expression } ] ']';

function_call = IDENTIFIER, '(', argument_list , ')'

argument_list = [ expression , { ',' , expression } ]

type_expression = 'int' | 'string' | 'bool' | 'object' | 'list' [, '<' , type_expression, '>']

Types

CatScript is statically typed with a small type system:

Type Description
int A 32-bit integer
string A Java-style string
bool A boolean value
list<T> A list of values with type T
null The null type
object Any type of value