Semantic Analysis of Expressions
Type checking additive, comparison, function call, and list literal expressions
The CatScript type system in code
The type system is implemented as an object hierarchy rooted in CatScriptType:
public static final CatScriptType INT = new CatScriptType("int", Integer.class);
public static final CatScriptType STRING = new CatScriptType("string", String.class);
public static final CatScriptType BOOLEAN = new CatScriptType("bool", Boolean.class);
public static final CatScriptType OBJECT = new CatScriptType("object", Object.class);
public static final CatScriptType NULL = new CatScriptType("null", Object.class);
public static final CatScriptType VOID = new CatScriptType("void", Void.class);
ListType extends CatScriptType and adds a component type (e.g. list<int> has component type int).
CatScriptFunctionType extends CatScriptType with parameter types and a return type.
Assignability rules
The isAssignableFrom method on CatScriptType encodes the core rules:
| Rule | Result |
|---|---|
Anything assigned to void |
Only void is assignable to void |
null assigned to anything |
Always true - null is the bottom type |
| Otherwise | Delegates to Java’s Class.isAssignableFrom on the backing class |
For list types, assignability is covariant on the component type: list<string> is assignable to list<object> because string is assignable to object. This is convenient but unsound.
Expressions have types
Expression extends ParseElement and adds a type field. Statement does not have a type.
Every expression resolves its type during validate. The general pattern:
- Validate children first (so they resolve their types)
- Determine own type from the children’s types
- Check type rules and add errors for violations
Additive expression
Additive expressions are overloaded - they handle both integer addition and string concatenation:
void validate(SymbolTable symbolTable) {
lhs.validate(symbolTable);
rhs.validate(symbolTable);
if (lhs.getType().equals(CatScriptType.STRING)
|| rhs.getType().equals(CatScriptType.STRING)) {
setType(CatScriptType.STRING);
} else {
setType(CatScriptType.INT);
if (!lhs.getType().equals(CatScriptType.INT)) {
lhs.addError(/* incompatible type */);
}
if (!rhs.getType().equals(CatScriptType.INT)) {
rhs.addError(/* incompatible type */);
}
}
}
If either side is a string, the result is a string (concatenation). Otherwise, both sides must be integers.
Factor (multiplication) expression
Simpler than addition - no overloading. Both sides must be int, result is always int.
Comparison expression
Both sides must be comparable types. The return type is always boolean.
Equality expression
Both sides can be anything - all types support equality comparison. The return type is boolean.
Unary expression
| Operator | Operand type | Result type |
|---|---|---|
- (negate) |
int |
int |
not |
boolean |
boolean |
Parenthesized expression
The type is simply the type of the inner expression. Parentheses affect evaluation order, not types.
Literals
| Literal | Type |
|---|---|
42 |
int |
"hello" |
string |
true / false |
boolean |
null |
null |
Literals can’t have type errors themselves - errors arise from the position they’re used in (e.g. passing a string to a function that expects int).
List literal and type unification
[1, 2, 3] // list<int>
[1, "two", 3] // list<object>
[1, null, 3] // list<int> - null is assignable to int
Validate all children first, then determine the list type by unifying the children’s types:
- If all children have the same type ->
list<that_type> - If children have mixed types -> unify up to a common supertype (ultimately
object) nullis the bottom type and doesn’t force unification upward
The type diamond: null (bottom) -> concrete types (int, string, boolean) -> object (top). Unification walks up toward object as needed.
Identifier expression
The type of an identifier comes from the symbol table:
void validate(SymbolTable symbolTable) {
CatScriptType type = symbolTable.getSymbolType(name);
if (type == null) {
addError(ErrorType.UNKNOWN_NAME);
} else {
setType(type);
}
}
This is the main place where expressions interact with the symbol table. The type was registered earlier by a var statement, function parameter, or for-loop variable.
Index expression
x[5]
| Part | Requirement |
|---|---|
x (the collection) |
Must be a ListType |
5 (the index) |
Must be of type int |
| Result type | The component type of the list (e.g. int if x is list<int>) |
Function call expression
var result = foo(1, "hello")
- Look up the function name in the symbol table
- Verify the symbol is a
CatScriptFunctionType(not an int or string - that’s an error) - Set the expression’s type to the return type of the function
- Validate each argument expression
- Check that each argument’s type is assignable to the corresponding parameter type
CatScriptType symbolType = symbolTable.getSymbolType(name);
if (symbolType instanceof CatScriptFunctionType funcType) {
setType(funcType.getReturnType());
for (int i = 0; i < arguments.size(); i++) {
arguments.get(i).validate(symbolTable);
// check: argument type assignable to parameter type at position i
}
}
The recursive pattern
Type resolution flows bottom-up through the parse tree:
- Literals know their types immediately
- Identifiers look up their types in the symbol table
- Binary expressions validate children, then derive their own type
- Function calls use the function’s declared return type
- List literals unify children’s types
- Parent expressions use their children’s resolved types
Each expression makes local decisions that globally verify correctness.
Summary
| Expression | Type resolution |
|---|---|
| Additive | String if either side is string; otherwise int (both sides must be int) |
| Factor | Always int; both sides must be int |
| Comparison | Always boolean |
| Equality | Always boolean; any types allowed |
| Unary | - -> int; not -> boolean |
| Parenthesized | Same as inner expression |
| Literals | Implied by literal type |
| List literal | Unify children’s types into list<unified_type> |
| Identifier | Looked up from symbol table |
| Index | Component type of the list being indexed |
| Function call | Return type of the function |