Programming Language Technology DAT151/DIT231 Andreas Abel
-
Machine perspective:
- Numeric types and arithmetic: 32/64bit word, signed/unsigned, float/integer
z = x / y;
- Avoid runtime errors (e.g. by confusion of number and memory address)
typedef int (*fun_t)(); // Pointer to function without arguments returning int int main () { fun_t good = &main; // Function pointer to main function int i = (*good)(); fun_t bad = 12345678; // Function pointer to random address int j = (*bad)(); }
- Numeric types and arithmetic: 32/64bit word, signed/unsigned, float/integer
-
Human perspective:
- Avoid silly mistakes (types provide some redundancy)
void foo (int x) { ... } ... int piece; char* peace; ... foo (peace);
- Better code comprehension (documentation)
- Avoid silly mistakes (types provide some redundancy)
... divide(... x, ... y) { return x / y; }
printDouble (divide (5, 3));
Rule format:
J₁ ... Jₙ
--------- C
J
- J : conclusion
- Jᵢ: premises/hypotheses
- C : side condition
Logical interpretation: Judgement J holds if C and all of J₁ ... Jₙ hold.
Algorithmic interpretation: To check J, check J₁ ... Jₙ (in this order) and C (in a suitable place).
Judgement version 1: "expression e
has type t
"
e : t
Rules may be written using concrete syntax:
e₁ : int e₂ : int
--------------------
e₁ / e₂ : int
e₁ : double e₂ : double
--------------------------
e₁ / e₂ : double
Generic version, using side condition:
e₁ : t e₂ : t
---------------- t ∈ {int, double}
e₁ / e₂ : t
Using abstract syntax:
TInt. Type ::= "int";
TDouble. Type ::= "double";
EInt. Exp6 ::= Integer;
EDouble. Exp6 ::= Double;
EDiv. Exp2 ::= Exp3 "/" Exp2;
------------- -------------------
EInt i : TInt EDouble d : TDouble
e₁ : t e₂ : t
---------------- t ∈ {TInt, TDouble}
EDiv e₁ e₂ : t
Checking: Given e
and t
, check whether e : t
.
Inference: Given e
, compute t
such that e : t
.
check (Exp e, Type t): Bool
infer (Exp e) : Maybe Type
Example implementation:
check (EInt i, t):
t == TInt
check (EDiv e₁ e₂, t):
check (e₁, t) && check (e₂, t) && (t == TInt || t == TDouble)
infer (EInt i):
return TInt
infer (EDiv e₁ e₂):
t ← infer (e₁)
check (e₂, t)
return t
int
values can be coerced to double
values.
int
is a subtype of double
, written int ≤ double
Subtyping t₁ ≤ t₂
is a preorder, i.e., a reflexive-transitive relation.
e : t₁
------ t₁ ≤ t₂
e : t₂
Subtyping should be coherent, up-casting via an intermediate type should not make a difference:
short int i;
double x = (double)((int)i);
double y = (double)i;
In practice, it often does. E.g. with coercions to string
:
int ≤ string 1 → "1"
int ≤ double ≤ string 1 → 1.0 → "1.0"
Quiz: What is the value of this expression?
1 + 2 + "hello" + 1 + 2
This should better be a type error! The value should not depend on the associativity of "+".
Type checker can insert coercion ECoerce
when subtyping int < double
was applied.
internal ECoerce. Exp ::= "(double)" Exp;
Checking/inference: also return elaborated expression.
check (Exp e, Type t): Maybe Exp
infer (Exp e) : Maybe (Exp, Type)
Example implementation:
check (EInt i, t):
if t == TInt then return (EInt i)
else fail "expected int"
check (EDiv e₁ e₂, t):
assert (t ∈ {TInt, TDouble})
e₁' ← check (e₁, t)
e₂' ← check (e₂, t)
return (EDiv e₁' e₂')
infer (EInt i):
return (EInt i, TInt)
infer (EDiv e₁ e₂):
(e₁', t₁) ← infer (e₁)
(e₂', t₂) ← infer (e₂)
t ← max t₁ t₂
e₁'' = coerce (e₁', t₁, t)
e2'' = coerce (e₂', t₂, t)
return (EDiv e₁'' e₂'', t)
coerce (e, t₁, t₂):
if t₁ == TInt and t₂ == TDouble then ECoerce(e)
else e
Comparison operators return a boolean:
e₁ : t e₂ : t
---------------- t ∈ {int, double}
e₁ < e₂ : bool
e₁ : t e₂ : t
---------------- t ∈ {bool, int, double}
e₁ == e₂ : bool
Statements all have type void
so we can omit it.
Judgements version 1:
⊢ s Statement s is well-typed
⊢ s₁...sₙ Statement sequence s₁...sₙ is well-typed
NB: ⊢ = "turnstile" (With TeX input mode: \vdash)
Rules for conditional statements:
e : bool ⊢ s
---------------
⊢ while (e) s
e : bool ⊢ s₁ ⊢ s₂
------------------------
⊢ if (e) s₁ else s₂
Statement sequences:
⊢ s₀ ⊢ s₁...sₙ
--- -----------------
⊢ ε ⊢ s₀ s₁...sₙ
Expressions as statements
e : t
-----
⊢ e;
A return statement needs to have a return expression of the correct type.
int main () {
...
if (...) return 5;
else return 1.0; // error
}
Judgements version 1.5:
⊢ᵗ s Statement s is well-typed, may return t
⊢ᵗ s₁...sₙ Statement sequence s₁...sₙ is well-typed, may return t
Rule:
e : t
-----------
⊢ᵗ return e;
DFun. Def ::= Type Id "(" ")" "{" [Stm] "}"
Checking function definitions (version 1):
checkStm (Type t, Stm s)
checkStms (Type t, [Stm] ss):
for (Stm s ∈ ss):
checkStm(t, s)
checkDef (DFun t x ss):
checkStms (t, ss)
Typing environments (aka contexts) assign types to variables.
Example:
int f (double x) {
int i = 2 ;
int j = 5 ;
// int i; // illegal, i already declared in block
// int x; // illegal, x already declared in block
{
int x = 0; // legal, shadows function parameter.
double i = 3 ; // shadows the previous variable i
{
// Environment: (x:double,i:int,j:int).(x:int,i:double).()
}
i = i + j; // the inner i becomes 8.0
int k;
}
i++ ; // the outer i becomes 3
return i + j;
}
Environments Γ
are structured as stack of blocks.
Each block Δ
is a (partial and finite) map from identifier to type.
Example: in the innermost block, the typing environment is a stack of three blocks
(x:double,i:int,j:int)
: function parameters and top local variables(x:int,i:double)
: variables declared in the first inner block()
: variables declared in the second inner block (none)
Declaration statements like int i, j;
declare a new block component Δ = (i:int, j:int)
.
Notation:
Γ.ε
orΓ.
orΓ.()
: Context Γ extended by a new empty block.Γ,Δ
: The top block ofΓ
is extended by declarationsΔ
, assuming no clash.
E.g. (x:t),(x:...)
is a clash.
Judgements version 2:
Γ ⊢ e : t
: In contextΓ
, expressione
has typet
.Γ ⊢ᵗ s ⇒ Δ
: In contextΓ
, statements
may returnt
and declaresΔ
.Γ ⊢ᵗ ss ⇒ Δ
: (ditto)
Rules for declarations and blocks.
------------------------------------- no xᵢ ∈ Δ
Γ.Δ ⊢ᵗ⁰ t x₁,...,xₙ; ⇒ (x₁:t,...xₙ:t)
Γ.(Δ,x:t) ⊢ e : t
------------------------ x ∉ Δ
Γ.Δ ⊢ᵗ⁰ t x = e; ⇒ (x:t)
Γ.() ⊢ᵗ⁰ ss ⇒ Δ
-----------------
Γ ⊢ᵗ⁰ { ss } ⇒ ()
Valid but pointless example for initialization statement:
int main () {
int i = i;
}
t x = e
is sugar for t x; x = e;
.
Rules for sequences:
-----------
Γ ⊢ᵗ⁰ ε ⇒ ()
Γ ⊢ᵗ⁰ s ⇒ Δ₁ Γ,Δ₁ ⊢ᵗ⁰ ss ⇒ Δ₂
--------------------------------
Γ ⊢ᵗ⁰ s ss ⇒ Δ₁,Δ₂
Rules for conditional statements: Branches need to be in new scope.
Γ ⊢ᵗ⁰ e : bool Γ. ⊢ᵗ⁰ s₁ ⇒ Δ₁ Γ. ⊢ᵗ⁰ s₂ ⇒ Δ₂
-------------------------------------------------
Γ ⊢ᵗ⁰ if (e) s₁ else e₂ ⇒ ()
Γ ⊢ᵗ⁰ e : bool Γ. ⊢ᵗ⁰ s ⇒ Δ
-----------------------------
Γ ⊢ᵗ⁰ while (e) s ⇒ ()
Example:
int main () {
if (condition) int i = 1; else int j = 2;
return i;
}
Same as if (condition) { int i = 1; } else { int j = 2; }
.
When calling a function, we need to provide arguments of the correct type.
Global environment Σ
(for function signature) maps function names to function types.
bool foo (int x, double y) { ... }
Type of foo
is bool(int,double)
also written as (int,double)→bool
.
Judgements version 3:
Σ;Γ ⊢ e : t
: In signatureΣ
and contextΓ
, expressione
has typet
.Σ;Γ ⊢ᵗ s ⇒ Δ
: ...Σ;Γ ⊢ᵗ ss ⇒ Δ
: ...Σ ⊢ d
: In signatureΣ
, function definitiond
is well-formed.
Rule for function application:
Σ;Γ ⊢ e₁ : t₁ ... Σ;Γ ⊢ eₙ : tₙ
------------------------------- Σ(f) = t(t₁,...,tₙ)
Σ;Γ ⊢ f(e₁,...,eₙ) : t
Rule for function definition:
Σ; (x₁:t₁,...,xₙ:tₙ) ⊢ᵗ ss ⇒ Δ
---------------------------------
Σ ⊢ t f (t₁ x₁, ... tₙ xₙ) { ss }
The correctness of the signature, Σ(f) = t(t₁,...,tₙ)
, can be assumed in the last rule.
A program is a list of functions. Checking a program:
- Pass 1: Compute the signature
Σ
(check for duplicate function definitions!). - Pass 2: In
Σ
, check the bodyss
of each function definition.
The elaborating type checker will compute a new body ss'
for each
function definition, resulting in an elaborated program.
The result of type checking is a map from function names to their types and elaborated definitions.
Elaboration:
- insert coercions (already discussed)
- disambiguate overloaded operators
+
,*
,...<
,<=
,...,==
,... - ...
Type-annotating checker: annotate each subexpression with its type.
-
Alternative 1 (book): put an annotation at each subexpression
internal ECast. Exp ::= "(" Type ")" Exp;
infer (Context Γ, Exp e): Maybe Exp infer (Γ, EDiv e₁ e₂): ECast t₁ e₁' ← infer (Γ, e₁) ECast t₂ e₂' ← infer (Γ, e₂) t = max t₁ t₂ return (ECast t (EDiv (ECast t₁ e₁') (ECast t₂ e₂')))
-
Alternative 2: design a new abstract syntax which type-annotations in the needed places.
... TEDiv. TExp ::= "div" Type TExp TExp; TSExp. TStm ::= Type TExp ";" ; TSReturn. TStm ::= "return" Type TExp; ...