Aidan Ewart
Updated 04/06/2020
Please Note: The following has been depricated due to the possibility of me being wrong.
The -Calculus, in Set Theory, in Coq
I've recently been reading about Peter Aczel's interpretation of 'Constructive Zermelo-Fraenkel Set Theory' (CZF) in Martin-Löfs Type Theory (MLTT). This got me thinking about the inverse interpretation; how would one go about creating a model for a simple type theory within a set theory like CZF? I resolved to experiment with constructing models of various logics in set theory; however, to do this, I first needed a set theory to experiment in.
Axiomatic Set Theory
CZF is an 'axiomatic' set theory, meaning that one may only 'construct' sets by using the particular rules, called axioms, which make up the theory. The reason for this is twofold; firstly, the axiomization of the set theory avoids the possibility of subtle inconsistencies which can invalidate proofs - the commonly-cited example is Russell's paradox.
Secondly, we can limit the axioms of our system to those that are constructive, meaning that each axiom (abstractly) represents a 'computation' or 'algorithm' of some sort; such a system of axioms is called a constructive theory. This has the interesting property that proofs in constructive theories can be thought of as descriptions of computations: for example, a proof of can be thought of as an algorithm for constructing a proof of given a proof of .
Using non-constructive theories, such as Zermelo-Fraenkel Set Theory (ZF), mathematicians can prove results which seem counterintuitive, such as the Banach-Tarski paradox (which relies on the non-constructive 'axiom of choice'). The distinction between constructive and non-constructive axioms and theories is mostly philosophical, as some non-constructive axioms (such as the law of the excluded middle or the powerset) may seem perfectly intuitive on the surface.
Most set theories are but an extension of an 'underlying language' of logic (classical logic for ZF, intuonistic logic for CZF) with the aforementioned axioms for set construction. This underlying language usually includes the normal logical operators and quantifiers, and a statement in this language is called a predicate. Luckily for us, many logics have already been implemented as programs for computers. This means we need not check our proofs manually; the program automatically checks the soundness of our reasoning for us!
These systems are called 'automated theorem provers' and are quite a mature technology; I will extend one, Coq, with axioms for the construction of sets. The actual set theory I will use is called 'Intuonistic Zermelo-Fraenkel Set Theory' (IZF); as the name suggests, it is similar to CZF, but is slightly 'less' constructive. It uses the following axioms:
- The 'axiom of extensional equality'. This says that if all the members of two sets are shared, then the two sets are equal.
- The 'axiom of the union set'. This says that if is a set, then the union of (written as
⋃x
in Coq) is also a set. - The 'axiom of the empty set'. This says that there is a set with no elements, which I write as or
∅
in Coq. - The 'axiom of the pair'; if and are sets, then is a set. When used in conjunction with the above two axioms, this means that we can write sets by listing their elements.
- The 'axiom of infinity'. Commonly used as a 'superset' of the natural numbers, this is a set such that if and if , then , where . I write the sets in like the natural numbers they correspond to (for example writing to mean ), but it is important to be aware of the distinction to avoid confusion with actual natural numbers.
- The 'axiom of separation'. If is a set, then we can construct a new set of all the members such that holds for some property . Informally, this is commonly written , but I also use .
- The 'axiom of replacement'. This is similar to the above axiom, except instead of selecting the members of that satisfy some property, we construct a new set of all the where , for some functional predicate (such a predicate is a property where for all , a exists; we write this as ). I write this .
- The 'axiom of the powerset'. This states that for all , there is a set consisting of the subsets of . This axiom is non-constructive as there is no algorithm that (even given an infinite amount of time) can produce all the subsets of any infinite set; this is an immediate consequence of Cantor's diagonalization theorem.
- The 'axiom of -induction'. This is essentially well-founded induction on the relation; if a property holding of all implies that the property holds of , then the property holds of all sets.
I describe the process of creating a model (an 'encoding') for this set theory in Coq in addendum I. You can view the axiomization of IZF in Coq here.
Somewhat amazingly, using only these axioms, one is able to 'construct' a decent amount of mathematics, as we shall soon see!
Set-Theoretic Constructions
As a first example construction, let us consider the Cartesian product of two sets . For any sets and , should have the property that for any and , , where denotes the ordered pair of and , constructed in the usual way as .
For a given , we can construct the set by the axiom of replacement and a predicate defined as . Clearly, this predicate is functional. Call this process , and we can then apply replacement again, this time using , to generate the indexed set . Taking , and we have the Cartesian product :
Definition cartesian_product (A B: set) :=
⋃(replacement (fun x (_: x ∈ A) => replacement (fun y (_: y ∈ B) => ⟨x,y⟩))).
Notation "A × B" := (cartesian_product A B) (at level 60, right associativity).
Lemma is_cartesian_product : forall A B x y, x ∈ A -> y ∈ B -> ⟨x,y⟩ ∈ A × B.
The disjoint union, or coproduct, of an indexed set is similar to the union of , except every element is 'tagged' with the index of the set which it is a member of. For example, for an for some , then , where is the disjoint union operation.
The disjoint union of an indexed set can be constructed using the axioms of IZF, similarly to the construction of the Cartesian product. First, for a given , take the replacement of with , and call this operation . The disjoint union is then the union of the replacement of over the elements of . In set notation, this is , which is easily represented in Coq (representing an indexed set as a function forall i, i ∈ I -> set
):
Definition disjoint_union {I} (f: forall i, i ∈ I -> set) :=
⋃(replacement (fun j H => replacement (fun y (_: y ∈ f j H) => ⟨j,y⟩))).
Lemma is_disjoint_union :
forall {S f x y}
(p: x ∈ S),
y ∈ f x p -> ⟨x,y⟩ ∈ disjoint_union f.
That derivation is almost identical to our derivation for the Cartesian product! In fact, the Cartesian product is definitionally equal to the case of the disjoint union where is always for any :
Lemma cartesian_product_is_extreme_disjoint_union :
forall {A B}, A × B ≡ disjoint_union (fun x (_: x ∈ A) => B).
Proof.
intros. reflexivity.
Qed.
This is a pretty good example of how a constructive approach can allow you to discover nuances about the properties of operations that would remain undiscovered otherwise.
We can also define a binary version of the union and disjoint union operations, where the binary disjoint union of two sets is written , similarly to how the cartesian product of two sets is written . It is defined by letting the indexing set be the set , the set containing two distinct elements (in this case and ). Elements of are therefore 'tagged', either existing as for some or for some .
Definition union2 x y :=
⋃(x +> (y +> ∅)).
Notation "x ∪ y" := (union2 x y) (at level 60).
Lemma union2_prop : forall {A B x},
x ∈ A ∪ B <-> x ∈ A \/ x ∈ B.
Definition left x := ⟨∅,x⟩.
Definition right x := ⟨Suc ∅,x⟩.
Definition disjoint_union2 x y.
Notation "x + y" := (disjoint_union2 x y) (at level 60).
Lemma disjoint_union2_prop : forall {A B x},
x ∈ A + B <-> (exists a, x ≡ left a /\ a ∈ A) \/ (exists b, x ≡ right b /\ b ∈ B).
We can derive a stronger form of our axiom of infinity by separating out the elements of that satisfy the inductive property. The inductive property, which is written , says that 'if and for all we can derive , then we can derive for any property ':
If for any we have , then we say that is in the standard model of arithmetic - this is pretty much the same as saying can be expressed by repeating the operation over the empty set a finite number of times.
It is important to note that the inductive property cannot be written in 'first-order' logic where only quantification over sets is allowed; in order to express the inductive property correctly, we must quantify over all propositional functions , which requires 'second-order' logic. It is possible to construct the standard model of arithmetic in ZF using only first-order logic, but not express the inductive property in it.
We can construct the standard model of arithmetic (which we call ) by selecting the elements that have the inductive property:
As, by definition, all elements of have the inductive property, we can prove that one can perform mathematical induction over itself; if, for any property , both and hold, then holds for all . This makes so useful for further constructions that it is sometimes even stated axiomatically, as the axiom of strong infinity:
The exponential set , of functions has a pretty obvious construction; all functions are subsets of , and so to construct we can simply restrict to only those subsets which are functional (for all , if for a , then is unique) and left-total (for all , there exists a such that ), using separation:
Definition functional (R: set) :=
forall {a b}, ⟨a,b⟩ ∈ R -> forall {c}, ⟨a,c⟩ ∈ R -> b ≡ c.
Definition left_total (R D: set) :=
forall {a}, a ∈ D -> exists {b}, ⟨a,b⟩ ∈ R.
Definition exponential (A B: set) :=
separation (fun R => functional R /\ left_total R B) (powerset (B × A)).
Lemma member_exp_function :
forall {A B R}, R ∈ exponential A B -> functional R /\ left_total R B.
From the exponential set and the principle of -induction, we can actually derive some useful and surprising tools for constructing more complex sets, such as the principle of -recursion. The principle of -recursion states, that for every formula of the underlying language such that , then there exists a formula such that . Expressed as a function, this can be written as:
However, in order to prove -recursion, we must first prove that the 'transitive closure' of any set is itself a set. The transitive closure of any set, , is defined by the properties and - that is, if for any and there is a chain of the form , then . Notice that . This is provable inside our model using -induction and collection:
Lemma TC : forall x, exists TCx,
(forall y, y ∈ x -> y ∈ TCx)
/\ (forall y, y ∈ TCx -> forall z, z ∈ y -> z ∈ TCx).
Using the transitive closure, we can prove the existence of -recursion as follows:
For any formula such that , and a set which we are recursing over, consider elements of the set:
Call this set . We can prove that contains a unique element, meaning that , by -induction over . In the inductive case, we need to prove that given . From this, it is obvious that . As relates this to a unique set , then .
Taking , by the uniqueness property above, will be the function we want, provided we can prove that is inhabited. iff is uninhabited, so proving suffices. Once again, -induction comes to our aid! Using the predicate , the inductive case becomes:
The assumption proves that by the property of the transitive closure. Therefore, we can use replacement to construct the set , for which there exists a unique set such that . By the construction of , is exactly the set , and so . From this, , i.e. , thereby proving the existence of - for a given . Whenever we use -recursion over a set , we can just set , meaning we can safely ignore the added complexity of from now on.
Formalizing the reasoning above is somewhat out of the scope of this project, so I just add an axiom implementing the functionality of -recursion:
Axiom ϵ_rec : (forall x, (forall y, y ∈ x -> set) -> set) -> set -> set.
Axiom ϵ_rec_prop : forall {F x},
ϵ_rec F x ≡ F x (fun y _ => ϵ_rec F y).
This is the last axiom I will add to the system; everything else is derived from the Coq model described in addendum I.
Inductively Defined Sets
We now have enough theory to begin constructing some interesting sets!
A set is 'inductively defined' if all members of the set can be finitely described in terms of operations on other members of that set - a classic example is the set of natural numbers:
- is a natural number
- is a natural number if is a natural number
You can see that the set of -calculus terms is also inductively defined - each term is either a constant or 'base case' ( for the natural numbers, or variables for -terms) or a composition of terms already in the set ( in the example above, abstraction and application for -terms). We can formalize this notion of inductive definability by introducing the concept of a 'description' of an inductively defined set.
The description of a set is a class function such that is the 'least fixed point' of - that is, is the smallest set such that .
As an example, consider the description of the natural numbers, . Note that here the meaning of is the disjoint union, as and (confusingly) are both sets: is the 'unit set' containing a single element, which I will write . By definition of the disjoint union, the least fixed point of must contain ; iterating, we see that it must also contain , as well as , and so on. If we define as and as , then the similarities with the inductive definition we gave above become clear.
It is often possible to gain an understanding of what the least fixed point of a given set will 'look' like by looking at its description. Let a 'polynomial description' be a description. in the form , where are expressions independent of . Considering only these descriptions, each 'term' of the polynomial - - is an operation combining other elements of (and an element of ) to construct a new element of . When , the 'operations' are constant; these are the base cases of the inductive definition.
For example, we can write the description for the terms of the -calculus as , given a set of variables . The constant term is the base case - we can construct a -term from a variable. The second term states that we can construct a -term from another -term and a variable - this is the abstraction . Finally, with two -terms - - we can construct , which is itself a -term.
We can 'isolate' the base cases of any polynomial by taking ; any other varying term must be in the form for . Since has no members, then the Cartesian product of any set with must also have no members. As these constant terms are always 'included' in for all , we have . We can use this property to algorithmically construct the least fixed point for .
The least fixed point of any should contain all base cases - - all terms constructed from the base cases - - and so on. This gives us a candidate for the least fixed point , alternatively expressed as . In order to prove that this actually is a fixed point of , we need to prove:
The two operators used in polynomial descriptions are both distributive over the binary union operator, meaning that and . This in turn means that polynomial equations are distributive over the binary union: . We now have:
The property that for all means that ; this applies to the case above as well, completing the proof:
We can use -recursion to construct infinite sets like . I define a function ϵ_iter
to iterate a function over a set recursively:
Definition ϵ_iter Γ := ϵ_rec (fun _ f => ⋃(replacement (fun x H => Γ(f x H)))).
Lemma ϵ_iter_prop : forall {Γ X},
ϵ_iter Γ X = ⋃ (replacement (fun Y _ => Γ (ϵ_iter Γ Y))).
We can then define the least fixed point of any polynomial as that polynomial iterated over . As contains arbitrarily-nested sets, then ϵ_iter Γ ω
must contain arbitrary iterations of . We can, for instance, use this to construct naturals and binary trees (trees where every non-leaf node has two children).
Definition lfp Γ := ϵ_iter Γ ω.
Definition Γ_nat X := X + (Suc ∅).
Definition zero := right ∅.
Definition suc n := left n.
Definition Γ_tree X := (X × X) + (Suc ∅).
Definition leaf := right ∅.
Definition branch l r := left ⟨l,r⟩.
Definition nats := lfp Γ_nat.
Definition trees := lfp Γ_tree.
We can then use the same methodology as before to prove that nats
and trees
are fixed points for their respective signatures, and use this fact to prove properties expected of them:
Lemma nat_lfp : Γ_nat nats ≡ nats.
Lemma zero_in_nats : zero ∈ nats.
Lemma suc_in_nats : forall {n}, n ∈ nats -> suc n ∈ nats.
Lemma tree_lfp : Γ_tree trees ≡ trees.
Lemma leaf_in_trees : leaf ∈ trees.
Lemma branch_in_trees : forall {l r},
l ∈ trees -> r ∈ trees -> branch l r ∈ trees.
This approach generalizes quite well and so can be used to implement the syntax of the -calculus.
Terms of the -Calculus
For an explanation of the -calculus see addendum II. I'll go ahead and assume you already know what it is now, so be warned!
To keep complexity to a minimum, I will create a model of the -calculus using de Brujin indices instead of the normal variables. Problems can arise when normalizing -terms meaning you have to rename bound variables in order to avoid 'name conflicts'. Consider the evaluation of . Naively performing a -reduction on this term yields , which is probably not what was intended - the free has suddenly become bound.
To avoid this issue, de Brujin instead used natural numbers to represent variables, and removed variables from abstractions altogether; when a natural number variable occurs in an expression, the 'value' of the variable directly refers to the abstraction to which it is bound. Specifically, a variable has other abstractions between it and the abstraction to which it is bound. For example, using de Brujin indices, the expression becomes and the expression becomes .
Unfortunately, de Brujin indices are not an entirely free lunch. When substituting using de Brujin indices, we need to increment free variables in the expression being substituted every time we pass a -abstraction, so they do not become bound mistakenly. For example, we would not like to -reduce to ; instead we have to 'lift' the free variables in the expression to reduce it to .
-terms using de Brujin indices (-terms) the following syntax:
- any natural number (member of ) is a term
- if is a term, is a term
- if and are terms, is a term
From this we can derive as a description for terms of the -calculus:
Definition Γ_lam X := ω + X + X × X.
Definition var n := left (left n).
Definition lam m := left (right n).
Definition app m n := right ⟨m,n⟩.
Since Γ_lam
is a polynomial function, we can create the set of terms in exactly the same way as we did for nats
and trees
above:
Definition terms := lfp Γ_lam.
Lemma lam_lfp : Γ_lam terms = terms.
Lemma var_term : forall {n}, n ∈ ω -> var n ∈ terms.
Lemma lam_term : forall {m}, m ∈ terms -> lam m ∈ terms.
Lemma app_term : forall {m n},
m ∈ terms -> n ∈ terms -> app m n ∈ terms.
We can use this to construct, for example, the identity function and prove that it is a term:
Definition id := lam (var ∅).
Lemma id_term : id ∈ terms.
Inductively Defined Functions
Similarly to how we can use inductive definitions to define sets, we can use inductive definitions to describe and construct recursive functions. Consider the recursive function for converting from our inductive definition of the naturals to :
To construct a description for this function (such that ), notice that the case is non-recursive; therefore, should be a base case and an element of . Then, for the recursive case, with every iteration of , we can 'add' a new mapping to , provided that ; given the set-theoretic encoding of functions, this is the same as saying . Iterating like we did for inductive definitions should then cover all cases of .
Unlike in a normal inductive definition, we do not want to be able to discriminate between the base and recursive cases of , so we should use the union instead of the disjoint union:
Essentially, given an equation for a recursive function , one can express the 'term' of the description corresponding to that equation by replacing each recursive application in the right-hand-side of the equation with an element of ; a case may then be 'added' to the function if is already a member of the function. We can write definitions for (both total and partial) general recursive functions using this method - a general recursive function being one that you can program an algorithm to solve. For a nontrivial example, see addendum III.
An Evaluator for the -Calculus
With the ability to create arbitrary recursive functions we can finally encode an evaluator for the -calculus within the set theory. I will be translating a minimal Haskell evaluator for , with each Haskell function corresponding to a set-theoretic one.
I wanted to keep my evaluator as simple as possible, so I opted to write a 'call-by-name weak-head-normal-form' (CBV WHNF) evaluator; 'call-by-name' means that a function's argument is not evaluated to normal form before being substituted into the function's body, while 'weak-head-normal-form' means that any abstraction is in normal form, even if is not.
First off, the Haskell implementation defines a function, lift
, to handle the lifting of free variables in de Brujin-indexed -terms as described above. I have assigned new variables to each function call to simplify the translation process, and better display the similarities between the Haskell and IZF implementations:
data Term
= Var Int
| Lam Term
| App Term Term
lift :: (Int, Term) -> Term
lift (i, Var v) | v < i = Var v
lift (i, Var v) | v >= i = Var (v+1)
lift (i, Lam b) =
let c = lift (i+1, b) in Lam c
lift (i, App f a) =
let g = lift (i, f)
b = lift (i, a)
in App g b
On ordinals (such as the members of ), the relation corresponds to , while the relation corresponds to . Using this, we can write descriptions for the above function, using instead of Int
:
We can now use this description to construct the function itself:
To express this in our Coq model of set theory, I attempt to replicate the usual set-builder notation shown above. Unfortunately, the notation {x | P}
is already used in Coq, so I have to make do with a textual form; for x, y, z ∈ S, exp
is simply the repeated application of the axioms of replacement on exp
, introducing x
, y
and z
as variables bound within the replacement. The expression given P, exp
is a strange one; if a proof of P
cannot be provided, it is the empty set. Otherwise, it is the set exp
. I use this to emulate the conditional part of set-builder notation:
Notation "'for' x , .. , y ∈ X , Z" :=
(⋃ (@replacement X (fun x _ => .. (⋃ (@replacement X (fun y _ => Z))) ..)))
(at level 200, x closed binder, y closed binder).
Notation "'given' P , Z" := (⋃ (selection (fun _ => P) (Z+>∅))) (at level 200, P at level 200).
Definition liftΓ X :=
(for x, y ∈ ω, given y ∈ x, ⟨⟨x,var y⟩,var y⟩+>∅)
∪ (for x, y ∈ ω, given x ⊆ y, ⟨⟨x,var y⟩,var (Suc y)⟩+>∅)
∪ (for b, c ∈ terms, for x ∈ ω,
given ⟨⟨Suc x,b⟩,c⟩ ∈ X,
⟨⟨x,lam b⟩,lam c⟩+>∅)
∪ (for f, a, g, b ∈ terms, for x ∈ ω,
given ⟨⟨x,f⟩,g⟩ ∈ X,
given ⟨⟨x,a⟩,b⟩ ∈ X,
⟨⟨x,app f a⟩,app g b⟩+>∅).
Definition lift := lfp liftΓ.
We can use the same process to construct the evaluator's subst
function:
subst :: (Int, Term, Term) -> Term
subst (i, st, Var v) | i == v = st
subst (i, st, Var v) | i /= v = Var v
subst (i, st, App f a) =
let g = subst (i, st, f)
b = subst (i, st, a)
in App g b
subst (i, st, Lam b) =
let c = subst (i+1, lift (0, st), b)
in Lam c
When we want to 'call' an already-defined function from within our set theory, we simply need to check that for some , and , which can be seen in the lam
branch of the subst
function:
Definition substΓ X :=
(for v ∈ ω, for st ∈ terms, ⟨⟨⟨v,st⟩,var v⟩,st⟩+>∅)
∪ (for i, v ∈ ω, for st ∈ terms, given i <> v,
⟨⟨⟨i,st⟩,var v⟩,var v⟩+>∅)
∪ (for i ∈ ω, for st, f, g, a, b ∈ terms,
given ⟨⟨⟨i,st⟩,f⟩,g⟩ ∈ X,
given ⟨⟨⟨i,st⟩,a⟩,b⟩ ∈ X,
⟨⟨⟨i,st⟩,app f a⟩,app g b⟩+>∅)
∪ (for i ∈ ω, for st, st', b, c ∈ terms,
given ⟨⟨∅,st⟩,st'⟩ ∈ lift,
given ⟨⟨⟨Suc i,st'⟩,b⟩,c⟩ ∈ X,
⟨⟨⟨i,st⟩,lam b⟩,lam c⟩+>∅).
Definition subst := lfp substΓ.
The CBN WHNF evaluation function has only two cases; Lam b
is already in normal form, so evaluates to itself, while App f a
first evaluates f
to an abstraction Lam b
, before substituting a
in b
. A free variable has no computation rules and is never evaluated. This means that the eval
function has only two cases and is extremely simple:
eval :: Term -> Term
eval (Lam b) = Lam b
eval (App f a) =
let (Lam b) = eval f
sb = subst (0, a, b)
vb = eval sb
in vb
This is easily translated into IZF:
Definition evalΓ X :=
(for b ∈ terms, ⟨lam b,lam b⟩+>∅)
∪ (for f, a, b, sb, vb ∈ terms,
given ⟨f,lam b⟩ ∈ X,
given ⟨⟨⟨∅,a⟩,b⟩,sb⟩ ∈ subst,
given ⟨sb,vb⟩ ∈ X,
⟨app f a,vb⟩+>∅).
Definition eval := lfp evalΓ.
Now that we have an eval
function encoded within our set theory, we can finally try it out! Perhaps the simplest evaluation that this system can express is , the identity function applied to itself, which simply evaluates to . To prove that (i.e. that ) we first need to prove that . This comes from the first part of our definition of :
Lemma eval_lam : forall {b}, ⟨lam b,lam b⟩ ∈ eval.
Lemma eval_id : ⟨id,id⟩ ∈ eval.
We can also prove that substituted into the variable (which is the body of ) is just , and that is constant under the lift
operation:
Lemma id_lift : ⟨⟨∅,id⟩,id⟩ ∈ lift.
Lemma id_subst : ⟨⟨⟨∅,id⟩,id⟩,id⟩ ∈ subst.
Using these intermediate proofs, we can prove that evaluates to in our model!
Lemma eval_id_id : ⟨app id id,id⟩ ∈ eval.
Verifying the Evaluator
We can prove that our evaluator is correct by proving that it is equivalent to a 'trusted' Coq implementation of the -calculus. First, then, we need to create such an implementation. We'll start with -terms using de Brujin indices:
Inductive lambda_term : Set :=
| Lam : lambda_term -> lambda_term
| Var : nat -> lambda_term
| App : lambda_term -> lambda_term -> lambda_term.
We can then define the lifting function using an inductively defined relation:
Inductive lift_lambda_term (n: nat) :
lambda_term -> lambda_term -> Prop :=
| lift_Var_lt (v: nat) :
v < n -> lift_lambda_term n (Var v) (Var v)
| lift_Var_ge (v: nat) :
v >= n -> lift_lambda_term n (Var v) (Var (v + 1))
| lift_Lam (a b: lambda_term) :
lift_lambda_term (n + 1) a b -> lift_lambda_term n (Lam a) (Lam b)
| lift_App (f g x y: lambda_term) :
lift_lambda_term n f g
-> lift_lambda_term n x y
-> lift_lambda_term (App f x) (App g y).
Substitution and evaluation can then be implemented similarly:
Inductive subst_lambda_term (x: nat) (m: lambda_term) :
lambda_term -> lambda_term -> Prop :=
| subst_Var_eq :
subst_lambda_term x m (Var x) m
| subst_Var_neq (y: nat) :
x <> y -> subst_lambda_term x m (Var y) (Var y)
| subst_Lam (o p: lambda_term) :
lift_lambda_term 0 m o
-> subst_lambda_term (x + 1) o p q
-> subst_lambda_term x (Lam p) (Lam q)
| subst_App (n o p q: lambda_term) :
subst_lambda_term x m n o
-> subst_lambda_term x m p q
-> subst_lambda_term x m (App n p) (App o q).
Inductive eval_lambda_term :
lambda_term -> lambda_term -> Prop :=
| eval_Lam (a: lambda_term) :
eval_lambda_term (Lam a) (Lam a)
| eval_App (f a b sb vb: lambda_term) :
eval_lambda_term f (Lam b)
-> subst_lambda_term 0 b a sb
-> eval_lambda_term sb vb
-> eval_lambda_term (App f a) vb.
In order to prove some sort of equivalence between the these different models of the -calculus, we need to be able to convert between lambda_term
and set
. However, not every set
is a valid -term, so we make do with a one-way conversion:
Fixpoint nat_to_set (n: nat) : set :=
match n with
| 0 => zero
| S x => suc (nat_to_set x)
end.
Fixpoint lambda_term_to_set (l: lambda_term) : set :=
match l with
| Lam b => lam (lambda_term_to_set b)
| App f a => app (lambda_term_to_set f) (lambda_term_to_set a)
| Var x => var (nat_to_set x)
end.
The two models are equivalent if any -term evaluates to in the type-theoretic model if and only if evaluates to in the set-theoretic model. The 'only if' direction is the easiest to prove:
Lemma all_evaluations_hold :
forall (a b: lambda_term),
eval_lambda_term a b
-> ⟨lambda_term_to_set a, lambda_term_to_set b⟩ ∈ eval.
When writing the 'if' direction, however, we run into a problem: the irreversible nature of the lambda_term_to_set
function means we cannot simply flip the arrows of the above proposition. Doing so would mean that the set-theoretic model could still include evaluations that do not correspond to any -term. Instead we prove the following proposition:
Lemma only_evaluations_hold :
forall (a b : set),
⟨a, b⟩ ∈ eval
-> exists (x y: lambda_term),
eval_lambda_term x y
/\ lambda_term_to_set x = a
/\ lambda_term_to_set y /\ b.
And with that we have proved the set-theoretic model of the -calculus correct!
It would be nice to prove -recursion correct within the Coq model of IZF, but that would require some considerable proof automation on my part; even the proofs for simple evaluations were quite long! In any case, this was fun to mess around with, and served as a nice introduction to constructive set theory for me, which I hadn't encountered previously.