# Modeling Languages

## Languages are a solved problem
 They were solved in the 2010s with the emergence of JavaScript and Rust and various frameworks
* They were solved in the 2000s with the emergence of Python and perl and C#
* They were solved in the 1990s with the emergence of C++ and Java.
* They were solved in the 1980s with the emergence of C. We have C. We wrote Unix and Linux with C. We don't need anything else.
* They were solved in the 1970s with the emergence of Pascal and Simula. Some really smart people told me that's all I would need to know an I learned them well. I never wrote a line of deliverable code in either language.
* They were solved in the 1960s with the emergence of Lisp, COBOL and FORTRAN. We still use all three
* They were solved in the 1950s with ENIAC and MANIAC and JONNIAC
* They were solved in the 1930s with the Universal AMachine and Lambda calculus. We still use both.
Hopefully you get the picture

## Every application of any reasonable size has a built in customization language:
 emacs  elisp
 autocad  lisp
 word  visual basic
 excel  scripting language
 unix  shells and shell scripts
 browsers  HTML
 all the things that read XML
 gaming engines
 protocols
 yacc, lex, and friends
 any program that reads a config file
Studying languages makes us better programmers. Gives us insights into the code we write.
Language is how we represent information and knowledge.
## How we think of languages
When we learn or use languages as programmers, we tend to think about them in several ways:
 syntax
 behavior associated with syntax (semantics)
 libraries
 programming idioms
Syntax is a solved problem and largely social. Syntax tells us very little about a language:
```text
a [25]  Java access of array element 25
(vectorref a 25)  Scheme access of vector element 25
a [25]  C access of array element 25 with no boundary checks
a [25]  Haskell call of a on a list of length 1
```
Syntax is a fickle friend.
Libraries help us program, but actually can make language study harder.
Idioms, like syntax are a study of sociological issues.
Semantics  what programs mean  is what we're interested in. Precisely defining and implementing what a program means.
## Describing Meaning
Describe the meaning of each syntactic element of a language. This is critical for developing compilers and interpreters. We have to know what a language is *supposed* to do before we can determine if our tools are implemented correctly.
 Define the new language concrete syntax
 Define the meaning of each syntactic element using a known language
 Evaluation semantics tells us what it does (execution)
 Static semantics tells us what we can predict (type checking)
Three ways of doing this (EECS 762)
 denotational  map each language structure to a mathematical function
 operational  define how legal strings in a language are evaluated
 axiomatic  define pre and postconditions on execution of language constructs
We're going to do something very close to operational semantics  Shriram Krishnamurthi calls this *interpreter semantics*  where we define a golden interpreter for our language. This is how many languages include Verilog and OCaml are defined
## Compilers and Interpreters
Two primary styles for language processing:
 *Compilers* translate language structures into an executable form and throw the rest away
 The *source* language is the language being translated
 The *target* language is the language being targeted
 *Interpreters* define a function that executes language syntax directly.
 The *embedded* language is the language being interpreted
 The *host* language is the known language defining the interpreter
We will be building interpreters. Our *host* language will be Haskell while our *embedded* language will evolve over the course of the semester. Most real languages are neither purely interpreted or compiled.
## Defining Syntax
_Programs are data structures._
```haskell
AE ::= num
 AE + AE
 AE  AE
 (AE)
```
It's a set AE
How big?  Infinite
Recursive!
Inductive
Examples
```haskell
4
1 + 3
(2 + 2) + (5  7)
1 + 3  (5 + (8  4))
```
 *concrete syntax*  what programmers write
 *abstract syntax*  interpreter operates over
```haskell
data AE where
Num :: Int > AE
Plus :: AE > AE > AE
Minus :: AE > AE > AE
deriving (Show,Eq)
```
`AE`  type name
`Num`, `Plus`, ...  Constructors construct elements of the type. ALL elements of the type.
`AE > AE ...`  Signature
We'll write an interpreter over AE
This is not the standard syntax for a Haskell algebraic type, but instead uses the GADT form. It is equivalent to:
```haskell
data AE =
Num Int
 Plus AE AE
 Minus AE AE
deriving (Show,Eq)
```
 *parser*  concrete syntax > abstract syntax
"1+3" == (Plus 1 3)
```haskell
expr :: Parser AE
expr = buildExpressionParser operators term
operators = [ [ inFix "+" Plus AssocLeft
, inFix "" Minus AssocLeft ]
]
numExpr :: Parser AE
numExpr = do i < integer lexer
return (Num (fromInteger i))
term = parens lexer expr
<> numExpr
 Parser invocation
parseAE = parseString expr
```
_Note_: This is not the abstract syntax we will actually use
Examples
```haskell
(parse "3") == (Num 3)
(parse "3 + 4") == (Plus (Num 3) (Num 4))
(parse "((3  4) + 7)" == (Plus (Minus (Num 3) (Num 4)) (Num 7))
```
Parsers are solved problems and this is the last we will speak of them in detail. We're going to skip the parser because the abstract syntax will be as easy to read as the concrete syntax. What a bonus.
# Interpreters

## Monadic Interpreters
_We will learn about languages by building interpreters for them in Haskell_
The general notion of an interpreter maps a _language_ to a _value_. Mathematically:
$E: L\rightarrow V$
$E$ is our interpreter $L$ is our language and $V$ is our value.
Values are good results. Cannot be evaluated further.
Let's start with the simplest language ever:
```haskell
AE ::= num
```
```haskell
data AE where
Nat :: Int > AE
(deriving Eq,Show)
```
`Nat`  constructor
```haskell
eval :: AE > Int
```
A parser will translate numbers into `AE`:
!  Bang
?  Hook
`*`  Splat
#!  Shebang
```haskell
parse "1" == (Nat 1)
parse "2" == (Nat 2)
parse "a" == !
parse "1+2" == !
```
An interpreter will translate `AE` into values:
```haskell
eval::AE > Int
eval (Nat x) = x
```
all together now:
```haskell
interp x = eval (parse x)
interp "1" == 1
interp "3" == 3
```
or
```haskell
interp = eval . parse
```
This is goofy. In and out and that's it.
Now let's add addition to our language. Just another term:
```haskell
data AE where
Nat :: Int > AE
Plus :: AE > AE > AE
(deriving Eq,Show)
```
This is not much harder:
```haskell
eval::AE > Int
eval (Nat x) = x
eval (Plus x y) = (eval x) + (eval y)
```
`x` and `y` in `Plus` bound to input arguments.
```haskell
eval (Plus (Nat 1) (Nat 3))
== (eval (Nat 1)) + (eval (Nat 3))
== 1 + 3
== 4
```
Do programs in AE terminate? Yes and that's okay.
Do programs in AE ever crash? No
But you can't do anything powerful.
Let's add another operator, `Minus`
```haskell
data AE where
Nat :: Int > AE
Plus :: AE > AE > AE
Minus :: AE > AE > AE
```
and extend `eval` with a new case:
```haskell
eval::AE > Int
eval (Nat x) = x
eval (Plus l r) = (eval l) + (eval r)
eval (Minus l r) = (eval l)  (eval r)  Not good  could be negative
```
What does `Minus` force us to deal with?  Errors
Simple error handling using `error`:
```haskell
eval (Minus l r) = let x = (eval l)  (eval r) in
if x<0 then error "!" else x
```
What if I don't want to crash? Return an error value:
```haskell
eval (Minus l r) = let x = (eval l)  (eval 2) in
if x < 0 then 1 else x
```
Whatever we choose, it must be of type `int`. Why is that a problem?
Magic Value like `1`, but easily introduces errors.
## Maybe
Two constructors:
 `Just x`  where `x` is the result of a computation
 `Nothing`  is not the result of a successful computation
`Maybe` is parameterized over type:
```haskell
data Maybe A =
Just :: A > Maybe A
Nothing :: Maybe A
```
What does `Maybe` do to `A`?
Using the `Maybe` in a traditional way:
```haskell
eval::AE > (Maybe Int)
eval (Nat x) = Just x
eval (Plus l r) = case (eval l)
Nothing > Nothing
(Just l') > case (eval r)
(Just r') > (Just l'+r')
Nothing > Nothing
eval (Minus l r) = case (eval l)
Nothing > Nothing
(Just l') > case (eval r)
Nothing > Nothing
(Just r') > if (l'<r') then Nothing else Just l'r'
```
((23) + 4)
How does `Maybe` help here?
## Maybe the Monad
Using `Maybe` as a Monad
```haskell
eval::AE > Maybe Int
eval (Nat x) = Just x
eval (Plus l r) = do { x < eval l;
y < eval r;
Just (x+y) }
eval (Minus l r) = do { x < eval l;
y < eval r;
if x<y then Nothing else Just xy}
```
`x < e` is called *bind* and we're binding the result of evaluating `e` to `x`. Bind only works when `e` is a monadic data structure.
The bind arrow does this with `Maybe`:
1. Evaluates the right side
2. If the right side is `Just a`, assign `a` to `x` and go to the next line
3. If the right side is `Nothing`, fall through and return `Nothing`
`return a` == `Just a`
```haskell
eval::AE > Maybe Int
eval (Nat x) = return x
eval (Plus l r) = do { x < eval l;
y < eval r;
return (x+y) }
eval (Minus l r) = do { x < eval l;
y < eval r;
if x<y then Nothing else return xy}
```
This is pretty cool. The Monad and the `do` notation capture the shunting of control around a case when `Nothing` appears. We don't have to worry about it anymore.
It gets cooler:
```haskell
eval :: AE > Maybe AE
```
What changed?
```haskell
eval (Nat x) = return (Nat x)
eval (Plus l r) = do { (Nat x) < eval l;
(Nat y) < eval r;
return (Nat x+y)}
eval (Minus l r) = do { (Nat x) < eval l;
(Nat y) < eval r;
if x<y then Nothing else return (Nat xy)}
```
Pattern matching _inside the bind_. This will come in very handy later, but file it away for now.
```haskell
(Boolean Bool)
```
If patterns do not match, then `Nothing` is returned.

## Language Properties
 Completeness  every wff that we put into `eval` will get evaluated
 Determinicity  every wff we put into `eval` will produce only one value
 Normalizing  every wff we put into `eval` will terminate in a value
 Value  a good computation result
wff  Well Formed Formula ("woof")
## Inference Rules and Axioms
 Axioms  Things we know. Givens.
 Inference Rules  Things we can deduce from what we know in 1 step
 Derivations  Sequences of inference rule applications.
An inference rule is a set of _antecedents_ and a _consequent_. If the antecedents are true, the consequent follows immediately:
$\begin{prooftree}\AXC{\(A\)}\AXC{\(B\)}\RLS{Inference Rule}\BIC{\(C\)}\end{prooftree}$
Means *if we know $A$ and we know $B$ then we know $C$*
An _axiom_ is an inference rule with no antecedents.
$\begin{prooftree}\AXC{}\RLS{Axiom}\UIC{\(A\)}\end{prooftree}$
Means *if we know nothing then we know $A$*. So $A$ is always try with no need for proof.
We can define all kinds of things with inference rules:
$\begin{prooftree}\AXC{\(t_1\in L\)}\AXC{\(t_2\in L\)}\RLS{Syntax}\BIC{\(t_1\)+``''+\(t_2\in L\)}
\end{prooftree}$
t ::= Nat  t1t2
$\begin{prooftree}\AXC{\(A\Rightarrow B\)}\AXC{\(A\)}\RLS{Logic}\BIC{\(B\)}\end{prooftree}$
$\begin{prooftree}\AXC{\(A\wedge B\)}\RLS{Logic}\UIC{\(B\)}\end{prooftree}$
And we can build trees that define proofs.
$\begin{prooftree}
\AXC{\(A\Rightarrow B\)}
\AXC{\(A\wedge A\)}\UIC{\(A\)}\RLS{Proofs}
\BIC{\(B\)}
\end{prooftree}$
Hilbert defined this system to define all of mathematics.
It did not go well.
However, Hilbert Systems and inference rules are pretty cool creatures will great utility
We will use them to define languages mathematically where they are the dominant definition mechanism.
First let's define some notational conventions:
 $v$ is a variable representing _values_
 $t$ is a variable representing _terms_
 $\underline{+}$ is an operation in our concrete syntax while $+$ is an operation in Haskell
$t_1\Downarrow t_2$ is an evaluation relation and is read "$t_1$ evaluates to $t_2$ in one step"
The Haskell function we've defined called `eval` corresponds with $\Downarrow$
 `eval` is a function, not a relation
 $t_1\Downarrow t_2$ == `eval t1 = t2`
Let's walk through some inference rules for our first little language AE:
Values evaluate to themselves. Note the underline.
$\begin{prooftree}\AXC{}\RLS{NumE}\UIC{\(\underline{v} \Downarrow v\)}\end{prooftree}$
Addition in AE is addition in Haskell.
$\begin{prooftree}\AXC{\(t_1 \Downarrow v_1\)}\AXC{\(t_2 \Downarrow v_2\)}\RLS{PlusE}\BIC{\(t_1 \underline{+} t_2 \Downarrow v_1+v_2\)}\end{prooftree}$
If $t_1$ evaluates to $v_1$ and $t_2$ evaluates to $v_2$, then $t_1+t_2$ evaluates to $v_1+v_2$.
$3\underline{+}5\Downarrow 3+5$
$3\Downarrow 3$
$5\Downarrow 5$
$8$
Subtraction in AE is subtraction in Haskell
$\begin{prooftree}\AXC{\(t_1 \Downarrow v_1\)}\AXC{\(t_2 \Downarrow v_2\)}\RLS{MinusE}\BIC{\(t_1 \underline{} t_2 \Downarrow v_1v_2\)}\end{prooftree}$
If $t_1$ evaluates to $v_1$ and $t_2$ evaluates to $v_2$, then $t_1t_2$ evaluates to $v_1v_2$.
But should it be?
What does this say?
$\begin{prooftree}\AXC{\(t_1 \Downarrow v_1\)}\AXC{\(t_2 \Downarrow v_2\)}\AXC{\(v_1\geq v_2\)}\RLS{MinusE+}\TIC{\(t_1 \underline{} t_2 \Downarrow v_1v_2\)}\end{prooftree}$
What happens if we add this rule to what we already have?
$\begin{prooftree}\AXC{\(t_1 \Downarrow v_1\)}\AXC{\(t_2 \Downarrow v_2\)}\AXC{\(v_1 < v_2\)}\RLS{MinusEZero}\TIC{\(t_1 \underline{} t_2 \Downarrow 0\)}\end{prooftree}$
Another alternative.
$\begin{prooftree}\AXC{\(t_1 \Downarrow v_1\)}\AXC{\(t_2 \Downarrow v_2\)}\AXC{\(v_1 < v_2\)}\RLS{MinusEBottom}\TIC{\(t_1 \underline{} t_2 \Downarrow \bot\)}\end{prooftree}$
This definitional style is called _Big Step Semantics_ or _Natural Semantics_
But there's more:
We can define evaluation with our rules:
$\begin{prooftree}\AXC{\(\underline{5}\Downarrow 5\)}\AXC{\(\underline{2}\Downarrow 2\)}\RLS{PlusE}\BIC{\(\underline{5+2} \Downarrow 5+2 \)}\AXC{\(\underline{3} \Downarrow 3\)}\RLS{PlusE}\BIC{\(\underline{5 + 2+ 3} \Downarrow 10\)}\end{prooftree}$
So is this a proof or an evaluation?
## Our First Language
A complete definition that allows:
 Parsing
 Evaluation
 Reasoning
### Concrete Syntax
```other
AE ::= num
 AE + AE
 AE  AE
 (AE)
```
### Inference Rules
$\begin{prooftree}\AXC{}\RLS{NumE}\UIC{\(\underline{v} \Downarrow v\)}\end{prooftree}$
$\begin{prooftree}\AXC{\(t_1 \Downarrow v_1\)}\AXC{\(t_2 \Downarrow v_2\)}\RLS{PlusE}\BIC{\(t_1 \underline{+} t_2 \Downarrow v_1+v_2\)}\end{prooftree}$
$\begin{prooftree}\AXC{\(t_1 \Downarrow v_1\)}\AXC{\(t_2 \Downarrow v_2\)}\AXC{\(v_1\geq v_2\)}\RLS{MinusE+}\TIC{\(t_1 \underline{} t_2 \Downarrow v_1v_2\)}\end{prooftree}$
$\begin{prooftree}\AXC{\(t_1 \Downarrow v_1\)}\AXC{\(t_2 \Downarrow v_2\)}\AXC{\(v_1 < v_2\)}\RLS{MinusEBottom}\TIC{\(t_1 \underline{} t_2 \Downarrow \bot\)}\end{prooftree}$
### Abstract Syntax
```haskell
data AE =
Nat Int
 Plus AE AE
 Minus AE AE
deriving (Show,Eq)
```
### Interpreter
```haskell
eval (Nat x) = return (Nat x)
eval (Plus l r) = do { (Nat x) < eval l;
(Nat y) < eval r;
return (Nat x+y)}
eval (Minus l r) = do { (Nat x) < eval l;
(Nat y) < eval r;
if x<y then Nothing else return (Nat xy)}
```
## Adding Booleans to AE
Definition of ABE is AE with Booleans added:
```other
ABE ::= Nat  ABE + ABE  ABE  ABE  (ABE)
 true  false  if ABE then ABE else ABE
 ABE <= ABE  ABE && ABE  isZero ABE
v ::= Nat  true  false
Nat ::= 0  succ Nat
```
And the abstract syntax.
```haskell
data ABE where
Num :: Int > ABE
Plus :: ABE > ABE > ABE
Minus :: ABE > ABE > ABE
Boolean :: Bool > ABE
And :: ABE > ABE > ABE
Leq :: ABE > ABE > ABE
IsZero :: ABE > ABE
If :: ABE > ABE > ABE > ABE
deriving (Show,Eq)
```
What changed?
This is Project 0.
# Adding Identifiers

## bind and identifiers
Things we need to do:
1. Concrete Syntax (`t::=t...`)
2. Inference Rules (antecedents and consequents)
3. Abstract Syntax (`data ABE ...`)
4. Interpreter (`eval t`)
`bind` Creates a _binding_ between an _identifier_ and _value_. Normally called `let`.
Some examples to ponder when defining a variable using `bind`.
```other
bind x=5 in x+x
== 5+5
== 10
```
```other
bind x=5 in
bind y=6 in x+y
== bind y=6 in 5+y
== 5+6
== 11
```
```other
bind x=5 in
bind x=6 in x+x
== bind x=6 in x+x
== 6+6
== 12
```
```other
bind x=5 in
bind x=6+x in x+x
== bind x=6+5 in x+x
== bind x=11 in x+x
== 11+11
== 22
```
```other
bind x=5 in
x + bind y=6 in x+y
== 5 + bind y=6 in 5+y
== 5 + 5 + 6
== 5 + 11
== 16
```
```other
bind x=5 in
x + bind x=6 in x+x
== 5 + bind x=6 in x+x
== 5 + 6 + 6
== 5 + 12
== 17
```
```other
bind x=5 in
x + y
== 5 + y
== BOOM
```
```other
bind x=x+1 in x
== BOOM
```
## Concrete Syntax
```
BAE ::= num
 BAE + BAE
 BAE  BAE
 (BAE)
 bind ID = BAE in BAE
 ID
ID ::= string
```
### Useful Definitions
 instance  occurrence of an identifier
```other
bind >x = >x+5
in >y4
```
 binding instance  where an identifier is declared and given a value
```other
bind >x = x+5
in y4
```
 bound value  value given to an identifier in a binding instance
```other
bind x = >(x+5)
in x4
```
 scope  the region where an identifier is defined and can be used
```other
bind x = x+5
in [x4]
```
 bound instance  where an identifier is used _in scope_
```other
bind x = x+5
in [>x4]
```
 free instance  where an identifier is used _outside scope_
```other
bind x = >x+5
in x4
```
```other
bind x=5 in
bind y=6 in
x+y+z
```
What is the scope for a variable defined with `bind`? Everything after `in`
## Inference Rules for `bind` and Identifiers
First definition will use substitution.
### Substitution operator
 $[x \rightarrow v]t$  Replace all _free_ instances of $x$ in $t$ with $v$
 $[x\rightarrow 5]3 == 3$
 $[x\rightarrow 5]x == 5$
 $[x\rightarrow 5]5+5 ==$
 $[x\rightarrow 7]
bind x=7 in bind y=5 in` $x+y$ $==
bind x=7 in bind y=5 in` $x+y$
 $[x\rightarrow 7]
bind y=5 in` $x+y$ $==$ `bind y=5 in` $7+y$
 $[x\rightarrow 5]
x + bind x=10 in x` $==$ `5 + bind x=10 in x`
Substitution is a common mathematical operator that we will assume exists.
### Inference Rules  `bind`
$\begin{prooftree}
\AXC{\(a\Downarrow v_a\)}\AXC{\([i\rightarrow v_a]s\Downarrow v_s\)}\RLS{BindE}
\BIC{\(\mathsf{bind}\ i=a\ \mathsf{in}\ s\Downarrow v_s\)}
\end{prooftree}$
`bind x=3+2 in x+x`
#### Explanation
 $a$ evaluates to $v_a$
 substitute $v_a$ for $i$ in the body of `bind`, evaluate to $v_s$
 Result is $v_s$
$\begin{prooftree}
\AXC{\(a\Downarrow v_a\)}\AXC{\([i\rightarrow v_a]s\Downarrow v_s\)}\RLS{BindE}
\BIC{\(\mathsf{bind}\ i=a\ \mathsf{in}\ s\Downarrow v_s\)}
\end{prooftree}$
### Inference Rules  Identifiers
$\begin{prooftree}
\AXC{}\RLS{IDE}
\UIC{\(x\Downarrow\bot\)}
\end{prooftree}$
#### Explanation
 Evaluating an identifier means the identifier was not replaced
 No `bind` defined the identifier or it would no longer be there
```
bind x = 3 in y
== ???
```
## Abstract Syntax
Add constructors for new constructs in concrete syntax
```haskell
data AE where
Nat :: Int > AE
ID :: String > AE
Plus :: AE > AE > AE
Minus :: AE > AE > AE
Bind :: String > AE > AE > AE
deriving (Show,Eq)
```
## Evaluation
```haskell
eval (Nat x) = return (Nat x)
eval (Id s) = Nothing
eval (Plus l r) = do { (Nat x) < eval l;
(Nat y) < eval r;
return (Nat x+y)}
eval (Minus l r) = do { (Nat x) < eval l;
(Nat y) < eval r;
if x<y then Nothing else return (Nat xy)}
eval (Bind i a b) = do { a' < eval a;
Just (eval (subst i a' b))}
```
We will need to define substitution for this to work:
```haskell
subst :: String > BAE > BAE > BAE
subst x v (Nat x) = (Nat x)
subst x v (Id x') = if x==x' then v else (Id x')
subst x v (Plus l r) = (Plus (subst x v l) (subst x v r))
subst x v (Minus l r) = (Minus (subst x v l) (subst x v r))
subst x v (Bind x' v' t') = if x==x' then (Bind x' v' t') else (Bind x' v' (subst x v t'))
```
```haskell
eval (Bind “x” (Num 5)
(Bind “y” (Num 6)
(Plus (Id “x”) (Id “y”))))
== (Bind "y" (Num 6) (Plus (Num 5) (Id "y")))
== (Plus (Nun 5) (Num 6))
== (Num 11)
```
```haskell
eval (Bind “x” (Num 5) (Plus (Id “x”) (Id “x”))))
== (Plus (Num 5) (Num 5))
== (Num 10)
```
```haskell
eval (Bind “x” (Num 5)
(Bind “x” (Num 6)
(Plus (Id “x”) (Id “x”))))
== (Bind "x" (Num 6) (Plus (Id "x") (Id "x")))
== (Plus (Num 6) (Id "x"))
== (Num 12)
```
```haskell
eval (Bind “x” (Num 5)
(Bind “y” (Num 5)
(Bind “x” (Num 6)
(Bind “y” (Num 4)
(Bind “z” (Num 7)
(Id “z”))))))
==
```
Anything wrong here?
This is a _reference interpreter_ that defines what we want BAE to be, but not necessarily its implementation.
 Natural language
 Inference rules
 Reference interpreter
# Deferring Substitution
How would you build this interpreter "for real"?
Instead of immediately substituting, let's remember bindings of identifiers to values.
`[("x",5),("y",7)]`
Old Way:
```other
eval bind x = 5 in
bind y = 6 in
x + y
== bind y=6 in 5+y
== 5+6
== 11
```
New Way:
```other
eval bind x = 5 in [("x",5)]  "x" is 5 in this scope
bind y = 6 in [("x",5),("y",6)]
5 + 6 [("x",5),("y",6)]
== 11
```
 Environment  list of identifiers and values currently in scope
```haskell
eval (Bind “x” (Num 5) [("x",5)]
(Bind “y” (Num 6) [("x",5),("y",6)]
(Plus (Id "x") (Id "y")))) [("x",5),("y",6)]
```
```haskell
eval (Bind “x” (Num 5) [("x",5)]  Environment
(Bind “x” (Num 6) [("x",6),("x",5)]  Shadowing "x"
(Plus 6 6)) [("x",6),("x",5)]
```
```haskell
eval (Bind “x” (Num 5) [("x",5)]
(Bind “x” (Num 6) [("x",6),("x",5)]
(Plus 6 (Id "y")))) [("x",6),("x",5)]  Lookup of y fails
== BOOM
```
```haskell
eval (Bind “x” (Num 5) [("x",5)]
(Plus 5 [("x",5)]
(Bind “x” (Num 6) [("x",6)("x",5)]
(Plus 6+6))) [("x",6)("x",5)]
== 17
```
```haskell
eval (Bind “x” (Num 5) [("x",5)]
(Plus
(Bind “x” (Num 6) [("x",6),("x",5),]
(Plus 6 6)) [("x",6),("x",5)]
5) [("x",5)]
== 17
```
Now more inference rules?
$\begin{prooftree}
\AXC{}\AXC{}
\BIC{\(\mathsf{bind}\; x=v\;\mathsf{in}\ t\Downarrow v\)}
\end{prooftree}$
Why or why not?
## eval with env
Haskell builtins cool functions:
```haskell
Type Env = [(String,BAE)]
lookup :: A > [(A,B)] > Maybe B
x:xs  Adds x to front of xs
 Pattern matches with nonempty list (that's kinda cool)
```
Let's define `eval`:
`return` == `Just`
```haskell
eval :: Env > BAE > Maybe BAE
eval _ (Num n) = Just (Num n)
eval e (Plus l r) = do { (Num l') < (eval e l) ;
(Nun r') < (eval e r) ;
return (l'+r')
}
eval e (Minus l r) = do { (Num l') < (eval e l) ;
(Nun r') < (eval e r) ;
if r'<=l' thern return (l'r') else Nothing
}
eval e (Bind x a s) = do { a' < (eval e a) ;
(eval (x,a'):e s)
}
eval e (Id x) = (lookup x e)
```
Good things about `eval` using deferred substitution:
 No more `subst`
 No more walking the code
We will say our original eval function performs _direct substitution_ and will use the name `evals`
We will say our new eval function performs _deferred substitution_ and will us the name `eval`
How do we know our new `eval` is correct?
$\forall t:BAE\cdot \mathsf{eval}\; []\; t = \mathsf{evals}\; t$
What does this say?
Can we do it?
# Predicting Behavior
Start with a tiny language with representative terms:
```haskell
data BAE where
 Nat :: Int > BAE EOVal > BAE
 Plus :: BAE > BAE > BAE
 Bind :: String > BAE > BAE > BAE
```
## Abstract Interpretation
Can we predict when a returned value is odd or even?
```haskell
data EOVal where
 Even  Odd
```
```haskell
predict :: [("String",EOVal)] > BAE > Maybe EOVal
predict _ (Nat n) = return (if even n then Even else Odd)  Just Even > n is even
return n
predict c (Plus l r) = do { l' < predict c l ;
r' < predict c r ;
return (if (l'==r') then Even else Odd)
}
predict c (Bind x a s) = do { a' < (eval c a) ;
(predict (x,a'):c s)
}
predict c (Bind x a s) =  c is the environment
predict c (Id x) = (lookup x c)
```
The `predict` function implements _abstract interpretation_ that makes predictions without running code.
Let's add some new structures to our language and look at a different kind of prediction:
```haskell
data BAE where
 Nat :: Int > BAE
 Boolean :: Bool > BAE
 Plus :: BAE > BAE > BAE
 And :: BAE > BAE > BAE
 If :: BAE > BAE > BAE > BAE
 Bind :: String > BAE > BAE > BAE
 Id :: String > BAE
deriving (Show,Eq)
```
Instead of `Odd` and `Even` let's use `TNum` and `TBool` for numbers and booleans:
```haskell
data TBAE where
 TNum  TBool
```
Can we calculate the numberness or booleanness of a term?
```haskell
predict :: [(String,TVal)] > BAE > Maybe TVal
predict _ (Nat _) = return TNum
predict _ (Boolean _) = return TBool
predict g (Plus l r) = do { TNum < predict g l ;
TNum < predict g r ;
return TNum
}
predict g (And l r) = do { TBool < predict g l ;
TBool < predict g r ;
return TBool
}
predict g (If c t e) = do { TBool < predict g c ;
t' < predict g t ;
e' < predict g e ;
if t'==e' then return t' else Nothing
}
predict g (Bind x a s) = do { a' < (predict g a) ;
(predict (x,a'):g s)
}
predict g (Id x) = (lookup x g)
```
What is `predict` in this case?
## Type Checking
As you might have guessed we just wrote a type checker for our little language. Lets formalize this idea using inference rules. First, a few definitions:
 $x : T$  typing relation saying `x` is of type `T`. Just like $x\Downarrow t$
$
\begin{prooftree}
\AXC{\(\)}\RLS{NumT}
\UIC{\(N:TNum\)}
\end{prooftree}
$
$
\begin{prooftree}
\AXC{\(l:TNum\)}\AXC{\(r:TNum\)}\RLS{PlusT}
\BIC{\(l+r : TNum\)}
\end{prooftree}
$
$
\begin{prooftree}
\AXC{\(TBool\)}\AXC{\(TBool\)}\RLS{AndT}
\BIC{\(l \wedge r : TBool\)}
\end{prooftree}
$
$
\begin{prooftree}
\AXC{\(c:TBool\)}\AXC{\(t:T\)}\AXC{\(e:T\)}\RLS{IfT}
\TIC{\(if\ c\ then\ t\ else\ e : T\)}
\end{prooftree}
$
 _context_  a list of identifiers and _types_ in scope represented by $\Gamma$
 $\Gamma\vdash x:T$  Gamma derives $x$ type $T$ given context $\Gamma$.
_Context_ is the same as _Environment_ except it contains types.
$
\begin{prooftree}
\AXC{\((x:T)\in \Gamma\)}\RLS{Name}
\UIC{\(\Gamma\vdash x : T\)}
\end{prooftree}
$
$
\begin{prooftree}
\AXC{\(\Gamma\vdash a:T_a\)}\AXC{\((x:T_a):\Gamma\vdash t:T\)}\RLS{BindT}
\BIC{\(bind\ x=a\ in\ t : T\)}
\end{prooftree}
$
## Optimization
We know several identities over our calclulations:
 $0+x==x$
 $True \wedge x == x$
 `if` $True$ `then` $t$ `else` $e == t$
Can we optimize such expressions out of an program?
```haskell
optimize :: BAE > BAE
optimize (Nat n) = (Nat n)
optimize (Boolean b) = (Boolean b)
optimize (Plus l r) = if (optimize l)==(Nat 0) then (optimize r) else (Plus (optimize l) (optimize r))
optimize (And l r) =
optimize (If c t e) = if c==(Boolean True) then (optimize t) else (If (optimize c) (optimize t) (optimize e))
optimize (Bind x a s) = (Bind x (optimize a) (optimize s))
optimize (Id x) = (Id x)
```
## General Purpose Evaluators
Let's change our data structure just a bit:
```haskell
data BAE A where
 Nat :: A > BAE A
 Plus :: BAE A > BAE A > BAE A
 Bind :: String > BAE A > BAE A > BAE A
 Id :: String > BAE A
```
```
predict :: ???
predict _ (Nat n) =
predict c (Plus l r) =
predict c (Bind x a s) =
```
# Lambda The Ultimate!
Now things get serious...
```other
inc x = x + 1
```
```other
inc 3
== 3 + 1
== 4
```
```other
inc ((5+1)3)
== ((5+1)3)+1
== 4
or
== inc 3
== 3+1
== 4
```
Some nifty definitions:
`inc x = x + 1`
 `x`  formal parameter
 `x + 1`  body
```other
inc 3
```
 `3`  actual parameter
 `inc 3`  application of `inc` to `3`
How do we evaluate function application?
 $inc\ 3 == [x\rightarrow 3]x+1$
## Kinds of Functions in Languages
 First order functions  Cannot take other functions as arguments. Have special representation in the language that is not accessible
```
int foo(int x){x++};
```
 Higher order functions  Can take functions as arguments. Can return functions.
```other
map foo [1,2,3] = [2,3,4]
```
 First class functions  Functions are values like any other value in the language
```haskell
foo = \x > x+1
foo x = x +1
```
 We will look at languages with firstclass functions
 Firstclass functions are the current trajectory of modern language
 ... it's been a long time coming.
## Concrete Syntax
 `lambda x in x + x`
 Defines a function value over _formal parameter_ `x`
 Called a `lambda` or an `abstraction`.
 Lambdas are values
 From a logical perspective, `lambda` introduces a variable
 `(l a)`
 Applies a function `l` to _actual paramater_ or _argument_ `a`
 Called an _application_ or simply an _app_
 From a logical perspective, `app` eliminates a variable.
```other
((lambda x in x+x) 3)
== 3+3
== 6
```
Have you seen this kind of evaluation before?
```
((lambda x in x+x) 3) == bind x=3 in x+x
```
```other
FBAE ::= VFBAE+FBAEFBAEFBAE
bind id = FBAE in FBAE
lambda id in FBAE
(FBAE FBAE)
id
V ::= Natlambda id in FBAE
```
```
(lambda x in x)
== (lambda x in x)
```
```
((lambda x in x) 5)
== [x>5]x
== 5
```
```
((lambda x in x) (lambda x in x))
== [x>(lambda x in x)]x
== (lambda x in x)
```
```
((((lambda x in (lambda y in x+y)) 3) 2)
== ([x>3](lambda y in x+y) 2)
== ([y>2](3+y))
== 3+2
== 5
x>3
(lambda y in x+y)
y
x+y
(lambda x y z ...)== (lambda x in (lambda y in (lambda z in ...)))
```
```
(((lambda x in (lambda y in x+y)) 1)
== [x>1](lambda y in x+y)
== (lambda y in 1+y)
Nat>Nat>Nat
Nat>Nat
```
```
((lambda x in (x 3)) (lambda x in x))
== [x>(lambda x in x)](x 3)
== ((lambda x in x) 3)
== [x>3]x
== 3
```
```
bind inc=(lambda x in x+1) in (inc 3)
== [inc>(lambda x in x+1)](inc 3)
== ((lambda x in x+1) 3)
...
== 3+1
== 4
```
```
bind inc=(lambda x in x+1) in
bind dec=(lambda x in x1) in
bind sqr=(lambda x in x*x) in
(inc (sqr (sqr 3)))
...
```
Alternate concrete syntax:
$\lambda x.s$
More common in mathematical presentations
## Abstract Syntax
```haskell
data FBAE where
 Num :: Int > FBAE
 Plus :: FBAE > FBAE > FBAE
 Minus :: FBAE > FBAE > FBAE
 Bind :: String > FBAE > FBAE > FBAE
 Lambda :: String > FBAE > FBAE
 App :: > FBAE > FBAE > FBAE
 Id :: String > FBAE
deriving (Show,Eq)
```
## Inference Rules
$
\begin{prooftree}
\AXC{\(f\Downarrow (\mathsf{lambda}\ i\ \mathsf{in}\ s)\)}
\AXC{\(a\Downarrow v_a\)}
\AXC{\([i\rightarrow v_a]s\Downarrow v_s\)}\RLS{Beta}
\TIC{\((f\ a)\Downarrow v_s\)}
\end{prooftree}
$
$
\begin{prooftree}
\AXC{}\RLS{LambdaE}
\UIC{\(\mathsf{lambda}\ i\ \mathsf{in}\ s\Downarrow \mathsf{lambda}\ i\ \mathsf{in}\ s\)}
\end{prooftree}
$
Remember bind?
$\begin{prooftree}
\AXC{\(a\Downarrow v_a\)}\AXC{\([i\rightarrow v_a]s\Downarrow v_s\)}\RLS{BindE}
\BIC{\(\mathsf{bind}\ i=a\ \mathsf{in}\ s\Downarrow v_s\)}
\end{prooftree}$
What's the difference here?
We can define `bind` using `lambda`
```
bind i=a in s == ((lambda i in s) a)
```
 This is called a _derived form_
## Church's Lambda Calculus
 Can represent _any_ computable function
 Equivalent to Turing Machines
 Basis for functional programming
 Turing Machines give rise to _imperative programming_
 Lambda Calculus gives rise to _functional programming_
### Concrete Syntax
```other
LC ::= id
lambda id in LC
(LC LC)
V ::= lambda id in LC
```
$
\begin{prooftree}
\AXC{\(f\Downarrow \mathsf{lambda}\ i\ \mathsf{in}\ s\)}
\AXC{\(a\Downarrow v_a\)}
\AXC{\([i\rightarrow v_a]s\Downarrow v_s\)}\RLS{CBV Beta}
\TIC{\((f\ a)\Downarrow v_s\)}
\end{prooftree}
$
$
\begin{prooftree}
\AXC{\(f\Downarrow \mathsf{lambda}\ i\ \mathsf{in}\ s\)}
\AXC{\([i\rightarrow a]s\Downarrow v_s\)}\RLS{CBN Beta}
\BIC{\((f\ a)\Downarrow v_s\)}
\end{prooftree}
$
That's it!
### Fun with Lambda
```
(lambda x in x) is a value
```
```
(lambda y in y)(lambda x in x)
== (lambda x in x)
```
```
(lambda x in x x) 3
== [x>3](x x)
== (3 3)
```
```
(lambda x in x x)(lambda y in y)
== [x>(lambda y in y)](x x)
== ((lambda y in y) (lambda y in y))
== [x>(lambda y in y)]y
== (lambda y in y)
```
```
(lambda x in x x)(lambda y in y y)
== [x>(lambda y in y y)](x x)
== ((lambda x in x x) (lambda x in x x))
== [x>(lambda x in x x)](x x)
== ((lambda x in x x) (lambda x in x x))
```
Omega combinator
## Curried Functions
```
f = (lambda x in (lambda y in x + y))
f 3 4 == ((f 3) 4)
==
```
## Evaluation
First lets add to our direct substitution interpreter. Callbyvalue
```haskell
eval Lambda i s = return (Lambda i s)
eval (Id _) = Nothing
eval App f a = do {(Lambda i s) < eval f ;
a' < eval a ;
eval (subst a' i s)}
eval Bind i a s = do {a' < eval a ;
eval (subst a' i s)}
```
==Midterm Stops Here==
[[20231003MidtermDiscussionMidterm Discussion]]
```
((Lambda i s) a) == bind i a s
```
```
bind f = (lambda x in x) in
(f 2)
==
```
```
bind n = 1 in
bind f = (lambda x in x + n) in
bind n = 2 in
f 1
== bind f = (lambda x in x + 1) in
bind n = 2 in
f 1
== bind n = 2 in
(lambda x in x + 1) 1
== (lambda x in x+1) 1
== [x>1]x+1
== 1+1
== 2
```
`(f a) > (App f a)`
```
f(x)=x+1 == bind f = (lambda x in x+1)
f(x,y)=x+y == bind f = (lambda x in (lambda y in x+y))
== f(x)=(lambda y in x+y)
```
Same problem as before  inefficient and kind of clumsy
Try again with environments:
```
Env = [(string,FBAE)]
eval e (Lambda i s) = return (Lambda i s)
eval e (App f a) = do { (Lambda i s) < eval e f;
a' < eval e a;
eval ((i,a'):e) s }
eval e (Id i) = (lookup i e)
```
```
bind f = (lambda x in x) in [(f,(lambda x in x))]
(f f) ''
== ((lambda x in x)(lambda x in x)) [(x,(lambda x in x)),(f,(lambda x in x))]
== (lambda x in x)
```
```
bind n = 1 in [(n,1)]
bind f = (lambda x in x + n) in [(f,(lambda x in x + n)),(n,1)]
bind n = 2 in [(n,2),(f,(lambda x in x + n)),(n,1)]
f 1
== (lambda x in x + n) 1 [(x,1),(n,2),(f,(lambda x in x + n)),(n,1)]
== x + n [(x,1),(n,2),(f,(lambda x in x + n)),(n,1)]
== 1 + 2
== 3
```
Oops...
## Static and Dynamic Scoping
 Static Scoping  Identifier scope in a lambda is the scope where a lambda is _defined_.
 Dynamic Scoping  Identifier scope in a lambda is the scope where a lambda is _used_.
```
bind n = 1 in. []
bind f = (lambda x in x + n) in []
bind n = 2 in []
f 1
```
How do we fix this problem?
*Closures* implement static scoping by including an environment in the function value. Naively:
```
data FBAE where
...
Closure :: String > FBAE > Env > FBAE
...
```
```
(lambda x in x+1 [(n,3))]
```
We will keep a copy of the definition environment in the closure  `Env` in the argument list.
Closures are now function values.
Introducing a Value type returned by the interpreter
```
data FBAEVal where
NumV :: Int > FBAEVal
ClosureV :: String > FBAE > Env > FBAEVal
eval :: Env > FBAE > (Maybe FBAEVal)
```
How do we:
1. Use the return value
2. Use the closure for static scoping
```
eval e (Num n) = return (NumV n)
eval e (Plus l r) = do { (NumV l') < (eval e l) ;
(NumV r') < (eval e r) ;
return (NumV (l'+r'))}
eval e (Lambda i s) = return (ClosureV i s e)
eval e (App f a) = do {(ClosureV i s e') < (eval e f) ;
a' < (eval e a) ;
eval ((i,a'):e') s }}
```
Where are closures in the evaluator?
Let's try the evaluation one more time.
```
bind n = 1 in [(n,1)]
bind f = (lambda x in x + n) in [(f,(ClosureV x x+n [(n,1)])),(n,1)]
bind n = 2 in [(n,2),(f,(ClosureV x x+n [(n,1)])),(n,1)]
f 1
== (ClosureV x x+n [(n,1)]) 1
== x+n [(x,1),(n,1)]
== 1+1
== 2
```
 Immediate substitution $\Rightarrow$ static scoping
 Deferred substitution $\Rightarrow$ dynamic scoping
 Deferred substitution + closures $\Rightarrow$ static scoping
## Derived Forms
Defining new language constructs in terms of existing language constructs
`unless` as an example
```
unless ...
1. unless c e or e unless c
2. Unless BBAE > BBAE > BBAE
3. unless c e == if c then false else e
4.
```
$
\begin{prooftree}
\AXC{\(c\Downarrow False\)}\AXC{\(e\Downarrow v\)}\RLS{UnlessF}
\BIC{\(\mathsf{unless}\ c\ e\Downarrow v\)}
\end{prooftree}
$
$
\begin{prooftree}
\AXC{\(c\Downarrow True\)}\RLS{UnlessT}
\UIC{$\mathsf{unless}\ c\ e\Downarrow False$}
\end{prooftree}
$
```haskell
5.
eval Unless c e = do { c’ < eval c;
if c’ return false else eval e }
6.
```
$
\begin{prooftree}
\AXC{\(\Gamma\vdash c:TBool\)}
\AXC{\(\Gamma\vdash e:TBool\)}
\RLS{TUnless}
\BIC{\(\mathsf{unless}\ c\ e:TBool\)}
\end{prooftree}
$
 _Elaboration_  Defining new language constructs in terms of existing constructs
1. Define concrete syntax
2. Define abstract syntax
3. _Define elaboration function_
4. Define type rules
5. Extend type inference function
 Embedded Language  New language with extensions
 Host Language  Target language for translation
```haskell
elab :: Embedded Language AST > Host Language AST
```
To evaluated translate embedded language to host language and execute as usual:
```haskell
evale t = eval [] (elab t)
```
or
```haskell
evale t = eval [] . elab
```
Let's introduce a new expression in FBAE that performs increment. Sort of like ++ in C. In fact, we could use the notation `t++` to represent this if we wanted. To keep things simple, let's use `inc t`.
Here is the abstract syntax for the host language, FAE:
```haskell
data FAE where
 Num :: int > FBAE
 Plus :: FBAE > FBAE > FBAE
 Lambda :: String > FBAE > FBAE
 App :: FBAE > FBAE > FBAE
 Id :: String > FBAE
```
This is just FBAE without `bind`
Here is the abstract syntax for our embedded language
```haskell
data FAEX where
 NumX :: int > FAEX
 PlusX :: FAEX > FAEX > FAEX
 LambdaX :: String > FAEX > FAEX
 AppX :: FAEX > FAEX > FAEX
 IdX :: String > FAEX
 IncX :: FAEX > FAEX
```
What changed?
Concrete syntax for the new `inc` function:
```haskell
inc x == x+1
```
Now an inference rule for `inc`:
$
\begin{prooftree}
\AXC{\(x\Downarrow v1\)}
\UIC{\(inc\ x\Downarrow v\)}
\end{prooftree}
$
Does this work?
Do we even need it?
Now our elaborator:
```haskell
elab :: FAEX > FAE
elab NumX n = Num n
elab PlusX l r = Plus (elab l) (elab r)
elab LambdaX i b = Lambda i (elab b)
elab AppX f a = App (elab f) (elab a)
elab IdX s = Id s
elab IncX x = Plus (elab x) (Num 1)
```
And finally `eval` that calls `elab` before evaluating `t`:
```haskell
evalX t = eval [] (elab t)
```
What about type checking? Anything wrong with this?
```haskell
typeofX t = typeof [] (elab t) ???
```
We really do need a type checker for our embedded language.
```haskell
Con = [(string,FAEType)]
typeofX :: Con > FAEX > Maybe FAETNum
typeofX _ (NumX _) = TNum
typeofX c (PlusX l r) = do { TNum < typeofX c l;
TNum < typeofX c r;
return TNum }
...
typeofX c (IncX t) = do { TNum < typeofX t ;
return TNum }
typeofX c (IdX s) = lookup s c
interpX t = do { typeofX [] t;
evalX [] t }
```
What about `bind`?
`bind i = v in t == (app (lambda i t) v)`
1. Concrete Syntax  already defined for FBAE
2. Abstract Syntax  already defined for FBAE
3. Evaluation Rules  already defined for FBAE
4. Elaboration  some work required
5. Type Rules  already defined for FBAE
6. typeof  already for FBAE
Let's define the `elab` case for `bind`:
```
elab (Bind i a t) = (App (Lambda i (elab t)) (elab a))
```
# Recursion
## Dynamically Scoped Recursion
```
bind fact = []
lambda x in
if x=0 then 1 else x * (fact x1) in
(fact 3)
== (fact 3) [(fact,(lambda x in ...))]
== ((lambda x in if x=0 then 1 else x * (fact x1)) 3)
== if x=0 then 1 else x * (fact x1) [(x,3),(fact,(lambda x in ...))]
== if 3=0 then 1 else 3 * ((lambda x in ...) 31)
== 3 * ((lambda x in ...) 2)
== 3 * if x=0 then 1 else x * (fact x1) [(x,2),(x,3),(fact,(lambda x in ...))]
== 3 * if 2=0 then 1 else 2 * ((lambda x in ...) 21)
== 3 * 2 * ((lambda x in ...) 1)
== 3 * 2 * if x=0 then 1 else x * (fact x1) [(x,1),(x,2),(x,3),(fact,(lambda x in ...))]
== 3 * 2 * if 1=0 then 1 else 1 * ((lambda x in ...) 0)
== 3 * 2 * 1 * ((lambda x in ...) 0)
== 3 * 2 * 1 * if 0=0 then 1 else 0 * (fact x1) [(x,0),(x,1),(x,2),(x,3),(fact,(lambda x in ...))]
== 3 * 2 * 1 * 1
== 6
```
### Execution Sequence
((lambda x in if x=0 then 1 else x * (fact x1)) 3) [(fact,(lambda x in ...))]
Recursion works like recursion should. Cool
## Statically Scoped Recursion
```
bind fact = []
lambda x in
if x=0 then 1 else x * (fact x1) in [(fact,closure x ... [])]
(fact 3)
== (fact 3) [(fact,closure x ... [])]
== ((closure x ... []) 3) [(fact,closure x ... [])]
== if 3=0 then 1 else 3 * (fact 31) [(x,3),(fact,closure x ... [])]
== 3 * (fact 2) [(x,3),(fact,closure x ... [(fact,closure x ... [])])]
== 3 * ((closure x ... []) 2) []  CAN'T HAPPEN fact is not in the environment
==
```
### Execution Sequence
((closure x (if x=0 then 1 else x*(fact x1))) 3) [(fact,closure x ... [])]
== if x=0 then 1 else x*(fact x1) [(x,3)]
== 3 * fact 2 [(x,3)]
## Omega (redux)
```
bind o = lambda x in x x
o o
== (lambda x in x x)(lambda x in x x)
== (x x) [(x,(lambda x in x x))]
== (lambda x in x x)(lambda x in x x)
```
Does this work with closures?
```
bind o = lambda x in x x
o o
== o o [(o,(closure x in x x []))]
== (closure x in x x [])(closure x in x x []) [(o,(closure x in x x []))]
== (x x) [(x,(closure x in x x []))]
== (closure x in x x [])(closure x in x x []) [(o,(closure x in x x []))]
```
## The Y Combinator
```
bind Y = (lambda f (lambda x in (f (x x)))
(lambda x in (f (x x))))
in (Y F)
== ((lambda x in (F (x x)))) (lambda x in (F (x x))))
== (F (x x)) [(x,(lambda x in (F (x x)))]
== (F ((closure x in (F (x x)) [] (closure x in (F (x x)) []))
== (F (F (x x))) [(x,(lambda x in (F (x x)))]
== (F (F (F (x x)))) [(x,(lambda x in (F (x x)))]
```
 `f` is the function being called recursively
 `F` the function we're applying recursively, not an identifier
The Y Combinator calculates a _fixed point_ for F
Calculating `sum` as an example.
```
bind F = (lambda g in (lambda z in if z=0 then z else z + (g (z1)))) in
bind Y = (lambda f (lambda x in (f (x x)))
(lambda x in (f (x x))))
in ((Y F) 5)
```
Ummm. Where is `sum` defined here?
```
== (((lambda x in (F (x x)))) (lambda x in (F (x x)))) 5)
Apply the first lambda to the second lambda
== ((F (x x)) 5)
[(x,(lambda x in (F (x x))))]
Expand F
== (((lambda g in (lambda z in if z=0 then ...)) (x x)) 5)
[(x,(lambda x in (F (x x))))]
Bind g to (x x) and evaluate the body
== (((lambda z in if z=0 then z else z + (g (z1))) 5)
[(g,(x x)),(x,lambda x in (F (x x))))]
Bind 5 to z
== (z + (g z1))
[(z,5),(g,(x x)),(x,(lambda x in (F (x x))))]
Substitute z
== (5 + ((x x) 4)))
[(z,5),(g,(x x)),(x,(lambda x in (F (x x))))]
Substitute x
== (5 + (lambda x in (F (x x)))(lambda x in (F (x x))) 4))
repeat...
== (5+(4+(3+(2+(1+(lambda x in (F (x x)))(lambda x in (F (x x))) 0))))))
[(z,1),(z,2),(z,3),(z,4),(z,5),(g,(x x)),(x,(lambda x in (F (x x))))]
== (5+(4+(3+(2+(1+(((lambda z in if z=0 then 0 else z + (g (z1))) 0))))))
== (5+(4+(3+(2+(1+0)))))
== 15
```
# Midterm Review
 Interpreters
 Concrete and Abstract Syntax
 Parsing and Interpretation
 Values and terms
 Predicting types
 Optimization
 Extending Languages
 Simple Inference Rules
 Identifiers and Substitution
 Binding identifiers to values
 Substitution and Interpretation
 Environments and Deferring Substitution
 Identifier Scoping
 Static and Dynamic Scoping
 Functions
 Taxonomy of functions  first order, higher order, first class
 Interpreting functions
 Closures for functions
There will be very little if any Haskell on the exam. You will be asked to analyze code written in our little languages like FBAE, BAE and friends. I will provide those language definitions for you, so there is no need to memorize them.
# Typing functions
```
bind inc = lambda x in x + 1 in
(inc inc)
```
This is a problem, but why?
```
bind x=1 in [(x,1)] Evaluation
x+1
```
```
bind x=1 in [(x,TNum)] Type checking
x+1
== x+1 [(x,TNum)]
== TNum + TNum
== TNum
```
Where do 1 and `TNum` come from?
Let's do the same with lambda expressions:
```
(lambda x in [(x,???)] Evaluation
x+1)
== x+1 [(x,???)]
(lambda x in [(x,???)] Type checking
x+1)
((lambda x in x+1) 1) Now what???
```
What's the issue here?
 `T>T` is a _function type_ or _signature_
Our concrete syntax for types becomes:
```
T ::= TNum  TBool  T > T
```
`>` can be thought of as a type constructor.
We will often write `D>R` where:
 D  domain type
 R  range type
What's different about a function type?
Function types represent promises:
 `TBool>TNum`  Input a Boolean and get back a number
 `TNum>TNum>TNum`  Input a number and get back a function from number to number
 `TNun>TNum>TBool`  Input a number and get back a function from number to Boolean
CurryHoward says function types are also theorems. We'll come back to that.
Unsurprisingly `lambda` as a function type:
`lambda x in t : D>R`
## Finding D and R
Let's start with `bind` What is the type of `bind`?
```
[]  bind x=1 in
x+1 : T
```
$\Gamma=[\ ]$ derives `bind...` is of some type `T`
`T` is the type of the _body_ of bind. That's what `bind` returns.
To find the type of the body, add type of `x` to the empty context:
```
== [(x,Tnum)]  x+1 : T
== TNum + TNum : T
== TNum
```
Where did we get that type?
The type of `bind` is the type of the _body_ with the type of the _identifier_ added to it's context
Does the same thing work for lambda?
Let's look at the type of `bind` as `app`
`bind x = 3 in x+1 == ((lambda x in x+1) 3)`
 Where do we get `D`?
 Where do we get `R`?
 What happens when we pull the `lambda` out of the `app`?
`(lambda x in x+1):D>R`
 Where do we get `D`?
 Where do we get `R`?
```
[]  lambda x in x+1 : D>R
== [(x,???)]  x+1 :
==
```
 We need the type of `x` to compute the type of the function body.
 Where do we get it?
 `D` is the type of the input parameter that we don't have.
 `R` is the type of the body _given the parameter_ type. Just like `bind`.
Let's assume the type of `x`:
```
[]  lambda x:TNat in x+1 : TNat>TNat
== [(x,TNat)]  x+1 : TNat
```
Can we find `D` and `R`?
 The type of `x` is `D` because we assumed it so
 The type of the body assuming `x:D` is `R`
```
[]  lambda x:TNum in [(x,TNum)]
x+1 [(x,TNum)]
== TNum + TNum
== TNum
== TNum > TNum
```
If you give the lambda `x:D` you will get `z:R`
What is the scope of `x`?
`f::Int>Int>Int`
`f x::Int>Int`
`f x y::Int`
```
lambda x:TNum in
lambda y:TNum in
x+y
lambda x:TNum in [(x,TNum)]
lambda y:TNum in [(y,TNum),(x,TNum)]
x+y:TNum
== TNum > TNum
== TNum > TNum > TNum
```
```
lambda f:TBool>TNum in [(f,TBool>TNum)]
lambda a:TBool in [(a,TBool),(f,TBool>TNum)]
f a : TNum
== TBool>TNum
== (TBool > TNum) > TBool > TNum
```
## Type Inference for Lambda
The Haskell code:
```haskell
typeof c (Lambda x d t) = do {r < typeof (x,d):c t;
return (d:>:r)}
```
Type Rule:
$
\begin{prooftree}
\AXC{\((x:D):\Gamma\vdash t:R\)}\RLS{LambdaT}
\UIC{\(\Gamma\vdash(\mathsf{lambda}\ x:D\ \mathsf{in}\ t):D\rightarrow R\)}
\end{prooftree}
$
`app` makes us keep our promise
What is the type of:
`((lambda x:TNum in x+1) 2 : T`
We know the type of the lambda and the actual parameter:
 `(lambda x:TNum in x+1) : TNum > TNum`
 `2 : TNum`
What is the type of the `App`?
What is the type of:
`((lambda x:TNum in x+1) True : T`
We know the type of the lambda and the actual parameter:
 `(lambda x:TNum in x+1) : TNum > TNum`
 `True : TBool`
What is the type of the `App`? Did we keep our promise?
## Type Inference for App
The Haskell code:
```haskell
typeof c (App f a) = do { D:>:R < typeof c f;
A < typeof c a;
if A=D then return R else Nothing}
```
Type rule:
$
\begin{prooftree}
\AXC{\(\Gamma\vdash f:D\rightarrow R\)}\AXC{\(\Gamma\vdash a:D\)}\RLS{TApp}
\BIC{\(\Gamma\vdash (f\ a):R\)}
\end{prooftree}
$
Examples:
```
bind inc = lambda x:TNum in x+1 in [(inc,TNum>TNum)]
inc 3
== inc : TNum>TNum, 3:TNum
== TNum
```
```
bind plus = (lambda x:TNum in [(x,TNum)]
(lambda y:TNum in [(y,TNum),(x,TNum)]
x+y)) in
plus 3 4 : TNum [(plus,TNum>TNum>TNum)]
plus 4 : (TNum>TNum)
(plus 3) 4 :
(TNum>TNum)>TNum
TNum>(TNum>TNum)
```
```
(bind app = lambda f:(TBool>TNum) in [(f,(TBool>TNun)]
lambda a:TBool in [(a,TBool),(f,(TBool>TNun)]
f a:TNum in [(app,(TBool>TNum)>TBool>TNum)]
app (lambda x:TBool in
if x then 1 else 0) : TBool > TNum
3 :TNum) : TNum
```
# Simply Typed Lambda Calculus
Functions and function types. Nothing else.
```
STC ::= id  lambda id in STC  STC STC
T ::= T > T
```
```
lambda x:T in lambda y:T in x : T>T>T  BAD No T
```
```
lambda x:T in lambda y:T in ((x false) y)  BAD No Booleans
```
```
omega = (lambda x:T>T>T>T in (x x))(lambda x:T in (x x))
```
We cannot define types for recursive functions.
Actually, we cannot write functions at all.
A perfectly fine language that we can write no programs in. Mathematical oddities abound...
What would fix this problem?
# Recursion (again)
What we know so far about recursion
1. Recursion works in untyped, dynamically scoped FBAE (Lambda, Bind, AE)
2. Recursion does not work in untyped, statically scoped FBAE
3. Recursion does not work in typed FBAE
```
bind f=(lambda x in if x=0 then 1 else x*(f (x1))) in [(f,(ClosureV “x” (if ...) []))]
f 0
== (ClosureV “x” (if x=0 then 1 else x*(f (x1))) []) 0
== if 0=0 then 1 else ..)
== 1
f 1
== (ClosureV “x” if x=0 then 1 else x*(f (x1)) []) 1
==if 1=0 then 1 else 1*(f 1) []
```
### Solution
```
== (if x=0 then 1 else x*(f (x1))) [(x,1),(f,(ClosureV ... []))]
== 1*(f 0) [(x,1),(f,ClosureV ... []))]
== 1*if 0=0 then 1 else
== 1
```
## How to _fix_ this problem?
```
bind f=lambda x in if x=0 then 1 else x*(f (x1))
```
Interpret the lambda and get:
```
(ClosureV “x” TNum (if x=0 then 1 else x * (f (x1))) [])
```
Apply to 0:
```
(ClosureV “x” TNum (if x=0 then 1 else x * (f (x1))) []) 0 e
== 1
```
Now Apply to 1:
```
(ClosureV “x” TNum (if x=0 then 1 else x * (f (x1))) []) 1 e
==
```
Let's spike the closure with its own definition!
```
((ClosureV “x” TNum (if x=0 then 1 else x*(fact (x1))
[(fact,(ClosureV “x” TNum (if ...) []))]) 1 e
== if 1=0 then 1 else 1*(fact 0)
== 1*(ClosureV "x" TNum (if ...) [])
== 1*1
((ClosureV ...) 2 e) [(fact,(ClosureV “x” TNum (if ...) [(fact,(ClosureV “x” TNum (if ...) [])]))]
==
```
Let's spike the closure's closure with its own definition!
```
((ClosureV “x” TNum (if x=0 then 1 else x*(fact (x1))
[(fact,(ClosureV “x” TNum (if ...))
[(fact,(ClosureV “x” TNum (if ...) []))]))]) 2 e
==
```
What about 3?
Preseeding the environment doesn't work. Why?
What's the problem?
## The Fix
Let's define a thing called `fix` that takes a function as its argument.
When evaluated, `fix` will substitute its function argument into itself _before_ evaluation. What might that look like?
```
bind f = (lambda g in
(lambda x in
(if x=0 then 1 else x * (g (x1)))) in
((fix f) 2)
==
```
`f` takes two arguments, the first is a function `g` that defines the recursive call. Have we seen this?
`fix` will replace `g` with `(fix f)`.
`(fix f)`  not `f`  Then evaluate
What does that do?
Let's look at an inference rule:
$
\begin{prooftree}
\AXC{\([g\rightarrow(\mathsf{fix}\ (\mathsf{lambda}\ g\ \mathsf{in}\ t))]t\Downarrow v\)}\RLS{FixE}
\UIC{\((\mathsf{fix}\ (\mathsf{lambda}\ g\ \mathsf{in}\ t))\Downarrow v\)}
\end{prooftree}
$
`fix` is copying itself _before_ evaluating. Remember substitution!
Not quite as simple as throwing `fix` in front of a function.
 We are effectively passing in the recursive call as a parameter
 $g$ is the recursive function, not the data parameter
 Need to extend the actual function to account for this new parameter
```
bind f = (lambda g in
(lambda x in
(if x=0 then 1 else x * (g (x1)))) in
(fact 2)
== ((lambda x in
(if x=0 then 1 else x * ((fix f) (x1)))) 2)
== (if 2=0 then 1 else 2 * ((fix f) (21)))
== 2 * ((fix f) 1)
== 2 * ((lambda x in
(if x=0 then 1 else x * ((fix f) (x1)))) 1)
== 2 * (if 1=0 then 1 else 1 * ((fix f) (11)))
== 2 * 1 * ((fix f) 0)
== 2 * 1 * ((lambda x in
(if x=0 then 1 else x * ((fix f) (x1)))) 0)
== 2 * 1 * (if 0=0 then 1 else 1 * ((fix f) 01)))
== 2 * 1 * 1
```
 `fix f` creates one recursive step at a time.
 nothing is recursive? Really?
Apply fix to the extended function and we get...
### Old Example
```
bind f = (lambda g in
(lambda x in
(if x=0 then 1 else x * (g (x1)))) in
((fix f) 2)
== [g>fix (lambda g b)]b
== (lambda x in
(if x=1 then 1 else x * (g (x1))))
== (lambda x in
(if x=1 then 1 else x * ((fix (lambda g b)) (x1)) 2
== 2 * ((fix (lambda g b)) 1)
== 2 * ((fix (lambda g (lambda x in
if x=0 then 1 else x * (g (x1))))) 1
== 2 * (lambda x in
if x=0 then 1 else x (fix (lambda g b)) (x1)))))) 1
== 2*1
== 2*1*1
== 2
```
## Extending FBAE for `fix`
We have everything we need. First, the eval case:
```haskell
eval e (Fix f) = do { (ClosureV g b e’) < (eval e f);
eval e’ (subst g (Fix (Lambda g b)) b) }
```
This is the same `subst` that we defined before
Look carefully at what is being substituted. What is it??
## Typing Fix
`f` == the nonrecursive form of `f`
`(fix f)` == the recursive form of function `f`
What is the type of `(fix f)`?
```
(fix f):D>R
```
What is the type of the lambda for factorial?
```
f = (lambda g:(D>R)>D>R in
(lambda x:TNat in
if x=0 then 1 else x*(g (x1))))
```
And the fix?
```
(fix f):D>R
```
$
\begin{prooftree}
\AXC{\(\Gamma\vdash f:D\rightarrow R\)}
\UIC{\(\Gamma\vdash (\textsf{fix}\ f):R\)}
\end{prooftree}
$
What is the type of the lambda for factorial?
```
f = (lambda g:TNat>TNat in
(lambda x:TNat in
if x=0 then 1 else x*(g (x1))))
(fix f):TNat>TNat
f:(TNat>TNat)>(TNat>TNat)
```
And now for something completely different...
# Mutable State
The big difference between Haskell and C
```
... x=x+3 ; y=x++ ...
```
What does the `“;”` do?
```
bind x=(f a) in
(g 3)
== ((lambda _ in (g 3)) (f a))
```
What’s happening besides substitution?
```
do { l’ < eval l ;
r’ < eval r ;
return (op l’ r’) }
```
What’s happening besides binding?
Order! We tend to think of two paradigms:
 functional  ordering is achieved via calling convention
 imperative  ordering is achieved via sequencing mutable state
What’s happening in this sequence? Where does x get a value?
```
x := 1; x := x + 1; x := x + 1; ...
```
`:=` is an assignment that updates a variable in the next state
```
x := x+1 == x’ = x+1
```
The tick notation traditionally means next.
The next `x` is the current `x` plus 1
## Adding State to FBAE
How is _state_ different than _environment_?
1. A Seq sequence operator
2. Define a store
3. Update eval to maintain state in addition to environment
New concrete syntax: `t ; t`
New abstract syntax: `Seq::FBAE>FBAE>FBAE`
Define sequence by elaboration: `Seq l r` == `(App (Lambda x r) l)`
```
eval Seq l r = do { s < eval q l; eval s r }
```
Interesting syntax!
We'll skip typing for now.
## New operations for state
```
FBAE := new FBAE  deref FBAE  set loc FBAE
```
 `loc`  new type of value for locations
 `new t`  creates a new location, puts t there, returns the location
 `deref l`  retrieves a value from l and returns it
 `set l t`  stores t in location l and returns l
```
deref (new 5)
== 5
```
```
deref (set (new 5) 6)
== 6
```
```
bind l = new 2+3 in deref l
== 5
```
```
bind l = new 2+3 in
set l ((deref l) + 1) ; (deref l)
==
```
```
l := r == (set l (deref r))
```
```
bind m = new 5 in
bind n = m in
set m 6 ; deref n
== 6
```
```
bind m = new 5 in
bind n = m in
bind n = new 5 in
set m 6 ; deref n
== 5
```
```
bind inc = (lambda l in (set l ((deref l) + 1)))
bind n = new 5
inc n ; deref n
== 6
```
## Implementing State
Store as a function with location as a number
We could use an array or a sequence, but let's try a technique used in formal modeling.
```
type Sto = Loc > Maybe FBAEVal
type Loc = Int
```
`Sto` is the store and is a function from `Loc` to `Maybe FBAEVal`.
Why a `Maybe`?
 `Just x`  a good location is accessed.
 `Nothing`  a bad location is accessed.
What are good and bad memory locations?
What does this have to do with language design?
Memory contains `FBAEVal`.
 How is this different from a C pointer?
 How is this different from a Java reference?
What does this have to do with language design?
Dereferencing is just using the store as a function:
```
derefSto s l = (s l)
```
The initial store is a store with nothing in it.
```
initSto :: Sto
initSto x = Nothing
```
 What is `initSto 3`?
 What is `initSto` for any value?
Updating the store is a bit trickier...
`m0 = \l > if l=3 then 1 else Nothing`
`m1 = \l > if l=1 then 2 else (m0 l)`
`m2 = \l > if l=2 then 0 else (m1 l)`
`m3 = \l > if l=3 then 4 else (m2 l)`
```
setSto :: Sto > Loc > FBAEVal > Sto
setSto s l v = \m > if m==l then (Just v) else (s m)
```
 `\x > t` is `lambda x in t` in FBAE
 Given that, how does this work?
 Why would I ever do this?
The store becomes a collection of nested if statements. Not the most efficient, but it does what it’s supposed to for our purposes.
Putting things together to model memory:
```
type Stor (Loc,Sto)
```
A location, store pair.
 `loc` is the next memory element
 `sto` is the current memory
```
derefStore (_,s) l = derefSto s l
```
Dereferencing is accessing the memory location
```
initStore = (0,initSto)
```
Initialize memory is the initial, empty memory with a next value of 0.
```
setStore (m,s) l v = (m,setSto s l v)
```
Storing a value
```
newStore (l,s) = (l+1,s)
```
Allocating memory location.
 Now we know what `l` is for
 Keep track of where the next memory location should be.
## Integrating Mutable Store
How does store differ from environment?
 Changes to store persist across scopes
 Locations are values
Let's add a constructor for locations
```
data FBAEVal =
...
Loc :: Int > FBAEVal
```
Seems simple enough
 `eval` can return a location
 Locations can be calculated? Maybe?
 Locations can be stored? Maybe?
Our choices are important. Even the tiny ones.
How to implement mutability of `Store`??
Start with a new return value:
```
type Retval = (FBAEVal,Store)
```
`Retval` is the return value for our interpreter
 A value from FBAE
 The resulting store
What's this for??
```
eval :: Env > Store > FBAE > Maybe Retval
```
`eval` now takes an environment _and_ a store.
 `Env`  identifiers and values in scope
 `Store`  contents and current location in memory.
What should an initial call to `eval` look like?
 environment
 store
 term
`x:=x+1 ; x:=x+1`
```
eval e s (Seq l r) = do {(v,s') < (eval e s l) ;
(v',s'') < (eval e s' r) ;
return (v',s'')}
```
 One thing, then the other....
```
eval e s (Set l t) = do {(v,s') < (eval e s t) ;
(v,s'') < (setStore s' l v) ;
return (v,s'')}
```
 All that work finally pays off  `Set` calls `setStore`
 What is the pattern matching in the `do` clause doing?
```
eval e s (Plus l r) = do {((NumV l'),s') < (eval e s l) ;
((NumV r');,s'') < (eval e s' r) ;
return ((NumV l'+r'),s'')}
```
This passing of state is the new idiom:
```
eval e s (Deref t) = do {((Loc l'),s') < (eval e s t) ;
(derefStore (Loc l') s'}
```
Let's unpack `New` and figure out what's going on:
```
eval e s (New t) = do { (t’,(l,s’)) < eval e s t ;
return ((Loc l),(setStore (newStore (l,s’)) l t’)) }
```
Wow. What the heck?
 `(t',(l,s'))`  `t'` value to be stored in `s'` in `l`
 `(Loc l+1)`  next location after `l`
 `(newStore (l,s'))`  new store with `l` allocated
 `(setStore ...)`  stores `t'
``
# Variables and Assignment
Let's do this together...
We have utilities for mutable store.
What things do we need?
 Variable declaration
 Variable dereference
 Variable assignment
We want to control the way that memory gets accessed.
 No more `set`, `deref`, `new`
## Variable Declaration
 What tools might we have?
`var x:=t1` == `bind x=new t1 in t2`
```
var x:=0
var y:=1
var z:=3
```
## Variable Dereference
 It would be nice to just say `x+1`
 Is that possible if `x` is a variable
1. Get var location value from e
2. Get value value from s by dereferencing the location
`(deref (lookup x e) s)`  value stored in the location stored in `x`
## Variable Assignment
 What tools might we have?
`x := t == (set x t)
```
Asn :: FBAE > FBAE > FBAE
x := x+y+z
```
 Evaluate the right side
 Already declared variable
 Variable  Identifier whose value can be changed
`(set x t)`
## Type Checking Variables
 What tools might we have?
 `bind x=new 3 in x:=x+1`

`var x:=3; [x:NumT]
`x:=x+1`
```
(LocT tl) < typeof c l
tr < typeof c l
(Loc tv) < lookup x c
```
`LocT :: TFBAE > TFBEA`
## Throwing Errors
 What tools might we have?
`eval _ bang = Nothing`
## While Loop
 What tools might we have?
`while c do t == if c then (t ; while c do t) else skip`
`while c do t == lambda f in lambda `
```
while x<5 do x:=x+1
== w = lambda _ in if x<5 then x:=x+1 ; w _ else skip
== lambda w in lambda _ in if x<5 then x:=x+1 ; w _ else skip
== fix (lambda w in (lambda _ in if x<5 then x:=x+1 ; w _ else skip))
```
## Goto
 Should we bother?
## Lists
## Objects & Object Oriented
```
let F=(lambda f in (lambda x in if x=0 then 1 else x*(f x1))) in
let fact=(fix F)
(fact 3)
```
```
let fact = (lambda x in if x=0 then 1 else x*(fact (x1))) in
(fact 3)
```
```
typeofM c (Plus t1 t2) = do {NumT < typeofM c t1;
NumT < typeofM c t2;
return NumT
}
```
`D > R`
```
typeofM c (Lambda x D b) = do {R < typeofM (x,D):c b;
return (D :>: R)}
```
```
typeofM c (Bind x t1 t2) = do {tx < (typeofM c t1);
(typeofM (x,tx):c t2)}
```
```
bind n=3 in [(n,3)]
bind f=(lambda x in x+n) in [(f,lambda...),(n,3)]
bind n=1 in [(n,1),(f,lambda...),(n,3)]
(f n)
== x+n [(x,1),(n,1),(f,lambda...)(n,3)]
== 2
Dynamic Scoping
```
```
bind n=3 in [(n,3)]
bind f=(lambda x in x+n) in [(f,(ClosureV x x+n [(n,3)]),(n,3)]
bind n=1 in [(n,1),(f,(ClosureV x x+n [(n,3)]),(n,3)]
(f n)
== x+n [(x,1),(n,3)]
== 4
Static Scoping
```
```
bind x=v in b == ((lambda x in b) v)
```
```
bind z=3+4 in z+z
match with left side of elaboration rule
x=z
v=3+4
b=z+z
substitute into right side
((lambda z in z+z) 3+4)
```