CSC444 CFLs and PDAs

CSC444 Oct23

single file version

Exercises
Transitive Closure of a Relation
Product of Relations
Definition of R^k for a relation R.
Transitive Closure
R^* and R⁺ Relations on a Finite Set
Proof of Lemma
R^* and R⁺ for Finite Sets
R^* and R⁺ are transitive
Example 1
Example 2
Notation for relation R
Transitive Closure
Reflexive Transitive Closure
Sentences and Sentential Forms
Examples
Left Sentential Forms
Greibach Normal Form
Examples
Use of Greibach Normal Form
Converting to Greibach Normal Form: Preliminaries, Substitution
Substitution (continued)
Converting to Greibach Normal Form: Preliminaries, Left-Recursion
Eliminating Left-Recursion: Proof
Sketch of Conversion from CNF to GNF
Example
Example continued
Substitue for A's to get Greibach
Example: Summary
Derivation Length with Greibach Normal Form
Left Sentential Forms for a Greibach Normal Form Grammar
PDA Configurations

Transitive Closure of a Relation [3] [top]

If R is a relation on a set A, R is of course not necessarily transitive.

At the other extreme, R is contained in A x A, and A x A is a transitive relation on A.

But A x A as a relation on A that contains R is too big to be of much use since by definition, (x,y) is in this relation for every x and y in A.

Instead what is often useful is the smallest transitive relation that contains R.

Product of Relations [4] [top]

If R and S are relations on a set A, the product of R and S, denoted by RS is defined by

  RS = { (x,y) | there is some z in A such that (x,z) is in R and (z,y) is int S }

Lemma (Associativity): If R, S, and T are relations on a set A, then (RS)T = R(ST).

Proof: (x,y) in (RS)T => there is a w in A such that (x,u) is in RS and (u,y) is in T. But (x,u) in RS => there is a v in A such that (x,v) is in R and (v,u) is in S.

Now, (v,u) in S and (u,y) in T => (v,y) is in (ST).

Since (x,v) is in R and (v,y) is in (ST), (x,y) is in R(ST).

Definition of R^k for a relation R.[5] [top]

If R is a relation on a set A, and k is any integer >= 0, define, a relation R^k inductively by:

   R⁰ = {(x,x) | x is in A }
   R¹ = R
   R^k = R^k-1R for k > 1

Note that

   R² = R¹R = RR
   R³ = R²R = (RR)R

But by the associativity lemma, (RR)R = R(RR).

The associativity lemma means it doesn't matter what order you compute the products when computing R^k. So, for instance, we can unambigously write

  R³ = RRR

since (RR)R = R(RR)

Transitive Closure [6] [top]

For any relation R on a set A with n elements, define relations R⁺ and R^* on A by

  R⁺ = R¹ ∪ R² ∪ ... ∪ R^n-1 ∪ ...
 
  R^* = R⁰ ∪ R¹ ∪ R² ∪ ... ∪ R^n-1 ∪ ...

R⁺ is called the transitive closure of R

R^* is called the transitive, reflexive closure of R

R^* and R⁺ Relations on a Finite Set [7] [top]

Lemma: If R is a relation on a finite set A that has n elements, and for some integer m > 0, (x,y) is in R^m, then (x,y) is in R^k for some k such that 1 <= k < n

Example: R is the relation on the set A = {q₀,q₁,q₂,q₃} whose graph is shown below. That is, (q_i, q_j) is in R if there is a directed edge from q_i to q_j.


Note: In this example, 

1.  n = 4
2. (q₀, q₃) is in R⁵ (so m = 5):

   q₀, q₁, q₂, q₁, q₂, q₃ is a path of length 5 in the graph of R.

   (q_i-1, q_i) is in R for i = 1,2,3,4,5

   But q₁ is the first element of A that is repeated later in this path
   so we can remove the elements between the first occurrence of q₁ and
   the last occurrence:

3. q₀, q₁, q₂, q₁, q₂, q₃ is a path of length 3 in the graph
   of R.

   So (q₀, q₃) is also in R³ (So k = 3 < n = 4).

Proof of Lemma [8] [top]

Lemma: If R is a relation on a finite set A that has n elements, and for some integer m > 0, (x,y) is in R^m, then (x,y) is in R^k for some k such that 1 <= k <= n

Proof: The proof is by contradiction. Let k be the smallest integer >= 1 such that (x,y) is in R^k. Then we know k <= m. Assume k > n.

Since (x,y) is in R^k, there is a sequence u₀ = x, u₁, u₂, ..., u_k = y of k+1 elements of A such that (u_i-1, u_i) is in R for each i = 1, 2, ..., k.

By assumption k > n and since there are only n elements in A , there must be a repetition in the first k elements u₀, u₁, ..., u_k-1 (without considering u_k).

Let r be the subscript of the first element among u₀, u₁, ..., u_k-1 that is repeated later in the same sequence.

Let s be the subscript of the last occurrence of this same element in u₀, u₁, ..., u_k-1.

Then, u_r = u_s, 0 <= r < s <= k-1
x = u₀, ... u_r, u_r+1, ... u_s = u_r, u_s+1, ... u_k-1, u_k = y
Since (u_i-1, u_i) is in R for 1 <= i <= k, this says (x,u_r) = (u₀, u_r) is in R^r and (u_r, y) = (u_s, y) = (u_s, u_k) is in R^{k -
s}.

And so (x,y) is in R^rR^{k - s} = R^{k - (s -
r)}.

But
 1 <= k - (s - r) < k   (why?)
which contradicts the assumption that k was the smallest integer >= 1 such that (x,y) is in R^k.

R^* and R⁺ for Finite Sets [9] [top]

If R is a relation on a finite set A with n elements, then

  R⁺ = R¹ ∪ R² ∪ ... ∪ Rⁿ
 
  R^* = R⁰ ∪ R¹ ∪ R² ∪ ... ∪ Rⁿ

Proof: Follows from the previous lemma.

R^* and R⁺ are transitive [10] [top]

Proposition: If R is a relation on A, then R⁺ is a transitive relation on A and is the smallest transitive relation that contains R.

Proof: First, show that R⁺ is transitive.

(x,y) in R⁺ and (y,z) in R⁺ => (x,y) is in Rⁱ and (y,z) is in R^j for some i and j. But this says that (x,z) is in RⁱR^j = R^i+j which is contained in R⁺.

A similar argument shows R^* is transitive.

Example 1 [11] [top]

Example:
Let G be the following context-free grammar

 S -> E $
 E -> E '+' T | T
 T -> T '*' F | F
 F -> 'id' | '(' E ')'

Define a relation R on {S,E,T,F, '+', '*', 'id', '(', ')' }

By R = 
{
 (S,E)
 (E,E)
 (E,T)
 (T,T)
 (T,F)
 (F, 'id')
 (F, '(')
}

Is R transitive? No.
What is the transitive closure, R⁺

What is S R⁺ related to? That is, what are all the values
of w such that (S,w) is in R⁺

Answer: (S,E), (S,T), (S,F), (S, 'id'), (S, '(')

Example 2 [12] [top]

Example:

Let G = (V, T, S, P) be a context-free grammar.

Define a relation R on (V ∪ T)^* by

(α, β) is in  R if 

 1. α = uXv for some u, v in (V ∪ T)^* and X in V, and
 2. β = uwv, and
 3. X -> w is a rule in P

Then, L(G), the language of G, is

  { w in T^* | (S, w) is in R⁺ }

Note also that (S, w) is in R⁺ if and only if

  (S, w) is in Rⁱ for some i >= 1.

Notation for relation R [13] [top]

The following notation will be also be used for the relation R of Example 2


 Notation: 
    For (x,y) in R, write x => y
    For (x,y) in R^k, write x =>^k y
    For (x,y) in R^*, write x =>^* y
    For (x,y) in R⁺, write x =>⁺ y

Transitive Closure [14] [top]

Proposition For any relation R on a set A, R⁺ and R^* are transitive relations on A.

Proof:(x,y) and (y,z) in R+ => there is an integer i >= 1 such that (x,y) is in Rⁱ and also an integer j >= 1 such that (y,z) is in R^j.

So (x,z) is in RⁱR^j = R^i+j where (i+j) >= 2, which implies (x,z) is in R⁺

The proof for R^* is essentially the same, except i and j are only >= 0.

Reflexive Transitive Closure [15] [top]

Proposition:For any relation R on a set A, R_* is reflexive.

Proof: R^* contains R⁰ = { (x,x) | x is in A}, so R^* is reflexive.

Sentences and Sentential Forms [16] [top]

Recall the relation R defined for a context-free grammar in Example 2:

Let G = (V, T, S, P) be a context-free grammar.

R is the relation on (V ∪ T)^* defined by

  (α, β) is in  R, or equivalently, using the alternate notation,

  α => β

if 

 1. α = uXv for some u, v in (V ∪ T)^* and X in V, and
 2. β = uwv, and
 3. X -> w is a rule in P

If S =>⁺ w and w is in T^*, then w is a
sentence for the grammar G; w is in the language generated by G.

If S =>^* w, where w is in (V ∪ T)^*,
then w is called a sentential form for G.

Examples [17] [top]

Example
Let G be the following context-free grammar

 S -> E $
 E -> E '+' T | T
 T -> T '*' F | F
 F -> 'id' | '(' E ')'

then

(a) S =>³ T '+' T
(b) S =>⁺ T '+' T
(c) T '+' T is a sentential form

Left Sentential Forms [18] [top]

We can define a restricted version of relation => for context-free grammars for left-most derivatiosn.

Example 3

For a context-free gramar G=(V,T, S, P), 
define R_L to be the relation on (V ∪ T)^*
by

( α, β) is in R_L if

 1. α = uXv for some u in T^*, 
    v in (V ∪ T)^* and X in V, and 
 2. β = uwv, and
 3. X -> w is a rule in P

Note: The only difference between relation R of Example 2 and
R_L is in condition 1. For R_L, u must consist of
only terminal symbols so that the X must be the left-most non-terminal
of α.

Greibach Normal Form [19] [top]

A context-free grammar G=(V, T, S, P) is in Greibach normal form if every grammar rule is of the form:

X -> tw, or
S -> ε

where t is in T and w is in (V - {S})^*

Examples [20] [top]

S -> a S B | a B
B -> b
S -> a A | ε
A -> a A | b B
B -> b B | ε
S -> i A | i A B | x
A -> i A | i A B | x
B -> e A

Use of Greibach Normal Form [21] [top]

Every context-free grammar can be converted to an equivalent grammar in Greibach normal form. The first step is to convert the grammar to Chomsky normal form!

If a sentence w for a grammar G has length n > 0 and G is in GNF, then any derivation of w must have length n.

Can you describe an algorithm that accepts any context-free grammar G and a string w of terminals and determines whether w is in the language of G? It doesn't have to be efficient, but ε grammar rules cause some problems.

Greibach normal form is useful in showing that any context-free language is the language accepted by some (nondeterinistic) push down automaton. (PDA).

Converting to Greibach Normal Form: Preliminaries, Substitution [22] [top]

Rules in context-free grammars are not required to meet in the Greibach normal from restrictions.

So to convert a given grammar G to Griebach normal form, we will need a way of replacing grammar rules, but the new grammar must be equivalent to G.

Lemma (Substitution): Suppose a variable B in a grammar G = (V, T, S, P) has productions B -> w₁ | w₂ | ... | w_k. If A is a variable in G that has a production A -> uBv, then let G₁ = (V,T,S, P₁) be the grammar obtained from G by adding rules A ->uw₁v | uw₂v | ... | uw_kv and deleting rule A -> uBv from the rules for G. Then G₁ is equivalent to G.

Proof: L(G) is contained in L(G₁).

If S =>* w for w in L(G) and the derivation doesn't use the rule A -> uBv, then the same derivation shows w is in L(G₁).

If S =>* w uses the rule A -> uBv, then it must be of the form:

S =>* r₁As₁ => r₁uBvs₁ =>* r₂uBvs₂ => r₂uw_ivs₂ =>* w.

The the following is a derivation for w in G₁.

S => r₁As₁ => r₁uw_ivs₁ =>* r₂uw_is₂ =>* w

Substitution (continued)[23] [top]

L(G₁) is contained in L(G).

Suppose w is in L(G₁). If a derivation for w in G₁ doesn't use any of the new rules A ->uw_iv, then the same derivation shows w is in L(G).

If any of the rules A -> uw_iv is used in a derivation in G₁ for w, the derivation only has to be modified to use two steps for each such rule to get a derivation in grammar G:

If S =>* ...A... => ...uw_iv... =>* w in G₁, replace the step ...A...=> ...uw_iv... by two steps
...A...=> ...uBv... => ...uw_iv...

Converting to Greibach Normal Form: Preliminaries, Left-Recursion [24] [top]

Eliminating Left Recursion

Lemma (Left Recursion) Suppose a variable A in a grammar G = (V,T,S, P) has left recursive rules

A -> Au₁ | Au₂ | ... | Au_m

and non-left recursive rules

A -> v₁ | v₂ | ... | v_m

Let G₁ = (S, V ∪ { B }, T, P₁) where B is a new variable not in V and P₁ is obtained from P by replacing the rules for A by:

A -> v₁B | v₂B | ... | v_mB | v₁ | v₂ | ... | v_m, and
introducing new rules for B:
B -> u₁B | u₂B | ... u_mB | u₁ | u₂ | ... | u_m

Then G₁ is equivalent to G.

Eliminating Left-Recursion: Proof [25] [top]

Proof: In a left-most derivation any use of a sequence of the left-recursive rules A -> Au_i must end with one of the non left-recursive rules A -> v_j.

The sequence of replacements
A => Au_i₁ => Au_i₂u_i₁ => ... => Au_{i_p}u_{i_p-1} ... u_i₁=> v_ju_{i_p}u_{i_p-1} ... u_i₁

in G can be replaced in G₁ by

A => v_jB => v_jui_pB => v_jui_pui_p-1B => ... => v_jui_pui_p-1 ... ui₂B => v_jui_pui_p-1 ... ui₂ui₁

This shows that L(G) is contained in L(G₁).

And conversely, if the rules for B are used to derive a string w in G1, then we can replace the sequence above in reverse to get a derivation for w in G.

Sketch of Conversion from CNF to GNF [26] [top]

Any grammar can be converted to Chomsky normal form. So G is assumed to already be in Chomsky normal form.

Rename the variables of the grammar if necessary as A₁, A₂, ...A_m

After the process below, only rules of this form will remain

A_i -> A_ju with i < j
A_i -> tu, with t i T and u in V^*
B_i -> u, with u in (V ∪ {B1, B2, ..., B_i-1} )^* and u does not begin with any B_j

for k = 1 to m 
{
  for j = 1 to k - 1
  {
     for each A_k -> A_ju
     {
       for all A_j -> v
       {
         add A_k -> vu
         remove A_k -> A_ju
       }
       
     }
     for each A_k -> A_ku
     {
       add rules B_k -> u and B_k -> uB_k
       remove A_k -> A_ku
     }
     if ( B_k was added ) {
       for each A_k -> v where v does not start with A_k
       {
         Add A_k -> vB_k
       }
     }
  }
}

Example [27] [top]

A₁ -> A₂A₃ | x
A₂ -> A₂A₃ | x
A₃ -> A₄A₅
A₄ -> +
A₅ -> x

Griebach: First Step

The rules for variables A₁, A₂, ... A₅ will be modified one at a time so that when A_k is modified

all rules of the form A_i -> A_ju with i less than k have i < j.
For a rule of the form A_k -> A_iu with i < k, substitute for A_i and repeat until the only rules for A_k of the form A_k -> A_iu have k <= i.
For a left recursive rule for A_k use the left-recursion elimination lemma, introducing new variable B_k

After the rules for each A_i will be in the correct form, beginning with a terminal followed by 0 or more variables.

The rules for the B_i's will either be in the correct form or be a string of variables beginning with some A_i. Substitute all the right side choices for A_i in this rule for B_i.

Example continued [28] [top]

A₁ -> A₂A₃ | x
A₂ -> A₂A₃ | x
A₃ -> A₄A₅
A₄ -> +
A₅ -> x

The rule for A₁ is not recursive and no index is < 1. So nothing to do for A₁.

A₂ does not require substitution since there is no rule of the form A₂ -> A₁..., but A₂ does have a recursive rule. So use the left-recursion elimination to get:

A₁ -> A₂A₃ | x
A₂ -> xB₂ | x     // Replaced A₂ -> A₂A₃ | x
B₂ -> A₃B₂ | A₃   // 
A₃ -> A₄A₅
A₄ -> +
A₅ -> x

Rules for A₃, A₄, and A₅ require no replacements.

Substitue for A's to get Greibach [29] [top]

A₁ -> A₂A₃ | x
A₂ -> xB₂ | x     
B₂ -> A₃B₂ | A₃   
A₃ -> A₄A₅
A₄ -> +
A₅ -> x

A₄ and A₅ rules are in Greibach normal form.

For A₃, replace leading A₄ on the right side.

A₃ -> +A₅.

Skip B₂.

A₂ rules are in Greibach normal form.

For A₁, replace leading A₂ on right side with all choices for A₂:

A₁ -> xB₂A₃ | xA₃ | x

Result:

A₁ -> xB₂A₃ | xA₃ | x
A₂ -> xB₂ | x     
B₂ -> A₃B₂ | A₃   
A₃ -> +A₅
A₄ -> +
A₅ -> x

Finally replace leading A₃ in rules for B₂

B₂ -> +A₅B₂ | +A₅

Example: Summary [30] [top]

A₁ -> xB₂A₃ | xA₃ | x
A₂ -> xB₂ | x     
B₂ -> +A₅B₂ | +A₅
A₃ -> +A₅
A₄ -> +
A₅ -> x

Derivation Length with Greibach Normal Form [31] [top]

The string x + x + x has length 5.

Give a derivation for x + x + x for the Greibach normal form grammar. How many derivation steps?

A₁ -> xB₂A₃ | xA₃ | x
A₂ -> xB₂ | x     
B₂ -> +A₅B₂ | +A₅
A₃ -> +A₅
A₄ -> +
A₅ -> x

Left Sentential Forms for a Greibach Normal Form Grammar [32] [top]

For a GNF grammar, every sentential form will be uv where u is a string of 0 or more terminals and v is a string of 0 or more variables!

PDA Configurations [33] [top]

For a PDA (Q, ∑, Γ, δ, q₀, F), we will consider triples (q, w, v) where q is in Q, w is in ∑^*, and v is in Γ^*. That is, we consider the set A = Q x ∑^* x Γ^*.

We define a relation R on A by:


Define

  (q,aw, Xv) is R-related to (p,w,uv)

if  δ(q,a,X) contains (p, u)

Note that a may be ε or an input symbol; that is a is in
∑_ε

The language recognized by the PDA can then be defined in terms of the transitive closure of R, R⁺.

  w is recognized by the PDA if

  (q₀, w, ε) is R^* related to (p, ε, u) 

  for some final state p.

Note that the stack doesn't have to be empty, although the PDA may
enforce this by not entering a final state unless the stack is empty.

CSC444 Oct23

slide version

single file version

Contents

Exercises [2] [top]

Transitive Closure of a Relation [3] [top]

Product of Relations [4] [top]

Definition of R^k for a relation R.[5] [top]

Transitive Closure [6] [top]

R^* and R⁺ Relations on a Finite Set [7] [top]

Proof of Lemma [8] [top]

R^* and R⁺ for Finite Sets [9] [top]

R^* and R⁺ are transitive [10] [top]

Example 1 [11] [top]

Example 2 [12] [top]

Notation for relation R [13] [top]

Transitive Closure [14] [top]

Reflexive Transitive Closure [15] [top]

Sentences and Sentential Forms [16] [top]

Examples [17] [top]

Left Sentential Forms [18] [top]

Greibach Normal Form [19] [top]

Examples [20] [top]

Use of Greibach Normal Form [21] [top]

Converting to Greibach Normal Form: Preliminaries, Substitution [22] [top]

Substitution (continued)[23] [top]

Converting to Greibach Normal Form: Preliminaries, Left-Recursion [24] [top]

Eliminating Left-Recursion: Proof [25] [top]

Sketch of Conversion from CNF to GNF [26] [top]

Example [27] [top]

Example continued [28] [top]

Substitue for A's to get Greibach [29] [top]

Example: Summary [30] [top]

Derivation Length with Greibach Normal Form [31] [top]

Left Sentential Forms for a Greibach Normal Form Grammar [32] [top]

PDA Configurations [33] [top]

CSC444 Oct23

slide version

single file version

Contents

Exercises[2] [top]

Transitive Closure of a Relation[3] [top]

Product of Relations[4] [top]

Definition of Rk for a relation R.[5] [top]

Transitive Closure[6] [top]

R* and R+ Relations on a Finite Set[7] [top]

Proof of Lemma[8] [top]

R* and R+ for Finite Sets[9] [top]

R* and R+ are transitive[10] [top]

Example 1[11] [top]

Example 2[12] [top]

Notation for relation R[13] [top]

Transitive Closure[14] [top]

Reflexive Transitive Closure[15] [top]

Sentences and Sentential Forms[16] [top]

Examples[17] [top]

Left Sentential Forms[18] [top]

Greibach Normal Form[19] [top]

Examples[20] [top]

Use of Greibach Normal Form[21] [top]

Converting to Greibach Normal Form: Preliminaries, Substitution[22] [top]

Substitution (continued)[23] [top]

Converting to Greibach Normal Form: Preliminaries, Left-Recursion[24] [top]

Eliminating Left-Recursion: Proof[25] [top]

Sketch of Conversion from CNF to GNF[26] [top]

Example[27] [top]

Example continued[28] [top]

Substitue for A's to get Greibach[29] [top]

Example: Summary[30] [top]

Derivation Length with Greibach Normal Form[31] [top]

Left Sentential Forms for a Greibach Normal Form Grammar[32] [top]

PDA Configurations[33] [top]

Exercises [2] [top]

Transitive Closure of a Relation [3] [top]

Product of Relations [4] [top]

Definition of R^k for a relation R.[5] [top]

Transitive Closure [6] [top]

R^* and R⁺ Relations on a Finite Set [7] [top]

Proof of Lemma [8] [top]

R^* and R⁺ for Finite Sets [9] [top]

R^* and R⁺ are transitive [10] [top]

Example 1 [11] [top]

Example 2 [12] [top]

Notation for relation R [13] [top]

Transitive Closure [14] [top]

Reflexive Transitive Closure [15] [top]

Sentences and Sentential Forms [16] [top]

Examples [17] [top]

Left Sentential Forms [18] [top]

Greibach Normal Form [19] [top]

Examples [20] [top]

Use of Greibach Normal Form [21] [top]

Converting to Greibach Normal Form: Preliminaries, Substitution [22] [top]

Converting to Greibach Normal Form: Preliminaries, Left-Recursion [24] [top]

Eliminating Left-Recursion: Proof [25] [top]

Sketch of Conversion from CNF to GNF [26] [top]

Example [27] [top]

Example continued [28] [top]

Substitue for A's to get Greibach [29] [top]

Example: Summary [30] [top]

Derivation Length with Greibach Normal Form [31] [top]

Left Sentential Forms for a Greibach Normal Form Grammar [32] [top]

PDA Configurations [33] [top]