Converting an FSA to a Regular Grammar

Last update : May 12, 2004

We will let each state in the NFA correspond to a non-terminal in the grammar. Each symbol used on the transitions corresponds to a terminal symbol.

Let S_idenote the non-terminal corresponding to state i of the NFA. Then

The start symbol of the grammar is S₀, the non-terminal corresponding to the start state of the NFA
For each transition from state i to state j on some symbol `a', create a production rule of the form: S_{i à} S_j
For each state i of the NFA which is a finish state, create a production rule of the form: S_{i à} l

Note that l -transitions between states would generate production rules of the form S_{i à} S_j

, which says that ``wherever we see Si we can replace it with Sj '' - another indication that l -transitions denote equivalence.

This process will work equally well with a DFA, and can be applied in reverse to a regular grammar to produce the corresponding NFA.

Making a Regular Grammar l -Free

Algorithm

Recall that a regular grammar is said to be l -free if it has no l -productions except possibly for the production Sà l , where S is the start symbol, and does not appear on the right hand side of the production rules.

1. copy all non l -productions from G to G'. Let S be the start symbol in G';

2. for any non-terminal N which can become l (N à l is a production), we copy every rule in which N appears on the right hand side both with and without N;

3. if S à l was in the original set of rules, add a new start symbol S1 in G', add the rule S1à l and copy all the production rules with S on the left hand side to ones with S' on the left hand side.

Example :
Give an equivalent l -free regular grammar to:

S à aA | bB | l

A à aA | a | l

B à bB | b | l

Where S is the starting symbol.

1. copy all the l -free production :

S à aA | bB

A à aA | a

B à bB | b

where S is the starting symbol.

2. if Nà l is in G, copy every rule in which N appears on the right hand side both with and without N

S à aA | bB | a | b

A à aA | a

B à bB | b

3. Since Sà l was a production in the original grammar, and S1

S1 à l

S à aA | bB | a | b

A à aA | a

B à bB | b

where S1 is the starting symbol …. and copy all rules from S to S1 and remove S

S 1à aA | bB | a | b

A à aA | a

B à bB | b

S1 is the starting symbol.

Constructing am equivalent NFA from a given Right Linear Grammar

Right linear grammar NFA

1. starting symbol S starting state S

2. each non-terminal Ai each state Ai

3. Create a Accepting state F

4. each production Ai à aAj transition from state Ai to Aj upon input a

5. if Ai à a then transition from state Ai to F upon input a

example :

Given a right linear grammar G : S à bS | aC C à bC | b

The corresponding NFA :

Set of states : { S , C } with S as the starting state and C is the accepting state

Set of Input symbols : { a, b }

Transition function :

State\input

a

b

S

C

S

C

Not defined

{ C , F}

F

Not defined

Not Defined

The Closure of Context-Free Languages

We have seen that the regular languages are closed under common set-theoretic operations; the same, however, does not hold true for context-free languages.

Lemma: The context-free languages are closed under union, concatenation and Kleene closure.

That is, if L₁and L₂ are context free, so are L₁ Ç L₂ , L₁L₂ (concatenation) and L₁* .

Proof:
We will prove that the languages are closed by creating the appropriate grammars.
Suppose we have two context-free languages, represented by grammars with start symbols

S₁ and S₂ respectively.

First of all, rename all the terminal symbols in the second grammar so that they don't conflict with those in the first. Then:

To get the union, add the rule S à S₁ | S₂

To get the concatenation, add the rule S à S₁ S₂

To get the Kleene Closure of L₁ , add the rule S à S₁ S | l to the grammar for L₁.

Lemma: The context-free languages are not closed under intersection

That is, if L₁ and L₂ are context free, but L₁ Ç L₂ is not necessarily be context free.

Proof:
We will prove the non-closure of intersection by exhibiting a counter-example.
Consider the following two languages:

L₁ = { aⁱb^jc^k : i < j } L₂ = { aⁱb^jc^k : i < k }

It can show that L₁ and L₂ are context free as follow :

L1 : S à ABC A à aAb | l B à bB | b C à cC | l

The intersection L₁ Ç L₂ is { aⁱb^jc^k : i < j and i < k} which is not context free.