Converting an FSA to a Regular Grammar

Last update : May 12, 2004

We will let each state in the NFA correspond to a non-terminal in the grammar. Each symbol used on the transitions corresponds to a terminal symbol.

Let Si denote the non-terminal corresponding to state i of the NFA. Then

  1. The start symbol of the grammar is S0 , the non-terminal corresponding to the start state of the NFA
  2. For each transition from state i to state j on some symbol `a', create a production rule of the form: Si à Sj
  3. For each state i of the NFA which is a finish state, create a production rule of the form: Si à l

Note that l -transitions between states would generate production rules of the form Si à Sj

, which says that ``wherever we see Si we can replace it with Sj '' - another indication that l -transitions denote equivalence.

This process will work equally well with a DFA, and can be applied in reverse to a regular grammar to produce the corresponding NFA.

 

Making a Regular Grammar l -Free

Algorithm

Recall that a regular grammar is said to be l -free if it has no l -productions except possibly for the production Sà l , where S is the start symbol, and does not appear on the right hand side of the production rules.

1. copy all non l -productions from G to G'. Let S be the start symbol in G';

2. for any non-terminal N which can become l (N à l is a production), we copy every rule in which N appears on the right hand side both with and without N;

3. if S à l was in the original set of rules, add a new start symbol S1 in G', add the rule S1à l and copy all the production rules with S on the left hand side to ones with S' on the left hand side.

 

Example :
Give an equivalent l -free regular grammar to:

S à aA | bB | l

A à aA | a | l

B à bB | b | l

Where S is the starting symbol.

 

1. copy all the l -free production :

S à aA | bB

A à aA | a

B à bB | b

where S is the starting symbol.

2. if Nà l is in G, copy every rule in which N appears on the right hand side both with and without N

 

S à aA | bB | a | b

A à aA | a

B à bB | b

3. Since Sà l was a production in the original grammar, and S1

S1 à l

S à aA | bB | a | b

A à aA | a

B à bB | b

 

where S1 is the starting symbol …. and copy all rules from S to S1 and remove S

S 1à aA | bB | a | b

A à aA | a

B à bB | b

S1 is the starting symbol.

 

 

 

 

 

Constructing am equivalent NFA from a given Right Linear Grammar

 

Right linear grammar NFA

1. starting symbol S starting state S

2. each non-terminal Ai each state Ai

3. Create a Accepting state F

4. each production Ai à aAj transition from state Ai to Aj upon input a

5. if Ai à a then transition from state Ai to F upon input a

example :

Given a right linear grammar G : S à bS | aC C à bC | b

The corresponding NFA :

Set of states : { S , C } with S as the starting state and C is the accepting state

Set of Input symbols : { a, b }

Transition function :

 

State\input

a

b

S

C

S

C

Not defined

{ C , F}

F

Not defined

Not Defined

 

The Closure of Context-Free Languages

We have seen that the regular languages are closed under common set-theoretic operations; the same, however, does not hold true for context-free languages.

Lemma: The context-free languages are closed under union, concatenation and Kleene closure.

That is, if L1 and L2 are context free, so are L1 Ç L2 , L1L2 (concatenation) and L1* .

Proof:
We will prove that the languages are closed by creating the appropriate grammars.
Suppose we have two context-free languages, represented by grammars with start symbols

S1 and S2 respectively.

First of all, rename all the terminal symbols in the second grammar so that they don't conflict with those in the first. Then:

 

Lemma: The context-free languages are not closed under intersection

That is, if L1 and L2 are context free, but L1 Ç L2 is not necessarily be context free.

 

Proof:
We will prove the non-closure of intersection by exhibiting a counter-example.
Consider the following two languages:

L1 = { aibjck : i < j } L2 = { aibjck : i < k }

It can show that L1 and L2 are context free as follow :

L1 : S à ABC A à aAb | l B à bB | b C à cC | l

The intersection L1 Ç L2 is { aibjck : i < j and i < k} which is not context free.