417 lecture note #8
[J: 10.3,4] Automata, Grammars and Languages (3)
6. Grammars and Languages
6.1 Basics
- A grammar is a set of (syntax) rules
used to specify a language.
- Formal definition:
A grammar G = (N, T, P, s) where
- N is a set of nonterminal
symbols
- T is a set of terminal
symbols
- P is a set of productions
(or "rewrite rules"), where [(N È
T)* - T*] -> (N È T)*
- s is the start symbol
- Example:
- N = {s, A, B}
- T = {a, b, c}
- P = {s ®
AB, AB ® BA, A ® aA, B ®
Bb, A ® a, B ® b}
- A grammar G is also considered to generate
a language L(G), through a notion of derivation. L(G) consists of all strings over T
derivable from s.
- Formal definition:
If a ® b is a
production in G and xay is in (N È T)*, then
xby is directly derivable
from xay, and written as xay Þ xby.
- Derivation Examples:
- Grammar G = (N, T, P, s) where
- N = {s, S}
- T = {a, b}
- P = {s ® bs, s ®
aS, S ® bS, S ® b}
Some derivations:
- for "bab"
s Þ bs Þ baS Þ bab
- for "bbabb"
s Þ
Then L(G) =
- Grammar for integers in Backus-Naur Form (BNF)
<integer> ::= + <unsigned integer> | - <unsigned integer> |
0 <digits> | 1 <digits> | ... | 9 < digits>
<unsigned integer> ::= 0 <digits> | 1 <digits> | ... | 9 < digits>
<digits> ::= 0 <digits> | 1 <digits>| ... | 9 <digits> | l
Some derivations:
6.2 Classes of Grammars
- There are 4 classes of grammars (Chomsky
hierarchy), in a decreasing order of expressiveness. 3 of
them are:
- Context-sensitive grammar
(type 1)
- Every production is of the
form aAb ® adb,
where a,b Î (N
È T)* and A Î N
and d
Î (N È T)* - {l}.
- In other words,
- The left-hand side
(LHS) is a sequence of terminal
or nonterminals but it must have
at least one nonterminal.
- The right-hand
side (RHS) is a sequence of
terminal or nonterminals but can
not be a null string.
- Context-free grammar
(type 2)
- Every production is of the
form A ® d, where A Î N
and d
Î (N È T)*.
- In other words,
- The LHS is a
nonterminal (one).
- The RHS is a
sequence of terminal or
nonterminals including a null
string.
- Regular grammar
(type 3)
- Every production is of the
form A ® a
or A ® aB or A ® l,
where A, BÎ N
and a Î T.
- In other words,
- The LHS is a
nonterminal (one).
- The RHS is a
terminal (one), or a terminal
followed by one nonterminal, or a
null string.
- Relation between classes

- More example grammars
- N = {S, A}
T = {a, b}
s = S
P = {S ® aS,
S ® aA, A ® bA, A ® l}
| Type of the grammar? |
|
| L(G) |
|
- N = {S, D}
T = {a, b, c}
s = S
P = {S ®
aDc, D ® a,
D ® b, D ® c, D ® DD}
| Type of the grammar? |
|
| L(G) |
|
- N = {s, A, B}
T = {a, b, c}
P = {s ® AB, AB ® BA, A ®
aA, B ® Bb, A ® a, B ®
b}
- Languages in different classes:
- Context-sensitive
- Can recognize strings of the form anbncn
(which is not CF)
- Example languages include some natural
languages such as Swiss-German.
- Context-free
- Can recognize strings of the form anbn
(which is not Regular)
- Example languages include most natural
languages such as English, Japanese, Spanish.
- Also almost all programming languages such as C,
C++, Pascal, Java.
- Regular
- Can recognize strings of anbm,
where n != m -- Requires NO memory.
6.3 Equivalency of Grammars
- Grammars G and G' are equivalent if L(G) = L(G').
- Example (10.3.14): context-free grammar for integer
6.4 FSA's and Grammars (from FSA to Regular Grammar)
- Every FSA has a corresponding regular grammar
which generates the language that the FSA accepts
(Ac(A) = L(G)).
- Given a FSA, construct the grammar as follows:
- Invent a nonterminal symbol for each state
- Write a production for each edge. If the edge is
labeled a from si to sj, the
production is si
® a
sj
- For each accepting state sa, add the
production sa
® e
- Examples:
- Automaton which accepts strings
that contain odd number of a's (where I = {a,
b}).
- Automaton which accepts a*b*c*
(where I = {a, b})
6.5 NDFSA's and Regular Grammars (from Regular Grammar
to NDFSA)
- Every regular grammar has a corresponding NDFSA
which accepts L(G).
- Given a grammar G = (N, T, P, s) be a regular grammar, build the
NDFSA as follows:
- Invent a state for each nonterminal symbol
- Need one additional state, st for the
accepting state.
- For each production in P of the form A ® xB, add an edge
from A to B and is labeled x.
- For each production in P of the form A ® x, add an edges
from A to st,
labeled x.
- For each production in P of the form B ® e, make B an
accepting state.
- Examples:
- Grammar G = (N, T, P, s) where
- N = {S,B,C}
- T = {a,b,c}
- s = S
- P = {S ®
aB, B ®
aB, B ®
bB, B ®
cB, B ®
aC, B ®
bC, B ®
cC, C ®
c}
Corresponding NDFSA:
- Grammar for integer
<integer> ::= + <unsigned integer> | - <unsigned integer> |
0 <digits> | 1 <digits> | ... | 9 < digits>
<unsigned integer> ::= 0 <digits> | 1 <digits> | ... | 9 < digits>
<digits> ::= 0 <digits> | 1 <digits>| ... | 9 <digits> | l
Corresponding NDFSA: