A language is regular if it can be expressed in terms of regular expression.
UNIT II: Regular Expressions and Languages
Syllabus
Regular
expression - Regular Languages- Equivalence of Finite Automata and regular
expressions Proving languages to be not regular (Pumping Lemma) - Closure
properties of regular languages.
Regular Expression and Regular Languages
AU
May-13, 16, Marks 8
Let
Σ be an alphabet which is used to denote the input set. The regular expression
over Σ can be defined as follows.
1.
ϕ is a regular expression which denotes the empty set.
2.
ε is a regular expression and denotes the set {ε} and it is a null string.
3.
For each 'a' in Σ 'a' is a regular expression and denotes the set {a}.
4.
If r and s are regular expressions denoting the languages L1 and L2
respectively, then
r+s
is equivalent to L1 U L2 i.e. union.
rs
is equivalent to L1 L2 i.e. concatenation
r*
is equivalent to L1* i.e. closure.
The
r* is known as kleen closure or closure which indicates occurrence of r for ∞
number of times.
For
example if Σ = {a} and we have regular expression R = a*, then R is a set
denoted by R= {ε, a, aa, aaa, aaaa, ...}
That
is R includes any number of a's as well as empty string which indicates zero
number of a's appearing, denoted by & character.
Similarly
there is a positive closure of L which can be shown as L+. The L+
denotes set of all the strings except the ε or null string. The null string can
be denoted by ε or ^ If Σ = {a} and if we have regular expression R = a+
then R is a set denoted by
R
= {a, aa, aaa, aaaa, ....)
We
can construct L* as
L*
= ε L+
Let
us try to use regular expressions with the help of some examples.
Definition
of Regular Language: A language is regular if it can be
expressed in terms of regular expression.
Example
2.1.1 Design regular expression for the language
containing all the strings containing any number of a's and b's.
Solution:
The regular expression will be
r.e.
= (a + b)*
This
will give the set as L = {ε, a, aa, ab, b, ba, bab, abab, ..... any combination
of a and b}.
The
(a + b)* means any combination with a and b even a null string.
Example
2.1.2 Write r.e. to denote a language L which accepts
all the strings which begin or end with either 00 or 11.
Solution:
The r.e. can be categorized into two subparts.
R
= L1 + L2
L1
= The strings which begin with 00 or 11.
L2
= The strings which end with 00 or 11.
Let
us find out L1 and L2.
L1
= (0011) (any number of O's and 1's)
L1
= (0011) (0+1)*
Similarly,
L2
= (any number of 0's and 1's) (00 + 11)
=
(0+1)* (00 + 11)
Hence
R
= [(00+11) (0+1)*] + [(0+1)* (00+11)]
Example
2.1.3 Write the r.e. to denote the language L over Σ =
{a, b} such that all the strings do not contain the substring "ab".
Solution:
The L = {ε, a, b, bb, aa, ba, baa ...)
The r.e. = (b* a*)
In
this regular expression if we substitute a* = ε we get all
combination of b's and similarly if we substitute b* = ε then we
will get all combinations of a's. We have strictly maintained a condition for
not to have ab as substring by giving the regular expression as b*
a*.
Example
2.1.4 What is regular expression? Write a regular
expression for set of strings that consists of alternating 0's and 1's. AU
May-16, Marks 8
Solution:
Regular expression - Refer section 2.1.
Regular
expression
Let
Σ be an alphabet which is used to denote the input set. The regular expression
over Σ can be defined as follows.
1.
ϕ is a regular expression which denotes the empty set.
2.
ε is a regular expression and denotes the set {ε} and it is a null string.
3.
For each 'a' in Σ 'a' is a regular expression and denotes the set {a}.
4.
If r and s are regular expressions denoting the languages L1 and L2
respectively, then
r+s
is equivalent to L1 U L2 i.e. union.
rs
is equivalent to L1 L2 i.e. concatenation
r*
is equivalent to L1* i.e. closure.
The
r* is known as kleen closure or closure which indicates occurrence of r for ∞
number of times.
For
example if Σ = {a} and we have regular expression R = a*, then R is a set
denoted by R= {ε, a, aa, aaa, aaaa, ...}
r.e.
= (0+ 1+ 0+ 1+)*
Example
2.1.5 Write regular expression to describe following
languages:
i)
{w = {0, 1}*}: w corresponds to binary encoding, without leading 0's,
of natural numbers that are evenly divisible by 4}
ii)
{w = {0, 1}}*: w corresponds to binary encoding, without leading 0's
of natural numbers that are powers of 4}
iii)
{w = {0-9}*: w corresponds to the decimal encoding, without leading
0's of an odd natural number}.
Solution:
i)
The valid strings for given language are (100, 1000, 1100, 11100,...)
For
ex: 100 = 4
Hence
regular expression will be
r.e.
= (1(0+1)* 00) + 0
ii)
The vaild strings for given language are (100, 10000, 1000000,...}
Since
(i) 100 in binary = 1×22 + 0×21 + 0×20 = 4 + 0
+ 0 = 4 in decimal
(ii)
10000 in binary = 1×24 + 0×23 + 0×22 + 0×21
+ 0×20 = 16 in decimal
Thus
we want 4, 16, 64... in binary representation as valid string.
Hence
the regular expression is
r.e.
= 1(00)*
iii)
The odd natural numbers are those numbers that end with either 1, 3, 5 or 9.
Hence regular expression using the set (0-9)* will be as follows
r.e.
= (ε +((1-9)(0-9)* ))(1+3+5+7+9)
Example
2.1.6 Write regular expressions for the following
languages over the alphabet Σ = (a, b)
i)
All strings that do not end with 'aa'.
ii)
All strings that contain an even number of 'b's.
iii)
All strings which do not contain the substring 'ba'.
Solution:
i)
The language contains all the strings that do not end with aa means it may end
with bb, ba or ab. Hence r. e. will be
(a
+ b) * ( (ab)* + (ba)* + b*)
ii)
r. e. = (a* b a* b)*
iii)
The L = {ε, a, b, bb, aa, ab, abb, ... }
r.
e. = (a*b*)
Example
2.1.7 Give the Regular Expression for the following
languages:
i)
L = { x|x ϵ {a,b}* and x contains exactly two a's}
ii)
L = (a, c, ab, cb, abb, cbb, abbb, cbbb, ...}
iii)
L = { x|x ϵ {a, b}* and x is any string that begins in
"abb" or a}
Solution:
i)
b* ab* ab*
ii)
The set contains all the string that either begin with a or c but always end
with any number of b's.
r.e.
= (a+c) b*
iii)
r.e. = (abb + a) (a + b)*
Review Question
1. Discuss on regular
expressions. AU: May-13, Marks 8
Theory of Computation: Unit II: Regular Expressions and Languages : Tag: : Regular Expressions and Languages - Theory of Computation - Regular Expression and Regular Languages
Theory of Computation
CS3452 4th Semester CSE Dept | 2021 Regulation | 4th Semester CSE Dept 2021 Regulation