Regular expressions

5.3 Regular expressions

Regular expressions are a classical and convenient way to describe, for example, the structure of terminal words. This section defines regular expressions, defines the

Regular Languages

language of a regular expression, and shows that regular expressions and regular grammars are equally expressive formalisms. We do not discuss implementations of (datatypes and functions for matching) regular expressions; implementations can be found in the literature, see [9, 6].

Definition 14: RE T , regular expressions over alphabet T

regular

The set RE T of regular expressions over alphabet T is inductively defined as follows:

expres-

for regular expressions R, S

where a ∈ T . The operator + is associative, commutative, and idempotent; the concatenation operator, written as juxtaposition (so x concatenated with y is denoted by xy), is associative, and is the unit of it. In formulae this reads, for all regular expressions R, S, and V ,

2 Furthermore, the star operator, ∗, binds stronger than concatenation, and concatenation binds stronger than +. Examples of regular expressions are:

(bc)∗ + Ø

+ b( ∗) The language (i.e. the “semantics”) of a regular expression over T is a set of T -

sequences compositionally defined on the structure of regular expressions. As follows.

Definition 15: Language of a regular expression

Function Lre :: RE T → {T ∗ } returns the language of a regular expression. It is defined inductively by:

Lre(Ø) = Ø

Lre( ) = {} Lre(b) = {b}

5.3 Regular expressions

Lre(x + y) = Lre(x)) ∪ Lre(y)

Lre(xy) = Lre(x) Lre(y) Lre(x∗) = (Lre (x)) ∗

2 Since ∪ is associative, commutative, and idempotent, set concatenation is associative

with { } as its unit, and function Lre is well defined. Note that the language Lreb∗ is the set consisting of zero, one or more concatenations of b, i.e., Lre(b∗) = ({b}) ∗ . As an example of a language of a regular expression, we compute the language of the regular expression ( + bc)d.

Lre(( + bc)d)

(Lre( + bc)) (Lre(d))

(Lre( ) ∪ Lre(bc)){d}

({ } ∪ (Lre(b))(Lre(c))){d}

{d, bcd} Regular expressions are used to describe the tokens of a language. For example, the

list

if p then e1 else e2 contains six tokens, three of which are identifiers. An identifier is an element in the

language of the regular expression

letter (letter + digit )∗ where

letter = a+b+...+z+

A+B+...+Z

digit = 0+1+...+9 see subsection 2.3.1.

In the beginning of this section we claimed that regular expressions and regular grammars are equivalent formalisms. We will prove this claim later, but first we illustrate the construction of a regular grammar out of a regular expressions in an example. Consider the following regular expression.

R = a∗ + + (a + b)∗ We aim at a regular grammar G such that Lre(R) = L(G) and again we take a

top-down approach.

Regular Languages

Suppose that nonterminal A generates the language Lre(a∗), nonterminal B generates the language Lre( ), and nonterminal C generates the language Lre((a + b)∗). Suppose furthermore that the productions for A, B, and C satisfy the conditions imposed upon regular grammars. Then we obtain a regular grammar G with L(G) = Lre(R) by defining

S →A S →B S →C

where S is the start-symbol of G. It remains to construct productions for nonter- minals A, B, and C.

• The nonterminal A with productions

A → aA

A → generates the language Lre(a∗).

• Since Lre( ) = { }, the nonterminal B with production

B → generates the language { }.

• Nonterminal C with productions

C → aC

C → bC

C → generates the language Lre((a + b)∗). For a specific example it is not difficult to construct a regular grammar for a regular

expression. We now give the general result.

Theorem 16: Regular Grammar for Regular Expression

For each regular expression R there exists a regular grammar G such that

Lre(R) = L(G)

The proof of this theorem is given in Section 5.4. To obtain a regular expression that generates the same language as a given regular

grammar we go via an automaton. Given a regular grammar G, we can use the theorems from the previous sections to obtain a DFA D such that

L(G) = Ldfa(D)

5.4 Proofs

So if we can obtain a regular expression for a DFA D, we have found a regular expression for a regular grammar. To obtain a regular expression for a DFA D, we interpret each state of D as a regular expression defined as the sum of the concatenation of outgoing terminal symbols with the resulting state. For our example DFA we obtain:

C = It is easy to merge these four regular expressions into a single regular expression,

partially because this is a simple example. Merging the regular expressions obtained from a DFA that may loop is more complicated, as we will briefly explain in the proof of the following theorem. In general, we have:

Theorem 17: Regular Expression for Regular Grammar

For each regular grammar G there exists a regular expression R such that

L(G) = Lre(R)

The proof of this theorem is given in Section 5.4.

Regular expressions

5.3 Regular expressions

Parts

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

The Correlation between students vocabulary master and reading comprehension

The correlation intelligence quatient (IQ) and studenst achievement in learning english : a correlational study on tenth grade of man 19 jakarta

An analysis of moral values through the rewards and punishments on the script of The chronicles of Narnia : The Lion, the witch, and the wardrobe

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

Transmission of Greek and Arabic Veteri

Dukungan

Links

Regular expressions

5.3 Regular expressions

Parts

Dokumen yang terkait

Analisis Komparasi Internet Financial Local Government Reporting Pada Website Resmi Kabupaten dan Kota di Jawa Timur The Comparison Analysis of Internet Financial Local Government Reporting on Official Website of Regency and City in East Java

ANTARA IDEALISME DAN KENYATAAN: KEBIJAKAN PENDIDIKAN TIONGHOA PERANAKAN DI SURABAYA PADA MASA PENDUDUKAN JEPANG TAHUN 1942-1945 Between Idealism and Reality: Education Policy of Chinese in Surabaya in the Japanese Era at 1942-1945)

Improving the Eighth Year Students' Tense Achievement and Active Participation by Giving Positive Reinforcement at SMPN 1 Silo in the 2013/2014 Academic Year

Improving the VIII-B Students' listening comprehension ability through note taking and partial dictation techniques at SMPN 3 Jember in the 2006/2007 Academic Year -

The Correlation between students vocabulary master and reading comprehension

The correlation intelligence quatient (IQ) and studenst achievement in learning english : a correlational study on tenth grade of man 19 jakarta

An analysis of moral values through the rewards and punishments on the script of The chronicles of Narnia : The Lion, the witch, and the wardrobe

Improping student's reading comprehension of descriptive text through textual teaching and learning (CTL)

The correlation between listening skill and pronunciation accuracy : a case study in the firt year of smk vocation higt school pupita bangsa ciputat school year 2005-2006

Transmission of Greek and Arabic Veteri

Dokumen yang Anda mencari sudah siap untuk unduhkan