An equivalence between a class of languages and a class of machines.
This post depends on earlier posts about NFAs and Regular Expressions. It will not make sense without them.
The following process creates an NFA that accepts any string from the language defined by a given regular expression. I’ll describe it informally, since the formal version is kind of long.
Work backwards from the end of the regular expression. We start with just a single accepting state in the NFA. We’ll keep track of which nodes are “live”; initially this is the single accepting state.
a
we’ll add a new node which has an edge labeled a
pointing to each live node, and then make it the only live node.?
we’ll add what we normally would for the character before it
but keep all of the current live nodes as well as the new ones.)
we’ll create an entire NFA for the stuff inside the parentheses,
except we’ll use our current live nodes
instead of the single accepting state.|
we add any current live nodes to a set of start nodes
and reset the live nodes to it’s initial state.*
we’ll add what we normally would for the character before it
but we’ll make it’s final nodes also have edges to it’s first nodes
and we’ll keep all of the current live nodes as well as the new ones.When we’re all done we add the live nodes to a set of starting nodes.
That’s it. All that’s left is to show an example.
Let’s use a(b*c|d)ef?g
,
which we’ll read backwards:
g
,
f?
,
e
,
d)
,
|
,
c
,
b*
,
(
(which takes two steps in the image),
a
,
and a final step to finish.
There are many other ways to create an NFA from a regular expression; many texts present the “gadget” technique instead where each symbol has some set of nodes connected with “epsilon transitions”, or edges that don’t require any input to follow. I’ve always found the technique I’ve presented much simpler. Either way, the purpose remains clear: given a regular expression we can create an NFA that recognizes its language. And, since we talked recently about how to turn an NFA into a DFA, we can create a DFA too.
In part 2 we’ll see that we can go the other direction as well.
Looking for comments…