The simplest of information machines.
In a previous post I discussed the idea of machines that read symbols off a tape and recognize a language, defining briefly each of those terms. One of the simplest such machines is the Deterministic Finite Automaton, or DFA.
We are all familiar with the idea that objects can be in several states. A car engine might be running or not running. A traffic semaphore might be green, amber, red, flashing amber, flashing red, or dead. In the natural world things are more often on a continuum than in distinct states: we might say an piece of fruit is green, ripe, or rotten, but we all know the boundaries are fuzzy. Still, discrete states are easy to think about and are almost universally used in computation. I may, at some point, talk about some continuous-state Turing-complete systems, but they are rarely used or discussed in computing.
Automata work based on state transitions which are precipitated by events. Turning the ignition cases a car engine to go from not running to running; turning it the other way switches it back to not running. Timers and/or vehicle sensors switch semaphores from green to amber, after which another timer switches them to red, etc.
A DFA is a machine that has a set of distinct states.
It is given a sequence of symbols from some alphabet;
read a symbol is an event that might change the machine’s state.
We’ll use an alphabet with just two symbols, a
and b
, for now.
After reading all the symbols the DFA either outputs “yes”
or “no”, based on which state it is in.
States that cause it to say “yes” are called accept states.
Let’s consider a machine with two states,
one of which is an accept state and the other is not.
Let’s say this machine starts in the accept state.
When it reads an a
in the accept state nothing happens.
When it reads a b
in the accept state it changes to the other state.
Once it enter the other state it never goes back.
Thus, if we give the machine aaba
it will start in the accept state,
read an a
and remain in the accept state,
read an a
and remain in the accept state,
read a b
and transition to the reject state,
read an a
and remain in the reject state,
and conclude by answering “no”, also called “rejecting the input”.
DFAs are often displayed using circles and arrows, like so:
Accept states are indicated using doubled circles. It is traditional to put self-edges for events that do not cause a state change. The starting state is indicated with an arrow excident nothing.
The language of a machine
is the set of all inputs that will cause it to end in an accept state.
A string is a sequence of symbols; each input is a string.
The language of the DFA in the previous section is the set of all strings
that do not have a b
in them, only zero or more a
s.
Sometimes the languages a DFA describes are harder to articulate; for example, the following DFA
has a language that consists of one or more blurbs,
where a blurb is one of ab
, aaab
, or abbb
.
In later posts we shall investigate a number of questions about DFAs that we will also ask about other machines later. Is there a readable way of describing languages DFAs recognize? Are there languages DFAs can’t recognize, and if so what are they? If we make DFAs more complicated in various ways, does their expressive power change?
Deterministic Finite Automata are useful in a variety of settings, mostly because they are simple and don’t admit many bugs. They are also our first step into the world of computability theory.
Looking for comments…