Friday 21 June 2013

Applying State Design Pattern to Code an FSM


A finite-state machine (FSM) or finite-state automaton (plural: automata), or simply a state machine, is a mathematical model of computation used to design both computer programs and sequential logic circuits. The behavior of state machines can be observed in many devices in modern society which perform a predetermined sequence of actions depending on a sequence of events with which they are presented. 

A programmer often needs to model states of classes implementing such real life behavior. States of  a class can be implemented using some class variables denoting the a particular state the class is in. While handling the events the class is sensitive to, a check can be done to find the current state and appropriate action can be taken. For example 'if(state ==ON) do this else do that'. But this approach gets complicated if the number of states or events to which class has to react increase.


State Pattern is an  elegant design patterns that can be used to implement states and event in a seamless way. But the problem one often face is to map an FSM to state design patter. This post is aimed at explaining the concept of mapping an FSM to state design patter with an example. 

The example we take to illustrate State Pattern is a string tokenizer that converts an expression into a vector of tokens. The tokenizer reads one character at a time from a string stream and based on the state it is in it creates token for them and updates it state if required. For example once it finds a alphabet the it start collecting alphanumeric characters until it find an operator or space.The FSM of a tokenizer is listed below.



Mapping FSM to Class Design

To implement an FSM using state design pattern first step is to identify all the events. The FSM reads the characters of input string and change its state based on inputs. Hence for the given FSM the events are:
  • A space
  • A digit
  • A number
  • A letter
Then we need to define a base class consisting of virtual functions representing all the event as shown below. We can also optionally add two boundary function onstateEntry and onstateExit to take action on entry and exit to a particular state.

class fsm_base_state
{
public:
      virtual state_enum getStateType(){return STATE_INIT;}
      virtual void onstateEntry(char ch){}
      virtual void onstateExit(){}
      virtual void nextcharNumber(char num){}
      virtual void nextcharOperator(char op){}
      virtual void nextcharAlphabet(char alpha){}
      virtual void nextcharSpace(){}
protected:
      string nextToken;
};

Now since the FSM has 4 states hence we need to define four concrete class implementing the base class functions. There should be one class for each state in given FSM. The concrete state classes for our examples are:
WaitingForNextToken   çè fsm_state_init
AcceptingNumbers      çè fsm_state_acceptnumber
AcceptingLetters      çè fsm_state_acceptid
AcceptingOperator     çè fsm_state_acceptOperator

We also need to implement a context class(fsm_state_context) that creates all the states and also provides an API to change the current state to given state. I have provided one variation from traditional state patter where in every time a new state is entered a new object of state class is created. In the implementation provided I am creating all the state class object in advance and use them again and again.

For scanning a given string expression a new object of fsm_state_context is created that iterates trough all the characters of string and based on type of character calls one of fsm_base_state functions on the current state. After that cal lands into implementation of the current class and it takes necessary action as per behavior.
Complete class diagram for tokenizer state machine is as shown below.


Implementation

Each class is implemented taking in mind only the current state. Hence we will not find any instance where we have to check the current state and take any special action, like.. 
if(state == <SomeState>)
{
//do some thing
}
That’s the beauty of using a state pattern. While implementing a particular state we only need to worry about all the events/functions that are allowed in current state. If a state change is required, the state classes request the context class to change to desired next state. 

Entry and Exit conditions
We have defined stateEntry and Exit functions for handling the boundary conditions. For example while we are scanning a number once we find end of number we need to move to appropriate state but before doing so we also need to store the current number as a new toke found. Hence we will push it to the vector in the context class.

Handling Error Condition
There are some condition which are not valid for example 10*/20 is not a valid expression. This can also be handled quite easily. Once we find operator ‘*’ the state moves to 'accept operator'. So in implementation of nextCharOperator() function for fsm_state_acceptOperator class we can put a check that we cannot accept a new operator unless it is ‘(‘ or ‘)’ or this operator is after a ‘)’. See implementation of
void fsm_state_acceptOperator::nextcharOperator(char op)
function in tokenizer.cpp to understand how error is handled.

More about shared Example
The example privided also gives implementation of an evaluator class that inputs a tokenized vector and converts it into a post fix expression. If the conversion is success full then it evaluates the expression using a stack. The pseudo code for evaluation is available in this document.pseudoCode.pdf

More details on what is an infix, prefix and post fix expression can be found ath this wiki link Reverse Polish Notation

 Example Source Files Details

File
Details
 Header file defining all the classes required for modleing a tokenizer
 Source file implementing all the classes required for modleing a tokenizer
 Header file defining all the evaluator class
 Souce file implementing all the evaluator class
 A VC workspace with all the above files and testbench


Summing Up:
State design pattern is a very useful pattern to handle complexity of modeling a complex state machine. However mapping an FSM to this pattern is not straight forward.  This post explains how that can be achieved and also shares an example. The example also implements an expression evaluator that can be used to evaluate any expression given in string form. This is what most of the compilers do to evaluate complex expression we write in our source code. Hope I was able to explain the intent properly. In case any thing is not clear, feel free to let me know.

No comments:

Post a Comment