Push Down Automaton to detect RNA hairpins
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
2
down vote
favorite
For one of my assignments I am supposed to write a Push Down Automaton. It's supposed to receive a string that has to be checked if it's an RNA Hairpin.
For example: gacgcaaguc would be one, since the gac is the opposite of the guc, and the four in the middle have to be either gcaa or gaaa.
I already have a working program which, however, checks each possible case. It has 15 if
statements, one for each case, and I am sure there is a better way.
Now I am looking for a more elegant way in C++ to check these cases:
S -> aW1u | uW1a | gW1c | cW1g
W1-> aW2u | uW2a | gW2c | cW2g
W2-> aW3u | uW3a | gW3c | cW3g
W3-> gW4a
W4-> ca | aa
As you probably know the right side gets filled into a Stack, and once I reach half the String, I check if the Stack and the other half of my String align. I didn't understand PDAs too well, so maybe you have some help for a guy who is new to this.
HPP:
#include <stack>
class PDA
public:
enum Language
HAIRPIN, // accepts RNA Hairpins
BRACKETS // Zusatzaufgabe
;
enum State
IN_PROGRESS = 0, // sequence is valid so far, but not complete
SUCCESS = 1, // currently in accepting state, i.e. stack is empty after reading # symbol
FAIL = 2 // sequence is not in language
;
/// Constructor, which specifies which language to check (HAIRPIN by default)
/// and internally builds up the production rules
PDA(const Language l = Language::HAIRPIN);
/// Read the next symbol and internally traverse states
/// If a fail state was reached in the previous call, the fail state will be maintained
/// @param a Symbol to read
/// @return current state of PDA
State next(const char a);
/// Reset the automaton to a state as if newly constructed
/// I.e. init stack and set state to s0.
void reset();
protected:
/// YOUR Member variables and functions
State transitionfunction(char b, State cur, char stacktop);
State current;
std::stack<char> stark;
int statechange;
;
CPP:
#include <stack>
#include "PDA.hpp"
#include <iostream>
using namespace std;
PDA::PDA(const Language l)
current = IN_PROGRESS;
stark.push('$');
statechange = 0;
PDA::State PDA::next(const char a)
if (statechange == 1)
if (stark.top() == '$')
return PDA::SUCCESS;
else if (stark.top() == a)
stark.pop();
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else
return transitionfunction(a, current, stark.top());
void PDA::reset()
current = IN_PROGRESS;
while (!stark.empty())
stark.pop();
stark.push('$');
statechange = 0;
PDA::State PDA::transitionfunction(const char b, PDA::State cur, const char stacktop)
if (cur == PDA::FAIL)
return PDA::FAIL;
else if (stacktop =='$')
if (b == 'a')
stark.push('u');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.push('g');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.push('c');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.push('a');
stark.push('1');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop =='1')
if (b == 'a')
stark.pop();
stark.push('u');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.pop();
stark.push('g');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.pop();
stark.push('c');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.pop();
stark.push('a');
stark.push('2');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop =='2')
if (b == 'a')
stark.pop();
stark.push('u');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.pop();
stark.push('g');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.pop();
stark.push('c');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.pop();
stark.push('a');
stark.push('3');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop == '3')
if(b == 'g')
stark.pop();
stark.push('a');
stark.push('4');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop == '4')
if(b == 'c')
stark.pop();
stark.push('a');
statechange = 1;
return PDA::IN_PROGRESS;
else if(b == 'a')
stark.pop();
stark.push('a');
statechange = 1;
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
Main:
#include <iostream>
#include "PDA.hpp"
using namespace std;
int main(int argc, char* argv)
if (argc != 2)
cout << "Please only enter one string" << endl;
return 1;
string a = argv[1];
if (a.length() != 10)
cout << "The string should have length 10" << endl;
return 1;
PDA::State final = PDA::IN_PROGRESS;
PDA testpin;
for (uint i = 0; i <= a.length(); i++)
final = testpin.next(a[i]);
if (final == PDA::FAIL)
cout << "FAIL" << endl;
return 1;
else if (final == PDA::SUCCESS)
cout << "ACCEPT" << endl;
return 0;
c++ bioinformatics state-machine
add a comment |Â
up vote
2
down vote
favorite
For one of my assignments I am supposed to write a Push Down Automaton. It's supposed to receive a string that has to be checked if it's an RNA Hairpin.
For example: gacgcaaguc would be one, since the gac is the opposite of the guc, and the four in the middle have to be either gcaa or gaaa.
I already have a working program which, however, checks each possible case. It has 15 if
statements, one for each case, and I am sure there is a better way.
Now I am looking for a more elegant way in C++ to check these cases:
S -> aW1u | uW1a | gW1c | cW1g
W1-> aW2u | uW2a | gW2c | cW2g
W2-> aW3u | uW3a | gW3c | cW3g
W3-> gW4a
W4-> ca | aa
As you probably know the right side gets filled into a Stack, and once I reach half the String, I check if the Stack and the other half of my String align. I didn't understand PDAs too well, so maybe you have some help for a guy who is new to this.
HPP:
#include <stack>
class PDA
public:
enum Language
HAIRPIN, // accepts RNA Hairpins
BRACKETS // Zusatzaufgabe
;
enum State
IN_PROGRESS = 0, // sequence is valid so far, but not complete
SUCCESS = 1, // currently in accepting state, i.e. stack is empty after reading # symbol
FAIL = 2 // sequence is not in language
;
/// Constructor, which specifies which language to check (HAIRPIN by default)
/// and internally builds up the production rules
PDA(const Language l = Language::HAIRPIN);
/// Read the next symbol and internally traverse states
/// If a fail state was reached in the previous call, the fail state will be maintained
/// @param a Symbol to read
/// @return current state of PDA
State next(const char a);
/// Reset the automaton to a state as if newly constructed
/// I.e. init stack and set state to s0.
void reset();
protected:
/// YOUR Member variables and functions
State transitionfunction(char b, State cur, char stacktop);
State current;
std::stack<char> stark;
int statechange;
;
CPP:
#include <stack>
#include "PDA.hpp"
#include <iostream>
using namespace std;
PDA::PDA(const Language l)
current = IN_PROGRESS;
stark.push('$');
statechange = 0;
PDA::State PDA::next(const char a)
if (statechange == 1)
if (stark.top() == '$')
return PDA::SUCCESS;
else if (stark.top() == a)
stark.pop();
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else
return transitionfunction(a, current, stark.top());
void PDA::reset()
current = IN_PROGRESS;
while (!stark.empty())
stark.pop();
stark.push('$');
statechange = 0;
PDA::State PDA::transitionfunction(const char b, PDA::State cur, const char stacktop)
if (cur == PDA::FAIL)
return PDA::FAIL;
else if (stacktop =='$')
if (b == 'a')
stark.push('u');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.push('g');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.push('c');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.push('a');
stark.push('1');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop =='1')
if (b == 'a')
stark.pop();
stark.push('u');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.pop();
stark.push('g');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.pop();
stark.push('c');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.pop();
stark.push('a');
stark.push('2');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop =='2')
if (b == 'a')
stark.pop();
stark.push('u');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.pop();
stark.push('g');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.pop();
stark.push('c');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.pop();
stark.push('a');
stark.push('3');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop == '3')
if(b == 'g')
stark.pop();
stark.push('a');
stark.push('4');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop == '4')
if(b == 'c')
stark.pop();
stark.push('a');
statechange = 1;
return PDA::IN_PROGRESS;
else if(b == 'a')
stark.pop();
stark.push('a');
statechange = 1;
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
Main:
#include <iostream>
#include "PDA.hpp"
using namespace std;
int main(int argc, char* argv)
if (argc != 2)
cout << "Please only enter one string" << endl;
return 1;
string a = argv[1];
if (a.length() != 10)
cout << "The string should have length 10" << endl;
return 1;
PDA::State final = PDA::IN_PROGRESS;
PDA testpin;
for (uint i = 0; i <= a.length(); i++)
final = testpin.next(a[i]);
if (final == PDA::FAIL)
cout << "FAIL" << endl;
return 1;
else if (final == PDA::SUCCESS)
cout << "ACCEPT" << endl;
return 0;
c++ bioinformatics state-machine
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
For one of my assignments I am supposed to write a Push Down Automaton. It's supposed to receive a string that has to be checked if it's an RNA Hairpin.
For example: gacgcaaguc would be one, since the gac is the opposite of the guc, and the four in the middle have to be either gcaa or gaaa.
I already have a working program which, however, checks each possible case. It has 15 if
statements, one for each case, and I am sure there is a better way.
Now I am looking for a more elegant way in C++ to check these cases:
S -> aW1u | uW1a | gW1c | cW1g
W1-> aW2u | uW2a | gW2c | cW2g
W2-> aW3u | uW3a | gW3c | cW3g
W3-> gW4a
W4-> ca | aa
As you probably know the right side gets filled into a Stack, and once I reach half the String, I check if the Stack and the other half of my String align. I didn't understand PDAs too well, so maybe you have some help for a guy who is new to this.
HPP:
#include <stack>
class PDA
public:
enum Language
HAIRPIN, // accepts RNA Hairpins
BRACKETS // Zusatzaufgabe
;
enum State
IN_PROGRESS = 0, // sequence is valid so far, but not complete
SUCCESS = 1, // currently in accepting state, i.e. stack is empty after reading # symbol
FAIL = 2 // sequence is not in language
;
/// Constructor, which specifies which language to check (HAIRPIN by default)
/// and internally builds up the production rules
PDA(const Language l = Language::HAIRPIN);
/// Read the next symbol and internally traverse states
/// If a fail state was reached in the previous call, the fail state will be maintained
/// @param a Symbol to read
/// @return current state of PDA
State next(const char a);
/// Reset the automaton to a state as if newly constructed
/// I.e. init stack and set state to s0.
void reset();
protected:
/// YOUR Member variables and functions
State transitionfunction(char b, State cur, char stacktop);
State current;
std::stack<char> stark;
int statechange;
;
CPP:
#include <stack>
#include "PDA.hpp"
#include <iostream>
using namespace std;
PDA::PDA(const Language l)
current = IN_PROGRESS;
stark.push('$');
statechange = 0;
PDA::State PDA::next(const char a)
if (statechange == 1)
if (stark.top() == '$')
return PDA::SUCCESS;
else if (stark.top() == a)
stark.pop();
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else
return transitionfunction(a, current, stark.top());
void PDA::reset()
current = IN_PROGRESS;
while (!stark.empty())
stark.pop();
stark.push('$');
statechange = 0;
PDA::State PDA::transitionfunction(const char b, PDA::State cur, const char stacktop)
if (cur == PDA::FAIL)
return PDA::FAIL;
else if (stacktop =='$')
if (b == 'a')
stark.push('u');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.push('g');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.push('c');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.push('a');
stark.push('1');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop =='1')
if (b == 'a')
stark.pop();
stark.push('u');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.pop();
stark.push('g');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.pop();
stark.push('c');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.pop();
stark.push('a');
stark.push('2');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop =='2')
if (b == 'a')
stark.pop();
stark.push('u');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.pop();
stark.push('g');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.pop();
stark.push('c');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.pop();
stark.push('a');
stark.push('3');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop == '3')
if(b == 'g')
stark.pop();
stark.push('a');
stark.push('4');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop == '4')
if(b == 'c')
stark.pop();
stark.push('a');
statechange = 1;
return PDA::IN_PROGRESS;
else if(b == 'a')
stark.pop();
stark.push('a');
statechange = 1;
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
Main:
#include <iostream>
#include "PDA.hpp"
using namespace std;
int main(int argc, char* argv)
if (argc != 2)
cout << "Please only enter one string" << endl;
return 1;
string a = argv[1];
if (a.length() != 10)
cout << "The string should have length 10" << endl;
return 1;
PDA::State final = PDA::IN_PROGRESS;
PDA testpin;
for (uint i = 0; i <= a.length(); i++)
final = testpin.next(a[i]);
if (final == PDA::FAIL)
cout << "FAIL" << endl;
return 1;
else if (final == PDA::SUCCESS)
cout << "ACCEPT" << endl;
return 0;
c++ bioinformatics state-machine
For one of my assignments I am supposed to write a Push Down Automaton. It's supposed to receive a string that has to be checked if it's an RNA Hairpin.
For example: gacgcaaguc would be one, since the gac is the opposite of the guc, and the four in the middle have to be either gcaa or gaaa.
I already have a working program which, however, checks each possible case. It has 15 if
statements, one for each case, and I am sure there is a better way.
Now I am looking for a more elegant way in C++ to check these cases:
S -> aW1u | uW1a | gW1c | cW1g
W1-> aW2u | uW2a | gW2c | cW2g
W2-> aW3u | uW3a | gW3c | cW3g
W3-> gW4a
W4-> ca | aa
As you probably know the right side gets filled into a Stack, and once I reach half the String, I check if the Stack and the other half of my String align. I didn't understand PDAs too well, so maybe you have some help for a guy who is new to this.
HPP:
#include <stack>
class PDA
public:
enum Language
HAIRPIN, // accepts RNA Hairpins
BRACKETS // Zusatzaufgabe
;
enum State
IN_PROGRESS = 0, // sequence is valid so far, but not complete
SUCCESS = 1, // currently in accepting state, i.e. stack is empty after reading # symbol
FAIL = 2 // sequence is not in language
;
/// Constructor, which specifies which language to check (HAIRPIN by default)
/// and internally builds up the production rules
PDA(const Language l = Language::HAIRPIN);
/// Read the next symbol and internally traverse states
/// If a fail state was reached in the previous call, the fail state will be maintained
/// @param a Symbol to read
/// @return current state of PDA
State next(const char a);
/// Reset the automaton to a state as if newly constructed
/// I.e. init stack and set state to s0.
void reset();
protected:
/// YOUR Member variables and functions
State transitionfunction(char b, State cur, char stacktop);
State current;
std::stack<char> stark;
int statechange;
;
CPP:
#include <stack>
#include "PDA.hpp"
#include <iostream>
using namespace std;
PDA::PDA(const Language l)
current = IN_PROGRESS;
stark.push('$');
statechange = 0;
PDA::State PDA::next(const char a)
if (statechange == 1)
if (stark.top() == '$')
return PDA::SUCCESS;
else if (stark.top() == a)
stark.pop();
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else
return transitionfunction(a, current, stark.top());
void PDA::reset()
current = IN_PROGRESS;
while (!stark.empty())
stark.pop();
stark.push('$');
statechange = 0;
PDA::State PDA::transitionfunction(const char b, PDA::State cur, const char stacktop)
if (cur == PDA::FAIL)
return PDA::FAIL;
else if (stacktop =='$')
if (b == 'a')
stark.push('u');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.push('g');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.push('c');
stark.push('1');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.push('a');
stark.push('1');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop =='1')
if (b == 'a')
stark.pop();
stark.push('u');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.pop();
stark.push('g');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.pop();
stark.push('c');
stark.push('2');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.pop();
stark.push('a');
stark.push('2');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop =='2')
if (b == 'a')
stark.pop();
stark.push('u');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'c')
stark.pop();
stark.push('g');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'g')
stark.pop();
stark.push('c');
stark.push('3');
return PDA::IN_PROGRESS;
else if (b == 'u')
stark.pop();
stark.push('a');
stark.push('3');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop == '3')
if(b == 'g')
stark.pop();
stark.push('a');
stark.push('4');
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
else if (stacktop == '4')
if(b == 'c')
stark.pop();
stark.push('a');
statechange = 1;
return PDA::IN_PROGRESS;
else if(b == 'a')
stark.pop();
stark.push('a');
statechange = 1;
return PDA::IN_PROGRESS;
else
return PDA::FAIL;
Main:
#include <iostream>
#include "PDA.hpp"
using namespace std;
int main(int argc, char* argv)
if (argc != 2)
cout << "Please only enter one string" << endl;
return 1;
string a = argv[1];
if (a.length() != 10)
cout << "The string should have length 10" << endl;
return 1;
PDA::State final = PDA::IN_PROGRESS;
PDA testpin;
for (uint i = 0; i <= a.length(); i++)
final = testpin.next(a[i]);
if (final == PDA::FAIL)
cout << "FAIL" << endl;
return 1;
else if (final == PDA::SUCCESS)
cout << "ACCEPT" << endl;
return 0;
c++ bioinformatics state-machine
edited Mar 4 at 7:42
200_success
123k14143401
123k14143401
asked Jan 27 at 22:53
Daniel GieÃing
112
112
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
1
down vote
I think you're on the right track with a PDA! It seems like a good way to approach this problem. I see 2 big issues with this: 1)the naming of your variables, and 2) the high number of magic values. Here are my thoughts.
Naming
In main()
for example, the user enters a string
. I assume the string
is intended to represent a sequence of bases? If so, it should be called something like base_sequence
.
Similarly, the argument names in your functions tend to be a single character which conveys no meaning whatsoever to a reader. In the constructor to PDA
, you have an argument named l
. A single lowercase letter L is a really bad variable name because it looks a lot like a number 1 at first glance. Someone reading the code could easily confuse one for the other, so at least call it lng
or lang
, or better yet, language
. (Or, since it's not actually used ever anywhere, get rid of it!)
The next
method of PDA
takes a single char
named a
. Looking at the interface for this class, I have no idea what a
represents. The comment says symbol to be read.
Why not name it that way? Something like nextSymbol
, or nextInput
, or better yet, next_base
since it represents a base from the base sequence. Also, the name next
conveys very little. Next what? How about nextState()
or processNextBase()
? And the transitionfunction()
function has both of those problems as well. Again, b
is a base, so name it that way. You don't usually need to put the word function
in the name of a function (unless it's returning a function pointer, for example).
And why is your stack
called stark
?
Magic Values
In transitionfunction()
, what does it mean that the top of the stack contains the character $
? Or 1
or 2
, or any other value? You should either make named constants for them or define an enumerated type for them. That way, someone reading your code will understand the purpose of them.
Simplify
I think your transitionfunction()
function could be simplified in a number of ways. I'll break them down in order of easiest to hardest.
First, you don't need any of the top-level else
s since once you enter one of them, the function returns. So the structure could be more like this:
if (cur == PDA::FAIL)
return cur;
if (stacktop == '$')
//... internal "if"s
if (stacktop == '1')
// ... internal "if"s
etc.
Next, it might make sense to break the body of the top-most if
s into separate functions. So something like this:
if (stacktop == '$')
return handleS(b);
if (stacktop == '1')
return handle1(b);
if (stacktop == '2')
return handle2(b);
... etc.
Of course, I'd name the handlers something clearer than handleX
to make it clear what those states represent.
You can then rearrange the above into a switch
statement like this:
switch (stacktop)
case '$':
return handleS(b);
break;
case '1':
return handle1(b);
break;
case '2':
return handle2(b);
break;
...etc.
;
You can do the same in the handleX()
functions:
switch (b)
case 'a':
stark.push('u');
stark.push('1');
return PDA::IN_PROGRESS;
break;
case 'c':
//... code for the 'c' base
break;
... etc.
But do you notice something? Every case in the handleX()
functions will consist of up to 4 operations:
- optionally pop the stack
- push a base
- optionally push a number
- optionally set the state change to 1
And then they return PDA::IN_PROGRESS
. This seems like a potential candidate for a table-driven design. You could define a struct
that holds these things. Something like this:
struct StateChange
bool popStack;
char base;
char number; // use a better name here
bool setStateChange;
StateChange(bool popIt, char newBase, char newNumber, bool shouldChangeState) : popStack(popIt), base(newBase), number (newNumber), setStateChange(shouldChangeState);
;
Then you can create std::map
of these. Let's look at the $
case:
using StateChangeMap = std::map<char, StateChange>;
StateChangeMap dollarStates;
dollarStates [ 'a' ] = StateChange(false, 'u', '1', false);
dollarStates [ 'c' ] = StateChange(false, 'g', '1', false);
// ... etc. for the rest of the '$' states
Then you can put each of those into another map
where the key is the stacktop
value that's passed into transitionfunction()
. Your transition function now looks something like:
PDA::State PDA::transitionfunction(const char b, PDA::State cur, const char stacktop)
if (cur == PDA::FAIL)
return PDA::FAIL;
StateChange nextState = transitionMap [ stacktop ][ b ];
if (nextState.popStack)
stark.pop();
stark.push(nextState.base);
if (nextState.number != '0') // Here I'm using '0' to represent not pushing a number
stark.push(nextState.number);
if (nextState.setStateChange)
statechange = 1;
return PDA::IN_PROGRESS;
Note that you'll either need to use std::map::find()
rather than the operator
that I used if you haven't sanitized the input, or you'll need to sanitize the input so you never get an invalid state value.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
I think you're on the right track with a PDA! It seems like a good way to approach this problem. I see 2 big issues with this: 1)the naming of your variables, and 2) the high number of magic values. Here are my thoughts.
Naming
In main()
for example, the user enters a string
. I assume the string
is intended to represent a sequence of bases? If so, it should be called something like base_sequence
.
Similarly, the argument names in your functions tend to be a single character which conveys no meaning whatsoever to a reader. In the constructor to PDA
, you have an argument named l
. A single lowercase letter L is a really bad variable name because it looks a lot like a number 1 at first glance. Someone reading the code could easily confuse one for the other, so at least call it lng
or lang
, or better yet, language
. (Or, since it's not actually used ever anywhere, get rid of it!)
The next
method of PDA
takes a single char
named a
. Looking at the interface for this class, I have no idea what a
represents. The comment says symbol to be read.
Why not name it that way? Something like nextSymbol
, or nextInput
, or better yet, next_base
since it represents a base from the base sequence. Also, the name next
conveys very little. Next what? How about nextState()
or processNextBase()
? And the transitionfunction()
function has both of those problems as well. Again, b
is a base, so name it that way. You don't usually need to put the word function
in the name of a function (unless it's returning a function pointer, for example).
And why is your stack
called stark
?
Magic Values
In transitionfunction()
, what does it mean that the top of the stack contains the character $
? Or 1
or 2
, or any other value? You should either make named constants for them or define an enumerated type for them. That way, someone reading your code will understand the purpose of them.
Simplify
I think your transitionfunction()
function could be simplified in a number of ways. I'll break them down in order of easiest to hardest.
First, you don't need any of the top-level else
s since once you enter one of them, the function returns. So the structure could be more like this:
if (cur == PDA::FAIL)
return cur;
if (stacktop == '$')
//... internal "if"s
if (stacktop == '1')
// ... internal "if"s
etc.
Next, it might make sense to break the body of the top-most if
s into separate functions. So something like this:
if (stacktop == '$')
return handleS(b);
if (stacktop == '1')
return handle1(b);
if (stacktop == '2')
return handle2(b);
... etc.
Of course, I'd name the handlers something clearer than handleX
to make it clear what those states represent.
You can then rearrange the above into a switch
statement like this:
switch (stacktop)
case '$':
return handleS(b);
break;
case '1':
return handle1(b);
break;
case '2':
return handle2(b);
break;
...etc.
;
You can do the same in the handleX()
functions:
switch (b)
case 'a':
stark.push('u');
stark.push('1');
return PDA::IN_PROGRESS;
break;
case 'c':
//... code for the 'c' base
break;
... etc.
But do you notice something? Every case in the handleX()
functions will consist of up to 4 operations:
- optionally pop the stack
- push a base
- optionally push a number
- optionally set the state change to 1
And then they return PDA::IN_PROGRESS
. This seems like a potential candidate for a table-driven design. You could define a struct
that holds these things. Something like this:
struct StateChange
bool popStack;
char base;
char number; // use a better name here
bool setStateChange;
StateChange(bool popIt, char newBase, char newNumber, bool shouldChangeState) : popStack(popIt), base(newBase), number (newNumber), setStateChange(shouldChangeState);
;
Then you can create std::map
of these. Let's look at the $
case:
using StateChangeMap = std::map<char, StateChange>;
StateChangeMap dollarStates;
dollarStates [ 'a' ] = StateChange(false, 'u', '1', false);
dollarStates [ 'c' ] = StateChange(false, 'g', '1', false);
// ... etc. for the rest of the '$' states
Then you can put each of those into another map
where the key is the stacktop
value that's passed into transitionfunction()
. Your transition function now looks something like:
PDA::State PDA::transitionfunction(const char b, PDA::State cur, const char stacktop)
if (cur == PDA::FAIL)
return PDA::FAIL;
StateChange nextState = transitionMap [ stacktop ][ b ];
if (nextState.popStack)
stark.pop();
stark.push(nextState.base);
if (nextState.number != '0') // Here I'm using '0' to represent not pushing a number
stark.push(nextState.number);
if (nextState.setStateChange)
statechange = 1;
return PDA::IN_PROGRESS;
Note that you'll either need to use std::map::find()
rather than the operator
that I used if you haven't sanitized the input, or you'll need to sanitize the input so you never get an invalid state value.
add a comment |Â
up vote
1
down vote
I think you're on the right track with a PDA! It seems like a good way to approach this problem. I see 2 big issues with this: 1)the naming of your variables, and 2) the high number of magic values. Here are my thoughts.
Naming
In main()
for example, the user enters a string
. I assume the string
is intended to represent a sequence of bases? If so, it should be called something like base_sequence
.
Similarly, the argument names in your functions tend to be a single character which conveys no meaning whatsoever to a reader. In the constructor to PDA
, you have an argument named l
. A single lowercase letter L is a really bad variable name because it looks a lot like a number 1 at first glance. Someone reading the code could easily confuse one for the other, so at least call it lng
or lang
, or better yet, language
. (Or, since it's not actually used ever anywhere, get rid of it!)
The next
method of PDA
takes a single char
named a
. Looking at the interface for this class, I have no idea what a
represents. The comment says symbol to be read.
Why not name it that way? Something like nextSymbol
, or nextInput
, or better yet, next_base
since it represents a base from the base sequence. Also, the name next
conveys very little. Next what? How about nextState()
or processNextBase()
? And the transitionfunction()
function has both of those problems as well. Again, b
is a base, so name it that way. You don't usually need to put the word function
in the name of a function (unless it's returning a function pointer, for example).
And why is your stack
called stark
?
Magic Values
In transitionfunction()
, what does it mean that the top of the stack contains the character $
? Or 1
or 2
, or any other value? You should either make named constants for them or define an enumerated type for them. That way, someone reading your code will understand the purpose of them.
Simplify
I think your transitionfunction()
function could be simplified in a number of ways. I'll break them down in order of easiest to hardest.
First, you don't need any of the top-level else
s since once you enter one of them, the function returns. So the structure could be more like this:
if (cur == PDA::FAIL)
return cur;
if (stacktop == '$')
//... internal "if"s
if (stacktop == '1')
// ... internal "if"s
etc.
Next, it might make sense to break the body of the top-most if
s into separate functions. So something like this:
if (stacktop == '$')
return handleS(b);
if (stacktop == '1')
return handle1(b);
if (stacktop == '2')
return handle2(b);
... etc.
Of course, I'd name the handlers something clearer than handleX
to make it clear what those states represent.
You can then rearrange the above into a switch
statement like this:
switch (stacktop)
case '$':
return handleS(b);
break;
case '1':
return handle1(b);
break;
case '2':
return handle2(b);
break;
...etc.
;
You can do the same in the handleX()
functions:
switch (b)
case 'a':
stark.push('u');
stark.push('1');
return PDA::IN_PROGRESS;
break;
case 'c':
//... code for the 'c' base
break;
... etc.
But do you notice something? Every case in the handleX()
functions will consist of up to 4 operations:
- optionally pop the stack
- push a base
- optionally push a number
- optionally set the state change to 1
And then they return PDA::IN_PROGRESS
. This seems like a potential candidate for a table-driven design. You could define a struct
that holds these things. Something like this:
struct StateChange
bool popStack;
char base;
char number; // use a better name here
bool setStateChange;
StateChange(bool popIt, char newBase, char newNumber, bool shouldChangeState) : popStack(popIt), base(newBase), number (newNumber), setStateChange(shouldChangeState);
;
Then you can create std::map
of these. Let's look at the $
case:
using StateChangeMap = std::map<char, StateChange>;
StateChangeMap dollarStates;
dollarStates [ 'a' ] = StateChange(false, 'u', '1', false);
dollarStates [ 'c' ] = StateChange(false, 'g', '1', false);
// ... etc. for the rest of the '$' states
Then you can put each of those into another map
where the key is the stacktop
value that's passed into transitionfunction()
. Your transition function now looks something like:
PDA::State PDA::transitionfunction(const char b, PDA::State cur, const char stacktop)
if (cur == PDA::FAIL)
return PDA::FAIL;
StateChange nextState = transitionMap [ stacktop ][ b ];
if (nextState.popStack)
stark.pop();
stark.push(nextState.base);
if (nextState.number != '0') // Here I'm using '0' to represent not pushing a number
stark.push(nextState.number);
if (nextState.setStateChange)
statechange = 1;
return PDA::IN_PROGRESS;
Note that you'll either need to use std::map::find()
rather than the operator
that I used if you haven't sanitized the input, or you'll need to sanitize the input so you never get an invalid state value.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
I think you're on the right track with a PDA! It seems like a good way to approach this problem. I see 2 big issues with this: 1)the naming of your variables, and 2) the high number of magic values. Here are my thoughts.
Naming
In main()
for example, the user enters a string
. I assume the string
is intended to represent a sequence of bases? If so, it should be called something like base_sequence
.
Similarly, the argument names in your functions tend to be a single character which conveys no meaning whatsoever to a reader. In the constructor to PDA
, you have an argument named l
. A single lowercase letter L is a really bad variable name because it looks a lot like a number 1 at first glance. Someone reading the code could easily confuse one for the other, so at least call it lng
or lang
, or better yet, language
. (Or, since it's not actually used ever anywhere, get rid of it!)
The next
method of PDA
takes a single char
named a
. Looking at the interface for this class, I have no idea what a
represents. The comment says symbol to be read.
Why not name it that way? Something like nextSymbol
, or nextInput
, or better yet, next_base
since it represents a base from the base sequence. Also, the name next
conveys very little. Next what? How about nextState()
or processNextBase()
? And the transitionfunction()
function has both of those problems as well. Again, b
is a base, so name it that way. You don't usually need to put the word function
in the name of a function (unless it's returning a function pointer, for example).
And why is your stack
called stark
?
Magic Values
In transitionfunction()
, what does it mean that the top of the stack contains the character $
? Or 1
or 2
, or any other value? You should either make named constants for them or define an enumerated type for them. That way, someone reading your code will understand the purpose of them.
Simplify
I think your transitionfunction()
function could be simplified in a number of ways. I'll break them down in order of easiest to hardest.
First, you don't need any of the top-level else
s since once you enter one of them, the function returns. So the structure could be more like this:
if (cur == PDA::FAIL)
return cur;
if (stacktop == '$')
//... internal "if"s
if (stacktop == '1')
// ... internal "if"s
etc.
Next, it might make sense to break the body of the top-most if
s into separate functions. So something like this:
if (stacktop == '$')
return handleS(b);
if (stacktop == '1')
return handle1(b);
if (stacktop == '2')
return handle2(b);
... etc.
Of course, I'd name the handlers something clearer than handleX
to make it clear what those states represent.
You can then rearrange the above into a switch
statement like this:
switch (stacktop)
case '$':
return handleS(b);
break;
case '1':
return handle1(b);
break;
case '2':
return handle2(b);
break;
...etc.
;
You can do the same in the handleX()
functions:
switch (b)
case 'a':
stark.push('u');
stark.push('1');
return PDA::IN_PROGRESS;
break;
case 'c':
//... code for the 'c' base
break;
... etc.
But do you notice something? Every case in the handleX()
functions will consist of up to 4 operations:
- optionally pop the stack
- push a base
- optionally push a number
- optionally set the state change to 1
And then they return PDA::IN_PROGRESS
. This seems like a potential candidate for a table-driven design. You could define a struct
that holds these things. Something like this:
struct StateChange
bool popStack;
char base;
char number; // use a better name here
bool setStateChange;
StateChange(bool popIt, char newBase, char newNumber, bool shouldChangeState) : popStack(popIt), base(newBase), number (newNumber), setStateChange(shouldChangeState);
;
Then you can create std::map
of these. Let's look at the $
case:
using StateChangeMap = std::map<char, StateChange>;
StateChangeMap dollarStates;
dollarStates [ 'a' ] = StateChange(false, 'u', '1', false);
dollarStates [ 'c' ] = StateChange(false, 'g', '1', false);
// ... etc. for the rest of the '$' states
Then you can put each of those into another map
where the key is the stacktop
value that's passed into transitionfunction()
. Your transition function now looks something like:
PDA::State PDA::transitionfunction(const char b, PDA::State cur, const char stacktop)
if (cur == PDA::FAIL)
return PDA::FAIL;
StateChange nextState = transitionMap [ stacktop ][ b ];
if (nextState.popStack)
stark.pop();
stark.push(nextState.base);
if (nextState.number != '0') // Here I'm using '0' to represent not pushing a number
stark.push(nextState.number);
if (nextState.setStateChange)
statechange = 1;
return PDA::IN_PROGRESS;
Note that you'll either need to use std::map::find()
rather than the operator
that I used if you haven't sanitized the input, or you'll need to sanitize the input so you never get an invalid state value.
I think you're on the right track with a PDA! It seems like a good way to approach this problem. I see 2 big issues with this: 1)the naming of your variables, and 2) the high number of magic values. Here are my thoughts.
Naming
In main()
for example, the user enters a string
. I assume the string
is intended to represent a sequence of bases? If so, it should be called something like base_sequence
.
Similarly, the argument names in your functions tend to be a single character which conveys no meaning whatsoever to a reader. In the constructor to PDA
, you have an argument named l
. A single lowercase letter L is a really bad variable name because it looks a lot like a number 1 at first glance. Someone reading the code could easily confuse one for the other, so at least call it lng
or lang
, or better yet, language
. (Or, since it's not actually used ever anywhere, get rid of it!)
The next
method of PDA
takes a single char
named a
. Looking at the interface for this class, I have no idea what a
represents. The comment says symbol to be read.
Why not name it that way? Something like nextSymbol
, or nextInput
, or better yet, next_base
since it represents a base from the base sequence. Also, the name next
conveys very little. Next what? How about nextState()
or processNextBase()
? And the transitionfunction()
function has both of those problems as well. Again, b
is a base, so name it that way. You don't usually need to put the word function
in the name of a function (unless it's returning a function pointer, for example).
And why is your stack
called stark
?
Magic Values
In transitionfunction()
, what does it mean that the top of the stack contains the character $
? Or 1
or 2
, or any other value? You should either make named constants for them or define an enumerated type for them. That way, someone reading your code will understand the purpose of them.
Simplify
I think your transitionfunction()
function could be simplified in a number of ways. I'll break them down in order of easiest to hardest.
First, you don't need any of the top-level else
s since once you enter one of them, the function returns. So the structure could be more like this:
if (cur == PDA::FAIL)
return cur;
if (stacktop == '$')
//... internal "if"s
if (stacktop == '1')
// ... internal "if"s
etc.
Next, it might make sense to break the body of the top-most if
s into separate functions. So something like this:
if (stacktop == '$')
return handleS(b);
if (stacktop == '1')
return handle1(b);
if (stacktop == '2')
return handle2(b);
... etc.
Of course, I'd name the handlers something clearer than handleX
to make it clear what those states represent.
You can then rearrange the above into a switch
statement like this:
switch (stacktop)
case '$':
return handleS(b);
break;
case '1':
return handle1(b);
break;
case '2':
return handle2(b);
break;
...etc.
;
You can do the same in the handleX()
functions:
switch (b)
case 'a':
stark.push('u');
stark.push('1');
return PDA::IN_PROGRESS;
break;
case 'c':
//... code for the 'c' base
break;
... etc.
But do you notice something? Every case in the handleX()
functions will consist of up to 4 operations:
- optionally pop the stack
- push a base
- optionally push a number
- optionally set the state change to 1
And then they return PDA::IN_PROGRESS
. This seems like a potential candidate for a table-driven design. You could define a struct
that holds these things. Something like this:
struct StateChange
bool popStack;
char base;
char number; // use a better name here
bool setStateChange;
StateChange(bool popIt, char newBase, char newNumber, bool shouldChangeState) : popStack(popIt), base(newBase), number (newNumber), setStateChange(shouldChangeState);
;
Then you can create std::map
of these. Let's look at the $
case:
using StateChangeMap = std::map<char, StateChange>;
StateChangeMap dollarStates;
dollarStates [ 'a' ] = StateChange(false, 'u', '1', false);
dollarStates [ 'c' ] = StateChange(false, 'g', '1', false);
// ... etc. for the rest of the '$' states
Then you can put each of those into another map
where the key is the stacktop
value that's passed into transitionfunction()
. Your transition function now looks something like:
PDA::State PDA::transitionfunction(const char b, PDA::State cur, const char stacktop)
if (cur == PDA::FAIL)
return PDA::FAIL;
StateChange nextState = transitionMap [ stacktop ][ b ];
if (nextState.popStack)
stark.pop();
stark.push(nextState.base);
if (nextState.number != '0') // Here I'm using '0' to represent not pushing a number
stark.push(nextState.number);
if (nextState.setStateChange)
statechange = 1;
return PDA::IN_PROGRESS;
Note that you'll either need to use std::map::find()
rather than the operator
that I used if you haven't sanitized the input, or you'll need to sanitize the input so you never get an invalid state value.
answered Feb 2 at 6:53
user1118321
10.2k11144
10.2k11144
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f186161%2fpush-down-automaton-to-detect-rna-hairpins%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password