Python - basis of a parser library, followup

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
6
down vote

favorite

This is a followup to this question: link. As advised there, I've rewritten the State class, which also simplified some of the other code.

I'm writing a parser library. I would like to know if the basis of it is sound and whether it can be improved in any way. The entire code can be seen in this repo: link, in the frozen branch review-11-02-2018 (file core.py). I'll post relevant portions below.

The library is written around two things: a State class, and a notion of an effect. State objects represent current state of a parser chain, and keep track of the input that's left to parse and a portion of input parsed by the last parser. A parser is just a callable that takes a State object and returns a new one (State objects are immutable). Output of one parser is passed to the next parser in a chain. A parser may also fail by throwing a ParsingFailure exception. Any parser in the chain may register an effect - a callable that takes an arbitrary value as the first argument and a State object as the second. If the chain succeeds, all effects registered during the parsing run are applied sequentially to a seed (with the seed or return value of the previous effect being the first argument, and the state of the chain at the moment of effect registration being the second), and the return value of the last effect becomes the output of entire chain, along with the final state. Is the concept sane? It works, but is it a reasonable way to do this?

State class is just a named tuple with some extra methods and is defined as follows:

class State(namedtuple("State", "string effect left_start left_end parsed_start parsed_end")):
 """
 State objects represent current state of a parser chain (or an individual
 parser).

 State objects provide two views over the input string: 'left', which spans
 a substring between 'left_start' and 'left_end' and represents unparsed
 input left from the previous parser, and 'parsed', which spans a substring
 between 'parsed_start' and 'parsed_end' and represents a portion of input
 the previous parser has parsed. Windows may overlap, and usually
 'parsed_end' and 'left_start' are the same, but not always.

 A State object is immutable and has following fields:
 * string (str): the input the parser chain is supposed to parse.
 * effect ((value, state) -> value): if the chain is successful, this will
 be called in sequence with other effects from the chain to form the
 chain's output value.
 * left_start, left_end (int): see above about the 'left' window.
 * parsed_start, parser_end (int): see above about the 'parsed' window.

 State objects are just named tuples, so they support a very convenient
 '_replace' method. !Note!: to avoid duplicating effects accidentally,
 '_replace' treats lack of 'effect' in its arguments as 'effect=None'. So if
 you want to copy an effect from another parser, you have to do it
 explicitly.

 State objects' constructor takes the following arguments:
 1. string - the input.
 2. effect=None - the effect, transformation to be performed on success of
 the last parser.
 3. start=0 - will be translated into 'left_start'
 4. end=None - will be translated into 'left_end'. If set to None,
 'left_end' will be set to the length of the input.
 State objects created via this constructor have both 'parsed_start' and
 'parsed_end' set to 'left_start'.

 State objects have several properties:
 * left - returns a slice of input that's left to parse.
 * left_len - returns the length of the above slice without computing the
 slice itself.
 * parsed - returns a slice of input that's been parsed.
 * parsed_len - returns the length of the above slice, again without
 computing the slice.

 Finally, State objects have following public methods:
 * consume(how_many) - move 'how_many' characters from the left window into
 the parsed window. Raise ValueError if more input was consumed than left.
 * split(at) - split the State in two (and return them). The first keeps
 the input up to, but not including, 'at' as its 'left' window, the second
 gets the rest. Both have their 'parsed' windows reset to an empty string.
 The first gets 'effect' of the original, the second gets None.
 """

 __slots__ = 

 def __new__(cls, string, effect=None, start=0, end=None):
 if end is None:
 end = len(string)
 assert 0 <= start <= end <= len(string)
 return super().__new__(cls, string, effect, start, end, start, start)

 def _replace(self, **kwargs):
 if "effect" not in kwargs:
 return super()._replace(effect=None, **kwargs)
 return super()._replace(**kwargs)

 def consume(self, how_many):
 """
 Return a new State object with 'how_many' characters consumed and moved
 to the 'parsed' window.

 Raise ValueError if 'how_many' is negative or if consuming more
 characters than left in the 'left' window.
 """
 if how_many < 0:
 raise ValueError("Negative number of consumed characters")
 left_start = self.left_start + how_many
 parsed_start = self.left_start
 parsed_end = parsed_start + how_many
 if left_start > self.left_end:
 raise ValueError("Consumed more characters than fits in the 'left' window")
 return self._replace(left_start=left_start, parsed_start=parsed_start,
 parsed_end=parsed_end)

 def split(self, at):
 """
 Split the State in two. The first one keeps a portion of input up to
 'at'th character (exclusive), the second one gets the rest. Both have
 'parsed' window reset to an empty string. First one gets the effect of
 the original, the second one gets None.
 """
 split_point = self.left_start + at
 first = self._replace(effect=self.effect,
 left_end=split_point,
 parsed_start=self.left_start,
 parsed_end=self.left_start)
 second = self._replace(effect=None,
 left_start=split_point,
 parsed_start=split_point,
 parsed_end=split_point)
 return first, second

 @property
 def left(self):
 """
 Return the portion of input the last parser hasn't consumed.
 """
 return self.string[self.left_start:self.left_end]

 @property
 def left_len(self):
 """
 Return the length of the portion of input the last parser hasn't
 consumed.
 """
 return self.left_end - self.left_start

 @property
 def parsed(self):
 """
 Return the string parsed by the last parser.
 """
 return self.string[self.parsed_start:self.parsed_end]

 @property
 def parsed_len(self):
 """
 Return the length of the string parsed by the last parser.
 """
 return self.parsed_end - self.parsed_start

Hopefully docstrings are descriptive enough. If any further clarification is needed, please tell me, I'll edit it in.

Another important thing is parse function, which the user is supposed to call with a parser instead of calling parsers directly. Here it is:

def parse(seed, state_or_string, parser, verbose=False):
 """
 Run a given parser on a given state object or a string, then apply combined
 chain or parser's effects to 'seed' and return a tuple
 (seed after effects, final state).

 On failure, return None unless 'verbose' is truthy, in which case return
 the ParsingFailure exception that has terminated the parsing process.
 """
 if isinstance(state_or_string, str):
 state = State(state_or_string)
 else:
 state = state_or_string
 try:
 after = parser(state)
 if after.effect is not None:
 return after.effect(seed, after), after
 return seed, after
 except ParsingFailure as failure:
 if verbose:
 return failure
 return None
 except ParsingEnd as end:
 if end.state.effect is not None:
 return end.state.effect(seed, end.state), end.state
 return seed, end.state

Another important thing is chain parser generator, which performs the chaining logic described above, but I don't want to post it here, because a) the question is bloated already, b) it also deals with lookahead, which seems to me a matter worth asking a separate question.

If you've read this far, thank you! Any suggestions on improving the library?

asked Feb 11 at 15:14

Michail

2648

add a commentÂ |Â

up vote
6
down vote

favorite

This is a followup to this question: link. As advised there, I've rewritten the State class, which also simplified some of the other code.

State class is just a named tuple with some extra methods and is defined as follows:

class State(namedtuple("State", "string effect left_start left_end parsed_start parsed_end")):
 """
 State objects represent current state of a parser chain (or an individual
 parser).

 State objects provide two views over the input string: 'left', which spans
 a substring between 'left_start' and 'left_end' and represents unparsed
 input left from the previous parser, and 'parsed', which spans a substring
 between 'parsed_start' and 'parsed_end' and represents a portion of input
 the previous parser has parsed. Windows may overlap, and usually
 'parsed_end' and 'left_start' are the same, but not always.

 A State object is immutable and has following fields:
 * string (str): the input the parser chain is supposed to parse.
 * effect ((value, state) -> value): if the chain is successful, this will
 be called in sequence with other effects from the chain to form the
 chain's output value.
 * left_start, left_end (int): see above about the 'left' window.
 * parsed_start, parser_end (int): see above about the 'parsed' window.

 State objects are just named tuples, so they support a very convenient
 '_replace' method. !Note!: to avoid duplicating effects accidentally,
 '_replace' treats lack of 'effect' in its arguments as 'effect=None'. So if
 you want to copy an effect from another parser, you have to do it
 explicitly.

 State objects' constructor takes the following arguments:
 1. string - the input.
 2. effect=None - the effect, transformation to be performed on success of
 the last parser.
 3. start=0 - will be translated into 'left_start'
 4. end=None - will be translated into 'left_end'. If set to None,
 'left_end' will be set to the length of the input.
 State objects created via this constructor have both 'parsed_start' and
 'parsed_end' set to 'left_start'.

 State objects have several properties:
 * left - returns a slice of input that's left to parse.
 * left_len - returns the length of the above slice without computing the
 slice itself.
 * parsed - returns a slice of input that's been parsed.
 * parsed_len - returns the length of the above slice, again without
 computing the slice.

 Finally, State objects have following public methods:
 * consume(how_many) - move 'how_many' characters from the left window into
 the parsed window. Raise ValueError if more input was consumed than left.
 * split(at) - split the State in two (and return them). The first keeps
 the input up to, but not including, 'at' as its 'left' window, the second
 gets the rest. Both have their 'parsed' windows reset to an empty string.
 The first gets 'effect' of the original, the second gets None.
 """

 __slots__ = 

 def __new__(cls, string, effect=None, start=0, end=None):
 if end is None:
 end = len(string)
 assert 0 <= start <= end <= len(string)
 return super().__new__(cls, string, effect, start, end, start, start)

 def _replace(self, **kwargs):
 if "effect" not in kwargs:
 return super()._replace(effect=None, **kwargs)
 return super()._replace(**kwargs)

 def consume(self, how_many):
 """
 Return a new State object with 'how_many' characters consumed and moved
 to the 'parsed' window.

 Raise ValueError if 'how_many' is negative or if consuming more
 characters than left in the 'left' window.
 """
 if how_many < 0:
 raise ValueError("Negative number of consumed characters")
 left_start = self.left_start + how_many
 parsed_start = self.left_start
 parsed_end = parsed_start + how_many
 if left_start > self.left_end:
 raise ValueError("Consumed more characters than fits in the 'left' window")
 return self._replace(left_start=left_start, parsed_start=parsed_start,
 parsed_end=parsed_end)

 def split(self, at):
 """
 Split the State in two. The first one keeps a portion of input up to
 'at'th character (exclusive), the second one gets the rest. Both have
 'parsed' window reset to an empty string. First one gets the effect of
 the original, the second one gets None.
 """
 split_point = self.left_start + at
 first = self._replace(effect=self.effect,
 left_end=split_point,
 parsed_start=self.left_start,
 parsed_end=self.left_start)
 second = self._replace(effect=None,
 left_start=split_point,
 parsed_start=split_point,
 parsed_end=split_point)
 return first, second

 @property
 def left(self):
 """
 Return the portion of input the last parser hasn't consumed.
 """
 return self.string[self.left_start:self.left_end]

 @property
 def left_len(self):
 """
 Return the length of the portion of input the last parser hasn't
 consumed.
 """
 return self.left_end - self.left_start

 @property
 def parsed(self):
 """
 Return the string parsed by the last parser.
 """
 return self.string[self.parsed_start:self.parsed_end]

 @property
 def parsed_len(self):
 """
 Return the length of the string parsed by the last parser.
 """
 return self.parsed_end - self.parsed_start

Hopefully docstrings are descriptive enough. If any further clarification is needed, please tell me, I'll edit it in.

Another important thing is parse function, which the user is supposed to call with a parser instead of calling parsers directly. Here it is:

def parse(seed, state_or_string, parser, verbose=False):
 """
 Run a given parser on a given state object or a string, then apply combined
 chain or parser's effects to 'seed' and return a tuple
 (seed after effects, final state).

 On failure, return None unless 'verbose' is truthy, in which case return
 the ParsingFailure exception that has terminated the parsing process.
 """
 if isinstance(state_or_string, str):
 state = State(state_or_string)
 else:
 state = state_or_string
 try:
 after = parser(state)
 if after.effect is not None:
 return after.effect(seed, after), after
 return seed, after
 except ParsingFailure as failure:
 if verbose:
 return failure
 return None
 except ParsingEnd as end:
 if end.state.effect is not None:
 return end.state.effect(seed, end.state), end.state
 return seed, end.state

If you've read this far, thank you! Any suggestions on improving the library?

asked Feb 11 at 15:14

Michail

2648

add a commentÂ |Â

up vote
6
down vote

favorite

This is a followup to this question: link. As advised there, I've rewritten the State class, which also simplified some of the other code.

State class is just a named tuple with some extra methods and is defined as follows:

class State(namedtuple("State", "string effect left_start left_end parsed_start parsed_end")):
 """
 State objects represent current state of a parser chain (or an individual
 parser).

 State objects provide two views over the input string: 'left', which spans
 a substring between 'left_start' and 'left_end' and represents unparsed
 input left from the previous parser, and 'parsed', which spans a substring
 between 'parsed_start' and 'parsed_end' and represents a portion of input
 the previous parser has parsed. Windows may overlap, and usually
 'parsed_end' and 'left_start' are the same, but not always.

 A State object is immutable and has following fields:
 * string (str): the input the parser chain is supposed to parse.
 * effect ((value, state) -> value): if the chain is successful, this will
 be called in sequence with other effects from the chain to form the
 chain's output value.
 * left_start, left_end (int): see above about the 'left' window.
 * parsed_start, parser_end (int): see above about the 'parsed' window.

 State objects are just named tuples, so they support a very convenient
 '_replace' method. !Note!: to avoid duplicating effects accidentally,
 '_replace' treats lack of 'effect' in its arguments as 'effect=None'. So if
 you want to copy an effect from another parser, you have to do it
 explicitly.

 State objects' constructor takes the following arguments:
 1. string - the input.
 2. effect=None - the effect, transformation to be performed on success of
 the last parser.
 3. start=0 - will be translated into 'left_start'
 4. end=None - will be translated into 'left_end'. If set to None,
 'left_end' will be set to the length of the input.
 State objects created via this constructor have both 'parsed_start' and
 'parsed_end' set to 'left_start'.

 State objects have several properties:
 * left - returns a slice of input that's left to parse.
 * left_len - returns the length of the above slice without computing the
 slice itself.
 * parsed - returns a slice of input that's been parsed.
 * parsed_len - returns the length of the above slice, again without
 computing the slice.

 Finally, State objects have following public methods:
 * consume(how_many) - move 'how_many' characters from the left window into
 the parsed window. Raise ValueError if more input was consumed than left.
 * split(at) - split the State in two (and return them). The first keeps
 the input up to, but not including, 'at' as its 'left' window, the second
 gets the rest. Both have their 'parsed' windows reset to an empty string.
 The first gets 'effect' of the original, the second gets None.
 """

 __slots__ = 

 def __new__(cls, string, effect=None, start=0, end=None):
 if end is None:
 end = len(string)
 assert 0 <= start <= end <= len(string)
 return super().__new__(cls, string, effect, start, end, start, start)

 def _replace(self, **kwargs):
 if "effect" not in kwargs:
 return super()._replace(effect=None, **kwargs)
 return super()._replace(**kwargs)

 def consume(self, how_many):
 """
 Return a new State object with 'how_many' characters consumed and moved
 to the 'parsed' window.

 Raise ValueError if 'how_many' is negative or if consuming more
 characters than left in the 'left' window.
 """
 if how_many < 0:
 raise ValueError("Negative number of consumed characters")
 left_start = self.left_start + how_many
 parsed_start = self.left_start
 parsed_end = parsed_start + how_many
 if left_start > self.left_end:
 raise ValueError("Consumed more characters than fits in the 'left' window")
 return self._replace(left_start=left_start, parsed_start=parsed_start,
 parsed_end=parsed_end)

 def split(self, at):
 """
 Split the State in two. The first one keeps a portion of input up to
 'at'th character (exclusive), the second one gets the rest. Both have
 'parsed' window reset to an empty string. First one gets the effect of
 the original, the second one gets None.
 """
 split_point = self.left_start + at
 first = self._replace(effect=self.effect,
 left_end=split_point,
 parsed_start=self.left_start,
 parsed_end=self.left_start)
 second = self._replace(effect=None,
 left_start=split_point,
 parsed_start=split_point,
 parsed_end=split_point)
 return first, second

 @property
 def left(self):
 """
 Return the portion of input the last parser hasn't consumed.
 """
 return self.string[self.left_start:self.left_end]

 @property
 def left_len(self):
 """
 Return the length of the portion of input the last parser hasn't
 consumed.
 """
 return self.left_end - self.left_start

 @property
 def parsed(self):
 """
 Return the string parsed by the last parser.
 """
 return self.string[self.parsed_start:self.parsed_end]

 @property
 def parsed_len(self):
 """
 Return the length of the string parsed by the last parser.
 """
 return self.parsed_end - self.parsed_start

Hopefully docstrings are descriptive enough. If any further clarification is needed, please tell me, I'll edit it in.

Another important thing is parse function, which the user is supposed to call with a parser instead of calling parsers directly. Here it is:

def parse(seed, state_or_string, parser, verbose=False):
 """
 Run a given parser on a given state object or a string, then apply combined
 chain or parser's effects to 'seed' and return a tuple
 (seed after effects, final state).

 On failure, return None unless 'verbose' is truthy, in which case return
 the ParsingFailure exception that has terminated the parsing process.
 """
 if isinstance(state_or_string, str):
 state = State(state_or_string)
 else:
 state = state_or_string
 try:
 after = parser(state)
 if after.effect is not None:
 return after.effect(seed, after), after
 return seed, after
 except ParsingFailure as failure:
 if verbose:
 return failure
 return None
 except ParsingEnd as end:
 if end.state.effect is not None:
 return end.state.effect(seed, end.state), end.state
 return seed, end.state

If you've read this far, thank you! Any suggestions on improving the library?

asked Feb 11 at 15:14

Michail

2648

This is a followup to this question: link. As advised there, I've rewritten the State class, which also simplified some of the other code.

State class is just a named tuple with some extra methods and is defined as follows:

class State(namedtuple("State", "string effect left_start left_end parsed_start parsed_end")):
 """
 State objects represent current state of a parser chain (or an individual
 parser).

 State objects provide two views over the input string: 'left', which spans
 a substring between 'left_start' and 'left_end' and represents unparsed
 input left from the previous parser, and 'parsed', which spans a substring
 between 'parsed_start' and 'parsed_end' and represents a portion of input
 the previous parser has parsed. Windows may overlap, and usually
 'parsed_end' and 'left_start' are the same, but not always.

 A State object is immutable and has following fields:
 * string (str): the input the parser chain is supposed to parse.
 * effect ((value, state) -> value): if the chain is successful, this will
 be called in sequence with other effects from the chain to form the
 chain's output value.
 * left_start, left_end (int): see above about the 'left' window.
 * parsed_start, parser_end (int): see above about the 'parsed' window.

 State objects are just named tuples, so they support a very convenient
 '_replace' method. !Note!: to avoid duplicating effects accidentally,
 '_replace' treats lack of 'effect' in its arguments as 'effect=None'. So if
 you want to copy an effect from another parser, you have to do it
 explicitly.

 State objects' constructor takes the following arguments:
 1. string - the input.
 2. effect=None - the effect, transformation to be performed on success of
 the last parser.
 3. start=0 - will be translated into 'left_start'
 4. end=None - will be translated into 'left_end'. If set to None,
 'left_end' will be set to the length of the input.
 State objects created via this constructor have both 'parsed_start' and
 'parsed_end' set to 'left_start'.

 State objects have several properties:
 * left - returns a slice of input that's left to parse.
 * left_len - returns the length of the above slice without computing the
 slice itself.
 * parsed - returns a slice of input that's been parsed.
 * parsed_len - returns the length of the above slice, again without
 computing the slice.

 Finally, State objects have following public methods:
 * consume(how_many) - move 'how_many' characters from the left window into
 the parsed window. Raise ValueError if more input was consumed than left.
 * split(at) - split the State in two (and return them). The first keeps
 the input up to, but not including, 'at' as its 'left' window, the second
 gets the rest. Both have their 'parsed' windows reset to an empty string.
 The first gets 'effect' of the original, the second gets None.
 """

 __slots__ = 

 def __new__(cls, string, effect=None, start=0, end=None):
 if end is None:
 end = len(string)
 assert 0 <= start <= end <= len(string)
 return super().__new__(cls, string, effect, start, end, start, start)

 def _replace(self, **kwargs):
 if "effect" not in kwargs:
 return super()._replace(effect=None, **kwargs)
 return super()._replace(**kwargs)

 def consume(self, how_many):
 """
 Return a new State object with 'how_many' characters consumed and moved
 to the 'parsed' window.

 Raise ValueError if 'how_many' is negative or if consuming more
 characters than left in the 'left' window.
 """
 if how_many < 0:
 raise ValueError("Negative number of consumed characters")
 left_start = self.left_start + how_many
 parsed_start = self.left_start
 parsed_end = parsed_start + how_many
 if left_start > self.left_end:
 raise ValueError("Consumed more characters than fits in the 'left' window")
 return self._replace(left_start=left_start, parsed_start=parsed_start,
 parsed_end=parsed_end)

 def split(self, at):
 """
 Split the State in two. The first one keeps a portion of input up to
 'at'th character (exclusive), the second one gets the rest. Both have
 'parsed' window reset to an empty string. First one gets the effect of
 the original, the second one gets None.
 """
 split_point = self.left_start + at
 first = self._replace(effect=self.effect,
 left_end=split_point,
 parsed_start=self.left_start,
 parsed_end=self.left_start)
 second = self._replace(effect=None,
 left_start=split_point,
 parsed_start=split_point,
 parsed_end=split_point)
 return first, second

 @property
 def left(self):
 """
 Return the portion of input the last parser hasn't consumed.
 """
 return self.string[self.left_start:self.left_end]

 @property
 def left_len(self):
 """
 Return the length of the portion of input the last parser hasn't
 consumed.
 """
 return self.left_end - self.left_start

 @property
 def parsed(self):
 """
 Return the string parsed by the last parser.
 """
 return self.string[self.parsed_start:self.parsed_end]

 @property
 def parsed_len(self):
 """
 Return the length of the string parsed by the last parser.
 """
 return self.parsed_end - self.parsed_start

Hopefully docstrings are descriptive enough. If any further clarification is needed, please tell me, I'll edit it in.

Another important thing is parse function, which the user is supposed to call with a parser instead of calling parsers directly. Here it is:

def parse(seed, state_or_string, parser, verbose=False):
 """
 Run a given parser on a given state object or a string, then apply combined
 chain or parser's effects to 'seed' and return a tuple
 (seed after effects, final state).

 On failure, return None unless 'verbose' is truthy, in which case return
 the ParsingFailure exception that has terminated the parsing process.
 """
 if isinstance(state_or_string, str):
 state = State(state_or_string)
 else:
 state = state_or_string
 try:
 after = parser(state)
 if after.effect is not None:
 return after.effect(seed, after), after
 return seed, after
 except ParsingFailure as failure:
 if verbose:
 return failure
 return None
 except ParsingEnd as end:
 if end.state.effect is not None:
 return end.state.effect(seed, end.state), end.state
 return seed, end.state

If you've read this far, thank you! Any suggestions on improving the library?

asked Feb 11 at 15:14

Michail

2648

asked Feb 11 at 15:14

Michail

2648

asked Feb 11 at 15:14

Michail

2648

asked Feb 11 at 15:14

Michail

2648

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Following up on the previous comments they made about the parser, I will focus on code legibility

You may consider moving some paragraphs of the class docstring to the relevant part of the code, like the paragraph about A State object is immutable and has following fields may suit better on the __new__ method, documenting as well the parameter

Also, don't forget to document the parameters of methods/functions, will help a lot knowing what your parser is doing when invoking them

Parameters like at might be better named as index which is universally used for it.

Parameters like how_many may be better named as characters_count or consumed_characters_count or similar, simply by reading it gives better insight on what the parameter refers to

if verbose:
 return failure
return None

This can be translated to ternary operator

return failure if verbose else None

For the following small methods, I would consider better naming as well, like this one

def left(self):
 """
 Return the portion of input the last parser hasn't consumed.
 """

Why not calling it portion_not_consumed or something on that way?

Just imagine you don't know anything about a library, and you find a method called left_len. Unless you read the docstring, is hard to tell what is doing

You can as well consider putting some spacing between the lines of a same function, to improve a bit readability between the sections of it

Keep the good work!

answered Feb 17 at 21:11

A. Romeu

949313

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f187315%2fpython-basis-of-a-parser-library-followup%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Following up on the previous comments they made about the parser, I will focus on code legibility

Also, don't forget to document the parameters of methods/functions, will help a lot knowing what your parser is doing when invoking them

Parameters like at might be better named as index which is universally used for it.

Parameters like how_many may be better named as characters_count or consumed_characters_count or similar, simply by reading it gives better insight on what the parameter refers to

if verbose:
 return failure
return None

This can be translated to ternary operator

return failure if verbose else None

For the following small methods, I would consider better naming as well, like this one

def left(self):
 """
 Return the portion of input the last parser hasn't consumed.
 """

Why not calling it portion_not_consumed or something on that way?

Just imagine you don't know anything about a library, and you find a method called left_len. Unless you read the docstring, is hard to tell what is doing

You can as well consider putting some spacing between the lines of a same function, to improve a bit readability between the sections of it

Keep the good work!

answered Feb 17 at 21:11

A. Romeu

949313

add a commentÂ |Â

up vote
1
down vote

accepted

Following up on the previous comments they made about the parser, I will focus on code legibility

Also, don't forget to document the parameters of methods/functions, will help a lot knowing what your parser is doing when invoking them

Parameters like at might be better named as index which is universally used for it.

Parameters like how_many may be better named as characters_count or consumed_characters_count or similar, simply by reading it gives better insight on what the parameter refers to

if verbose:
 return failure
return None

This can be translated to ternary operator

return failure if verbose else None

For the following small methods, I would consider better naming as well, like this one

def left(self):
 """
 Return the portion of input the last parser hasn't consumed.
 """

Why not calling it portion_not_consumed or something on that way?

Just imagine you don't know anything about a library, and you find a method called left_len. Unless you read the docstring, is hard to tell what is doing

You can as well consider putting some spacing between the lines of a same function, to improve a bit readability between the sections of it

Keep the good work!

answered Feb 17 at 21:11

A. Romeu

949313

add a commentÂ |Â

up vote
1
down vote

accepted

Following up on the previous comments they made about the parser, I will focus on code legibility

Also, don't forget to document the parameters of methods/functions, will help a lot knowing what your parser is doing when invoking them

Parameters like at might be better named as index which is universally used for it.

Parameters like how_many may be better named as characters_count or consumed_characters_count or similar, simply by reading it gives better insight on what the parameter refers to

if verbose:
 return failure
return None

This can be translated to ternary operator

return failure if verbose else None

For the following small methods, I would consider better naming as well, like this one

def left(self):
 """
 Return the portion of input the last parser hasn't consumed.
 """

Why not calling it portion_not_consumed or something on that way?

Just imagine you don't know anything about a library, and you find a method called left_len. Unless you read the docstring, is hard to tell what is doing

You can as well consider putting some spacing between the lines of a same function, to improve a bit readability between the sections of it

Keep the good work!

answered Feb 17 at 21:11

A. Romeu

949313

Following up on the previous comments they made about the parser, I will focus on code legibility

Also, don't forget to document the parameters of methods/functions, will help a lot knowing what your parser is doing when invoking them

Parameters like at might be better named as index which is universally used for it.

Parameters like how_many may be better named as characters_count or consumed_characters_count or similar, simply by reading it gives better insight on what the parameter refers to

if verbose:
 return failure
return None

This can be translated to ternary operator

return failure if verbose else None

For the following small methods, I would consider better naming as well, like this one

def left(self):
 """
 Return the portion of input the last parser hasn't consumed.
 """

Why not calling it portion_not_consumed or something on that way?

Just imagine you don't know anything about a library, and you find a method called left_len. Unless you read the docstring, is hard to tell what is doing

You can as well consider putting some spacing between the lines of a same function, to improve a bit readability between the sections of it

Keep the good work!

answered Feb 17 at 21:11

A. Romeu

949313

answered Feb 17 at 21:11

A. Romeu

949313

answered Feb 17 at 21:11

A. Romeu

949313

answered Feb 17 at 21:11

A. Romeu

949313

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr