Finding the best terms to use
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
5
down vote
favorite
I have a real-world problem involving finding the best known term from a string containing sets of terms. We have Term
objects that are comparable in that they have a method a_term.is_better_than(other_term)
. This method is Boolean -- if it returns False
, then a_term
may be as good as other_term
, but isn't better.
Using this method, I need to pare down a set of terms to the best ones. The best way I've been able to think about this is to compare pairs of terms and eliminate any terms when one isn't as good as the other, something like this:
def get_best_terms(terms: Set[Term]): Set[Term]
best_terms = set()
m_terms = list(terms)
while m_terms:
term = m_terms.pop()
for i in reversed(range(len(m_terms))):
if term.is_better_than(m_terms[i]):
del m_terms[i]
if m_terms[i].is_better_than(term):
break
else: # We never encountered the `break` statement
best_terms.add(term)
break
best_terms |= a_term for a_term in m_terms
if not (term.is_better_than(a_term) or
a_term.is_better_than(term))
return best_terms
This approach is functional, but pretty gross in my opinion. For one thing, there's a lot of mutation going on here. For another, once I've identified a single best term, finding all the terms that are equally good is messy. The whole thing is hard to read.
What do you suggest?
Examples
A Term is a string with some metadata attached. For example,
a = AutoTerm(term='honda', term_span=1, term_range=(1, 2),
original_term='CAR HONDA ACC...')
b = AutoTerm(term='2015 honda accord', term_span=2, term_range=(1, 3),
original_term='2015 HONDA ACC...')
c = AutoTerm(term='2015 accord', ...)
Sometimes a term is a better choice because it's more specific, but there are other nuances, such as a match in a database. Let's say, arbitrarily, that b and c are equally good, but a
is not as good. Then
assert get_best_terms(a, b, c) == b, c
should pass.
python algorithm
add a comment |Â
up vote
5
down vote
favorite
I have a real-world problem involving finding the best known term from a string containing sets of terms. We have Term
objects that are comparable in that they have a method a_term.is_better_than(other_term)
. This method is Boolean -- if it returns False
, then a_term
may be as good as other_term
, but isn't better.
Using this method, I need to pare down a set of terms to the best ones. The best way I've been able to think about this is to compare pairs of terms and eliminate any terms when one isn't as good as the other, something like this:
def get_best_terms(terms: Set[Term]): Set[Term]
best_terms = set()
m_terms = list(terms)
while m_terms:
term = m_terms.pop()
for i in reversed(range(len(m_terms))):
if term.is_better_than(m_terms[i]):
del m_terms[i]
if m_terms[i].is_better_than(term):
break
else: # We never encountered the `break` statement
best_terms.add(term)
break
best_terms |= a_term for a_term in m_terms
if not (term.is_better_than(a_term) or
a_term.is_better_than(term))
return best_terms
This approach is functional, but pretty gross in my opinion. For one thing, there's a lot of mutation going on here. For another, once I've identified a single best term, finding all the terms that are equally good is messy. The whole thing is hard to read.
What do you suggest?
Examples
A Term is a string with some metadata attached. For example,
a = AutoTerm(term='honda', term_span=1, term_range=(1, 2),
original_term='CAR HONDA ACC...')
b = AutoTerm(term='2015 honda accord', term_span=2, term_range=(1, 3),
original_term='2015 HONDA ACC...')
c = AutoTerm(term='2015 accord', ...)
Sometimes a term is a better choice because it's more specific, but there are other nuances, such as a match in a database. Let's say, arbitrarily, that b and c are equally good, but a
is not as good. Then
assert get_best_terms(a, b, c) == b, c
should pass.
python algorithm
Can you provide an example ofTerm
and some example IO?
â Peilonrayz
Jul 31 at 13:57
@Peilonrayz done, thanks
â kojiro
Jul 31 at 14:25
Is the ranking of Terms transitive? If Term A is better than Term B, and Term B is better than Term C, is Term A guaranteed to be better than Term C?
â David Conrad
Jul 31 at 20:01
1
@DavidConrad yes.
â kojiro
Jul 31 at 20:06
add a comment |Â
up vote
5
down vote
favorite
up vote
5
down vote
favorite
I have a real-world problem involving finding the best known term from a string containing sets of terms. We have Term
objects that are comparable in that they have a method a_term.is_better_than(other_term)
. This method is Boolean -- if it returns False
, then a_term
may be as good as other_term
, but isn't better.
Using this method, I need to pare down a set of terms to the best ones. The best way I've been able to think about this is to compare pairs of terms and eliminate any terms when one isn't as good as the other, something like this:
def get_best_terms(terms: Set[Term]): Set[Term]
best_terms = set()
m_terms = list(terms)
while m_terms:
term = m_terms.pop()
for i in reversed(range(len(m_terms))):
if term.is_better_than(m_terms[i]):
del m_terms[i]
if m_terms[i].is_better_than(term):
break
else: # We never encountered the `break` statement
best_terms.add(term)
break
best_terms |= a_term for a_term in m_terms
if not (term.is_better_than(a_term) or
a_term.is_better_than(term))
return best_terms
This approach is functional, but pretty gross in my opinion. For one thing, there's a lot of mutation going on here. For another, once I've identified a single best term, finding all the terms that are equally good is messy. The whole thing is hard to read.
What do you suggest?
Examples
A Term is a string with some metadata attached. For example,
a = AutoTerm(term='honda', term_span=1, term_range=(1, 2),
original_term='CAR HONDA ACC...')
b = AutoTerm(term='2015 honda accord', term_span=2, term_range=(1, 3),
original_term='2015 HONDA ACC...')
c = AutoTerm(term='2015 accord', ...)
Sometimes a term is a better choice because it's more specific, but there are other nuances, such as a match in a database. Let's say, arbitrarily, that b and c are equally good, but a
is not as good. Then
assert get_best_terms(a, b, c) == b, c
should pass.
python algorithm
I have a real-world problem involving finding the best known term from a string containing sets of terms. We have Term
objects that are comparable in that they have a method a_term.is_better_than(other_term)
. This method is Boolean -- if it returns False
, then a_term
may be as good as other_term
, but isn't better.
Using this method, I need to pare down a set of terms to the best ones. The best way I've been able to think about this is to compare pairs of terms and eliminate any terms when one isn't as good as the other, something like this:
def get_best_terms(terms: Set[Term]): Set[Term]
best_terms = set()
m_terms = list(terms)
while m_terms:
term = m_terms.pop()
for i in reversed(range(len(m_terms))):
if term.is_better_than(m_terms[i]):
del m_terms[i]
if m_terms[i].is_better_than(term):
break
else: # We never encountered the `break` statement
best_terms.add(term)
break
best_terms |= a_term for a_term in m_terms
if not (term.is_better_than(a_term) or
a_term.is_better_than(term))
return best_terms
This approach is functional, but pretty gross in my opinion. For one thing, there's a lot of mutation going on here. For another, once I've identified a single best term, finding all the terms that are equally good is messy. The whole thing is hard to read.
What do you suggest?
Examples
A Term is a string with some metadata attached. For example,
a = AutoTerm(term='honda', term_span=1, term_range=(1, 2),
original_term='CAR HONDA ACC...')
b = AutoTerm(term='2015 honda accord', term_span=2, term_range=(1, 3),
original_term='2015 HONDA ACC...')
c = AutoTerm(term='2015 accord', ...)
Sometimes a term is a better choice because it's more specific, but there are other nuances, such as a match in a database. Let's say, arbitrarily, that b and c are equally good, but a
is not as good. Then
assert get_best_terms(a, b, c) == b, c
should pass.
python algorithm
edited Jul 31 at 14:25
asked Jul 31 at 13:17
kojiro
1,30211021
1,30211021
Can you provide an example ofTerm
and some example IO?
â Peilonrayz
Jul 31 at 13:57
@Peilonrayz done, thanks
â kojiro
Jul 31 at 14:25
Is the ranking of Terms transitive? If Term A is better than Term B, and Term B is better than Term C, is Term A guaranteed to be better than Term C?
â David Conrad
Jul 31 at 20:01
1
@DavidConrad yes.
â kojiro
Jul 31 at 20:06
add a comment |Â
Can you provide an example ofTerm
and some example IO?
â Peilonrayz
Jul 31 at 13:57
@Peilonrayz done, thanks
â kojiro
Jul 31 at 14:25
Is the ranking of Terms transitive? If Term A is better than Term B, and Term B is better than Term C, is Term A guaranteed to be better than Term C?
â David Conrad
Jul 31 at 20:01
1
@DavidConrad yes.
â kojiro
Jul 31 at 20:06
Can you provide an example of
Term
and some example IO?â Peilonrayz
Jul 31 at 13:57
Can you provide an example of
Term
and some example IO?â Peilonrayz
Jul 31 at 13:57
@Peilonrayz done, thanks
â kojiro
Jul 31 at 14:25
@Peilonrayz done, thanks
â kojiro
Jul 31 at 14:25
Is the ranking of Terms transitive? If Term A is better than Term B, and Term B is better than Term C, is Term A guaranteed to be better than Term C?
â David Conrad
Jul 31 at 20:01
Is the ranking of Terms transitive? If Term A is better than Term B, and Term B is better than Term C, is Term A guaranteed to be better than Term C?
â David Conrad
Jul 31 at 20:01
1
1
@DavidConrad yes.
â kojiro
Jul 31 at 20:06
@DavidConrad yes.
â kojiro
Jul 31 at 20:06
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
5
down vote
accepted
- You have a bug, if you remove an item in your
for
loop without breaking you get anIndexError
.
Instead use aworse
set so that you know what to skip. - What you seem to be doing is performing a sort, and so if you make a
cmp
function you can usesorted
.
To do this you need to usefunctools.cmp_to_key
and then you canitertools.takewhile
.
And so I'd use:
import functools
import itertools
from typing import Set, List
class Term(int):
def is_better_than(self, other):
return self // 2 > other // 2
def cmp(self: Term, other: Term):
if self.is_better_than(other):
return 1
if other.is_better_than(self):
return -1
return 0
def get_best_terms(terms: Set[Term]) -> List[Term]:
if not terms:
return
terms = sorted(terms, key=functools.cmp_to_key(cmp), reverse=True)
best = terms[0]
return list(itertools.takewhile(lambda i: not best.is_better_than(i), terms))
print(get_best_terms(map(Term, [1, 2, 3, 4, 5])))
print(get_best_terms(map(Term, [1, 2, 3, 4])))
print(get_best_terms(map(Term, [5, 1, 2, 4, 3])))
print(get_best_terms(map(Term, [5, 1, 4, 3])))
This can however be simplified if Term
s are comparable, as you won't need to use key
and can remove the need for a lambda
.
@functools.total_ordering
class Term(int):
def __gt__(self, other: Term) -> bool:
return self // 2 > other // 2
def __eq__(self, other: Term) -> bool:
return self // 2 == other // 2
def get_best_terms(terms: Set[Term]) -> List[Term]:
if not terms:
return
terms = sorted(terms, reverse=True)
return list(itertools.takewhile(terms[0].__eq__, terms))
Yeah, this is great, thanks. I had thought about using total_ordering and sorting, but I was thinking I didn't actually need to sort. But in looking again, I think the gain in readability is worth every penny.
â kojiro
Jul 31 at 15:19
"in your for without breaking" for loop?
â Acccumulation
Jul 31 at 15:26
@Acccumulation Yes, I'll try make that clearer
â Peilonrayz
Jul 31 at 15:27
@kojiro You didn't say in your question but isTerm.is_better_than
costly? Because you could make the code run in $O(n)$ time, with a worst case $2n$ calls toTerm.is_best_terms
if they are all equal.
â Peilonrayz
Jul 31 at 15:34
It's not costly afaict. It is impossible to understand, though, which is why I am still having trouble writing a satisfactory__eq__
for total ordering. I'm also considering usingmax
and a filter instead oftakewhile
, which trades thesorted
for max 2x passes over the set.
â kojiro
Jul 31 at 15:36
 |Â
show 2 more comments
up vote
3
down vote
If all you want are the best terms, then it seems to me that you can just keep a list of the current best terms. Every time you find a term that at least as good as any one of the previous best terms, then, assuming transitivity, it must be at least as good as all the previous best terms. And since the previous best terms were at least as good as all the terms tested so far, the new term is at least as good as all the terms tested so for. Next you check whether the new term is strictly better than one of the current best terms. If so, then you can replace the list of best terms with a list containing just the new term. If not, then the current term is just as good as the current best terms, so you should append it to the list of best terms.
best_terms =
for term in terms:
if not best_terms:
best_terms = [term]
continue
if not best_terms[0].is_better_than(term):
if term.is_better_than(best_terms[0]):
best_terms = [term]
else
best_terms.append(term)
1
Please tell me you changed your account name just to give this answer, because that'd be about perfect.
â kojiro
Jul 31 at 15:47
Except "is_better_than" returns false if the terms are equally good. You seem to be considering otherwise...
â coconochao
Jul 31 at 20:14
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
5
down vote
accepted
- You have a bug, if you remove an item in your
for
loop without breaking you get anIndexError
.
Instead use aworse
set so that you know what to skip. - What you seem to be doing is performing a sort, and so if you make a
cmp
function you can usesorted
.
To do this you need to usefunctools.cmp_to_key
and then you canitertools.takewhile
.
And so I'd use:
import functools
import itertools
from typing import Set, List
class Term(int):
def is_better_than(self, other):
return self // 2 > other // 2
def cmp(self: Term, other: Term):
if self.is_better_than(other):
return 1
if other.is_better_than(self):
return -1
return 0
def get_best_terms(terms: Set[Term]) -> List[Term]:
if not terms:
return
terms = sorted(terms, key=functools.cmp_to_key(cmp), reverse=True)
best = terms[0]
return list(itertools.takewhile(lambda i: not best.is_better_than(i), terms))
print(get_best_terms(map(Term, [1, 2, 3, 4, 5])))
print(get_best_terms(map(Term, [1, 2, 3, 4])))
print(get_best_terms(map(Term, [5, 1, 2, 4, 3])))
print(get_best_terms(map(Term, [5, 1, 4, 3])))
This can however be simplified if Term
s are comparable, as you won't need to use key
and can remove the need for a lambda
.
@functools.total_ordering
class Term(int):
def __gt__(self, other: Term) -> bool:
return self // 2 > other // 2
def __eq__(self, other: Term) -> bool:
return self // 2 == other // 2
def get_best_terms(terms: Set[Term]) -> List[Term]:
if not terms:
return
terms = sorted(terms, reverse=True)
return list(itertools.takewhile(terms[0].__eq__, terms))
Yeah, this is great, thanks. I had thought about using total_ordering and sorting, but I was thinking I didn't actually need to sort. But in looking again, I think the gain in readability is worth every penny.
â kojiro
Jul 31 at 15:19
"in your for without breaking" for loop?
â Acccumulation
Jul 31 at 15:26
@Acccumulation Yes, I'll try make that clearer
â Peilonrayz
Jul 31 at 15:27
@kojiro You didn't say in your question but isTerm.is_better_than
costly? Because you could make the code run in $O(n)$ time, with a worst case $2n$ calls toTerm.is_best_terms
if they are all equal.
â Peilonrayz
Jul 31 at 15:34
It's not costly afaict. It is impossible to understand, though, which is why I am still having trouble writing a satisfactory__eq__
for total ordering. I'm also considering usingmax
and a filter instead oftakewhile
, which trades thesorted
for max 2x passes over the set.
â kojiro
Jul 31 at 15:36
 |Â
show 2 more comments
up vote
5
down vote
accepted
- You have a bug, if you remove an item in your
for
loop without breaking you get anIndexError
.
Instead use aworse
set so that you know what to skip. - What you seem to be doing is performing a sort, and so if you make a
cmp
function you can usesorted
.
To do this you need to usefunctools.cmp_to_key
and then you canitertools.takewhile
.
And so I'd use:
import functools
import itertools
from typing import Set, List
class Term(int):
def is_better_than(self, other):
return self // 2 > other // 2
def cmp(self: Term, other: Term):
if self.is_better_than(other):
return 1
if other.is_better_than(self):
return -1
return 0
def get_best_terms(terms: Set[Term]) -> List[Term]:
if not terms:
return
terms = sorted(terms, key=functools.cmp_to_key(cmp), reverse=True)
best = terms[0]
return list(itertools.takewhile(lambda i: not best.is_better_than(i), terms))
print(get_best_terms(map(Term, [1, 2, 3, 4, 5])))
print(get_best_terms(map(Term, [1, 2, 3, 4])))
print(get_best_terms(map(Term, [5, 1, 2, 4, 3])))
print(get_best_terms(map(Term, [5, 1, 4, 3])))
This can however be simplified if Term
s are comparable, as you won't need to use key
and can remove the need for a lambda
.
@functools.total_ordering
class Term(int):
def __gt__(self, other: Term) -> bool:
return self // 2 > other // 2
def __eq__(self, other: Term) -> bool:
return self // 2 == other // 2
def get_best_terms(terms: Set[Term]) -> List[Term]:
if not terms:
return
terms = sorted(terms, reverse=True)
return list(itertools.takewhile(terms[0].__eq__, terms))
Yeah, this is great, thanks. I had thought about using total_ordering and sorting, but I was thinking I didn't actually need to sort. But in looking again, I think the gain in readability is worth every penny.
â kojiro
Jul 31 at 15:19
"in your for without breaking" for loop?
â Acccumulation
Jul 31 at 15:26
@Acccumulation Yes, I'll try make that clearer
â Peilonrayz
Jul 31 at 15:27
@kojiro You didn't say in your question but isTerm.is_better_than
costly? Because you could make the code run in $O(n)$ time, with a worst case $2n$ calls toTerm.is_best_terms
if they are all equal.
â Peilonrayz
Jul 31 at 15:34
It's not costly afaict. It is impossible to understand, though, which is why I am still having trouble writing a satisfactory__eq__
for total ordering. I'm also considering usingmax
and a filter instead oftakewhile
, which trades thesorted
for max 2x passes over the set.
â kojiro
Jul 31 at 15:36
 |Â
show 2 more comments
up vote
5
down vote
accepted
up vote
5
down vote
accepted
- You have a bug, if you remove an item in your
for
loop without breaking you get anIndexError
.
Instead use aworse
set so that you know what to skip. - What you seem to be doing is performing a sort, and so if you make a
cmp
function you can usesorted
.
To do this you need to usefunctools.cmp_to_key
and then you canitertools.takewhile
.
And so I'd use:
import functools
import itertools
from typing import Set, List
class Term(int):
def is_better_than(self, other):
return self // 2 > other // 2
def cmp(self: Term, other: Term):
if self.is_better_than(other):
return 1
if other.is_better_than(self):
return -1
return 0
def get_best_terms(terms: Set[Term]) -> List[Term]:
if not terms:
return
terms = sorted(terms, key=functools.cmp_to_key(cmp), reverse=True)
best = terms[0]
return list(itertools.takewhile(lambda i: not best.is_better_than(i), terms))
print(get_best_terms(map(Term, [1, 2, 3, 4, 5])))
print(get_best_terms(map(Term, [1, 2, 3, 4])))
print(get_best_terms(map(Term, [5, 1, 2, 4, 3])))
print(get_best_terms(map(Term, [5, 1, 4, 3])))
This can however be simplified if Term
s are comparable, as you won't need to use key
and can remove the need for a lambda
.
@functools.total_ordering
class Term(int):
def __gt__(self, other: Term) -> bool:
return self // 2 > other // 2
def __eq__(self, other: Term) -> bool:
return self // 2 == other // 2
def get_best_terms(terms: Set[Term]) -> List[Term]:
if not terms:
return
terms = sorted(terms, reverse=True)
return list(itertools.takewhile(terms[0].__eq__, terms))
- You have a bug, if you remove an item in your
for
loop without breaking you get anIndexError
.
Instead use aworse
set so that you know what to skip. - What you seem to be doing is performing a sort, and so if you make a
cmp
function you can usesorted
.
To do this you need to usefunctools.cmp_to_key
and then you canitertools.takewhile
.
And so I'd use:
import functools
import itertools
from typing import Set, List
class Term(int):
def is_better_than(self, other):
return self // 2 > other // 2
def cmp(self: Term, other: Term):
if self.is_better_than(other):
return 1
if other.is_better_than(self):
return -1
return 0
def get_best_terms(terms: Set[Term]) -> List[Term]:
if not terms:
return
terms = sorted(terms, key=functools.cmp_to_key(cmp), reverse=True)
best = terms[0]
return list(itertools.takewhile(lambda i: not best.is_better_than(i), terms))
print(get_best_terms(map(Term, [1, 2, 3, 4, 5])))
print(get_best_terms(map(Term, [1, 2, 3, 4])))
print(get_best_terms(map(Term, [5, 1, 2, 4, 3])))
print(get_best_terms(map(Term, [5, 1, 4, 3])))
This can however be simplified if Term
s are comparable, as you won't need to use key
and can remove the need for a lambda
.
@functools.total_ordering
class Term(int):
def __gt__(self, other: Term) -> bool:
return self // 2 > other // 2
def __eq__(self, other: Term) -> bool:
return self // 2 == other // 2
def get_best_terms(terms: Set[Term]) -> List[Term]:
if not terms:
return
terms = sorted(terms, reverse=True)
return list(itertools.takewhile(terms[0].__eq__, terms))
edited Jul 31 at 15:27
answered Jul 31 at 14:41
Peilonrayz
24.3k336101
24.3k336101
Yeah, this is great, thanks. I had thought about using total_ordering and sorting, but I was thinking I didn't actually need to sort. But in looking again, I think the gain in readability is worth every penny.
â kojiro
Jul 31 at 15:19
"in your for without breaking" for loop?
â Acccumulation
Jul 31 at 15:26
@Acccumulation Yes, I'll try make that clearer
â Peilonrayz
Jul 31 at 15:27
@kojiro You didn't say in your question but isTerm.is_better_than
costly? Because you could make the code run in $O(n)$ time, with a worst case $2n$ calls toTerm.is_best_terms
if they are all equal.
â Peilonrayz
Jul 31 at 15:34
It's not costly afaict. It is impossible to understand, though, which is why I am still having trouble writing a satisfactory__eq__
for total ordering. I'm also considering usingmax
and a filter instead oftakewhile
, which trades thesorted
for max 2x passes over the set.
â kojiro
Jul 31 at 15:36
 |Â
show 2 more comments
Yeah, this is great, thanks. I had thought about using total_ordering and sorting, but I was thinking I didn't actually need to sort. But in looking again, I think the gain in readability is worth every penny.
â kojiro
Jul 31 at 15:19
"in your for without breaking" for loop?
â Acccumulation
Jul 31 at 15:26
@Acccumulation Yes, I'll try make that clearer
â Peilonrayz
Jul 31 at 15:27
@kojiro You didn't say in your question but isTerm.is_better_than
costly? Because you could make the code run in $O(n)$ time, with a worst case $2n$ calls toTerm.is_best_terms
if they are all equal.
â Peilonrayz
Jul 31 at 15:34
It's not costly afaict. It is impossible to understand, though, which is why I am still having trouble writing a satisfactory__eq__
for total ordering. I'm also considering usingmax
and a filter instead oftakewhile
, which trades thesorted
for max 2x passes over the set.
â kojiro
Jul 31 at 15:36
Yeah, this is great, thanks. I had thought about using total_ordering and sorting, but I was thinking I didn't actually need to sort. But in looking again, I think the gain in readability is worth every penny.
â kojiro
Jul 31 at 15:19
Yeah, this is great, thanks. I had thought about using total_ordering and sorting, but I was thinking I didn't actually need to sort. But in looking again, I think the gain in readability is worth every penny.
â kojiro
Jul 31 at 15:19
"in your for without breaking" for loop?
â Acccumulation
Jul 31 at 15:26
"in your for without breaking" for loop?
â Acccumulation
Jul 31 at 15:26
@Acccumulation Yes, I'll try make that clearer
â Peilonrayz
Jul 31 at 15:27
@Acccumulation Yes, I'll try make that clearer
â Peilonrayz
Jul 31 at 15:27
@kojiro You didn't say in your question but is
Term.is_better_than
costly? Because you could make the code run in $O(n)$ time, with a worst case $2n$ calls to Term.is_best_terms
if they are all equal.â Peilonrayz
Jul 31 at 15:34
@kojiro You didn't say in your question but is
Term.is_better_than
costly? Because you could make the code run in $O(n)$ time, with a worst case $2n$ calls to Term.is_best_terms
if they are all equal.â Peilonrayz
Jul 31 at 15:34
It's not costly afaict. It is impossible to understand, though, which is why I am still having trouble writing a satisfactory
__eq__
for total ordering. I'm also considering using max
and a filter instead of takewhile
, which trades the sorted
for max 2x passes over the set.â kojiro
Jul 31 at 15:36
It's not costly afaict. It is impossible to understand, though, which is why I am still having trouble writing a satisfactory
__eq__
for total ordering. I'm also considering using max
and a filter instead of takewhile
, which trades the sorted
for max 2x passes over the set.â kojiro
Jul 31 at 15:36
 |Â
show 2 more comments
up vote
3
down vote
If all you want are the best terms, then it seems to me that you can just keep a list of the current best terms. Every time you find a term that at least as good as any one of the previous best terms, then, assuming transitivity, it must be at least as good as all the previous best terms. And since the previous best terms were at least as good as all the terms tested so far, the new term is at least as good as all the terms tested so for. Next you check whether the new term is strictly better than one of the current best terms. If so, then you can replace the list of best terms with a list containing just the new term. If not, then the current term is just as good as the current best terms, so you should append it to the list of best terms.
best_terms =
for term in terms:
if not best_terms:
best_terms = [term]
continue
if not best_terms[0].is_better_than(term):
if term.is_better_than(best_terms[0]):
best_terms = [term]
else
best_terms.append(term)
1
Please tell me you changed your account name just to give this answer, because that'd be about perfect.
â kojiro
Jul 31 at 15:47
Except "is_better_than" returns false if the terms are equally good. You seem to be considering otherwise...
â coconochao
Jul 31 at 20:14
add a comment |Â
up vote
3
down vote
If all you want are the best terms, then it seems to me that you can just keep a list of the current best terms. Every time you find a term that at least as good as any one of the previous best terms, then, assuming transitivity, it must be at least as good as all the previous best terms. And since the previous best terms were at least as good as all the terms tested so far, the new term is at least as good as all the terms tested so for. Next you check whether the new term is strictly better than one of the current best terms. If so, then you can replace the list of best terms with a list containing just the new term. If not, then the current term is just as good as the current best terms, so you should append it to the list of best terms.
best_terms =
for term in terms:
if not best_terms:
best_terms = [term]
continue
if not best_terms[0].is_better_than(term):
if term.is_better_than(best_terms[0]):
best_terms = [term]
else
best_terms.append(term)
1
Please tell me you changed your account name just to give this answer, because that'd be about perfect.
â kojiro
Jul 31 at 15:47
Except "is_better_than" returns false if the terms are equally good. You seem to be considering otherwise...
â coconochao
Jul 31 at 20:14
add a comment |Â
up vote
3
down vote
up vote
3
down vote
If all you want are the best terms, then it seems to me that you can just keep a list of the current best terms. Every time you find a term that at least as good as any one of the previous best terms, then, assuming transitivity, it must be at least as good as all the previous best terms. And since the previous best terms were at least as good as all the terms tested so far, the new term is at least as good as all the terms tested so for. Next you check whether the new term is strictly better than one of the current best terms. If so, then you can replace the list of best terms with a list containing just the new term. If not, then the current term is just as good as the current best terms, so you should append it to the list of best terms.
best_terms =
for term in terms:
if not best_terms:
best_terms = [term]
continue
if not best_terms[0].is_better_than(term):
if term.is_better_than(best_terms[0]):
best_terms = [term]
else
best_terms.append(term)
If all you want are the best terms, then it seems to me that you can just keep a list of the current best terms. Every time you find a term that at least as good as any one of the previous best terms, then, assuming transitivity, it must be at least as good as all the previous best terms. And since the previous best terms were at least as good as all the terms tested so far, the new term is at least as good as all the terms tested so for. Next you check whether the new term is strictly better than one of the current best terms. If so, then you can replace the list of best terms with a list containing just the new term. If not, then the current term is just as good as the current best terms, so you should append it to the list of best terms.
best_terms =
for term in terms:
if not best_terms:
best_terms = [term]
continue
if not best_terms[0].is_better_than(term):
if term.is_better_than(best_terms[0]):
best_terms = [term]
else
best_terms.append(term)
edited Jul 31 at 21:50
answered Jul 31 at 15:46
Acccumulation
9395
9395
1
Please tell me you changed your account name just to give this answer, because that'd be about perfect.
â kojiro
Jul 31 at 15:47
Except "is_better_than" returns false if the terms are equally good. You seem to be considering otherwise...
â coconochao
Jul 31 at 20:14
add a comment |Â
1
Please tell me you changed your account name just to give this answer, because that'd be about perfect.
â kojiro
Jul 31 at 15:47
Except "is_better_than" returns false if the terms are equally good. You seem to be considering otherwise...
â coconochao
Jul 31 at 20:14
1
1
Please tell me you changed your account name just to give this answer, because that'd be about perfect.
â kojiro
Jul 31 at 15:47
Please tell me you changed your account name just to give this answer, because that'd be about perfect.
â kojiro
Jul 31 at 15:47
Except "is_better_than" returns false if the terms are equally good. You seem to be considering otherwise...
â coconochao
Jul 31 at 20:14
Except "is_better_than" returns false if the terms are equally good. You seem to be considering otherwise...
â coconochao
Jul 31 at 20:14
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f200663%2ffinding-the-best-terms-to-use%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Can you provide an example of
Term
and some example IO?â Peilonrayz
Jul 31 at 13:57
@Peilonrayz done, thanks
â kojiro
Jul 31 at 14:25
Is the ranking of Terms transitive? If Term A is better than Term B, and Term B is better than Term C, is Term A guaranteed to be better than Term C?
â David Conrad
Jul 31 at 20:01
1
@DavidConrad yes.
â kojiro
Jul 31 at 20:06