Checking if two strings are anagrams in Python
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
14
down vote
favorite
The code written in Python 3.6 mostly using Python's built in functions sorted and len
. First I'm checking for the edge case that the two given strings are not of the same length obviously. Then I'm sorting the strings and comparing if their sorted value are equal to each other with a boolean expression. An anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
Just for the sake of practising Big-O this function runs in $O(1)$ constant complexity time because regardless of the strings given the function will always return a Boolean value. Is this correct? I realized my assumption was not correct since the function will run in the time complexity of the sorted built-in function which is $nlog(n)$.
If you were an interviewer would you prefer the candidate to not use Python's built in functions and to resort to a more manual way of solving the problem?
def is_anagram(string1, string2):
while len(string1) == len(string2):
# Testing sorted values equality
if sorted(string1) == sorted(string2):
return True
return False
return 'The strings are not anagrams they have differing lengths'
print(is_anagram('cat', 'cra'))
python beginner python-3.x interview-questions
add a comment |Â
up vote
14
down vote
favorite
The code written in Python 3.6 mostly using Python's built in functions sorted and len
. First I'm checking for the edge case that the two given strings are not of the same length obviously. Then I'm sorting the strings and comparing if their sorted value are equal to each other with a boolean expression. An anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
Just for the sake of practising Big-O this function runs in $O(1)$ constant complexity time because regardless of the strings given the function will always return a Boolean value. Is this correct? I realized my assumption was not correct since the function will run in the time complexity of the sorted built-in function which is $nlog(n)$.
If you were an interviewer would you prefer the candidate to not use Python's built in functions and to resort to a more manual way of solving the problem?
def is_anagram(string1, string2):
while len(string1) == len(string2):
# Testing sorted values equality
if sorted(string1) == sorted(string2):
return True
return False
return 'The strings are not anagrams they have differing lengths'
print(is_anagram('cat', 'cra'))
python beginner python-3.x interview-questions
add a comment |Â
up vote
14
down vote
favorite
up vote
14
down vote
favorite
The code written in Python 3.6 mostly using Python's built in functions sorted and len
. First I'm checking for the edge case that the two given strings are not of the same length obviously. Then I'm sorting the strings and comparing if their sorted value are equal to each other with a boolean expression. An anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
Just for the sake of practising Big-O this function runs in $O(1)$ constant complexity time because regardless of the strings given the function will always return a Boolean value. Is this correct? I realized my assumption was not correct since the function will run in the time complexity of the sorted built-in function which is $nlog(n)$.
If you were an interviewer would you prefer the candidate to not use Python's built in functions and to resort to a more manual way of solving the problem?
def is_anagram(string1, string2):
while len(string1) == len(string2):
# Testing sorted values equality
if sorted(string1) == sorted(string2):
return True
return False
return 'The strings are not anagrams they have differing lengths'
print(is_anagram('cat', 'cra'))
python beginner python-3.x interview-questions
The code written in Python 3.6 mostly using Python's built in functions sorted and len
. First I'm checking for the edge case that the two given strings are not of the same length obviously. Then I'm sorting the strings and comparing if their sorted value are equal to each other with a boolean expression. An anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
Just for the sake of practising Big-O this function runs in $O(1)$ constant complexity time because regardless of the strings given the function will always return a Boolean value. Is this correct? I realized my assumption was not correct since the function will run in the time complexity of the sorted built-in function which is $nlog(n)$.
If you were an interviewer would you prefer the candidate to not use Python's built in functions and to resort to a more manual way of solving the problem?
def is_anagram(string1, string2):
while len(string1) == len(string2):
# Testing sorted values equality
if sorted(string1) == sorted(string2):
return True
return False
return 'The strings are not anagrams they have differing lengths'
print(is_anagram('cat', 'cra'))
python beginner python-3.x interview-questions
edited Jun 9 at 3:04
Jamalâ¦
30.1k11114225
30.1k11114225
asked Jun 7 at 15:31
nemo
736
736
add a comment |Â
add a comment |Â
4 Answers
4
active
oldest
votes
up vote
22
down vote
accepted
- You can change your
if
to just bereturn
. - You should change your
while
to anif
, as it makes no sense for it to bewhile
. - You shouldn't return a string on invalid input, instead you could
raise
aValueError
.
This can get:
def is_anagram(string1, string2):
if len(string1) == len(string2):
return sorted(string1) == sorted(string2)
raise ValueError('The strings are not anagrams they have differing lengths')
However I wouldn't raise
, and so you can just use:
def is_anagram(string1, string2):
return sorted(string1) == sorted(string2)
To answer your questions:
The $O$ complexity for your function is not $O(1)$. Lets take the different aspects of your function:
len(n)
is $O(1)$int == int
is $O(1)$sorted(n)
is $nlogn$str == str
is $O(n)$len(a) == len(b)
is $O(1)$sorted(a) == sorted(b)
is $O(min(a, b) + aloga + blogb)$
Since the function will short circuit if
len(a) == len(b)
we know that $a = b = n$.
And so the complexity becomes $O(n + nlogn)$. This may be ok to simplify to $O(nlogn)$, as $O(n + nlogn) = O(n(1 + logn))$You can however use
collections.Counter
to reduce yoursorted
complexity to $O(n)$. And so keeping the short circuiting your function would be $O(n)$, otherwise it would be $O(a + b)$. And so to use this you can use:from collections import Counter
def is_anagram(string1, string2):
return Counter(string1) == Counter(string2)I would prefer you use the builtin functions, as it should lead to less code to maintain.
Oh wow that was a really fast to the point answer thank you ! I didn't realize the time complexity would be that of the sorted function. Awesome good to know someone would be okay with getting an answer with builtin functions :D
â nemo
Jun 7 at 16:09
Just a minor point on the complexity: the length check at the beginning will short circuit unless a equals b. Because of this, and the fact that the big-O eats any constant coefficients, $mathcalO(aloga + blogb)$ is actually equal to just $mathcalO(aloga)$ in this case.
â Bass
Jun 8 at 7:13
@Bass You're right, you can do that.
â Peilonrayz
Jun 8 at 8:30
About thecollections.Counter
, could you work that out please? It might be useful for OP to see another solution.
â JAD
Jun 8 at 13:19
@JAD what do you mean "work it out?" Have you tried swapping them out?
â Peilonrayz
Jun 8 at 13:38
 |Â
show 3 more comments
up vote
6
down vote
Nice to see someone even think about efficiency. So many coders (especially young/new ones) just write things the simple way. Comparing the sort strings is simple. It's not necessarily efficient.
As an interviewer I would be happy to see you use built-in functions. There is no point in re-inventing the wheel. But you have to use the right functions at the right time.
As an interviewer, the first question I want you to ask before you optimize is about the problem space. What lengths of strings are you likely to be comparing? If the answer is 10 then it isn't worth optimizing. If the answer is 2048 then you'll want to think about it.
"Premature optimization is the root of all evil [...] in programming"
- Donald Knuth1
A second question about the problem space is "how often will this program be used and in what context?" If you are writing a one-time use program then just do it the faster simplest way you can think of. There is little point in expending brain cells on optimizing something that will barely be used. If the answer is that this will be a frequently used component in a life support system then you darn well better plan to optimize, and get it right.
Test your resulting program. In an interview I would want you to at least describe some sample strings for comparison. What's the worst case pair of strings? (I think it's a pair that are different by one character and the two characters are the ones that would be last in the sorted string within your element space. That is, if you are comparing lower case letters the difference is a "y" in one string instead of a "z" in the other.) In real world test your "optimized" program against the simple one to see how they compare on time use. Some results are bound to surprise you. There are lots of factors that will surprise you about implementation details.
1https://en.wikiquote.org/wiki/Donald_Knuth#Computer_Programming_as_an_Art_(1974)
Thank you Sinc for appreciating newbies thinking about the run time of a problem! I figured it's important to let people know how my brain is thinking about the problem . I make mistakes someone can point them out then and I understand things better! Definitely thinking about the scenario in which the solution will run is important. In interviews I've had mixed reactions from interviewers wanting either a quick solution or optimize for the best solution . So I hope to practice to get better! Cheers Thank you for your advice :D
â nemo
Jun 7 at 23:16
Good point about the length of the strings. However, you should only add the additional step of comparing the string lengths if bench-marking shows there is a bottleneck there. Maybe 2048 can be done really quickly and the code is slower in other places that are more valuable to improve first.
â CJ Dennis
Jun 8 at 1:55
@Nemo Your comment about mixed reactions is exactly why you need to ask about the problem space. Many programmers will go for a simple working solution first, and then will decide whether to optimize. If you ask the interviewer about their expectations on that point you should gain points for the question. Depending on the interview(er) you may or may not be expected to actually take the time to try to optimize, but I would think some discussion about whether, when and how to optimize is better than actually trying it. (But I'm like a 1000 years old so I've never done a coding interview.)
â Sinc
Jun 8 at 14:14
add a comment |Â
up vote
4
down vote
while len(string1) == len(string2):
Why are you using while
here instead of if
?
I think it's "ugly" to have a function that has mixed return types, in your case booleans or a string.
You don't need to test the length before you check if the sorted lists are the same.
Big-o of this would be the Big-o of whatever sorting function sorted are using, I think it is timsort in python so the big-o would be $n log n$.
If they are asking for a python dev it's good to use builltins because that shows you know the language.
Instead of useing an if
, you could just return the value of the boolean operation.
I would probably implement it something like this.
def is_anagram(s1,s2):
return sorted(s1) == sorted(s2)
Thanks for pointing out mixing returns types seems messy baot ! I used a while statement to first check if the to strings are of equal lengths since an anagram is a word whose letters have been rearranged to form a new word. So if the word lengths are not the same they should not even be sorted.
â nemo
Jun 7 at 15:58
ok, but then you should use a if insted, because len only returns a slingle digit so there isnt anything for the while to iterate over.
â baot
Jun 7 at 16:10
add a comment |Â
up vote
4
down vote
I personally would not be thrilled with that response in an interview. I mean, it's perfectly effective for exact matching, but some assumptions were made here.
An anagram is a word or phrase formed by rearranging the letters of a
different word or phrase, typically using all the original letters
exactly once
The definition explicitly states that it matches all the original letters exactly once. Not all characters are letters, however. Within the function, a match is made by sorting the strings and seeing if they are equal. The assumptions that are not great are that each letter will have the same capitalisation, and that whitespace or even punctuation matters. This may indeed be the case, but it is not explicitly stated in the definiton.
Anagrams that will fail:
"funeral" "real fun"
"Madam Curie" "Radium came"
"election results" "Lies. Let's recount"
I would prefer something like the following. I am also making some assumptions here about whether the characters will match in isalpha without checking if that is actually the case in certain character sets. But hey, I speak English so that's all I personally care about until a customer complains about a bug about my code not working in their language, right?
def is_anagram(string1, string2):
x=
for c in string1.lower():
if c.isalpha():
try:
x[c]+=1
except:
x[c]=1
for c in string2.lower():
if c.isalpha():
try:
x[c]-=1
except:
return False
for i in x.values():
if i != 0:
return False
return True
As @graipher pointed out, this can be done in a pythonier way with the following:
Counter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
Still learning python, so that's my lesson on collections.Counter.
And in action:
user@computer: ~$ python3.5 anagram2.py "cat" "tra"
False
user@computer: ~$ python3.5 anagram2.py "cat" "tac"
True
user@computer: ~$ python3.5 anagram2.py "funeral" "real fun"
True
user@computer: ~$ python3.5 anagram2.py "Madam Curie" "Radium came"
True
user@computer: ~$ python3.5 anagram2.py "Election results" "Lies. Let's recount"
True
Oh snap good point. . . you are right what if the strings are separated by spaced, have special characters etc all important factors to clarify before starting to thinking about the solution. Good catch !
â nemo
Jun 7 at 23:17
1
While I like the idea behind your proposed function, it's readability is not the best (not using PEP8, one-line variable names) and has some other problems (bare excepts). I would probably just compare twocollections.Counter
objects withCounter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
.
â Graipher
Jun 8 at 8:21
Falsenames I'm not a Python coder. Why isn't theexcept: return false
in the string2 handling going to cause your last example to fail because of the period? Also, what will happen if you put the period in the first string instead? Won't thex[c]=1
cause a failure?
â Sinc
Jun 8 at 14:03
Not a python code here either... as evidenced by not knowing the super awesome shortcuts that @Graipher listed. The full stops are scrubbed out with the .isalpha bits. That will only return characters that are letters. And the string2 exception kicks out a false if a character shows up that is not in string1 at all.
â Falsenames
Jun 12 at 18:15
add a comment |Â
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
22
down vote
accepted
- You can change your
if
to just bereturn
. - You should change your
while
to anif
, as it makes no sense for it to bewhile
. - You shouldn't return a string on invalid input, instead you could
raise
aValueError
.
This can get:
def is_anagram(string1, string2):
if len(string1) == len(string2):
return sorted(string1) == sorted(string2)
raise ValueError('The strings are not anagrams they have differing lengths')
However I wouldn't raise
, and so you can just use:
def is_anagram(string1, string2):
return sorted(string1) == sorted(string2)
To answer your questions:
The $O$ complexity for your function is not $O(1)$. Lets take the different aspects of your function:
len(n)
is $O(1)$int == int
is $O(1)$sorted(n)
is $nlogn$str == str
is $O(n)$len(a) == len(b)
is $O(1)$sorted(a) == sorted(b)
is $O(min(a, b) + aloga + blogb)$
Since the function will short circuit if
len(a) == len(b)
we know that $a = b = n$.
And so the complexity becomes $O(n + nlogn)$. This may be ok to simplify to $O(nlogn)$, as $O(n + nlogn) = O(n(1 + logn))$You can however use
collections.Counter
to reduce yoursorted
complexity to $O(n)$. And so keeping the short circuiting your function would be $O(n)$, otherwise it would be $O(a + b)$. And so to use this you can use:from collections import Counter
def is_anagram(string1, string2):
return Counter(string1) == Counter(string2)I would prefer you use the builtin functions, as it should lead to less code to maintain.
Oh wow that was a really fast to the point answer thank you ! I didn't realize the time complexity would be that of the sorted function. Awesome good to know someone would be okay with getting an answer with builtin functions :D
â nemo
Jun 7 at 16:09
Just a minor point on the complexity: the length check at the beginning will short circuit unless a equals b. Because of this, and the fact that the big-O eats any constant coefficients, $mathcalO(aloga + blogb)$ is actually equal to just $mathcalO(aloga)$ in this case.
â Bass
Jun 8 at 7:13
@Bass You're right, you can do that.
â Peilonrayz
Jun 8 at 8:30
About thecollections.Counter
, could you work that out please? It might be useful for OP to see another solution.
â JAD
Jun 8 at 13:19
@JAD what do you mean "work it out?" Have you tried swapping them out?
â Peilonrayz
Jun 8 at 13:38
 |Â
show 3 more comments
up vote
22
down vote
accepted
- You can change your
if
to just bereturn
. - You should change your
while
to anif
, as it makes no sense for it to bewhile
. - You shouldn't return a string on invalid input, instead you could
raise
aValueError
.
This can get:
def is_anagram(string1, string2):
if len(string1) == len(string2):
return sorted(string1) == sorted(string2)
raise ValueError('The strings are not anagrams they have differing lengths')
However I wouldn't raise
, and so you can just use:
def is_anagram(string1, string2):
return sorted(string1) == sorted(string2)
To answer your questions:
The $O$ complexity for your function is not $O(1)$. Lets take the different aspects of your function:
len(n)
is $O(1)$int == int
is $O(1)$sorted(n)
is $nlogn$str == str
is $O(n)$len(a) == len(b)
is $O(1)$sorted(a) == sorted(b)
is $O(min(a, b) + aloga + blogb)$
Since the function will short circuit if
len(a) == len(b)
we know that $a = b = n$.
And so the complexity becomes $O(n + nlogn)$. This may be ok to simplify to $O(nlogn)$, as $O(n + nlogn) = O(n(1 + logn))$You can however use
collections.Counter
to reduce yoursorted
complexity to $O(n)$. And so keeping the short circuiting your function would be $O(n)$, otherwise it would be $O(a + b)$. And so to use this you can use:from collections import Counter
def is_anagram(string1, string2):
return Counter(string1) == Counter(string2)I would prefer you use the builtin functions, as it should lead to less code to maintain.
Oh wow that was a really fast to the point answer thank you ! I didn't realize the time complexity would be that of the sorted function. Awesome good to know someone would be okay with getting an answer with builtin functions :D
â nemo
Jun 7 at 16:09
Just a minor point on the complexity: the length check at the beginning will short circuit unless a equals b. Because of this, and the fact that the big-O eats any constant coefficients, $mathcalO(aloga + blogb)$ is actually equal to just $mathcalO(aloga)$ in this case.
â Bass
Jun 8 at 7:13
@Bass You're right, you can do that.
â Peilonrayz
Jun 8 at 8:30
About thecollections.Counter
, could you work that out please? It might be useful for OP to see another solution.
â JAD
Jun 8 at 13:19
@JAD what do you mean "work it out?" Have you tried swapping them out?
â Peilonrayz
Jun 8 at 13:38
 |Â
show 3 more comments
up vote
22
down vote
accepted
up vote
22
down vote
accepted
- You can change your
if
to just bereturn
. - You should change your
while
to anif
, as it makes no sense for it to bewhile
. - You shouldn't return a string on invalid input, instead you could
raise
aValueError
.
This can get:
def is_anagram(string1, string2):
if len(string1) == len(string2):
return sorted(string1) == sorted(string2)
raise ValueError('The strings are not anagrams they have differing lengths')
However I wouldn't raise
, and so you can just use:
def is_anagram(string1, string2):
return sorted(string1) == sorted(string2)
To answer your questions:
The $O$ complexity for your function is not $O(1)$. Lets take the different aspects of your function:
len(n)
is $O(1)$int == int
is $O(1)$sorted(n)
is $nlogn$str == str
is $O(n)$len(a) == len(b)
is $O(1)$sorted(a) == sorted(b)
is $O(min(a, b) + aloga + blogb)$
Since the function will short circuit if
len(a) == len(b)
we know that $a = b = n$.
And so the complexity becomes $O(n + nlogn)$. This may be ok to simplify to $O(nlogn)$, as $O(n + nlogn) = O(n(1 + logn))$You can however use
collections.Counter
to reduce yoursorted
complexity to $O(n)$. And so keeping the short circuiting your function would be $O(n)$, otherwise it would be $O(a + b)$. And so to use this you can use:from collections import Counter
def is_anagram(string1, string2):
return Counter(string1) == Counter(string2)I would prefer you use the builtin functions, as it should lead to less code to maintain.
- You can change your
if
to just bereturn
. - You should change your
while
to anif
, as it makes no sense for it to bewhile
. - You shouldn't return a string on invalid input, instead you could
raise
aValueError
.
This can get:
def is_anagram(string1, string2):
if len(string1) == len(string2):
return sorted(string1) == sorted(string2)
raise ValueError('The strings are not anagrams they have differing lengths')
However I wouldn't raise
, and so you can just use:
def is_anagram(string1, string2):
return sorted(string1) == sorted(string2)
To answer your questions:
The $O$ complexity for your function is not $O(1)$. Lets take the different aspects of your function:
len(n)
is $O(1)$int == int
is $O(1)$sorted(n)
is $nlogn$str == str
is $O(n)$len(a) == len(b)
is $O(1)$sorted(a) == sorted(b)
is $O(min(a, b) + aloga + blogb)$
Since the function will short circuit if
len(a) == len(b)
we know that $a = b = n$.
And so the complexity becomes $O(n + nlogn)$. This may be ok to simplify to $O(nlogn)$, as $O(n + nlogn) = O(n(1 + logn))$You can however use
collections.Counter
to reduce yoursorted
complexity to $O(n)$. And so keeping the short circuiting your function would be $O(n)$, otherwise it would be $O(a + b)$. And so to use this you can use:from collections import Counter
def is_anagram(string1, string2):
return Counter(string1) == Counter(string2)I would prefer you use the builtin functions, as it should lead to less code to maintain.
edited Jun 8 at 15:08
answered Jun 7 at 15:54
Peilonrayz
24.3k336102
24.3k336102
Oh wow that was a really fast to the point answer thank you ! I didn't realize the time complexity would be that of the sorted function. Awesome good to know someone would be okay with getting an answer with builtin functions :D
â nemo
Jun 7 at 16:09
Just a minor point on the complexity: the length check at the beginning will short circuit unless a equals b. Because of this, and the fact that the big-O eats any constant coefficients, $mathcalO(aloga + blogb)$ is actually equal to just $mathcalO(aloga)$ in this case.
â Bass
Jun 8 at 7:13
@Bass You're right, you can do that.
â Peilonrayz
Jun 8 at 8:30
About thecollections.Counter
, could you work that out please? It might be useful for OP to see another solution.
â JAD
Jun 8 at 13:19
@JAD what do you mean "work it out?" Have you tried swapping them out?
â Peilonrayz
Jun 8 at 13:38
 |Â
show 3 more comments
Oh wow that was a really fast to the point answer thank you ! I didn't realize the time complexity would be that of the sorted function. Awesome good to know someone would be okay with getting an answer with builtin functions :D
â nemo
Jun 7 at 16:09
Just a minor point on the complexity: the length check at the beginning will short circuit unless a equals b. Because of this, and the fact that the big-O eats any constant coefficients, $mathcalO(aloga + blogb)$ is actually equal to just $mathcalO(aloga)$ in this case.
â Bass
Jun 8 at 7:13
@Bass You're right, you can do that.
â Peilonrayz
Jun 8 at 8:30
About thecollections.Counter
, could you work that out please? It might be useful for OP to see another solution.
â JAD
Jun 8 at 13:19
@JAD what do you mean "work it out?" Have you tried swapping them out?
â Peilonrayz
Jun 8 at 13:38
Oh wow that was a really fast to the point answer thank you ! I didn't realize the time complexity would be that of the sorted function. Awesome good to know someone would be okay with getting an answer with builtin functions :D
â nemo
Jun 7 at 16:09
Oh wow that was a really fast to the point answer thank you ! I didn't realize the time complexity would be that of the sorted function. Awesome good to know someone would be okay with getting an answer with builtin functions :D
â nemo
Jun 7 at 16:09
Just a minor point on the complexity: the length check at the beginning will short circuit unless a equals b. Because of this, and the fact that the big-O eats any constant coefficients, $mathcalO(aloga + blogb)$ is actually equal to just $mathcalO(aloga)$ in this case.
â Bass
Jun 8 at 7:13
Just a minor point on the complexity: the length check at the beginning will short circuit unless a equals b. Because of this, and the fact that the big-O eats any constant coefficients, $mathcalO(aloga + blogb)$ is actually equal to just $mathcalO(aloga)$ in this case.
â Bass
Jun 8 at 7:13
@Bass You're right, you can do that.
â Peilonrayz
Jun 8 at 8:30
@Bass You're right, you can do that.
â Peilonrayz
Jun 8 at 8:30
About the
collections.Counter
, could you work that out please? It might be useful for OP to see another solution.â JAD
Jun 8 at 13:19
About the
collections.Counter
, could you work that out please? It might be useful for OP to see another solution.â JAD
Jun 8 at 13:19
@JAD what do you mean "work it out?" Have you tried swapping them out?
â Peilonrayz
Jun 8 at 13:38
@JAD what do you mean "work it out?" Have you tried swapping them out?
â Peilonrayz
Jun 8 at 13:38
 |Â
show 3 more comments
up vote
6
down vote
Nice to see someone even think about efficiency. So many coders (especially young/new ones) just write things the simple way. Comparing the sort strings is simple. It's not necessarily efficient.
As an interviewer I would be happy to see you use built-in functions. There is no point in re-inventing the wheel. But you have to use the right functions at the right time.
As an interviewer, the first question I want you to ask before you optimize is about the problem space. What lengths of strings are you likely to be comparing? If the answer is 10 then it isn't worth optimizing. If the answer is 2048 then you'll want to think about it.
"Premature optimization is the root of all evil [...] in programming"
- Donald Knuth1
A second question about the problem space is "how often will this program be used and in what context?" If you are writing a one-time use program then just do it the faster simplest way you can think of. There is little point in expending brain cells on optimizing something that will barely be used. If the answer is that this will be a frequently used component in a life support system then you darn well better plan to optimize, and get it right.
Test your resulting program. In an interview I would want you to at least describe some sample strings for comparison. What's the worst case pair of strings? (I think it's a pair that are different by one character and the two characters are the ones that would be last in the sorted string within your element space. That is, if you are comparing lower case letters the difference is a "y" in one string instead of a "z" in the other.) In real world test your "optimized" program against the simple one to see how they compare on time use. Some results are bound to surprise you. There are lots of factors that will surprise you about implementation details.
1https://en.wikiquote.org/wiki/Donald_Knuth#Computer_Programming_as_an_Art_(1974)
Thank you Sinc for appreciating newbies thinking about the run time of a problem! I figured it's important to let people know how my brain is thinking about the problem . I make mistakes someone can point them out then and I understand things better! Definitely thinking about the scenario in which the solution will run is important. In interviews I've had mixed reactions from interviewers wanting either a quick solution or optimize for the best solution . So I hope to practice to get better! Cheers Thank you for your advice :D
â nemo
Jun 7 at 23:16
Good point about the length of the strings. However, you should only add the additional step of comparing the string lengths if bench-marking shows there is a bottleneck there. Maybe 2048 can be done really quickly and the code is slower in other places that are more valuable to improve first.
â CJ Dennis
Jun 8 at 1:55
@Nemo Your comment about mixed reactions is exactly why you need to ask about the problem space. Many programmers will go for a simple working solution first, and then will decide whether to optimize. If you ask the interviewer about their expectations on that point you should gain points for the question. Depending on the interview(er) you may or may not be expected to actually take the time to try to optimize, but I would think some discussion about whether, when and how to optimize is better than actually trying it. (But I'm like a 1000 years old so I've never done a coding interview.)
â Sinc
Jun 8 at 14:14
add a comment |Â
up vote
6
down vote
Nice to see someone even think about efficiency. So many coders (especially young/new ones) just write things the simple way. Comparing the sort strings is simple. It's not necessarily efficient.
As an interviewer I would be happy to see you use built-in functions. There is no point in re-inventing the wheel. But you have to use the right functions at the right time.
As an interviewer, the first question I want you to ask before you optimize is about the problem space. What lengths of strings are you likely to be comparing? If the answer is 10 then it isn't worth optimizing. If the answer is 2048 then you'll want to think about it.
"Premature optimization is the root of all evil [...] in programming"
- Donald Knuth1
A second question about the problem space is "how often will this program be used and in what context?" If you are writing a one-time use program then just do it the faster simplest way you can think of. There is little point in expending brain cells on optimizing something that will barely be used. If the answer is that this will be a frequently used component in a life support system then you darn well better plan to optimize, and get it right.
Test your resulting program. In an interview I would want you to at least describe some sample strings for comparison. What's the worst case pair of strings? (I think it's a pair that are different by one character and the two characters are the ones that would be last in the sorted string within your element space. That is, if you are comparing lower case letters the difference is a "y" in one string instead of a "z" in the other.) In real world test your "optimized" program against the simple one to see how they compare on time use. Some results are bound to surprise you. There are lots of factors that will surprise you about implementation details.
1https://en.wikiquote.org/wiki/Donald_Knuth#Computer_Programming_as_an_Art_(1974)
Thank you Sinc for appreciating newbies thinking about the run time of a problem! I figured it's important to let people know how my brain is thinking about the problem . I make mistakes someone can point them out then and I understand things better! Definitely thinking about the scenario in which the solution will run is important. In interviews I've had mixed reactions from interviewers wanting either a quick solution or optimize for the best solution . So I hope to practice to get better! Cheers Thank you for your advice :D
â nemo
Jun 7 at 23:16
Good point about the length of the strings. However, you should only add the additional step of comparing the string lengths if bench-marking shows there is a bottleneck there. Maybe 2048 can be done really quickly and the code is slower in other places that are more valuable to improve first.
â CJ Dennis
Jun 8 at 1:55
@Nemo Your comment about mixed reactions is exactly why you need to ask about the problem space. Many programmers will go for a simple working solution first, and then will decide whether to optimize. If you ask the interviewer about their expectations on that point you should gain points for the question. Depending on the interview(er) you may or may not be expected to actually take the time to try to optimize, but I would think some discussion about whether, when and how to optimize is better than actually trying it. (But I'm like a 1000 years old so I've never done a coding interview.)
â Sinc
Jun 8 at 14:14
add a comment |Â
up vote
6
down vote
up vote
6
down vote
Nice to see someone even think about efficiency. So many coders (especially young/new ones) just write things the simple way. Comparing the sort strings is simple. It's not necessarily efficient.
As an interviewer I would be happy to see you use built-in functions. There is no point in re-inventing the wheel. But you have to use the right functions at the right time.
As an interviewer, the first question I want you to ask before you optimize is about the problem space. What lengths of strings are you likely to be comparing? If the answer is 10 then it isn't worth optimizing. If the answer is 2048 then you'll want to think about it.
"Premature optimization is the root of all evil [...] in programming"
- Donald Knuth1
A second question about the problem space is "how often will this program be used and in what context?" If you are writing a one-time use program then just do it the faster simplest way you can think of. There is little point in expending brain cells on optimizing something that will barely be used. If the answer is that this will be a frequently used component in a life support system then you darn well better plan to optimize, and get it right.
Test your resulting program. In an interview I would want you to at least describe some sample strings for comparison. What's the worst case pair of strings? (I think it's a pair that are different by one character and the two characters are the ones that would be last in the sorted string within your element space. That is, if you are comparing lower case letters the difference is a "y" in one string instead of a "z" in the other.) In real world test your "optimized" program against the simple one to see how they compare on time use. Some results are bound to surprise you. There are lots of factors that will surprise you about implementation details.
1https://en.wikiquote.org/wiki/Donald_Knuth#Computer_Programming_as_an_Art_(1974)
Nice to see someone even think about efficiency. So many coders (especially young/new ones) just write things the simple way. Comparing the sort strings is simple. It's not necessarily efficient.
As an interviewer I would be happy to see you use built-in functions. There is no point in re-inventing the wheel. But you have to use the right functions at the right time.
As an interviewer, the first question I want you to ask before you optimize is about the problem space. What lengths of strings are you likely to be comparing? If the answer is 10 then it isn't worth optimizing. If the answer is 2048 then you'll want to think about it.
"Premature optimization is the root of all evil [...] in programming"
- Donald Knuth1
A second question about the problem space is "how often will this program be used and in what context?" If you are writing a one-time use program then just do it the faster simplest way you can think of. There is little point in expending brain cells on optimizing something that will barely be used. If the answer is that this will be a frequently used component in a life support system then you darn well better plan to optimize, and get it right.
Test your resulting program. In an interview I would want you to at least describe some sample strings for comparison. What's the worst case pair of strings? (I think it's a pair that are different by one character and the two characters are the ones that would be last in the sorted string within your element space. That is, if you are comparing lower case letters the difference is a "y" in one string instead of a "z" in the other.) In real world test your "optimized" program against the simple one to see how they compare on time use. Some results are bound to surprise you. There are lots of factors that will surprise you about implementation details.
1https://en.wikiquote.org/wiki/Donald_Knuth#Computer_Programming_as_an_Art_(1974)
edited Jun 7 at 23:59
Sam Onela
5,76961543
5,76961543
answered Jun 7 at 21:45
Sinc
1612
1612
Thank you Sinc for appreciating newbies thinking about the run time of a problem! I figured it's important to let people know how my brain is thinking about the problem . I make mistakes someone can point them out then and I understand things better! Definitely thinking about the scenario in which the solution will run is important. In interviews I've had mixed reactions from interviewers wanting either a quick solution or optimize for the best solution . So I hope to practice to get better! Cheers Thank you for your advice :D
â nemo
Jun 7 at 23:16
Good point about the length of the strings. However, you should only add the additional step of comparing the string lengths if bench-marking shows there is a bottleneck there. Maybe 2048 can be done really quickly and the code is slower in other places that are more valuable to improve first.
â CJ Dennis
Jun 8 at 1:55
@Nemo Your comment about mixed reactions is exactly why you need to ask about the problem space. Many programmers will go for a simple working solution first, and then will decide whether to optimize. If you ask the interviewer about their expectations on that point you should gain points for the question. Depending on the interview(er) you may or may not be expected to actually take the time to try to optimize, but I would think some discussion about whether, when and how to optimize is better than actually trying it. (But I'm like a 1000 years old so I've never done a coding interview.)
â Sinc
Jun 8 at 14:14
add a comment |Â
Thank you Sinc for appreciating newbies thinking about the run time of a problem! I figured it's important to let people know how my brain is thinking about the problem . I make mistakes someone can point them out then and I understand things better! Definitely thinking about the scenario in which the solution will run is important. In interviews I've had mixed reactions from interviewers wanting either a quick solution or optimize for the best solution . So I hope to practice to get better! Cheers Thank you for your advice :D
â nemo
Jun 7 at 23:16
Good point about the length of the strings. However, you should only add the additional step of comparing the string lengths if bench-marking shows there is a bottleneck there. Maybe 2048 can be done really quickly and the code is slower in other places that are more valuable to improve first.
â CJ Dennis
Jun 8 at 1:55
@Nemo Your comment about mixed reactions is exactly why you need to ask about the problem space. Many programmers will go for a simple working solution first, and then will decide whether to optimize. If you ask the interviewer about their expectations on that point you should gain points for the question. Depending on the interview(er) you may or may not be expected to actually take the time to try to optimize, but I would think some discussion about whether, when and how to optimize is better than actually trying it. (But I'm like a 1000 years old so I've never done a coding interview.)
â Sinc
Jun 8 at 14:14
Thank you Sinc for appreciating newbies thinking about the run time of a problem! I figured it's important to let people know how my brain is thinking about the problem . I make mistakes someone can point them out then and I understand things better! Definitely thinking about the scenario in which the solution will run is important. In interviews I've had mixed reactions from interviewers wanting either a quick solution or optimize for the best solution . So I hope to practice to get better! Cheers Thank you for your advice :D
â nemo
Jun 7 at 23:16
Thank you Sinc for appreciating newbies thinking about the run time of a problem! I figured it's important to let people know how my brain is thinking about the problem . I make mistakes someone can point them out then and I understand things better! Definitely thinking about the scenario in which the solution will run is important. In interviews I've had mixed reactions from interviewers wanting either a quick solution or optimize for the best solution . So I hope to practice to get better! Cheers Thank you for your advice :D
â nemo
Jun 7 at 23:16
Good point about the length of the strings. However, you should only add the additional step of comparing the string lengths if bench-marking shows there is a bottleneck there. Maybe 2048 can be done really quickly and the code is slower in other places that are more valuable to improve first.
â CJ Dennis
Jun 8 at 1:55
Good point about the length of the strings. However, you should only add the additional step of comparing the string lengths if bench-marking shows there is a bottleneck there. Maybe 2048 can be done really quickly and the code is slower in other places that are more valuable to improve first.
â CJ Dennis
Jun 8 at 1:55
@Nemo Your comment about mixed reactions is exactly why you need to ask about the problem space. Many programmers will go for a simple working solution first, and then will decide whether to optimize. If you ask the interviewer about their expectations on that point you should gain points for the question. Depending on the interview(er) you may or may not be expected to actually take the time to try to optimize, but I would think some discussion about whether, when and how to optimize is better than actually trying it. (But I'm like a 1000 years old so I've never done a coding interview.)
â Sinc
Jun 8 at 14:14
@Nemo Your comment about mixed reactions is exactly why you need to ask about the problem space. Many programmers will go for a simple working solution first, and then will decide whether to optimize. If you ask the interviewer about their expectations on that point you should gain points for the question. Depending on the interview(er) you may or may not be expected to actually take the time to try to optimize, but I would think some discussion about whether, when and how to optimize is better than actually trying it. (But I'm like a 1000 years old so I've never done a coding interview.)
â Sinc
Jun 8 at 14:14
add a comment |Â
up vote
4
down vote
while len(string1) == len(string2):
Why are you using while
here instead of if
?
I think it's "ugly" to have a function that has mixed return types, in your case booleans or a string.
You don't need to test the length before you check if the sorted lists are the same.
Big-o of this would be the Big-o of whatever sorting function sorted are using, I think it is timsort in python so the big-o would be $n log n$.
If they are asking for a python dev it's good to use builltins because that shows you know the language.
Instead of useing an if
, you could just return the value of the boolean operation.
I would probably implement it something like this.
def is_anagram(s1,s2):
return sorted(s1) == sorted(s2)
Thanks for pointing out mixing returns types seems messy baot ! I used a while statement to first check if the to strings are of equal lengths since an anagram is a word whose letters have been rearranged to form a new word. So if the word lengths are not the same they should not even be sorted.
â nemo
Jun 7 at 15:58
ok, but then you should use a if insted, because len only returns a slingle digit so there isnt anything for the while to iterate over.
â baot
Jun 7 at 16:10
add a comment |Â
up vote
4
down vote
while len(string1) == len(string2):
Why are you using while
here instead of if
?
I think it's "ugly" to have a function that has mixed return types, in your case booleans or a string.
You don't need to test the length before you check if the sorted lists are the same.
Big-o of this would be the Big-o of whatever sorting function sorted are using, I think it is timsort in python so the big-o would be $n log n$.
If they are asking for a python dev it's good to use builltins because that shows you know the language.
Instead of useing an if
, you could just return the value of the boolean operation.
I would probably implement it something like this.
def is_anagram(s1,s2):
return sorted(s1) == sorted(s2)
Thanks for pointing out mixing returns types seems messy baot ! I used a while statement to first check if the to strings are of equal lengths since an anagram is a word whose letters have been rearranged to form a new word. So if the word lengths are not the same they should not even be sorted.
â nemo
Jun 7 at 15:58
ok, but then you should use a if insted, because len only returns a slingle digit so there isnt anything for the while to iterate over.
â baot
Jun 7 at 16:10
add a comment |Â
up vote
4
down vote
up vote
4
down vote
while len(string1) == len(string2):
Why are you using while
here instead of if
?
I think it's "ugly" to have a function that has mixed return types, in your case booleans or a string.
You don't need to test the length before you check if the sorted lists are the same.
Big-o of this would be the Big-o of whatever sorting function sorted are using, I think it is timsort in python so the big-o would be $n log n$.
If they are asking for a python dev it's good to use builltins because that shows you know the language.
Instead of useing an if
, you could just return the value of the boolean operation.
I would probably implement it something like this.
def is_anagram(s1,s2):
return sorted(s1) == sorted(s2)
while len(string1) == len(string2):
Why are you using while
here instead of if
?
I think it's "ugly" to have a function that has mixed return types, in your case booleans or a string.
You don't need to test the length before you check if the sorted lists are the same.
Big-o of this would be the Big-o of whatever sorting function sorted are using, I think it is timsort in python so the big-o would be $n log n$.
If they are asking for a python dev it's good to use builltins because that shows you know the language.
Instead of useing an if
, you could just return the value of the boolean operation.
I would probably implement it something like this.
def is_anagram(s1,s2):
return sorted(s1) == sorted(s2)
edited Jun 7 at 16:45
Null
8571920
8571920
answered Jun 7 at 15:52
baot
1456
1456
Thanks for pointing out mixing returns types seems messy baot ! I used a while statement to first check if the to strings are of equal lengths since an anagram is a word whose letters have been rearranged to form a new word. So if the word lengths are not the same they should not even be sorted.
â nemo
Jun 7 at 15:58
ok, but then you should use a if insted, because len only returns a slingle digit so there isnt anything for the while to iterate over.
â baot
Jun 7 at 16:10
add a comment |Â
Thanks for pointing out mixing returns types seems messy baot ! I used a while statement to first check if the to strings are of equal lengths since an anagram is a word whose letters have been rearranged to form a new word. So if the word lengths are not the same they should not even be sorted.
â nemo
Jun 7 at 15:58
ok, but then you should use a if insted, because len only returns a slingle digit so there isnt anything for the while to iterate over.
â baot
Jun 7 at 16:10
Thanks for pointing out mixing returns types seems messy baot ! I used a while statement to first check if the to strings are of equal lengths since an anagram is a word whose letters have been rearranged to form a new word. So if the word lengths are not the same they should not even be sorted.
â nemo
Jun 7 at 15:58
Thanks for pointing out mixing returns types seems messy baot ! I used a while statement to first check if the to strings are of equal lengths since an anagram is a word whose letters have been rearranged to form a new word. So if the word lengths are not the same they should not even be sorted.
â nemo
Jun 7 at 15:58
ok, but then you should use a if insted, because len only returns a slingle digit so there isnt anything for the while to iterate over.
â baot
Jun 7 at 16:10
ok, but then you should use a if insted, because len only returns a slingle digit so there isnt anything for the while to iterate over.
â baot
Jun 7 at 16:10
add a comment |Â
up vote
4
down vote
I personally would not be thrilled with that response in an interview. I mean, it's perfectly effective for exact matching, but some assumptions were made here.
An anagram is a word or phrase formed by rearranging the letters of a
different word or phrase, typically using all the original letters
exactly once
The definition explicitly states that it matches all the original letters exactly once. Not all characters are letters, however. Within the function, a match is made by sorting the strings and seeing if they are equal. The assumptions that are not great are that each letter will have the same capitalisation, and that whitespace or even punctuation matters. This may indeed be the case, but it is not explicitly stated in the definiton.
Anagrams that will fail:
"funeral" "real fun"
"Madam Curie" "Radium came"
"election results" "Lies. Let's recount"
I would prefer something like the following. I am also making some assumptions here about whether the characters will match in isalpha without checking if that is actually the case in certain character sets. But hey, I speak English so that's all I personally care about until a customer complains about a bug about my code not working in their language, right?
def is_anagram(string1, string2):
x=
for c in string1.lower():
if c.isalpha():
try:
x[c]+=1
except:
x[c]=1
for c in string2.lower():
if c.isalpha():
try:
x[c]-=1
except:
return False
for i in x.values():
if i != 0:
return False
return True
As @graipher pointed out, this can be done in a pythonier way with the following:
Counter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
Still learning python, so that's my lesson on collections.Counter.
And in action:
user@computer: ~$ python3.5 anagram2.py "cat" "tra"
False
user@computer: ~$ python3.5 anagram2.py "cat" "tac"
True
user@computer: ~$ python3.5 anagram2.py "funeral" "real fun"
True
user@computer: ~$ python3.5 anagram2.py "Madam Curie" "Radium came"
True
user@computer: ~$ python3.5 anagram2.py "Election results" "Lies. Let's recount"
True
Oh snap good point. . . you are right what if the strings are separated by spaced, have special characters etc all important factors to clarify before starting to thinking about the solution. Good catch !
â nemo
Jun 7 at 23:17
1
While I like the idea behind your proposed function, it's readability is not the best (not using PEP8, one-line variable names) and has some other problems (bare excepts). I would probably just compare twocollections.Counter
objects withCounter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
.
â Graipher
Jun 8 at 8:21
Falsenames I'm not a Python coder. Why isn't theexcept: return false
in the string2 handling going to cause your last example to fail because of the period? Also, what will happen if you put the period in the first string instead? Won't thex[c]=1
cause a failure?
â Sinc
Jun 8 at 14:03
Not a python code here either... as evidenced by not knowing the super awesome shortcuts that @Graipher listed. The full stops are scrubbed out with the .isalpha bits. That will only return characters that are letters. And the string2 exception kicks out a false if a character shows up that is not in string1 at all.
â Falsenames
Jun 12 at 18:15
add a comment |Â
up vote
4
down vote
I personally would not be thrilled with that response in an interview. I mean, it's perfectly effective for exact matching, but some assumptions were made here.
An anagram is a word or phrase formed by rearranging the letters of a
different word or phrase, typically using all the original letters
exactly once
The definition explicitly states that it matches all the original letters exactly once. Not all characters are letters, however. Within the function, a match is made by sorting the strings and seeing if they are equal. The assumptions that are not great are that each letter will have the same capitalisation, and that whitespace or even punctuation matters. This may indeed be the case, but it is not explicitly stated in the definiton.
Anagrams that will fail:
"funeral" "real fun"
"Madam Curie" "Radium came"
"election results" "Lies. Let's recount"
I would prefer something like the following. I am also making some assumptions here about whether the characters will match in isalpha without checking if that is actually the case in certain character sets. But hey, I speak English so that's all I personally care about until a customer complains about a bug about my code not working in their language, right?
def is_anagram(string1, string2):
x=
for c in string1.lower():
if c.isalpha():
try:
x[c]+=1
except:
x[c]=1
for c in string2.lower():
if c.isalpha():
try:
x[c]-=1
except:
return False
for i in x.values():
if i != 0:
return False
return True
As @graipher pointed out, this can be done in a pythonier way with the following:
Counter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
Still learning python, so that's my lesson on collections.Counter.
And in action:
user@computer: ~$ python3.5 anagram2.py "cat" "tra"
False
user@computer: ~$ python3.5 anagram2.py "cat" "tac"
True
user@computer: ~$ python3.5 anagram2.py "funeral" "real fun"
True
user@computer: ~$ python3.5 anagram2.py "Madam Curie" "Radium came"
True
user@computer: ~$ python3.5 anagram2.py "Election results" "Lies. Let's recount"
True
Oh snap good point. . . you are right what if the strings are separated by spaced, have special characters etc all important factors to clarify before starting to thinking about the solution. Good catch !
â nemo
Jun 7 at 23:17
1
While I like the idea behind your proposed function, it's readability is not the best (not using PEP8, one-line variable names) and has some other problems (bare excepts). I would probably just compare twocollections.Counter
objects withCounter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
.
â Graipher
Jun 8 at 8:21
Falsenames I'm not a Python coder. Why isn't theexcept: return false
in the string2 handling going to cause your last example to fail because of the period? Also, what will happen if you put the period in the first string instead? Won't thex[c]=1
cause a failure?
â Sinc
Jun 8 at 14:03
Not a python code here either... as evidenced by not knowing the super awesome shortcuts that @Graipher listed. The full stops are scrubbed out with the .isalpha bits. That will only return characters that are letters. And the string2 exception kicks out a false if a character shows up that is not in string1 at all.
â Falsenames
Jun 12 at 18:15
add a comment |Â
up vote
4
down vote
up vote
4
down vote
I personally would not be thrilled with that response in an interview. I mean, it's perfectly effective for exact matching, but some assumptions were made here.
An anagram is a word or phrase formed by rearranging the letters of a
different word or phrase, typically using all the original letters
exactly once
The definition explicitly states that it matches all the original letters exactly once. Not all characters are letters, however. Within the function, a match is made by sorting the strings and seeing if they are equal. The assumptions that are not great are that each letter will have the same capitalisation, and that whitespace or even punctuation matters. This may indeed be the case, but it is not explicitly stated in the definiton.
Anagrams that will fail:
"funeral" "real fun"
"Madam Curie" "Radium came"
"election results" "Lies. Let's recount"
I would prefer something like the following. I am also making some assumptions here about whether the characters will match in isalpha without checking if that is actually the case in certain character sets. But hey, I speak English so that's all I personally care about until a customer complains about a bug about my code not working in their language, right?
def is_anagram(string1, string2):
x=
for c in string1.lower():
if c.isalpha():
try:
x[c]+=1
except:
x[c]=1
for c in string2.lower():
if c.isalpha():
try:
x[c]-=1
except:
return False
for i in x.values():
if i != 0:
return False
return True
As @graipher pointed out, this can be done in a pythonier way with the following:
Counter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
Still learning python, so that's my lesson on collections.Counter.
And in action:
user@computer: ~$ python3.5 anagram2.py "cat" "tra"
False
user@computer: ~$ python3.5 anagram2.py "cat" "tac"
True
user@computer: ~$ python3.5 anagram2.py "funeral" "real fun"
True
user@computer: ~$ python3.5 anagram2.py "Madam Curie" "Radium came"
True
user@computer: ~$ python3.5 anagram2.py "Election results" "Lies. Let's recount"
True
I personally would not be thrilled with that response in an interview. I mean, it's perfectly effective for exact matching, but some assumptions were made here.
An anagram is a word or phrase formed by rearranging the letters of a
different word or phrase, typically using all the original letters
exactly once
The definition explicitly states that it matches all the original letters exactly once. Not all characters are letters, however. Within the function, a match is made by sorting the strings and seeing if they are equal. The assumptions that are not great are that each letter will have the same capitalisation, and that whitespace or even punctuation matters. This may indeed be the case, but it is not explicitly stated in the definiton.
Anagrams that will fail:
"funeral" "real fun"
"Madam Curie" "Radium came"
"election results" "Lies. Let's recount"
I would prefer something like the following. I am also making some assumptions here about whether the characters will match in isalpha without checking if that is actually the case in certain character sets. But hey, I speak English so that's all I personally care about until a customer complains about a bug about my code not working in their language, right?
def is_anagram(string1, string2):
x=
for c in string1.lower():
if c.isalpha():
try:
x[c]+=1
except:
x[c]=1
for c in string2.lower():
if c.isalpha():
try:
x[c]-=1
except:
return False
for i in x.values():
if i != 0:
return False
return True
As @graipher pointed out, this can be done in a pythonier way with the following:
Counter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
Still learning python, so that's my lesson on collections.Counter.
And in action:
user@computer: ~$ python3.5 anagram2.py "cat" "tra"
False
user@computer: ~$ python3.5 anagram2.py "cat" "tac"
True
user@computer: ~$ python3.5 anagram2.py "funeral" "real fun"
True
user@computer: ~$ python3.5 anagram2.py "Madam Curie" "Radium came"
True
user@computer: ~$ python3.5 anagram2.py "Election results" "Lies. Let's recount"
True
edited Jun 8 at 18:10
answered Jun 7 at 22:53
Falsenames
1413
1413
Oh snap good point. . . you are right what if the strings are separated by spaced, have special characters etc all important factors to clarify before starting to thinking about the solution. Good catch !
â nemo
Jun 7 at 23:17
1
While I like the idea behind your proposed function, it's readability is not the best (not using PEP8, one-line variable names) and has some other problems (bare excepts). I would probably just compare twocollections.Counter
objects withCounter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
.
â Graipher
Jun 8 at 8:21
Falsenames I'm not a Python coder. Why isn't theexcept: return false
in the string2 handling going to cause your last example to fail because of the period? Also, what will happen if you put the period in the first string instead? Won't thex[c]=1
cause a failure?
â Sinc
Jun 8 at 14:03
Not a python code here either... as evidenced by not knowing the super awesome shortcuts that @Graipher listed. The full stops are scrubbed out with the .isalpha bits. That will only return characters that are letters. And the string2 exception kicks out a false if a character shows up that is not in string1 at all.
â Falsenames
Jun 12 at 18:15
add a comment |Â
Oh snap good point. . . you are right what if the strings are separated by spaced, have special characters etc all important factors to clarify before starting to thinking about the solution. Good catch !
â nemo
Jun 7 at 23:17
1
While I like the idea behind your proposed function, it's readability is not the best (not using PEP8, one-line variable names) and has some other problems (bare excepts). I would probably just compare twocollections.Counter
objects withCounter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
.
â Graipher
Jun 8 at 8:21
Falsenames I'm not a Python coder. Why isn't theexcept: return false
in the string2 handling going to cause your last example to fail because of the period? Also, what will happen if you put the period in the first string instead? Won't thex[c]=1
cause a failure?
â Sinc
Jun 8 at 14:03
Not a python code here either... as evidenced by not knowing the super awesome shortcuts that @Graipher listed. The full stops are scrubbed out with the .isalpha bits. That will only return characters that are letters. And the string2 exception kicks out a false if a character shows up that is not in string1 at all.
â Falsenames
Jun 12 at 18:15
Oh snap good point. . . you are right what if the strings are separated by spaced, have special characters etc all important factors to clarify before starting to thinking about the solution. Good catch !
â nemo
Jun 7 at 23:17
Oh snap good point. . . you are right what if the strings are separated by spaced, have special characters etc all important factors to clarify before starting to thinking about the solution. Good catch !
â nemo
Jun 7 at 23:17
1
1
While I like the idea behind your proposed function, it's readability is not the best (not using PEP8, one-line variable names) and has some other problems (bare excepts). I would probably just compare two
collections.Counter
objects with Counter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
.â Graipher
Jun 8 at 8:21
While I like the idea behind your proposed function, it's readability is not the best (not using PEP8, one-line variable names) and has some other problems (bare excepts). I would probably just compare two
collections.Counter
objects with Counter(filter(str.isalpha, map(str.lower, s1))) == Counter(filter(str.isalpha, map(str.lower, s2)))
.â Graipher
Jun 8 at 8:21
Falsenames I'm not a Python coder. Why isn't the
except: return false
in the string2 handling going to cause your last example to fail because of the period? Also, what will happen if you put the period in the first string instead? Won't the x[c]=1
cause a failure?â Sinc
Jun 8 at 14:03
Falsenames I'm not a Python coder. Why isn't the
except: return false
in the string2 handling going to cause your last example to fail because of the period? Also, what will happen if you put the period in the first string instead? Won't the x[c]=1
cause a failure?â Sinc
Jun 8 at 14:03
Not a python code here either... as evidenced by not knowing the super awesome shortcuts that @Graipher listed. The full stops are scrubbed out with the .isalpha bits. That will only return characters that are letters. And the string2 exception kicks out a false if a character shows up that is not in string1 at all.
â Falsenames
Jun 12 at 18:15
Not a python code here either... as evidenced by not knowing the super awesome shortcuts that @Graipher listed. The full stops are scrubbed out with the .isalpha bits. That will only return characters that are letters. And the string2 exception kicks out a false if a character shows up that is not in string1 at all.
â Falsenames
Jun 12 at 18:15
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f196045%2fchecking-if-two-strings-are-anagrams-in-python%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password