Functions to convert camelCase strings to snake_case
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
22
down vote
favorite
I have written two functions, one actual and one helper, to convert camelCase strings into snake_case strings (I call it joint-lower case). Any ideas for improvement are welcome.
def _cc2jl(string):
"""Camel case to joint-lower helper."""
for index, current in enumerate(string):
if is_upper(current):
if index > 0:
previous = string[index-1]
if is_upper(previous):
try:
next = string[index+1]
except IndexError:
yield current.lower()
continue
else:
if is_upper(next):
yield current.lower()
continue
yield '_' + current.lower()
continue
yield current.lower()
continue
yield current
def cc2jl(string):
"""Converts a camel case string to joint-lower."""
return ''.join(_cc2jl(string))
Expected behaviour:
>>> cc2jl('m')
'm'
>>> cc2jl('AA')
'aa'
>>> cc2jl('MySQLDatabase')
'my_sql_database'
>>> cc2jl('GarbageCollection')
'garbage_collection'
>>> cc2jl('AAAAAAAAAAAAAAAAA')
'aaaaaaaaaaaaaaaaa'
python strings python-3.x
add a comment |Â
up vote
22
down vote
favorite
I have written two functions, one actual and one helper, to convert camelCase strings into snake_case strings (I call it joint-lower case). Any ideas for improvement are welcome.
def _cc2jl(string):
"""Camel case to joint-lower helper."""
for index, current in enumerate(string):
if is_upper(current):
if index > 0:
previous = string[index-1]
if is_upper(previous):
try:
next = string[index+1]
except IndexError:
yield current.lower()
continue
else:
if is_upper(next):
yield current.lower()
continue
yield '_' + current.lower()
continue
yield current.lower()
continue
yield current
def cc2jl(string):
"""Converts a camel case string to joint-lower."""
return ''.join(_cc2jl(string))
Expected behaviour:
>>> cc2jl('m')
'm'
>>> cc2jl('AA')
'aa'
>>> cc2jl('MySQLDatabase')
'my_sql_database'
>>> cc2jl('GarbageCollection')
'garbage_collection'
>>> cc2jl('AAAAAAAAAAAAAAAAA')
'aaaaaaaaaaaaaaaaa'
python strings python-3.x
add a comment |Â
up vote
22
down vote
favorite
up vote
22
down vote
favorite
I have written two functions, one actual and one helper, to convert camelCase strings into snake_case strings (I call it joint-lower case). Any ideas for improvement are welcome.
def _cc2jl(string):
"""Camel case to joint-lower helper."""
for index, current in enumerate(string):
if is_upper(current):
if index > 0:
previous = string[index-1]
if is_upper(previous):
try:
next = string[index+1]
except IndexError:
yield current.lower()
continue
else:
if is_upper(next):
yield current.lower()
continue
yield '_' + current.lower()
continue
yield current.lower()
continue
yield current
def cc2jl(string):
"""Converts a camel case string to joint-lower."""
return ''.join(_cc2jl(string))
Expected behaviour:
>>> cc2jl('m')
'm'
>>> cc2jl('AA')
'aa'
>>> cc2jl('MySQLDatabase')
'my_sql_database'
>>> cc2jl('GarbageCollection')
'garbage_collection'
>>> cc2jl('AAAAAAAAAAAAAAAAA')
'aaaaaaaaaaaaaaaaa'
python strings python-3.x
I have written two functions, one actual and one helper, to convert camelCase strings into snake_case strings (I call it joint-lower case). Any ideas for improvement are welcome.
def _cc2jl(string):
"""Camel case to joint-lower helper."""
for index, current in enumerate(string):
if is_upper(current):
if index > 0:
previous = string[index-1]
if is_upper(previous):
try:
next = string[index+1]
except IndexError:
yield current.lower()
continue
else:
if is_upper(next):
yield current.lower()
continue
yield '_' + current.lower()
continue
yield current.lower()
continue
yield current
def cc2jl(string):
"""Converts a camel case string to joint-lower."""
return ''.join(_cc2jl(string))
Expected behaviour:
>>> cc2jl('m')
'm'
>>> cc2jl('AA')
'aa'
>>> cc2jl('MySQLDatabase')
'my_sql_database'
>>> cc2jl('GarbageCollection')
'garbage_collection'
>>> cc2jl('AAAAAAAAAAAAAAAAA')
'aaaaaaaaaaaaaaaaa'
python strings python-3.x
edited Jan 25 at 23:06
Jamalâ¦
30.1k11114225
30.1k11114225
asked Jan 25 at 11:56
Richard Neumann
1,761620
1,761620
add a comment |Â
add a comment |Â
4 Answers
4
active
oldest
votes
up vote
12
down vote
accepted
I think regex would be the easiest to do this with. You just need to find each word in the string.
- The name starts with an uppercase letter.
[A-Z]
. However due to the way the rest of the regex works, we can change this to.
, so that we match all words, even ones that start with_
. The word will either contain uppercase or lowercase letters, both with other characters - excluding
_
.Uppercase:
- The word won't be lowercase or have a
_
.[^a-z_]+
- You don't want the last uppercase letter or a
_
.(?=[A-Z_])
- You want the above if it's the last in the string.
(?=[A-Z_]|$)
- The word won't be lowercase or have a
Lowercase:
- The word won't be uppercase or have a
_
.[^A-Z_]+
- The word won't be uppercase or have a
And so you can use:
(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))
You then want to apply the following to these:
Prepend the name with a single
_
, unless:- It's the first word in the name
- The word already starts with a
_
Convert the word to lowercase
Making:
def _jl_match(match):
group = match.group()
prefix = bool(match.start() and not group.startswith('_'))
return '_' * prefix + group.lower()
And so I'd use:
def _jl_match(match):
group = match.group()
prefix = bool(match.start() and not group.startswith('_'))
return '_' * prefix + group.lower()
REGEX = r'(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))'
def _cc2jl(string):
return re.subn(REGEX, _jl_match, string)[0]
Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â Graipher
Jan 25 at 19:50
2
Here is a comparison of various regexes that could be used.
â 200_success
Jan 25 at 19:52
I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â¦
â Richard Neumann
Jan 26 at 9:02
1
@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â Peilonrayz
Jan 26 at 11:33
2
@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â Peilonrayz
Jan 26 at 12:07
 |Â
show 7 more comments
up vote
16
down vote
A couple of things:
- Why 2 functions?
- Now that we established (1), you can get rid of the
yield
s.yield
is useful when you want to process stuff in chunks and do not want to wait for the entire population to be created first. This is not the case here. - There are some
continue
s that do make sense and some that are redundant. Yours are of the latter type. - Nested
if
statements without correspondingelse
clauses can be merged withand
if readability is not reduced.
Putting all that together, we get the following:
def cc2jl(my_str):
"""Camel case to joint-lower"""
r = my_str[0].lower()
for i, letter in enumerate(my_str[1:], 1):
if letter.isupper():
try:
if my_str[i-1].islower() or my_str[i+1].islower():
r += '_'
except IndexError:
pass
r += letter.lower()
return r
print(cc2jl('m')) #-> m
print(cc2jl('AA')) #-> aa
print(cc2jl('MySQLDatabase')) #-> my_sql_database
print(cc2jl('GarbageCollection')) #-> garbage_collection
print(cc2jl('AAAAAAAAAAAAAAAAA')) #-> aaaaaaaaaaaaaaaaa
Finally, some thoughts on avoiding the try
block that bothers me personally.
We can leverage from the short-circuiting used in Python's logical expressions and (re-)write the following:
def cc2jl(my_str):
"""Camel case to joint-lower"""
r = my_str[0].lower()
for i, letter in enumerate(my_str[1:], 1):
if letter.isupper():
if my_str[i-1].islower() or (i != len(my_str)-1 and my_str[i+1].islower()):
r += '_'
r += letter.lower()
return r
Note the i != len(my_str)-1
on the left side of the and
. If this returns False
the my_str[i+1]
is not evaluated at all and thus cannot raise
the IndexError
.
Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â Graipher
Jan 25 at 14:07
@Graipher The thing that I do not like the most about my approach is thetry-except
block but couldn't think of something more elegant.
â Ev. Kounis
Jan 25 at 14:11
You could replace it withif not 0 < i < len(my_str) - 1
, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.
â Graipher
Jan 25 at 14:15
2
@Graipher the0
I do not need since I am iterating overmy_str[1:]
.I tried going overmy_str[1:-1]
instead and handling the last letter in thereturn
line but that wasn't any better either. Anyway, thanks!
â Ev. Kounis
Jan 25 at 14:17
add a comment |Â
up vote
12
down vote
First a note on naming: cc2jl
is a very cryptic name. Give the public function a clearer name, like to_snake_case
or something similar (note that the function does not do anything to a string that already is in snake_case
).
While I quite like the fact that you used a generator approach, which guarantees that you make only on pass over the string, this seems like the perfect place for a regular expression. Some quick googling finds this rather simple function:
import re
def convert(name):
s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()
This is definitely shorter, even though that cryptic regex is not very easy to understand. But following your code, while written quite straightforward, is also not easy, especially with the deep nesting.
It works by first splitting runs of upper-case letters followed by one or more lowercase letters so that the last upper-case letter goes to the lower-case letter run. So "ABCdef"
becomes "AB_Cdef"
.
Then it separates runs of not-upper-case letters, followed by a single upper-case letter with a "_"
, so "abcD"
becomes "abc_D"
.
To get a bit more performance out of this, you should pre-compile the regexes:
first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
s1 = first_cap_re.sub(r'1_2', name)
return all_cap_re.sub(r'1_2', s1).lower()
Performance wise, all algorithms are similar:
Not quite surprising, they all give a linear behavior (as a function of string length).
My regex version needs to make two passes over the string and is therefore consistently slower. The function written by @Ev.Kounis in his answer outperforms both of ours, but the regex approach by @Peilonrayz is even faster, because it manages to do only one pass, but with a regex.
Note that all functions are very fast, so as long as you need this less than a few thousand times per second, any is fine.
The test strings were generated with this code:
import random
import string
strings = [''.join(random.choice(string.ascii_letters) for _ in range(n))
for n in range(10, 2000, 20)]
In Python 3 it could have been:
strings = [''.join(random.choices(string.ascii_letters, k=k))
for k in range(10, 2000, 20)]
I have the feeling that I would love this answer if I understoodregex
. My mind is too small for it though. +1
â Ev. Kounis
Jan 25 at 14:08
@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â Graipher
Jan 25 at 14:13
Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â Richard Neumann
Jan 26 at 9:02
add a comment |Â
up vote
1
down vote
It can be a bit shorter:
def cc2jl(s):
return "".join(["_"+l if i and l.isupper() and not s[i-1:i+2].isupper() else l for i, l in enumerate(s)]).lower()
Regular Expression Alternative:
rx = re.compile(r"(?<=.)(((?<![A-Z])[A-Z])|([A-Z](?=[a-z])))")
def cc2jl(s):
return rx.sub("_\1", s).lower()
My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â M. Utku ALTINKAYA
Jan 26 at 6:14
1
@Graipher fixed
â M. Utku ALTINKAYA
Jan 26 at 8:06
1
Added your function to the timings
â Graipher
Jan 26 at 9:12
"(...) compared to regex I think it is easier to read". Nope
â ÃÂïàú
Jan 26 at 11:24
Thanks @Graipher, I've revised that code and added a regex alternative also
â M. Utku ALTINKAYA
Jan 26 at 12:11
add a comment |Â
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
12
down vote
accepted
I think regex would be the easiest to do this with. You just need to find each word in the string.
- The name starts with an uppercase letter.
[A-Z]
. However due to the way the rest of the regex works, we can change this to.
, so that we match all words, even ones that start with_
. The word will either contain uppercase or lowercase letters, both with other characters - excluding
_
.Uppercase:
- The word won't be lowercase or have a
_
.[^a-z_]+
- You don't want the last uppercase letter or a
_
.(?=[A-Z_])
- You want the above if it's the last in the string.
(?=[A-Z_]|$)
- The word won't be lowercase or have a
Lowercase:
- The word won't be uppercase or have a
_
.[^A-Z_]+
- The word won't be uppercase or have a
And so you can use:
(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))
You then want to apply the following to these:
Prepend the name with a single
_
, unless:- It's the first word in the name
- The word already starts with a
_
Convert the word to lowercase
Making:
def _jl_match(match):
group = match.group()
prefix = bool(match.start() and not group.startswith('_'))
return '_' * prefix + group.lower()
And so I'd use:
def _jl_match(match):
group = match.group()
prefix = bool(match.start() and not group.startswith('_'))
return '_' * prefix + group.lower()
REGEX = r'(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))'
def _cc2jl(string):
return re.subn(REGEX, _jl_match, string)[0]
Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â Graipher
Jan 25 at 19:50
2
Here is a comparison of various regexes that could be used.
â 200_success
Jan 25 at 19:52
I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â¦
â Richard Neumann
Jan 26 at 9:02
1
@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â Peilonrayz
Jan 26 at 11:33
2
@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â Peilonrayz
Jan 26 at 12:07
 |Â
show 7 more comments
up vote
12
down vote
accepted
I think regex would be the easiest to do this with. You just need to find each word in the string.
- The name starts with an uppercase letter.
[A-Z]
. However due to the way the rest of the regex works, we can change this to.
, so that we match all words, even ones that start with_
. The word will either contain uppercase or lowercase letters, both with other characters - excluding
_
.Uppercase:
- The word won't be lowercase or have a
_
.[^a-z_]+
- You don't want the last uppercase letter or a
_
.(?=[A-Z_])
- You want the above if it's the last in the string.
(?=[A-Z_]|$)
- The word won't be lowercase or have a
Lowercase:
- The word won't be uppercase or have a
_
.[^A-Z_]+
- The word won't be uppercase or have a
And so you can use:
(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))
You then want to apply the following to these:
Prepend the name with a single
_
, unless:- It's the first word in the name
- The word already starts with a
_
Convert the word to lowercase
Making:
def _jl_match(match):
group = match.group()
prefix = bool(match.start() and not group.startswith('_'))
return '_' * prefix + group.lower()
And so I'd use:
def _jl_match(match):
group = match.group()
prefix = bool(match.start() and not group.startswith('_'))
return '_' * prefix + group.lower()
REGEX = r'(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))'
def _cc2jl(string):
return re.subn(REGEX, _jl_match, string)[0]
Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â Graipher
Jan 25 at 19:50
2
Here is a comparison of various regexes that could be used.
â 200_success
Jan 25 at 19:52
I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â¦
â Richard Neumann
Jan 26 at 9:02
1
@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â Peilonrayz
Jan 26 at 11:33
2
@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â Peilonrayz
Jan 26 at 12:07
 |Â
show 7 more comments
up vote
12
down vote
accepted
up vote
12
down vote
accepted
I think regex would be the easiest to do this with. You just need to find each word in the string.
- The name starts with an uppercase letter.
[A-Z]
. However due to the way the rest of the regex works, we can change this to.
, so that we match all words, even ones that start with_
. The word will either contain uppercase or lowercase letters, both with other characters - excluding
_
.Uppercase:
- The word won't be lowercase or have a
_
.[^a-z_]+
- You don't want the last uppercase letter or a
_
.(?=[A-Z_])
- You want the above if it's the last in the string.
(?=[A-Z_]|$)
- The word won't be lowercase or have a
Lowercase:
- The word won't be uppercase or have a
_
.[^A-Z_]+
- The word won't be uppercase or have a
And so you can use:
(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))
You then want to apply the following to these:
Prepend the name with a single
_
, unless:- It's the first word in the name
- The word already starts with a
_
Convert the word to lowercase
Making:
def _jl_match(match):
group = match.group()
prefix = bool(match.start() and not group.startswith('_'))
return '_' * prefix + group.lower()
And so I'd use:
def _jl_match(match):
group = match.group()
prefix = bool(match.start() and not group.startswith('_'))
return '_' * prefix + group.lower()
REGEX = r'(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))'
def _cc2jl(string):
return re.subn(REGEX, _jl_match, string)[0]
I think regex would be the easiest to do this with. You just need to find each word in the string.
- The name starts with an uppercase letter.
[A-Z]
. However due to the way the rest of the regex works, we can change this to.
, so that we match all words, even ones that start with_
. The word will either contain uppercase or lowercase letters, both with other characters - excluding
_
.Uppercase:
- The word won't be lowercase or have a
_
.[^a-z_]+
- You don't want the last uppercase letter or a
_
.(?=[A-Z_])
- You want the above if it's the last in the string.
(?=[A-Z_]|$)
- The word won't be lowercase or have a
Lowercase:
- The word won't be uppercase or have a
_
.[^A-Z_]+
- The word won't be uppercase or have a
And so you can use:
(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))
You then want to apply the following to these:
Prepend the name with a single
_
, unless:- It's the first word in the name
- The word already starts with a
_
Convert the word to lowercase
Making:
def _jl_match(match):
group = match.group()
prefix = bool(match.start() and not group.startswith('_'))
return '_' * prefix + group.lower()
And so I'd use:
def _jl_match(match):
group = match.group()
prefix = bool(match.start() and not group.startswith('_'))
return '_' * prefix + group.lower()
REGEX = r'(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))'
def _cc2jl(string):
return re.subn(REGEX, _jl_match, string)[0]
edited Jan 25 at 16:24
answered Jan 25 at 16:17
Peilonrayz
24.4k336102
24.4k336102
Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â Graipher
Jan 25 at 19:50
2
Here is a comparison of various regexes that could be used.
â 200_success
Jan 25 at 19:52
I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â¦
â Richard Neumann
Jan 26 at 9:02
1
@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â Peilonrayz
Jan 26 at 11:33
2
@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â Peilonrayz
Jan 26 at 12:07
 |Â
show 7 more comments
Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â Graipher
Jan 25 at 19:50
2
Here is a comparison of various regexes that could be used.
â 200_success
Jan 25 at 19:52
I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â¦
â Richard Neumann
Jan 26 at 9:02
1
@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â Peilonrayz
Jan 26 at 11:33
2
@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â Peilonrayz
Jan 26 at 12:07
Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â Graipher
Jan 25 at 19:50
Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â Graipher
Jan 25 at 19:50
2
2
Here is a comparison of various regexes that could be used.
â 200_success
Jan 25 at 19:52
Here is a comparison of various regexes that could be used.
â 200_success
Jan 25 at 19:52
I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â¦
â Richard Neumann
Jan 26 at 9:02
I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â¦
â Richard Neumann
Jan 26 at 9:02
1
1
@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â Peilonrayz
Jan 26 at 11:33
@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â Peilonrayz
Jan 26 at 11:33
2
2
@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â Peilonrayz
Jan 26 at 12:07
@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â Peilonrayz
Jan 26 at 12:07
 |Â
show 7 more comments
up vote
16
down vote
A couple of things:
- Why 2 functions?
- Now that we established (1), you can get rid of the
yield
s.yield
is useful when you want to process stuff in chunks and do not want to wait for the entire population to be created first. This is not the case here. - There are some
continue
s that do make sense and some that are redundant. Yours are of the latter type. - Nested
if
statements without correspondingelse
clauses can be merged withand
if readability is not reduced.
Putting all that together, we get the following:
def cc2jl(my_str):
"""Camel case to joint-lower"""
r = my_str[0].lower()
for i, letter in enumerate(my_str[1:], 1):
if letter.isupper():
try:
if my_str[i-1].islower() or my_str[i+1].islower():
r += '_'
except IndexError:
pass
r += letter.lower()
return r
print(cc2jl('m')) #-> m
print(cc2jl('AA')) #-> aa
print(cc2jl('MySQLDatabase')) #-> my_sql_database
print(cc2jl('GarbageCollection')) #-> garbage_collection
print(cc2jl('AAAAAAAAAAAAAAAAA')) #-> aaaaaaaaaaaaaaaaa
Finally, some thoughts on avoiding the try
block that bothers me personally.
We can leverage from the short-circuiting used in Python's logical expressions and (re-)write the following:
def cc2jl(my_str):
"""Camel case to joint-lower"""
r = my_str[0].lower()
for i, letter in enumerate(my_str[1:], 1):
if letter.isupper():
if my_str[i-1].islower() or (i != len(my_str)-1 and my_str[i+1].islower()):
r += '_'
r += letter.lower()
return r
Note the i != len(my_str)-1
on the left side of the and
. If this returns False
the my_str[i+1]
is not evaluated at all and thus cannot raise
the IndexError
.
Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â Graipher
Jan 25 at 14:07
@Graipher The thing that I do not like the most about my approach is thetry-except
block but couldn't think of something more elegant.
â Ev. Kounis
Jan 25 at 14:11
You could replace it withif not 0 < i < len(my_str) - 1
, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.
â Graipher
Jan 25 at 14:15
2
@Graipher the0
I do not need since I am iterating overmy_str[1:]
.I tried going overmy_str[1:-1]
instead and handling the last letter in thereturn
line but that wasn't any better either. Anyway, thanks!
â Ev. Kounis
Jan 25 at 14:17
add a comment |Â
up vote
16
down vote
A couple of things:
- Why 2 functions?
- Now that we established (1), you can get rid of the
yield
s.yield
is useful when you want to process stuff in chunks and do not want to wait for the entire population to be created first. This is not the case here. - There are some
continue
s that do make sense and some that are redundant. Yours are of the latter type. - Nested
if
statements without correspondingelse
clauses can be merged withand
if readability is not reduced.
Putting all that together, we get the following:
def cc2jl(my_str):
"""Camel case to joint-lower"""
r = my_str[0].lower()
for i, letter in enumerate(my_str[1:], 1):
if letter.isupper():
try:
if my_str[i-1].islower() or my_str[i+1].islower():
r += '_'
except IndexError:
pass
r += letter.lower()
return r
print(cc2jl('m')) #-> m
print(cc2jl('AA')) #-> aa
print(cc2jl('MySQLDatabase')) #-> my_sql_database
print(cc2jl('GarbageCollection')) #-> garbage_collection
print(cc2jl('AAAAAAAAAAAAAAAAA')) #-> aaaaaaaaaaaaaaaaa
Finally, some thoughts on avoiding the try
block that bothers me personally.
We can leverage from the short-circuiting used in Python's logical expressions and (re-)write the following:
def cc2jl(my_str):
"""Camel case to joint-lower"""
r = my_str[0].lower()
for i, letter in enumerate(my_str[1:], 1):
if letter.isupper():
if my_str[i-1].islower() or (i != len(my_str)-1 and my_str[i+1].islower()):
r += '_'
r += letter.lower()
return r
Note the i != len(my_str)-1
on the left side of the and
. If this returns False
the my_str[i+1]
is not evaluated at all and thus cannot raise
the IndexError
.
Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â Graipher
Jan 25 at 14:07
@Graipher The thing that I do not like the most about my approach is thetry-except
block but couldn't think of something more elegant.
â Ev. Kounis
Jan 25 at 14:11
You could replace it withif not 0 < i < len(my_str) - 1
, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.
â Graipher
Jan 25 at 14:15
2
@Graipher the0
I do not need since I am iterating overmy_str[1:]
.I tried going overmy_str[1:-1]
instead and handling the last letter in thereturn
line but that wasn't any better either. Anyway, thanks!
â Ev. Kounis
Jan 25 at 14:17
add a comment |Â
up vote
16
down vote
up vote
16
down vote
A couple of things:
- Why 2 functions?
- Now that we established (1), you can get rid of the
yield
s.yield
is useful when you want to process stuff in chunks and do not want to wait for the entire population to be created first. This is not the case here. - There are some
continue
s that do make sense and some that are redundant. Yours are of the latter type. - Nested
if
statements without correspondingelse
clauses can be merged withand
if readability is not reduced.
Putting all that together, we get the following:
def cc2jl(my_str):
"""Camel case to joint-lower"""
r = my_str[0].lower()
for i, letter in enumerate(my_str[1:], 1):
if letter.isupper():
try:
if my_str[i-1].islower() or my_str[i+1].islower():
r += '_'
except IndexError:
pass
r += letter.lower()
return r
print(cc2jl('m')) #-> m
print(cc2jl('AA')) #-> aa
print(cc2jl('MySQLDatabase')) #-> my_sql_database
print(cc2jl('GarbageCollection')) #-> garbage_collection
print(cc2jl('AAAAAAAAAAAAAAAAA')) #-> aaaaaaaaaaaaaaaaa
Finally, some thoughts on avoiding the try
block that bothers me personally.
We can leverage from the short-circuiting used in Python's logical expressions and (re-)write the following:
def cc2jl(my_str):
"""Camel case to joint-lower"""
r = my_str[0].lower()
for i, letter in enumerate(my_str[1:], 1):
if letter.isupper():
if my_str[i-1].islower() or (i != len(my_str)-1 and my_str[i+1].islower()):
r += '_'
r += letter.lower()
return r
Note the i != len(my_str)-1
on the left side of the and
. If this returns False
the my_str[i+1]
is not evaluated at all and thus cannot raise
the IndexError
.
A couple of things:
- Why 2 functions?
- Now that we established (1), you can get rid of the
yield
s.yield
is useful when you want to process stuff in chunks and do not want to wait for the entire population to be created first. This is not the case here. - There are some
continue
s that do make sense and some that are redundant. Yours are of the latter type. - Nested
if
statements without correspondingelse
clauses can be merged withand
if readability is not reduced.
Putting all that together, we get the following:
def cc2jl(my_str):
"""Camel case to joint-lower"""
r = my_str[0].lower()
for i, letter in enumerate(my_str[1:], 1):
if letter.isupper():
try:
if my_str[i-1].islower() or my_str[i+1].islower():
r += '_'
except IndexError:
pass
r += letter.lower()
return r
print(cc2jl('m')) #-> m
print(cc2jl('AA')) #-> aa
print(cc2jl('MySQLDatabase')) #-> my_sql_database
print(cc2jl('GarbageCollection')) #-> garbage_collection
print(cc2jl('AAAAAAAAAAAAAAAAA')) #-> aaaaaaaaaaaaaaaaa
Finally, some thoughts on avoiding the try
block that bothers me personally.
We can leverage from the short-circuiting used in Python's logical expressions and (re-)write the following:
def cc2jl(my_str):
"""Camel case to joint-lower"""
r = my_str[0].lower()
for i, letter in enumerate(my_str[1:], 1):
if letter.isupper():
if my_str[i-1].islower() or (i != len(my_str)-1 and my_str[i+1].islower()):
r += '_'
r += letter.lower()
return r
Note the i != len(my_str)-1
on the left side of the and
. If this returns False
the my_str[i+1]
is not evaluated at all and thus cannot raise
the IndexError
.
edited Jan 25 at 14:23
answered Jan 25 at 14:04
Ev. Kounis
620313
620313
Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â Graipher
Jan 25 at 14:07
@Graipher The thing that I do not like the most about my approach is thetry-except
block but couldn't think of something more elegant.
â Ev. Kounis
Jan 25 at 14:11
You could replace it withif not 0 < i < len(my_str) - 1
, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.
â Graipher
Jan 25 at 14:15
2
@Graipher the0
I do not need since I am iterating overmy_str[1:]
.I tried going overmy_str[1:-1]
instead and handling the last letter in thereturn
line but that wasn't any better either. Anyway, thanks!
â Ev. Kounis
Jan 25 at 14:17
add a comment |Â
Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â Graipher
Jan 25 at 14:07
@Graipher The thing that I do not like the most about my approach is thetry-except
block but couldn't think of something more elegant.
â Ev. Kounis
Jan 25 at 14:11
You could replace it withif not 0 < i < len(my_str) - 1
, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.
â Graipher
Jan 25 at 14:15
2
@Graipher the0
I do not need since I am iterating overmy_str[1:]
.I tried going overmy_str[1:-1]
instead and handling the last letter in thereturn
line but that wasn't any better either. Anyway, thanks!
â Ev. Kounis
Jan 25 at 14:17
Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â Graipher
Jan 25 at 14:07
Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â Graipher
Jan 25 at 14:07
@Graipher The thing that I do not like the most about my approach is the
try-except
block but couldn't think of something more elegant.â Ev. Kounis
Jan 25 at 14:11
@Graipher The thing that I do not like the most about my approach is the
try-except
block but couldn't think of something more elegant.â Ev. Kounis
Jan 25 at 14:11
You could replace it with
if not 0 < i < len(my_str) - 1
, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.â Graipher
Jan 25 at 14:15
You could replace it with
if not 0 < i < len(my_str) - 1
, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.â Graipher
Jan 25 at 14:15
2
2
@Graipher the
0
I do not need since I am iterating over my_str[1:]
.I tried going over my_str[1:-1]
instead and handling the last letter in the return
line but that wasn't any better either. Anyway, thanks!â Ev. Kounis
Jan 25 at 14:17
@Graipher the
0
I do not need since I am iterating over my_str[1:]
.I tried going over my_str[1:-1]
instead and handling the last letter in the return
line but that wasn't any better either. Anyway, thanks!â Ev. Kounis
Jan 25 at 14:17
add a comment |Â
up vote
12
down vote
First a note on naming: cc2jl
is a very cryptic name. Give the public function a clearer name, like to_snake_case
or something similar (note that the function does not do anything to a string that already is in snake_case
).
While I quite like the fact that you used a generator approach, which guarantees that you make only on pass over the string, this seems like the perfect place for a regular expression. Some quick googling finds this rather simple function:
import re
def convert(name):
s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()
This is definitely shorter, even though that cryptic regex is not very easy to understand. But following your code, while written quite straightforward, is also not easy, especially with the deep nesting.
It works by first splitting runs of upper-case letters followed by one or more lowercase letters so that the last upper-case letter goes to the lower-case letter run. So "ABCdef"
becomes "AB_Cdef"
.
Then it separates runs of not-upper-case letters, followed by a single upper-case letter with a "_"
, so "abcD"
becomes "abc_D"
.
To get a bit more performance out of this, you should pre-compile the regexes:
first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
s1 = first_cap_re.sub(r'1_2', name)
return all_cap_re.sub(r'1_2', s1).lower()
Performance wise, all algorithms are similar:
Not quite surprising, they all give a linear behavior (as a function of string length).
My regex version needs to make two passes over the string and is therefore consistently slower. The function written by @Ev.Kounis in his answer outperforms both of ours, but the regex approach by @Peilonrayz is even faster, because it manages to do only one pass, but with a regex.
Note that all functions are very fast, so as long as you need this less than a few thousand times per second, any is fine.
The test strings were generated with this code:
import random
import string
strings = [''.join(random.choice(string.ascii_letters) for _ in range(n))
for n in range(10, 2000, 20)]
In Python 3 it could have been:
strings = [''.join(random.choices(string.ascii_letters, k=k))
for k in range(10, 2000, 20)]
I have the feeling that I would love this answer if I understoodregex
. My mind is too small for it though. +1
â Ev. Kounis
Jan 25 at 14:08
@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â Graipher
Jan 25 at 14:13
Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â Richard Neumann
Jan 26 at 9:02
add a comment |Â
up vote
12
down vote
First a note on naming: cc2jl
is a very cryptic name. Give the public function a clearer name, like to_snake_case
or something similar (note that the function does not do anything to a string that already is in snake_case
).
While I quite like the fact that you used a generator approach, which guarantees that you make only on pass over the string, this seems like the perfect place for a regular expression. Some quick googling finds this rather simple function:
import re
def convert(name):
s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()
This is definitely shorter, even though that cryptic regex is not very easy to understand. But following your code, while written quite straightforward, is also not easy, especially with the deep nesting.
It works by first splitting runs of upper-case letters followed by one or more lowercase letters so that the last upper-case letter goes to the lower-case letter run. So "ABCdef"
becomes "AB_Cdef"
.
Then it separates runs of not-upper-case letters, followed by a single upper-case letter with a "_"
, so "abcD"
becomes "abc_D"
.
To get a bit more performance out of this, you should pre-compile the regexes:
first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
s1 = first_cap_re.sub(r'1_2', name)
return all_cap_re.sub(r'1_2', s1).lower()
Performance wise, all algorithms are similar:
Not quite surprising, they all give a linear behavior (as a function of string length).
My regex version needs to make two passes over the string and is therefore consistently slower. The function written by @Ev.Kounis in his answer outperforms both of ours, but the regex approach by @Peilonrayz is even faster, because it manages to do only one pass, but with a regex.
Note that all functions are very fast, so as long as you need this less than a few thousand times per second, any is fine.
The test strings were generated with this code:
import random
import string
strings = [''.join(random.choice(string.ascii_letters) for _ in range(n))
for n in range(10, 2000, 20)]
In Python 3 it could have been:
strings = [''.join(random.choices(string.ascii_letters, k=k))
for k in range(10, 2000, 20)]
I have the feeling that I would love this answer if I understoodregex
. My mind is too small for it though. +1
â Ev. Kounis
Jan 25 at 14:08
@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â Graipher
Jan 25 at 14:13
Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â Richard Neumann
Jan 26 at 9:02
add a comment |Â
up vote
12
down vote
up vote
12
down vote
First a note on naming: cc2jl
is a very cryptic name. Give the public function a clearer name, like to_snake_case
or something similar (note that the function does not do anything to a string that already is in snake_case
).
While I quite like the fact that you used a generator approach, which guarantees that you make only on pass over the string, this seems like the perfect place for a regular expression. Some quick googling finds this rather simple function:
import re
def convert(name):
s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()
This is definitely shorter, even though that cryptic regex is not very easy to understand. But following your code, while written quite straightforward, is also not easy, especially with the deep nesting.
It works by first splitting runs of upper-case letters followed by one or more lowercase letters so that the last upper-case letter goes to the lower-case letter run. So "ABCdef"
becomes "AB_Cdef"
.
Then it separates runs of not-upper-case letters, followed by a single upper-case letter with a "_"
, so "abcD"
becomes "abc_D"
.
To get a bit more performance out of this, you should pre-compile the regexes:
first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
s1 = first_cap_re.sub(r'1_2', name)
return all_cap_re.sub(r'1_2', s1).lower()
Performance wise, all algorithms are similar:
Not quite surprising, they all give a linear behavior (as a function of string length).
My regex version needs to make two passes over the string and is therefore consistently slower. The function written by @Ev.Kounis in his answer outperforms both of ours, but the regex approach by @Peilonrayz is even faster, because it manages to do only one pass, but with a regex.
Note that all functions are very fast, so as long as you need this less than a few thousand times per second, any is fine.
The test strings were generated with this code:
import random
import string
strings = [''.join(random.choice(string.ascii_letters) for _ in range(n))
for n in range(10, 2000, 20)]
In Python 3 it could have been:
strings = [''.join(random.choices(string.ascii_letters, k=k))
for k in range(10, 2000, 20)]
First a note on naming: cc2jl
is a very cryptic name. Give the public function a clearer name, like to_snake_case
or something similar (note that the function does not do anything to a string that already is in snake_case
).
While I quite like the fact that you used a generator approach, which guarantees that you make only on pass over the string, this seems like the perfect place for a regular expression. Some quick googling finds this rather simple function:
import re
def convert(name):
s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()
This is definitely shorter, even though that cryptic regex is not very easy to understand. But following your code, while written quite straightforward, is also not easy, especially with the deep nesting.
It works by first splitting runs of upper-case letters followed by one or more lowercase letters so that the last upper-case letter goes to the lower-case letter run. So "ABCdef"
becomes "AB_Cdef"
.
Then it separates runs of not-upper-case letters, followed by a single upper-case letter with a "_"
, so "abcD"
becomes "abc_D"
.
To get a bit more performance out of this, you should pre-compile the regexes:
first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
s1 = first_cap_re.sub(r'1_2', name)
return all_cap_re.sub(r'1_2', s1).lower()
Performance wise, all algorithms are similar:
Not quite surprising, they all give a linear behavior (as a function of string length).
My regex version needs to make two passes over the string and is therefore consistently slower. The function written by @Ev.Kounis in his answer outperforms both of ours, but the regex approach by @Peilonrayz is even faster, because it manages to do only one pass, but with a regex.
Note that all functions are very fast, so as long as you need this less than a few thousand times per second, any is fine.
The test strings were generated with this code:
import random
import string
strings = [''.join(random.choice(string.ascii_letters) for _ in range(n))
for n in range(10, 2000, 20)]
In Python 3 it could have been:
strings = [''.join(random.choices(string.ascii_letters, k=k))
for k in range(10, 2000, 20)]
edited Jan 26 at 9:11
answered Jan 25 at 14:04
Graipher
20.5k43081
20.5k43081
I have the feeling that I would love this answer if I understoodregex
. My mind is too small for it though. +1
â Ev. Kounis
Jan 25 at 14:08
@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â Graipher
Jan 25 at 14:13
Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â Richard Neumann
Jan 26 at 9:02
add a comment |Â
I have the feeling that I would love this answer if I understoodregex
. My mind is too small for it though. +1
â Ev. Kounis
Jan 25 at 14:08
@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â Graipher
Jan 25 at 14:13
Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â Richard Neumann
Jan 26 at 9:02
I have the feeling that I would love this answer if I understood
regex
. My mind is too small for it though. +1â Ev. Kounis
Jan 25 at 14:08
I have the feeling that I would love this answer if I understood
regex
. My mind is too small for it though. +1â Ev. Kounis
Jan 25 at 14:08
@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â Graipher
Jan 25 at 14:13
@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â Graipher
Jan 25 at 14:13
Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â Richard Neumann
Jan 26 at 9:02
Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â Richard Neumann
Jan 26 at 9:02
add a comment |Â
up vote
1
down vote
It can be a bit shorter:
def cc2jl(s):
return "".join(["_"+l if i and l.isupper() and not s[i-1:i+2].isupper() else l for i, l in enumerate(s)]).lower()
Regular Expression Alternative:
rx = re.compile(r"(?<=.)(((?<![A-Z])[A-Z])|([A-Z](?=[a-z])))")
def cc2jl(s):
return rx.sub("_\1", s).lower()
My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â M. Utku ALTINKAYA
Jan 26 at 6:14
1
@Graipher fixed
â M. Utku ALTINKAYA
Jan 26 at 8:06
1
Added your function to the timings
â Graipher
Jan 26 at 9:12
"(...) compared to regex I think it is easier to read". Nope
â ÃÂïàú
Jan 26 at 11:24
Thanks @Graipher, I've revised that code and added a regex alternative also
â M. Utku ALTINKAYA
Jan 26 at 12:11
add a comment |Â
up vote
1
down vote
It can be a bit shorter:
def cc2jl(s):
return "".join(["_"+l if i and l.isupper() and not s[i-1:i+2].isupper() else l for i, l in enumerate(s)]).lower()
Regular Expression Alternative:
rx = re.compile(r"(?<=.)(((?<![A-Z])[A-Z])|([A-Z](?=[a-z])))")
def cc2jl(s):
return rx.sub("_\1", s).lower()
My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â M. Utku ALTINKAYA
Jan 26 at 6:14
1
@Graipher fixed
â M. Utku ALTINKAYA
Jan 26 at 8:06
1
Added your function to the timings
â Graipher
Jan 26 at 9:12
"(...) compared to regex I think it is easier to read". Nope
â ÃÂïàú
Jan 26 at 11:24
Thanks @Graipher, I've revised that code and added a regex alternative also
â M. Utku ALTINKAYA
Jan 26 at 12:11
add a comment |Â
up vote
1
down vote
up vote
1
down vote
It can be a bit shorter:
def cc2jl(s):
return "".join(["_"+l if i and l.isupper() and not s[i-1:i+2].isupper() else l for i, l in enumerate(s)]).lower()
Regular Expression Alternative:
rx = re.compile(r"(?<=.)(((?<![A-Z])[A-Z])|([A-Z](?=[a-z])))")
def cc2jl(s):
return rx.sub("_\1", s).lower()
It can be a bit shorter:
def cc2jl(s):
return "".join(["_"+l if i and l.isupper() and not s[i-1:i+2].isupper() else l for i, l in enumerate(s)]).lower()
Regular Expression Alternative:
rx = re.compile(r"(?<=.)(((?<![A-Z])[A-Z])|([A-Z](?=[a-z])))")
def cc2jl(s):
return rx.sub("_\1", s).lower()
edited Jan 26 at 12:10
answered Jan 25 at 22:38
M. Utku ALTINKAYA
1193
1193
My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â M. Utku ALTINKAYA
Jan 26 at 6:14
1
@Graipher fixed
â M. Utku ALTINKAYA
Jan 26 at 8:06
1
Added your function to the timings
â Graipher
Jan 26 at 9:12
"(...) compared to regex I think it is easier to read". Nope
â ÃÂïàú
Jan 26 at 11:24
Thanks @Graipher, I've revised that code and added a regex alternative also
â M. Utku ALTINKAYA
Jan 26 at 12:11
add a comment |Â
My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â M. Utku ALTINKAYA
Jan 26 at 6:14
1
@Graipher fixed
â M. Utku ALTINKAYA
Jan 26 at 8:06
1
Added your function to the timings
â Graipher
Jan 26 at 9:12
"(...) compared to regex I think it is easier to read". Nope
â ÃÂïàú
Jan 26 at 11:24
Thanks @Graipher, I've revised that code and added a regex alternative also
â M. Utku ALTINKAYA
Jan 26 at 12:11
My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â M. Utku ALTINKAYA
Jan 26 at 6:14
My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â M. Utku ALTINKAYA
Jan 26 at 6:14
1
1
@Graipher fixed
â M. Utku ALTINKAYA
Jan 26 at 8:06
@Graipher fixed
â M. Utku ALTINKAYA
Jan 26 at 8:06
1
1
Added your function to the timings
â Graipher
Jan 26 at 9:12
Added your function to the timings
â Graipher
Jan 26 at 9:12
"(...) compared to regex I think it is easier to read". Nope
â ÃÂïàú
Jan 26 at 11:24
"(...) compared to regex I think it is easier to read". Nope
â ÃÂïàú
Jan 26 at 11:24
Thanks @Graipher, I've revised that code and added a regex alternative also
â M. Utku ALTINKAYA
Jan 26 at 12:11
Thanks @Graipher, I've revised that code and added a regex alternative also
â M. Utku ALTINKAYA
Jan 26 at 12:11
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f185966%2ffunctions-to-convert-camelcase-strings-to-snake-case%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password