Functions to convert camelCase strings to snake

Functions to convert camelCase strings to snake_case

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
22
down vote

favorite

I have written two functions, one actual and one helper, to convert camelCase strings into snake_case strings (I call it joint-lower case). Any ideas for improvement are welcome.

def _cc2jl(string):
 """Camel case to joint-lower helper."""

 for index, current in enumerate(string):

 if is_upper(current):
 if index > 0:
 previous = string[index-1]

 if is_upper(previous):
 try:
 next = string[index+1]
 except IndexError:
 yield current.lower()
 continue
 else:
 if is_upper(next):
 yield current.lower()
 continue

 yield '_' + current.lower()
 continue

 yield current.lower()
 continue

 yield current


def cc2jl(string):
 """Converts a camel case string to joint-lower."""

 return ''.join(_cc2jl(string))

Expected behaviour:

>>> cc2jl('m')
'm'
>>> cc2jl('AA')
'aa'
>>> cc2jl('MySQLDatabase')
'my_sql_database'
>>> cc2jl('GarbageCollection')
'garbage_collection'
>>> cc2jl('AAAAAAAAAAAAAAAAA')
'aaaaaaaaaaaaaaaaa'

edited Jan 25 at 23:06

Jamalâ™¦

30.1k11114225

asked Jan 25 at 11:56

Richard Neumann

1,761620

add a commentÂ |Â

up vote
22
down vote

favorite

I have written two functions, one actual and one helper, to convert camelCase strings into snake_case strings (I call it joint-lower case). Any ideas for improvement are welcome.

def _cc2jl(string):
 """Camel case to joint-lower helper."""

 for index, current in enumerate(string):

 if is_upper(current):
 if index > 0:
 previous = string[index-1]

 if is_upper(previous):
 try:
 next = string[index+1]
 except IndexError:
 yield current.lower()
 continue
 else:
 if is_upper(next):
 yield current.lower()
 continue

 yield '_' + current.lower()
 continue

 yield current.lower()
 continue

 yield current


def cc2jl(string):
 """Converts a camel case string to joint-lower."""

 return ''.join(_cc2jl(string))

Expected behaviour:

>>> cc2jl('m')
'm'
>>> cc2jl('AA')
'aa'
>>> cc2jl('MySQLDatabase')
'my_sql_database'
>>> cc2jl('GarbageCollection')
'garbage_collection'
>>> cc2jl('AAAAAAAAAAAAAAAAA')
'aaaaaaaaaaaaaaaaa'

edited Jan 25 at 23:06

Jamalâ™¦

30.1k11114225

asked Jan 25 at 11:56

Richard Neumann

1,761620

add a commentÂ |Â

up vote
22
down vote

favorite

I have written two functions, one actual and one helper, to convert camelCase strings into snake_case strings (I call it joint-lower case). Any ideas for improvement are welcome.

def _cc2jl(string):
 """Camel case to joint-lower helper."""

 for index, current in enumerate(string):

 if is_upper(current):
 if index > 0:
 previous = string[index-1]

 if is_upper(previous):
 try:
 next = string[index+1]
 except IndexError:
 yield current.lower()
 continue
 else:
 if is_upper(next):
 yield current.lower()
 continue

 yield '_' + current.lower()
 continue

 yield current.lower()
 continue

 yield current


def cc2jl(string):
 """Converts a camel case string to joint-lower."""

 return ''.join(_cc2jl(string))

Expected behaviour:

>>> cc2jl('m')
'm'
>>> cc2jl('AA')
'aa'
>>> cc2jl('MySQLDatabase')
'my_sql_database'
>>> cc2jl('GarbageCollection')
'garbage_collection'
>>> cc2jl('AAAAAAAAAAAAAAAAA')
'aaaaaaaaaaaaaaaaa'

edited Jan 25 at 23:06

Jamalâ™¦

30.1k11114225

asked Jan 25 at 11:56

Richard Neumann

1,761620

I have written two functions, one actual and one helper, to convert camelCase strings into snake_case strings (I call it joint-lower case). Any ideas for improvement are welcome.

def _cc2jl(string):
 """Camel case to joint-lower helper."""

 for index, current in enumerate(string):

 if is_upper(current):
 if index > 0:
 previous = string[index-1]

 if is_upper(previous):
 try:
 next = string[index+1]
 except IndexError:
 yield current.lower()
 continue
 else:
 if is_upper(next):
 yield current.lower()
 continue

 yield '_' + current.lower()
 continue

 yield current.lower()
 continue

 yield current


def cc2jl(string):
 """Converts a camel case string to joint-lower."""

 return ''.join(_cc2jl(string))

Expected behaviour:

>>> cc2jl('m')
'm'
>>> cc2jl('AA')
'aa'
>>> cc2jl('MySQLDatabase')
'my_sql_database'
>>> cc2jl('GarbageCollection')
'garbage_collection'
>>> cc2jl('AAAAAAAAAAAAAAAAA')
'aaaaaaaaaaaaaaaaa'

edited Jan 25 at 23:06

Jamalâ™¦

30.1k11114225

asked Jan 25 at 11:56

Richard Neumann

1,761620

edited Jan 25 at 23:06

Jamalâ™¦

30.1k11114225

edited Jan 25 at 23:06

Jamalâ™¦

30.1k11114225

edited Jan 25 at 23:06

Jamalâ™¦

30.1k11114225

asked Jan 25 at 11:56

Richard Neumann

1,761620

asked Jan 25 at 11:56

Richard Neumann

1,761620

asked Jan 25 at 11:56

Richard Neumann

1,761620

add a commentÂ |Â

4 Answers
4

active

oldest

votes

up vote
12
down vote

accepted

I think regex would be the easiest to do this with. You just need to find each word in the string.

The name starts with an uppercase letter. [A-Z]. However due to the way the rest of the regex works, we can change this to ., so that we match all words, even ones that start with _.

The word will either contain uppercase or lowercase letters, both with other characters - excluding _.
- Uppercase:
  1. The word won't be lowercase or have a _. [^a-z_]+
  2. You don't want the last uppercase letter or a _. (?=[A-Z_])
  3. You want the above if it's the last in the string. (?=[A-Z_]|$)
- Lowercase:
  1. The word won't be uppercase or have a _. [^A-Z_]+

And so you can use:

(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))

You then want to apply the following to these:

Prepend the name with a single _, unless:
- It's the first word in the name
- The word already starts with a _

Convert the word to lowercase

Making:

def _jl_match(match):
 group = match.group()
 prefix = bool(match.start() and not group.startswith('_'))
 return '_' * prefix + group.lower()

And so I'd use:

def _jl_match(match):
 group = match.group()
 prefix = bool(match.start() and not group.startswith('_'))
 return '_' * prefix + group.lower()

REGEX = r'(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))'

def _cc2jl(string):
 return re.subn(REGEX, _jl_match, string)[0]

edited Jan 25 at 16:24

answered Jan 25 at 16:17

Peilonrayz

24.4k336102

Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â€“Â Graipher
Jan 25 at 19:50

2

Here is a comparison of various regexes that could be used.
â€“Â 200_success
Jan 25 at 19:52

I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â€¦
â€“Â Richard Neumann
Jan 26 at 9:02

1

@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â€“Â Peilonrayz
Jan 26 at 11:33

2

@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â€“Â Peilonrayz
Jan 26 at 12:07

Â |Â
show 7 more comments

up vote
16
down vote

A couple of things:

Why 2 functions?

Now that we established (1), you can get rid of the yields.
yield is useful when you want to process stuff in chunks and do not want to wait for the entire population to be created first. This is not the case here.

There are some continues that do make sense and some that are redundant. Yours are of the latter type.

Nested if statements without corresponding else clauses can be merged with and if readability is not reduced.

Putting all that together, we get the following:

def cc2jl(my_str):
 """Camel case to joint-lower"""

 r = my_str[0].lower()
 for i, letter in enumerate(my_str[1:], 1):
 if letter.isupper():
 try:
 if my_str[i-1].islower() or my_str[i+1].islower():
 r += '_'
 except IndexError:
 pass
 r += letter.lower()
 return r

print(cc2jl('m')) #-> m
print(cc2jl('AA')) #-> aa
print(cc2jl('MySQLDatabase')) #-> my_sql_database
print(cc2jl('GarbageCollection')) #-> garbage_collection
print(cc2jl('AAAAAAAAAAAAAAAAA')) #-> aaaaaaaaaaaaaaaaa

Finally, some thoughts on avoiding the try block that bothers me personally.

We can leverage from the short-circuiting used in Python's logical expressions and (re-)write the following:

def cc2jl(my_str):
 """Camel case to joint-lower"""

 r = my_str[0].lower()
 for i, letter in enumerate(my_str[1:], 1):
 if letter.isupper():
 if my_str[i-1].islower() or (i != len(my_str)-1 and my_str[i+1].islower()):
 r += '_'
 r += letter.lower()
 return r

Note the i != len(my_str)-1 on the left side of the and. If this returns False the my_str[i+1] is not evaluated at all and thus cannot raise the IndexError.

edited Jan 25 at 14:23

answered Jan 25 at 14:04

Ev. Kounis

620313

Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â€“Â Graipher
Jan 25 at 14:07

@Graipher The thing that I do not like the most about my approach is the try-except block but couldn't think of something more elegant.
â€“Â Ev. Kounis
Jan 25 at 14:11

You could replace it with if not 0 < i < len(my_str) - 1, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.
â€“Â Graipher
Jan 25 at 14:15

2

@Graipher the 0 I do not need since I am iterating over my_str[1:].I tried going over my_str[1:-1] instead and handling the last letter in the return line but that wasn't any better either. Anyway, thanks!
â€“Â Ev. Kounis
Jan 25 at 14:17

add a commentÂ |Â

up vote
12
down vote

First a note on naming: cc2jl is a very cryptic name. Give the public function a clearer name, like to_snake_case or something similar (note that the function does not do anything to a string that already is in snake_case).

While I quite like the fact that you used a generator approach, which guarantees that you make only on pass over the string, this seems like the perfect place for a regular expression. Some quick googling finds this rather simple function:

import re

def convert(name):
 s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
 return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()

This is definitely shorter, even though that cryptic regex is not very easy to understand. But following your code, while written quite straightforward, is also not easy, especially with the deep nesting.

It works by first splitting runs of upper-case letters followed by one or more lowercase letters so that the last upper-case letter goes to the lower-case letter run. So "ABCdef" becomes "AB_Cdef".

Then it separates runs of not-upper-case letters, followed by a single upper-case letter with a "_", so "abcD" becomes "abc_D".

To get a bit more performance out of this, you should pre-compile the regexes:

first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
 s1 = first_cap_re.sub(r'1_2', name)
 return all_cap_re.sub(r'1_2', s1).lower()

Performance wise, all algorithms are similar:

enter image description here

Not quite surprising, they all give a linear behavior (as a function of string length).

My regex version needs to make two passes over the string and is therefore consistently slower. The function written by @Ev.Kounis in his answer outperforms both of ours, but the regex approach by @Peilonrayz is even faster, because it manages to do only one pass, but with a regex.

Note that all functions are very fast, so as long as you need this less than a few thousand times per second, any is fine.

The test strings were generated with this code:

import random
import string

strings = [''.join(random.choice(string.ascii_letters) for _ in range(n))
 for n in range(10, 2000, 20)]

In Python 3 it could have been:

strings = [''.join(random.choices(string.ascii_letters, k=k))
 for k in range(10, 2000, 20)]

edited Jan 26 at 9:11

answered Jan 25 at 14:04

Graipher

20.5k43081

I have the feeling that I would love this answer if I understood regex. My mind is too small for it though. +1
â€“Â Ev. Kounis
Jan 25 at 14:08

@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â€“Â Graipher
Jan 25 at 14:13

Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â€“Â Richard Neumann
Jan 26 at 9:02

add a commentÂ |Â

up vote
1
down vote

It can be a bit shorter:

def cc2jl(s):
 return "".join(["_"+l if i and l.isupper() and not s[i-1:i+2].isupper() else l for i, l in enumerate(s)]).lower()

Regular Expression Alternative:

rx = re.compile(r"(?<=.)(((?<![A-Z])[A-Z])|([A-Z](?=[a-z])))")
def cc2jl(s):
 return rx.sub("_\1", s).lower()

edited Jan 26 at 12:10

answered Jan 25 at 22:38

M. Utku ALTINKAYA

1193

My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â€“Â M. Utku ALTINKAYA
Jan 26 at 6:14

1

@Graipher fixed
â€“Â M. Utku ALTINKAYA
Jan 26 at 8:06

1

Added your function to the timings
â€“Â Graipher
Jan 26 at 9:12

"(...) compared to regex I think it is easier to read". Nope
â€“Â Ã‘ÂÃ’Â¯ÃÂ…ÃÂº
Jan 26 at 11:24

Thanks @Graipher, I've revised that code and added a regex alternative also
â€“Â M. Utku ALTINKAYA
Jan 26 at 12:11

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f185966%2ffunctions-to-convert-camelcase-strings-to-snake-case%23new-answer', 'question_page');

);

Post as a guest

Name

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

up vote
12
down vote

accepted

I think regex would be the easiest to do this with. You just need to find each word in the string.

The name starts with an uppercase letter. [A-Z]. However due to the way the rest of the regex works, we can change this to ., so that we match all words, even ones that start with _.

The word will either contain uppercase or lowercase letters, both with other characters - excluding _.
- Uppercase:
  1. The word won't be lowercase or have a _. [^a-z_]+
  2. You don't want the last uppercase letter or a _. (?=[A-Z_])
  3. You want the above if it's the last in the string. (?=[A-Z_]|$)
- Lowercase:
  1. The word won't be uppercase or have a _. [^A-Z_]+

And so you can use:

(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))

You then want to apply the following to these:

Prepend the name with a single _, unless:
- It's the first word in the name
- The word already starts with a _

Convert the word to lowercase

Making:

def _jl_match(match):
 group = match.group()
 prefix = bool(match.start() and not group.startswith('_'))
 return '_' * prefix + group.lower()

And so I'd use:

def _jl_match(match):
 group = match.group()
 prefix = bool(match.start() and not group.startswith('_'))
 return '_' * prefix + group.lower()

REGEX = r'(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))'

def _cc2jl(string):
 return re.subn(REGEX, _jl_match, string)[0]

edited Jan 25 at 16:24

answered Jan 25 at 16:17

Peilonrayz

24.4k336102

Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â€“Â Graipher
Jan 25 at 19:50

2

Here is a comparison of various regexes that could be used.
â€“Â 200_success
Jan 25 at 19:52

I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â€¦
â€“Â Richard Neumann
Jan 26 at 9:02

1

@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â€“Â Peilonrayz
Jan 26 at 11:33

2

@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â€“Â Peilonrayz
Jan 26 at 12:07

Â |Â
show 7 more comments

up vote
12
down vote

accepted

I think regex would be the easiest to do this with. You just need to find each word in the string.

The name starts with an uppercase letter. [A-Z]. However due to the way the rest of the regex works, we can change this to ., so that we match all words, even ones that start with _.

The word will either contain uppercase or lowercase letters, both with other characters - excluding _.
- Uppercase:
  1. The word won't be lowercase or have a _. [^a-z_]+
  2. You don't want the last uppercase letter or a _. (?=[A-Z_])
  3. You want the above if it's the last in the string. (?=[A-Z_]|$)
- Lowercase:
  1. The word won't be uppercase or have a _. [^A-Z_]+

And so you can use:

(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))

You then want to apply the following to these:

Prepend the name with a single _, unless:
- It's the first word in the name
- The word already starts with a _

Convert the word to lowercase

Making:

def _jl_match(match):
 group = match.group()
 prefix = bool(match.start() and not group.startswith('_'))
 return '_' * prefix + group.lower()

And so I'd use:

def _jl_match(match):
 group = match.group()
 prefix = bool(match.start() and not group.startswith('_'))
 return '_' * prefix + group.lower()

REGEX = r'(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))'

def _cc2jl(string):
 return re.subn(REGEX, _jl_match, string)[0]

edited Jan 25 at 16:24

answered Jan 25 at 16:17

Peilonrayz

24.4k336102

Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â€“Â Graipher
Jan 25 at 19:50

2

Here is a comparison of various regexes that could be used.
â€“Â 200_success
Jan 25 at 19:52

I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â€¦
â€“Â Richard Neumann
Jan 26 at 9:02

1

@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â€“Â Peilonrayz
Jan 26 at 11:33

2

@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â€“Â Peilonrayz
Jan 26 at 12:07

Â |Â
show 7 more comments

up vote
12
down vote

accepted

I think regex would be the easiest to do this with. You just need to find each word in the string.

The name starts with an uppercase letter. [A-Z]. However due to the way the rest of the regex works, we can change this to ., so that we match all words, even ones that start with _.

The word will either contain uppercase or lowercase letters, both with other characters - excluding _.
- Uppercase:
  1. The word won't be lowercase or have a _. [^a-z_]+
  2. You don't want the last uppercase letter or a _. (?=[A-Z_])
  3. You want the above if it's the last in the string. (?=[A-Z_]|$)
- Lowercase:
  1. The word won't be uppercase or have a _. [^A-Z_]+

And so you can use:

(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))

You then want to apply the following to these:

Prepend the name with a single _, unless:
- It's the first word in the name
- The word already starts with a _

Convert the word to lowercase

Making:

def _jl_match(match):
 group = match.group()
 prefix = bool(match.start() and not group.startswith('_'))
 return '_' * prefix + group.lower()

And so I'd use:

def _jl_match(match):
 group = match.group()
 prefix = bool(match.start() and not group.startswith('_'))
 return '_' * prefix + group.lower()

REGEX = r'(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))'

def _cc2jl(string):
 return re.subn(REGEX, _jl_match, string)[0]

edited Jan 25 at 16:24

answered Jan 25 at 16:17

Peilonrayz

24.4k336102

I think regex would be the easiest to do this with. You just need to find each word in the string.

The name starts with an uppercase letter. [A-Z]. However due to the way the rest of the regex works, we can change this to ., so that we match all words, even ones that start with _.

The word will either contain uppercase or lowercase letters, both with other characters - excluding _.
- Uppercase:
  1. The word won't be lowercase or have a _. [^a-z_]+
  2. You don't want the last uppercase letter or a _. (?=[A-Z_])
  3. You want the above if it's the last in the string. (?=[A-Z_]|$)
- Lowercase:
  1. The word won't be uppercase or have a _. [^A-Z_]+

And so you can use:

(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))

You then want to apply the following to these:

Prepend the name with a single _, unless:
- It's the first word in the name
- The word already starts with a _

Convert the word to lowercase

Making:

def _jl_match(match):
 group = match.group()
 prefix = bool(match.start() and not group.startswith('_'))
 return '_' * prefix + group.lower()

And so I'd use:

def _jl_match(match):
 group = match.group()
 prefix = bool(match.start() and not group.startswith('_'))
 return '_' * prefix + group.lower()

REGEX = r'(.(?:[^a-z_]+(?=[A-Z_]|$)|[^A-Z_]+))'

def _cc2jl(string):
 return re.subn(REGEX, _jl_match, string)[0]

edited Jan 25 at 16:24

answered Jan 25 at 16:17

Peilonrayz

24.4k336102

edited Jan 25 at 16:24

answered Jan 25 at 16:17

Peilonrayz

24.4k336102

answered Jan 25 at 16:17

Peilonrayz

24.4k336102

answered Jan 25 at 16:17

Peilonrayz

24.4k336102

Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â€“Â Graipher
Jan 25 at 19:50

2

Here is a comparison of various regexes that could be used.
â€“Â 200_success
Jan 25 at 19:52

I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â€¦
â€“Â Richard Neumann
Jan 26 at 9:02

1

@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â€“Â Peilonrayz
Jan 26 at 11:33

2

@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â€“Â Peilonrayz
Jan 26 at 12:07

Â |Â
show 7 more comments

Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â€“Â Graipher
Jan 25 at 19:50

2

Here is a comparison of various regexes that could be used.
â€“Â 200_success
Jan 25 at 19:52

I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â€¦
â€“Â Richard Neumann
Jan 26 at 9:02

1

@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â€“Â Peilonrayz
Jan 26 at 11:33

2

@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â€“Â Peilonrayz
Jan 26 at 12:07

Nice, I knew there should be a way to do it with a single regex, but my regex-fu was too weak to come up with one...I added your function to the timing plot, and it is the fastest (so far). I did modify it so that the regex is compiled first.
â€“Â Graipher
Jan 25 at 19:50

Here is a comparison of various regexes that could be used.
â€“Â 200_success
Jan 25 at 19:52

I went with your solution because of its simplicity and performance: github.com/HOMEINFO/strflib/commit/â€¦
â€“Â Richard Neumann
Jan 26 at 9:02

@RichardNeumann Thank you for that. But it wasn't the BY part you violated - and I'm happy you didn't :) It was the SA part. If SE used CC-BY-SA 4.0 and you relicensed to GPLv3, then there wouldn't be any violation. but they use CC-BY-SA 3.0, ):
â€“Â Peilonrayz
Jan 26 at 11:33

@RichardNeumann That's a good point. I wonder if you can do that... IANAL too, so I guess it's time to go to Law.SE, ;P
â€“Â Peilonrayz
Jan 26 at 12:07

Â |Â
show 7 more comments

up vote
16
down vote

A couple of things:

Why 2 functions?

Now that we established (1), you can get rid of the yields.
yield is useful when you want to process stuff in chunks and do not want to wait for the entire population to be created first. This is not the case here.

There are some continues that do make sense and some that are redundant. Yours are of the latter type.

Nested if statements without corresponding else clauses can be merged with and if readability is not reduced.

Putting all that together, we get the following:

def cc2jl(my_str):
 """Camel case to joint-lower"""

 r = my_str[0].lower()
 for i, letter in enumerate(my_str[1:], 1):
 if letter.isupper():
 try:
 if my_str[i-1].islower() or my_str[i+1].islower():
 r += '_'
 except IndexError:
 pass
 r += letter.lower()
 return r

print(cc2jl('m')) #-> m
print(cc2jl('AA')) #-> aa
print(cc2jl('MySQLDatabase')) #-> my_sql_database
print(cc2jl('GarbageCollection')) #-> garbage_collection
print(cc2jl('AAAAAAAAAAAAAAAAA')) #-> aaaaaaaaaaaaaaaaa

Finally, some thoughts on avoiding the try block that bothers me personally.

We can leverage from the short-circuiting used in Python's logical expressions and (re-)write the following:

def cc2jl(my_str):
 """Camel case to joint-lower"""

 r = my_str[0].lower()
 for i, letter in enumerate(my_str[1:], 1):
 if letter.isupper():
 if my_str[i-1].islower() or (i != len(my_str)-1 and my_str[i+1].islower()):
 r += '_'
 r += letter.lower()
 return r

Note the i != len(my_str)-1 on the left side of the and. If this returns False the my_str[i+1] is not evaluated at all and thus cannot raise the IndexError.

edited Jan 25 at 14:23

answered Jan 25 at 14:04

Ev. Kounis

620313

Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â€“Â Graipher
Jan 25 at 14:07

@Graipher The thing that I do not like the most about my approach is the try-except block but couldn't think of something more elegant.
â€“Â Ev. Kounis
Jan 25 at 14:11

You could replace it with if not 0 < i < len(my_str) - 1, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.
â€“Â Graipher
Jan 25 at 14:15

2

@Graipher the 0 I do not need since I am iterating over my_str[1:].I tried going over my_str[1:-1] instead and handling the last letter in the return line but that wasn't any better either. Anyway, thanks!
â€“Â Ev. Kounis
Jan 25 at 14:17

add a commentÂ |Â

up vote
16
down vote

A couple of things:

Why 2 functions?

Now that we established (1), you can get rid of the yields.
yield is useful when you want to process stuff in chunks and do not want to wait for the entire population to be created first. This is not the case here.

There are some continues that do make sense and some that are redundant. Yours are of the latter type.

Nested if statements without corresponding else clauses can be merged with and if readability is not reduced.

Putting all that together, we get the following:

def cc2jl(my_str):
 """Camel case to joint-lower"""

 r = my_str[0].lower()
 for i, letter in enumerate(my_str[1:], 1):
 if letter.isupper():
 try:
 if my_str[i-1].islower() or my_str[i+1].islower():
 r += '_'
 except IndexError:
 pass
 r += letter.lower()
 return r

print(cc2jl('m')) #-> m
print(cc2jl('AA')) #-> aa
print(cc2jl('MySQLDatabase')) #-> my_sql_database
print(cc2jl('GarbageCollection')) #-> garbage_collection
print(cc2jl('AAAAAAAAAAAAAAAAA')) #-> aaaaaaaaaaaaaaaaa

Finally, some thoughts on avoiding the try block that bothers me personally.

We can leverage from the short-circuiting used in Python's logical expressions and (re-)write the following:

def cc2jl(my_str):
 """Camel case to joint-lower"""

 r = my_str[0].lower()
 for i, letter in enumerate(my_str[1:], 1):
 if letter.isupper():
 if my_str[i-1].islower() or (i != len(my_str)-1 and my_str[i+1].islower()):
 r += '_'
 r += letter.lower()
 return r

Note the i != len(my_str)-1 on the left side of the and. If this returns False the my_str[i+1] is not evaluated at all and thus cannot raise the IndexError.

edited Jan 25 at 14:23

answered Jan 25 at 14:04

Ev. Kounis

620313

Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â€“Â Graipher
Jan 25 at 14:07

@Graipher The thing that I do not like the most about my approach is the try-except block but couldn't think of something more elegant.
â€“Â Ev. Kounis
Jan 25 at 14:11

You could replace it with if not 0 < i < len(my_str) - 1, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.
â€“Â Graipher
Jan 25 at 14:15

2

@Graipher the 0 I do not need since I am iterating over my_str[1:].I tried going over my_str[1:-1] instead and handling the last letter in the return line but that wasn't any better either. Anyway, thanks!
â€“Â Ev. Kounis
Jan 25 at 14:17

add a commentÂ |Â

up vote
16
down vote

A couple of things:

Why 2 functions?

Now that we established (1), you can get rid of the yields.
yield is useful when you want to process stuff in chunks and do not want to wait for the entire population to be created first. This is not the case here.

There are some continues that do make sense and some that are redundant. Yours are of the latter type.

Nested if statements without corresponding else clauses can be merged with and if readability is not reduced.

Putting all that together, we get the following:

def cc2jl(my_str):
 """Camel case to joint-lower"""

 r = my_str[0].lower()
 for i, letter in enumerate(my_str[1:], 1):
 if letter.isupper():
 try:
 if my_str[i-1].islower() or my_str[i+1].islower():
 r += '_'
 except IndexError:
 pass
 r += letter.lower()
 return r

print(cc2jl('m')) #-> m
print(cc2jl('AA')) #-> aa
print(cc2jl('MySQLDatabase')) #-> my_sql_database
print(cc2jl('GarbageCollection')) #-> garbage_collection
print(cc2jl('AAAAAAAAAAAAAAAAA')) #-> aaaaaaaaaaaaaaaaa

Finally, some thoughts on avoiding the try block that bothers me personally.

We can leverage from the short-circuiting used in Python's logical expressions and (re-)write the following:

def cc2jl(my_str):
 """Camel case to joint-lower"""

 r = my_str[0].lower()
 for i, letter in enumerate(my_str[1:], 1):
 if letter.isupper():
 if my_str[i-1].islower() or (i != len(my_str)-1 and my_str[i+1].islower()):
 r += '_'
 r += letter.lower()
 return r

Note the i != len(my_str)-1 on the left side of the and. If this returns False the my_str[i+1] is not evaluated at all and thus cannot raise the IndexError.

edited Jan 25 at 14:23

answered Jan 25 at 14:04

Ev. Kounis

620313

A couple of things:

Why 2 functions?

Now that we established (1), you can get rid of the yields.
yield is useful when you want to process stuff in chunks and do not want to wait for the entire population to be created first. This is not the case here.

There are some continues that do make sense and some that are redundant. Yours are of the latter type.

Nested if statements without corresponding else clauses can be merged with and if readability is not reduced.

Putting all that together, we get the following:

def cc2jl(my_str):
 """Camel case to joint-lower"""

 r = my_str[0].lower()
 for i, letter in enumerate(my_str[1:], 1):
 if letter.isupper():
 try:
 if my_str[i-1].islower() or my_str[i+1].islower():
 r += '_'
 except IndexError:
 pass
 r += letter.lower()
 return r

print(cc2jl('m')) #-> m
print(cc2jl('AA')) #-> aa
print(cc2jl('MySQLDatabase')) #-> my_sql_database
print(cc2jl('GarbageCollection')) #-> garbage_collection
print(cc2jl('AAAAAAAAAAAAAAAAA')) #-> aaaaaaaaaaaaaaaaa

Finally, some thoughts on avoiding the try block that bothers me personally.

We can leverage from the short-circuiting used in Python's logical expressions and (re-)write the following:

def cc2jl(my_str):
 """Camel case to joint-lower"""

 r = my_str[0].lower()
 for i, letter in enumerate(my_str[1:], 1):
 if letter.isupper():
 if my_str[i-1].islower() or (i != len(my_str)-1 and my_str[i+1].islower()):
 r += '_'
 r += letter.lower()
 return r

Note the i != len(my_str)-1 on the left side of the and. If this returns False the my_str[i+1] is not evaluated at all and thus cannot raise the IndexError.

edited Jan 25 at 14:23

answered Jan 25 at 14:04

Ev. Kounis

620313

edited Jan 25 at 14:23

answered Jan 25 at 14:04

Ev. Kounis

620313

answered Jan 25 at 14:04

Ev. Kounis

620313

answered Jan 25 at 14:04

Ev. Kounis

620313

Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â€“Â Graipher
Jan 25 at 14:07

@Graipher The thing that I do not like the most about my approach is the try-except block but couldn't think of something more elegant.
â€“Â Ev. Kounis
Jan 25 at 14:11

You could replace it with if not 0 < i < len(my_str) - 1, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.
â€“Â Graipher
Jan 25 at 14:15

2

@Graipher the 0 I do not need since I am iterating over my_str[1:].I tried going over my_str[1:-1] instead and handling the last letter in the return line but that wasn't any better either. Anyway, thanks!
â€“Â Ev. Kounis
Jan 25 at 14:17

add a commentÂ |Â

Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â€“Â Graipher
Jan 25 at 14:07

@Graipher The thing that I do not like the most about my approach is the try-except block but couldn't think of something more elegant.
â€“Â Ev. Kounis
Jan 25 at 14:11

You could replace it with if not 0 < i < len(my_str) - 1, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.
â€“Â Graipher
Jan 25 at 14:15

2

@Graipher the 0 I do not need since I am iterating over my_str[1:].I tried going over my_str[1:-1] instead and handling the last letter in the return line but that wasn't any better either. Anyway, thanks!
â€“Â Ev. Kounis
Jan 25 at 14:17

Nice job cleaning up OP's code! I added your function to the timing plot in my answer, it is faster than my regex function and the OP's function (probably due to getting rid of the overhead of the generator).
â€“Â Graipher
Jan 25 at 14:07

@Graipher The thing that I do not like the most about my approach is the try-except block but couldn't think of something more elegant.
â€“Â Ev. Kounis
Jan 25 at 14:11

You could replace it with if not 0 < i < len(my_str) - 1, but I'm not sure that is any better, because it needs to check it every iteration, instead of only failing at the beginning or end of the string.
â€“Â Graipher
Jan 25 at 14:15

@Graipher the 0 I do not need since I am iterating over my_str[1:].I tried going over my_str[1:-1] instead and handling the last letter in the return line but that wasn't any better either. Anyway, thanks!
â€“Â Ev. Kounis
Jan 25 at 14:17

add a commentÂ |Â

up vote
12
down vote

import re

def convert(name):
 s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
 return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()

Then it separates runs of not-upper-case letters, followed by a single upper-case letter with a "_", so "abcD" becomes "abc_D".

To get a bit more performance out of this, you should pre-compile the regexes:

first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
 s1 = first_cap_re.sub(r'1_2', name)
 return all_cap_re.sub(r'1_2', s1).lower()

Performance wise, all algorithms are similar:

enter image description here

Not quite surprising, they all give a linear behavior (as a function of string length).

Note that all functions are very fast, so as long as you need this less than a few thousand times per second, any is fine.

The test strings were generated with this code:

import random
import string

strings = [''.join(random.choice(string.ascii_letters) for _ in range(n))
 for n in range(10, 2000, 20)]

In Python 3 it could have been:

strings = [''.join(random.choices(string.ascii_letters, k=k))
 for k in range(10, 2000, 20)]

edited Jan 26 at 9:11

answered Jan 25 at 14:04

Graipher

20.5k43081

I have the feeling that I would love this answer if I understood regex. My mind is too small for it though. +1
â€“Â Ev. Kounis
Jan 25 at 14:08

@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â€“Â Graipher
Jan 25 at 14:13

Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â€“Â Richard Neumann
Jan 26 at 9:02

add a commentÂ |Â

up vote
12
down vote

import re

def convert(name):
 s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
 return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()

Then it separates runs of not-upper-case letters, followed by a single upper-case letter with a "_", so "abcD" becomes "abc_D".

To get a bit more performance out of this, you should pre-compile the regexes:

first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
 s1 = first_cap_re.sub(r'1_2', name)
 return all_cap_re.sub(r'1_2', s1).lower()

Performance wise, all algorithms are similar:

enter image description here

Not quite surprising, they all give a linear behavior (as a function of string length).

Note that all functions are very fast, so as long as you need this less than a few thousand times per second, any is fine.

The test strings were generated with this code:

import random
import string

strings = [''.join(random.choice(string.ascii_letters) for _ in range(n))
 for n in range(10, 2000, 20)]

In Python 3 it could have been:

strings = [''.join(random.choices(string.ascii_letters, k=k))
 for k in range(10, 2000, 20)]

edited Jan 26 at 9:11

answered Jan 25 at 14:04

Graipher

20.5k43081

I have the feeling that I would love this answer if I understood regex. My mind is too small for it though. +1
â€“Â Ev. Kounis
Jan 25 at 14:08

@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â€“Â Graipher
Jan 25 at 14:13

Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â€“Â Richard Neumann
Jan 26 at 9:02

add a commentÂ |Â

up vote
12
down vote

import re

def convert(name):
 s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
 return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()

Then it separates runs of not-upper-case letters, followed by a single upper-case letter with a "_", so "abcD" becomes "abc_D".

To get a bit more performance out of this, you should pre-compile the regexes:

first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
 s1 = first_cap_re.sub(r'1_2', name)
 return all_cap_re.sub(r'1_2', s1).lower()

Performance wise, all algorithms are similar:

enter image description here

Not quite surprising, they all give a linear behavior (as a function of string length).

Note that all functions are very fast, so as long as you need this less than a few thousand times per second, any is fine.

The test strings were generated with this code:

import random
import string

strings = [''.join(random.choice(string.ascii_letters) for _ in range(n))
 for n in range(10, 2000, 20)]

In Python 3 it could have been:

strings = [''.join(random.choices(string.ascii_letters, k=k))
 for k in range(10, 2000, 20)]

edited Jan 26 at 9:11

answered Jan 25 at 14:04

Graipher

20.5k43081

import re

def convert(name):
 s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
 return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()

Then it separates runs of not-upper-case letters, followed by a single upper-case letter with a "_", so "abcD" becomes "abc_D".

To get a bit more performance out of this, you should pre-compile the regexes:

first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
 s1 = first_cap_re.sub(r'1_2', name)
 return all_cap_re.sub(r'1_2', s1).lower()

Performance wise, all algorithms are similar:

enter image description here

Not quite surprising, they all give a linear behavior (as a function of string length).

Note that all functions are very fast, so as long as you need this less than a few thousand times per second, any is fine.

The test strings were generated with this code:

import random
import string

strings = [''.join(random.choice(string.ascii_letters) for _ in range(n))
 for n in range(10, 2000, 20)]

In Python 3 it could have been:

strings = [''.join(random.choices(string.ascii_letters, k=k))
 for k in range(10, 2000, 20)]

edited Jan 26 at 9:11

answered Jan 25 at 14:04

Graipher

20.5k43081

edited Jan 26 at 9:11

answered Jan 25 at 14:04

Graipher

20.5k43081

answered Jan 25 at 14:04

Graipher

20.5k43081

answered Jan 25 at 14:04

Graipher

20.5k43081

I have the feeling that I would love this answer if I understood regex. My mind is too small for it though. +1
â€“Â Ev. Kounis
Jan 25 at 14:08

@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â€“Â Graipher
Jan 25 at 14:13

Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â€“Â Richard Neumann
Jan 26 at 9:02

add a commentÂ |Â

I have the feeling that I would love this answer if I understood regex. My mind is too small for it though. +1
â€“Â Ev. Kounis
Jan 25 at 14:08

@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â€“Â Graipher
Jan 25 at 14:13

Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â€“Â Richard Neumann
Jan 26 at 9:02

I have the feeling that I would love this answer if I understood regex. My mind is too small for it though. +1
â€“Â Ev. Kounis
Jan 25 at 14:08

@Ev.Kounis I totally agree (since I did not come up with it myself, see the link for the source). Here is at least some explanation: The first substitution takes care of continuous runs of caps (so it splits of the last capital letter from a run of capital letters), while the last substitution splits runs of lower case letters followed by an upper case letter.
â€“Â Graipher
Jan 25 at 14:13

Thank you for this insightful analysis and performance measurement. Because of the latter, I accepted @peilonrayz solution.
â€“Â Richard Neumann
Jan 26 at 9:02

add a commentÂ |Â

up vote
1
down vote

It can be a bit shorter:

def cc2jl(s):
 return "".join(["_"+l if i and l.isupper() and not s[i-1:i+2].isupper() else l for i, l in enumerate(s)]).lower()

Regular Expression Alternative:

rx = re.compile(r"(?<=.)(((?<![A-Z])[A-Z])|([A-Z](?=[a-z])))")
def cc2jl(s):
 return rx.sub("_\1", s).lower()

edited Jan 26 at 12:10

answered Jan 25 at 22:38

M. Utku ALTINKAYA

1193

My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â€“Â M. Utku ALTINKAYA
Jan 26 at 6:14

1

@Graipher fixed
â€“Â M. Utku ALTINKAYA
Jan 26 at 8:06

1

Added your function to the timings
â€“Â Graipher
Jan 26 at 9:12

"(...) compared to regex I think it is easier to read". Nope
â€“Â Ã‘ÂÃ’Â¯ÃÂ…ÃÂº
Jan 26 at 11:24

Thanks @Graipher, I've revised that code and added a regex alternative also
â€“Â M. Utku ALTINKAYA
Jan 26 at 12:11

add a commentÂ |Â

up vote
1
down vote

It can be a bit shorter:

def cc2jl(s):
 return "".join(["_"+l if i and l.isupper() and not s[i-1:i+2].isupper() else l for i, l in enumerate(s)]).lower()

Regular Expression Alternative:

rx = re.compile(r"(?<=.)(((?<![A-Z])[A-Z])|([A-Z](?=[a-z])))")
def cc2jl(s):
 return rx.sub("_\1", s).lower()

edited Jan 26 at 12:10

answered Jan 25 at 22:38

M. Utku ALTINKAYA

1193

My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â€“Â M. Utku ALTINKAYA
Jan 26 at 6:14

1

@Graipher fixed
â€“Â M. Utku ALTINKAYA
Jan 26 at 8:06

1

Added your function to the timings
â€“Â Graipher
Jan 26 at 9:12

"(...) compared to regex I think it is easier to read". Nope
â€“Â Ã‘ÂÃ’Â¯ÃÂ…ÃÂº
Jan 26 at 11:24

Thanks @Graipher, I've revised that code and added a regex alternative also
â€“Â M. Utku ALTINKAYA
Jan 26 at 12:11

add a commentÂ |Â

up vote
1
down vote

It can be a bit shorter:

def cc2jl(s):
 return "".join(["_"+l if i and l.isupper() and not s[i-1:i+2].isupper() else l for i, l in enumerate(s)]).lower()

Regular Expression Alternative:

rx = re.compile(r"(?<=.)(((?<![A-Z])[A-Z])|([A-Z](?=[a-z])))")
def cc2jl(s):
 return rx.sub("_\1", s).lower()

edited Jan 26 at 12:10

answered Jan 25 at 22:38

M. Utku ALTINKAYA

1193

It can be a bit shorter:

def cc2jl(s):
 return "".join(["_"+l if i and l.isupper() and not s[i-1:i+2].isupper() else l for i, l in enumerate(s)]).lower()

Regular Expression Alternative:

rx = re.compile(r"(?<=.)(((?<![A-Z])[A-Z])|([A-Z](?=[a-z])))")
def cc2jl(s):
 return rx.sub("_\1", s).lower()

edited Jan 26 at 12:10

answered Jan 25 at 22:38

M. Utku ALTINKAYA

1193

edited Jan 26 at 12:10

answered Jan 25 at 22:38

M. Utku ALTINKAYA

1193

answered Jan 25 at 22:38

M. Utku ALTINKAYA

1193

answered Jan 25 at 22:38

M. Utku ALTINKAYA

1193

My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â€“Â M. Utku ALTINKAYA
Jan 26 at 6:14

1

@Graipher fixed
â€“Â M. Utku ALTINKAYA
Jan 26 at 8:06

1

Added your function to the timings
â€“Â Graipher
Jan 26 at 9:12

"(...) compared to regex I think it is easier to read". Nope
â€“Â Ã‘ÂÃ’Â¯ÃÂ…ÃÂº
Jan 26 at 11:24

Thanks @Graipher, I've revised that code and added a regex alternative also
â€“Â M. Utku ALTINKAYA
Jan 26 at 12:11

add a commentÂ |Â

My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â€“Â M. Utku ALTINKAYA
Jan 26 at 6:14

1

@Graipher fixed
â€“Â M. Utku ALTINKAYA
Jan 26 at 8:06

1

Added your function to the timings
â€“Â Graipher
Jan 26 at 9:12

"(...) compared to regex I think it is easier to read". Nope
â€“Â Ã‘ÂÃ’Â¯ÃÂ…ÃÂº
Jan 26 at 11:24

Thanks @Graipher, I've revised that code and added a regex alternative also
â€“Â M. Utku ALTINKAYA
Jan 26 at 12:11

My observation is the fact that the code is longer than it is necessary, my answer is the alternate solution. Readability is subjective, compared to regex I think it is easier to read.
â€“Â M. Utku ALTINKAYA
Jan 26 at 6:14

@Graipher fixed
â€“Â M. Utku ALTINKAYA
Jan 26 at 8:06

Added your function to the timings
â€“Â Graipher
Jan 26 at 9:12

"(...) compared to regex I think it is easier to read". Nope
â€“Â Ã‘ÂÃ’Â¯ÃÂ…ÃÂº
Jan 26 at 11:24

Thanks @Graipher, I've revised that code and added a regex alternative also
â€“Â M. Utku ALTINKAYA
Jan 26 at 12:11

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr