-
Notifications
You must be signed in to change notification settings - Fork 69
Performance tweaks for to_camel_case #131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WORD_RE = re.compile(r'^([a-z]+\d*)$', re.IGNORECASE) | ||
|
||
SNAKE_CASE_RE = re.compile(r'^([a-zA-Z]+\d*_[a-zA-Z\d_]*|_+[a-zA-Z\d]+[a-zA-Z\d_]*)$') | ||
WORD_RE = re.compile(r'^([a-zA-Z]+\d*)$') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that both of these could be simplified/combined into a single shorter regular expression, (_[a-zA-Z\d]|[a-zA-Z])\w*
, but that would be a larger change and perhaps there's a reason for wanting to separate them. I'm also unclear about whether digits are allowed in the middle of letters, the WORD_RE seems to indicate they're not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @pamelafox.
@YunchuWang - We can take this experience to simplify out regexes.
Codecov Report
@@ Coverage Diff @@
## dev #131 +/- ##
==========================================
- Coverage 90.09% 86.04% -4.05%
==========================================
Files 51 50 -1
Lines 2959 2903 -56
Branches 396 391 -5
==========================================
- Hits 2666 2498 -168
- Misses 218 329 +111
- Partials 75 76 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
WORD_RE = re.compile(r'^([a-z]+\d*)$', re.IGNORECASE) | ||
|
||
SNAKE_CASE_RE = re.compile(r'^([a-zA-Z]+\d*_[a-zA-Z\d_]*|_+[a-zA-Z\d]+[a-zA-Z\d_]*)$') | ||
WORD_RE = re.compile(r'^([a-zA-Z]+\d*)$') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @pamelafox.
@YunchuWang - We can take this experience to simplify out regexes.
This PR includes two performance tweaks for the
to_camel_case
function:It replaces the generator expression passed to
str.join()
with a list comprehension. The list comprehension is actually 10-15% faster asstr.join()
turns its arguments into a list anyway (if they're not already). More details in this benchmark test: Add benchmark comparing join of gen expression vs join of list comp tonybaloney/anti-patterns#4It removes the re.IGNORECASE flag and instead specifies a-zA-Z instead of a-z. That seems to be 20-25% faster according to benchmark in Add a benchmark for regex related practices tonybaloney/anti-patterns#5