Skip to content

Performance tweaks for to_camel_case #131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 23, 2022
Merged

Conversation

pamelafox
Copy link
Member

This PR includes two performance tweaks for the to_camel_case function:

  1. It replaces the generator expression passed to str.join() with a list comprehension. The list comprehension is actually 10-15% faster as str.join() turns its arguments into a list anyway (if they're not already). More details in this benchmark test: Add benchmark comparing join of gen expression vs join of list comp tonybaloney/anti-patterns#4

  2. It removes the re.IGNORECASE flag and instead specifies a-zA-Z instead of a-z. That seems to be 20-25% faster according to benchmark in Add a benchmark for regex related practices tonybaloney/anti-patterns#5

WORD_RE = re.compile(r'^([a-z]+\d*)$', re.IGNORECASE)

SNAKE_CASE_RE = re.compile(r'^([a-zA-Z]+\d*_[a-zA-Z\d_]*|_+[a-zA-Z\d]+[a-zA-Z\d_]*)$')
WORD_RE = re.compile(r'^([a-zA-Z]+\d*)$')
Copy link
Member Author

@pamelafox pamelafox May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that both of these could be simplified/combined into a single shorter regular expression, (_[a-zA-Z\d]|[a-zA-Z])\w*, but that would be a larger change and perhaps there's a reason for wanting to separate them. I'm also unclear about whether digits are allowed in the middle of letters, the WORD_RE seems to indicate they're not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @pamelafox.
@YunchuWang - We can take this experience to simplify out regexes.

@codecov
Copy link

codecov bot commented May 25, 2022

Codecov Report

Merging #131 (cc1101d) into dev (6271020) will decrease coverage by 4.04%.
The diff coverage is 100.00%.

❗ Current head cc1101d differs from pull request most recent head e283dd1. Consider uploading reports for the commit e283dd1 to get more accurate results

@@            Coverage Diff             @@
##              dev     #131      +/-   ##
==========================================
- Coverage   90.09%   86.04%   -4.05%     
==========================================
  Files          51       50       -1     
  Lines        2959     2903      -56     
  Branches      396      391       -5     
==========================================
- Hits         2666     2498     -168     
- Misses        218      329     +111     
- Partials       75       76       +1     
Flag Coverage Δ
unittests 86.01% <100.00%> (-4.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
azure/functions/decorators/utils.py 97.46% <100.00%> (ø)
azure/functions/queue.py 50.79% <0.00%> (-44.45%) ⬇️
azure/functions/http.py 53.94% <0.00%> (-30.27%) ⬇️
azure/functions/cosmosdb.py 74.35% <0.00%> (-25.65%) ⬇️
azure/functions/eventhub.py 69.46% <0.00%> (-19.09%) ⬇️
azure/functions/kafka.py 79.85% <0.00%> (-11.95%) ⬇️
azure/functions/durable_functions.py 83.33% <0.00%> (-3.04%) ⬇️
azure/functions/servicebus.py 83.85% <0.00%> (-2.09%) ⬇️
azure/functions/_http.py 91.30% <0.00%> (-1.74%) ⬇️
azure/functions/decorators/function_app.py 98.67% <0.00%> (-0.73%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6271020...e283dd1. Read the comment docs.

@pamelafox pamelafox marked this pull request as ready for review May 25, 2022 21:13
WORD_RE = re.compile(r'^([a-z]+\d*)$', re.IGNORECASE)

SNAKE_CASE_RE = re.compile(r'^([a-zA-Z]+\d*_[a-zA-Z\d_]*|_+[a-zA-Z\d]+[a-zA-Z\d_]*)$')
WORD_RE = re.compile(r'^([a-zA-Z]+\d*)$')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @pamelafox.
@YunchuWang - We can take this experience to simplify out regexes.

@vrdmr vrdmr merged commit 45e16bc into Azure:dev Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants