Skip to content

Remove test broken by Python CVE-2021-23336 fix #165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

andersk
Copy link

@andersk andersk commented Mar 1, 2021

In Python 3.6.13, 3.7.10, 3.8.8, and 3.9.2, urllib.parse.parse_qsl no longer treats ; as a separator by default.

Fixes #164.

In Python 3.6.13, 3.7.10, 3.8.8, and 3.9.2, urllib.parse.parse_qsl no
longer treats ; as a separator by default
(https://bugs.python.org/issue42967).

Fixes scrapy#164.

Signed-off-by: Anders Kaseorg <[email protected]>
@codecov
Copy link

codecov bot commented Mar 1, 2021

Codecov Report

Merging #165 (78054f1) into master (910ff63) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #165   +/-   ##
=======================================
  Coverage   95.44%   95.44%           
=======================================
  Files           7        7           
  Lines         483      483           
  Branches       99       99           
=======================================
  Hits          461      461           
  Misses         15       15           
  Partials        7        7           

@Gallaecio
Copy link
Member

I think it would be best to keep the test with updated expectations depending on the Python version being used, and include a link to https://python-security.readthedocs.io/vuln/urllib-query-string-semicolon-separator.html in a comment.

Based on https://bugs.python.org/issue42967#msg387756:

url = 'http://domain/test?arg1=v1;arg2=v2'
# https://python-security.readthedocs.io/vuln/urllib-query-string-semicolon-separator.html
if 'separator' in inspect.signature(urllib.parse.parse_qs).parameters:
    expected_url = 'http://domain/test?arg1=v3'
else:
    expected_url = 'http://domain/test?arg1=v3&arg2=v2'
self.assertEqual(add_or_replace_parameter(url, 'arg1', 'v3'), expected_url)

@andersk
Copy link
Author

andersk commented Mar 2, 2021

TBH I don’t see the advantage in asserting that the insecure version is insecure, but if you do, feel free to modify.

@wRAR
Copy link
Member

wRAR commented Mar 11, 2021

I'm in favor of just removing the test, especially as we won't run it with non-patched Python versions on CI anyway. I see #166 instead moves it to an xfail test which is fine for me too.

@Gallaecio
Copy link
Member

I don’t think tests should fail if run on a Python version without the security patch. We could force an unpatched Python version in CI to make sure we test both behaviors.

I don’t have a strong opinion on it, though. I’m fine with xfail or removing.

And just to be clear, I think we should eventually extend w3lib to support the old behavior, because I expect corner cases where users may need it. Not just the new option, which allows to choose between & and ;, but support for interpreting either as a parameter separator.

In fact, I expect w3lib users like Scrapy to go for the old behavior by default. I don’t think Scrapy is affected by the vulnerability, but not supporting the flexibility of the standard can potentially require a custom middleware to deal with these characters in order to crawl some websites.

@wRAR
Copy link
Member

wRAR commented Mar 11, 2021

It's an interesting point about supporting old behavior.

@Gallaecio
Copy link
Member

This was technically fixed by #166 with the xfail approach.

@Gallaecio Gallaecio closed this Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test_add_or_replace_parameter fails on Python 3.6.13, 3.7.10, 3.8.8, 3.9.2 due to CVE-2021-23336 fix
3 participants