-
-
Notifications
You must be signed in to change notification settings - Fork 52
Fix/issue765 SPDX is_compound_expression does not strictly check for compound expression #773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,60 +18,149 @@ | |
|
||
__all__ = [ | ||
'is_supported_id', 'fixup_id', | ||
'is_compound_expression' | ||
'is_simple_expression', 'is_compound_expression', 'is_spdx_license_id', 'is_spdx_expression' | ||
] | ||
|
||
from json import load as json_load | ||
from typing import TYPE_CHECKING, Dict, Optional, Set | ||
|
||
from license_expression import get_spdx_licensing # type:ignore[import-untyped] | ||
|
||
from .schema._res import SPDX_JSON as __SPDX_JSON_SCHEMA | ||
from boolean.boolean import Expression # type:ignore[import-untyped] | ||
from license_expression import ( # type:ignore[import-untyped] | ||
AND, | ||
OR, | ||
ExpressionError, | ||
LicenseSymbol, | ||
LicenseWithExceptionSymbol, | ||
get_license_index, | ||
get_spdx_licensing, | ||
) | ||
|
||
if TYPE_CHECKING: # pragma: no cover | ||
from license_expression import Licensing | ||
|
||
# region init | ||
# python's internal module loader will assure that this init-part runs only once. | ||
|
||
# !!! this requires to ship the actual schema data with the package. | ||
with open(__SPDX_JSON_SCHEMA) as schema: | ||
__IDS: Set[str] = set(json_load(schema).get('enum', [])) | ||
assert len(__IDS) > 0, 'known SPDX-IDs should be non-empty set' | ||
|
||
__IDS_LOWER_MAP: Dict[str, str] = dict((id_.lower(), id_) for id_ in __IDS) | ||
|
||
__SPDX_EXPRESSION_LICENSING: 'Licensing' = get_spdx_licensing() | ||
__KNOWN_IDS = ([entry['spdx_license_key'] for entry in get_license_index() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do i see this right? Instead of using CycloneDX own well-known list of allowed values for a SPDX license ID, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are using the license-expression module for license validation in the is_compound _expression. And this module internally takes already the data I use for the SPDX IDs to do this verification for official and specified LicenseRef-scancode-*. So I did not introduce something new. To use different data from different sources is inconsistent for me an leds to problems if one of the modules, cyclonedx-python-lib and license-expression are not most up to date and correct at the same time. Depending on usage on a single or compound the use of same licenses may result in different IDs. This is what we currently already have. It could be fine to have a text (for the target application) to check license-expressions SPDX license list IDs data. Furthermore, the BSI explicitly recommends the usage of the AboutCode/ScanCode database in their SBOM TR (for the LicenseRef-scancode-* custom IDs that also is incorporated in the license-expression module. And the module also receives updates for the SPDX license list. If your point to use both at the same time, this is your decision as a core maintainer. But for me I say that I do not want to rely on two different lists burned manually into each module version. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is a complete behavioral change.
there were no different data. there was just one data: the official list of well-known SPDX license IDs from the official CycloneDX spec. please see below.
it is not my decision. it is adherence to the spec.
This is the BSI's very own narrow profile/view/flavor of SBOM. and it is a recommendation that nobody is forced to follow. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let me say just two things. If you would extend the tests as I did you would see that it is not working as documented in the SPDX spec. Please look into the function description of Licensing.parse and the code of license-expression in general to see how it works. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
thank you for pointing this out. 👍 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. created the needed tests (#782), but cannot simply fix. |
||
if entry['spdx_license_key'] and not entry['is_exception']] | ||
+ [item for license_entry in get_license_index() | ||
for item in license_entry['other_spdx_license_keys'] if not license_entry['is_exception']]) | ||
__IDS: Set[str] = set(__KNOWN_IDS) | ||
__IDS_LOWER_MAP: Dict[str, str] = {**{entry['spdx_license_key'].lower(): entry['spdx_license_key'] | ||
for entry in get_license_index() | ||
if entry['spdx_license_key'] and not entry['is_exception']}, | ||
**{item.lower(): item for license_entry in get_license_index() | ||
for item in license_entry['other_spdx_license_keys'] | ||
if not license_entry['is_exception']}} | ||
|
||
# endregion | ||
|
||
|
||
def is_supported_id(value: str) -> bool: | ||
"""Validate a SPDX-ID according to current spec.""" | ||
"""Validate an SPDX-ID according to current spec.""" | ||
return value in __IDS | ||
|
||
|
||
def fixup_id(value: str) -> Optional[str]: | ||
"""Fixup a SPDX-ID. | ||
"""Fixup an SPDX-ID. | ||
|
||
:returns: repaired value string, or `None` if fixup was unable to help. | ||
""" | ||
return __IDS_LOWER_MAP.get(value.lower()) | ||
|
||
|
||
def is_simple_expression(value: str, validate: bool = False) -> bool: | ||
"""Indicates an SPDX simple expression (SPDX license identifier or license ref). | ||
|
||
.. note:: | ||
Utilizes `license-expression library`_ to | ||
validate SPDX simple expression according to `SPDX license expression spec`_. | ||
DocumentRef- references are not in scope for CycloneDX. | ||
|
||
.. _SPDX license expression spec: https://spdx.github.io/spdx-spec/v2.3/SPDX-license-expressions/ | ||
.. _license-expression library: https://github.com/nexB/license-expression | ||
""" | ||
if not value: | ||
return False | ||
try: | ||
expression = __SPDX_EXPRESSION_LICENSING.parse(value, strict=True, validate=validate) | ||
except (NameError, ExpressionError): | ||
return False | ||
if type(expression) in [OR, AND]: | ||
return False | ||
if str(expression).startswith('LicenseRef-'): | ||
# It is a custom license ref | ||
return True | ||
# It should be an official SPDX license identifier | ||
result = __SPDX_EXPRESSION_LICENSING.validate(value, strict=True) | ||
if result.errors: | ||
# The value was not understood | ||
return False | ||
if result.original_expression == result.normalized_expression: | ||
# The given value is identical to normalized, so it is a valid identifier | ||
return True | ||
if result.original_expression.upper() != result.normalized_expression.upper(): | ||
# It is not a capitalization issue, ID was normalized to another valid ID, so it is OK. | ||
return True | ||
return False | ||
|
||
|
||
def is_compound_expression(value: str) -> bool: | ||
"""Validate compound expression. | ||
"""Indicates whether value is an SPDX compound expression. | ||
|
||
.. note:: | ||
Utilizes `license-expression library`_ to | ||
validate SPDX compound expression according to `SPDX license expression spec`_. | ||
DocumentRef- references are not in scope for CycloneDX. | ||
|
||
.. _SPDX license expression spec: https://spdx.github.io/spdx-spec/v2.3/SPDX-license-expressions/ | ||
.. _license-expression library: https://github.com/nexB/license-expression | ||
""" | ||
def is_valid_item(expression: Expression) -> bool: | ||
if type(expression) in [OR, AND]: | ||
for item in expression.args: | ||
if not is_valid_item(item): | ||
return False | ||
return True | ||
elif type(expression) in [LicenseSymbol, LicenseWithExceptionSymbol]: | ||
return is_simple_expression(str(expression)) | ||
return False | ||
|
||
if not value: | ||
return False | ||
try: | ||
res = __SPDX_EXPRESSION_LICENSING.validate(value) | ||
except Exception: | ||
# the throw happens when internals crash due to unexpected input characters. | ||
parsed_expression = __SPDX_EXPRESSION_LICENSING.parse(value) | ||
if type(parsed_expression) in [OR, AND] or isinstance(parsed_expression, LicenseWithExceptionSymbol): | ||
return is_valid_item(parsed_expression) | ||
else: | ||
return False | ||
except (NameError, ExpressionError): | ||
return False | ||
return 0 == len(res.errors) | ||
|
||
|
||
def is_spdx_license_id(value: str) -> bool: | ||
"""Indicates whether value is an SPDX license identifier from official list. | ||
|
||
.. note:: | ||
Utilizes `license-expression library`_ to | ||
validate SPDX compound expression according to `SPDX license expression spec`_. | ||
DocumentRef- references are not in scope for CycloneDX. | ||
|
||
.. _SPDX license expression spec: https://spdx.github.io/spdx-spec/v2.3/SPDX-license-expressions/ | ||
.. _license-expression library: https://github.com/nexB/license-expression | ||
""" | ||
return is_simple_expression(value, validate=True) and not value.startswith('LicenseRef-') | ||
|
||
|
||
def is_spdx_expression(value: str) -> bool: | ||
"""Indicates whether value is an SPDX simple or compound expression. | ||
|
||
.. note:: | ||
Utilizes `license-expression library`_ to | ||
validate SPDX compound expression according to `SPDX license expression spec`_. | ||
DocumentRef- references are not in scope for CycloneDX. | ||
|
||
.. _SPDX license expression spec: https://spdx.github.io/spdx-spec/v2.3/SPDX-license-expressions/ | ||
.. _license-expression library: https://github.com/nexB/license-expression | ||
""" | ||
return is_simple_expression(value) or is_compound_expression(value) |
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Superseded by #783