From 2e12b6539c172de01756b613f7e4c39980ca9158 Mon Sep 17 00:00:00 2001 From: Aaron Meurer Date: Tue, 23 Jan 2024 17:39:57 -0700 Subject: [PATCH 1/6] Update the README --- README.md | 186 ++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 182 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 2a15cac..84c024f 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,191 @@ -array-api-strict -================ +# array-api-strict -A strict, minimal implementation of the [Python array +`array_api_strict` is a strict, minimal implementation of the [Python array API](https://data-apis.org/array-api/latest/) +The purpose of array-api-strict is to provide an implementation of the array +API for consuming libraries to test against so they can be completely sure +their usage of the array API is portable. + +It is *not* intended to be used by end-users. End-users of the array API +should just use their favorite array library (NumPy, CuPy, PyTorch, etc.) as +usual. It is also not intended to be used as a dependency by consuming +libraries. Consuming library code should use the +[array-api-compat](https://github.com/data-apis/array-api-compat) package to +support the array API. Rather, it is intended to be used in the test suites of +consuming libraries to test their array API usage. + +## Install + +`array-api-strict` is available on both +[PyPI](https://pypi.org/project/array-api-strict/) + +``` +python -m pip install array-api-strict +``` + +and [Conda-forge](https://anaconda.org/conda-forge/array-api-strict) + +``` +conda install --channel conda-forge array-api-strict +``` + +array-api-strict supports NumPy 1.26 and (the upcoming) NumPy 2.0. + +## Rationale + +The array API has many functions and behaviors that are required to be +implemented by conforming libraries, but it does not, in most cases, disallow +implementing additional functions, keyword arguments, and behaviors that +aren't explicitly required by the standard. + +However, this poses a problem for consumers of the array API, as they may +accidentally use a function or rely on a behavior which just happens to be +implemented in every array library they test against (e.g., NumPy and +PyTorch), but isn't required by the standard and may not be included in other +libraries. + +array-api-strict solves this problem by providing a strict, minimal +implementation of the array API standard. Only those functions and behaviors +that are explicitly *required* by the standard are implemented. For example, +most NumPy functions accept Python scalars as inputs: + +```py +>>> import numpy as np +>>> np.sin(0.0) +0.0 +``` + +However, the standard only specifies function inputs on `Array` objects. And +indeed, some libraries, such as PyTorch, do not allow this: + +```py +>>> import torch +>>> torch.sin(0.0) +Traceback (most recent call last): + File "", line 1, in +TypeError: sin(): argument 'input' (position 1) must be Tensor, not float +``` + +In array-api-strict, this is also an error: + +```py +>>> import array_api_strict as xp +>>> xp.sin(0.0) +Traceback (most recent call last): +... +AttributeError: 'float' object has no attribute 'dtype' +``` + +Here is an (incomplete) list of the sorts of ways that array-api-strict is +strict/minimal: + +- Only those functions and methods that are [defined in the + standard](https://data-apis.org/array-api/draft/API_specification/index.html) + are included. + +- In those functions, only the keyword-arguments that are defined by the + standard are included. All signatures in array-api-strict use + [positional-only + arguments](https://data-apis.org/array-api/draft/API_specification/function_and_method_signatures.html#function-and-method-signatures). + As noted above, only array_api_strict array objects are accepted by + functions, except in the places where the standard allows Python scalars + (i.e., functions to not automatically call `asarray` on their inputs). + +- Only those [dtypes that are defined in the + standard](https://data-apis.org/array-api/draft/API_specification/data_types.html) + are included. + +- All functions and methods reject inputs if the standard does not *require* + the input dtype(s) to be supported. This is one of the most restrictive + aspects of the library. For example, in NumPy, most transcendental functions + like `sin` will accept integer array inputs, but the [standard only requires + them to accept floating-point + inputs](https://data-apis.org/array-api/draft/API_specification/generated/array_api.sin.html#array_api.sin), + so in array-api-strict, `sin(integer_array)` will raise an exception. + +- The + [indexing](https://data-apis.org/array-api/draft/API_specification/indexing.html) + semantics required by the standard are not + +- There are no distinct "scalar" objects as in NumPy. There are only 0-D + arrays. + +- Dtype objects are just empty objects that only implement [equality + comparison](https://data-apis.org/array-api/draft/API_specification/generated/array_api.data_types.__eq__.html). + The way to access dtype objects in the standard is by name, like + `xp.float32`. + +- The array object type itself is private and should not be accessed. + Subclassing or otherwise trying to directly initialize this object is not + supported. Arrays should created with one of the [array creation + functions](https://data-apis.org/array-api/draft/API_specification/creation_functions.html) + such as `asarray`. + +## Caveats + +array-api-strict is a thin pure Python wrapper around NumPy. NumPy 2.0 fully +supports the array API but NumPy 1.26 does not, so many behaviors are wrapped +in NumPy 1.26 to provide array API compatible behavior. Although it is based +on NumPy, mixing NumPy arrays with array-api-strict arrays is not supported. +This should generally raise an error, as it indicates a potential portability +issue, but this hasn't necessarily been tested thoroughly. + +1. array-api-strict is validated against the [array API test + suite](https://github.com/data-apis/array-api-tests). However, there may be + a few minor instances where NumPy deviates from the standard in a way that + is inconvenient to workaround in array-api-strict, since it aims to remain + pure Python. You can see the full list of tests that are known to fail in + the [xfails + file](https://github.com/data-apis/array-api-strict/blob/main/array-api-tests-xfails.txt). + + The most notable of these is that in NumPy 1.26, the `copy=False` flag is + not implemented for `asarray` and therefore `array_api_strict` raises + `NotImplementedError` in that case. + +2. Since NumPy is a CPU-only library, the [device + support](https://data-apis.org/array-api/draft/design_topics/device_support.html) + in array-api-strict is superficial only. `x.device` is always a (private) + `_CPU_DEVICE` object, and `device` keywords to creation functions only + accept either this object or `None`. A future version of array-api-strict + [may add support for a CuPy + backend](https://github.com/data-apis/array-api-strict/issues/5) so that + more significant device support can be tested. + +3. Although only array types are expected in array-api-strict functions, + currently most functions do not do extensive type checking on their inputs, + so a sufficiently duck-typed object may pass through silently (or at best, + you may get `AttributeError` instead of `TypeError`). However, all type + signatures have type annotations (based on those from the standard), so + this deviation may be tested with type checking. This [behavior may improve + in the future](https://github.com/data-apis/array-api-strict/issues/6). + +4. There are some behaviors in the standard that are not required to be + implemented by libraries that cannot support [data dependent + shapes](https://data-apis.org/array-api/draft/design_topics/data_dependent_output_shapes.html). + This includes [the `unique_*` + functions](https://data-apis.org/array-api/draft/API_specification/set_functions.html), + [boolean array + indexing](https://data-apis.org/array-api/draft/API_specification/indexing.html#boolean-array-indexing), + and the + [`nonzero`](https://data-apis.org/array-api/draft/API_specification/generated/array_api.nonzero.html) + function. array-api-strict currently implements all of these. In the + future, [there may be a way to disable them](https://github.com/data-apis/array-api-strict/issues/7). + +5. array-api-strict currently only supports the latest version of the array + API standard. [This may change in the future depending on + need](https://github.com/data-apis/array-api-strict/issues/8). + +## Usage + +TODO: Add a sample CI script here. + +## Relationship to `numpy.array_api` + Previously this implementation was available as `numpy.array_api`, but it was moved to a separate package for NumPy 2.0. -Note: the history of this repo prior to commit +Note that the history of this repo prior to commit fbefd42e4d11e9be20e0a4785f2619fc1aef1e7c was generated automatically from the numpy git history, using the following [git-filter-repo](https://github.com/newren/git-filter-repo) command: From 07d482fbe3bec07da26a7f5e335e01cefbbdb015 Mon Sep 17 00:00:00 2001 From: Aaron Meurer Date: Tue, 23 Jan 2024 17:42:06 -0700 Subject: [PATCH 2/6] Update the module docstring --- array_api_strict/__init__.py | 127 ++++------------------------------- 1 file changed, 14 insertions(+), 113 deletions(-) diff --git a/array_api_strict/__init__.py b/array_api_strict/__init__.py index b3c22c8..326f55d 100644 --- a/array_api_strict/__init__.py +++ b/array_api_strict/__init__.py @@ -1,117 +1,18 @@ """ -A NumPy sub-namespace that conforms to the Python array API standard. - -This submodule accompanies NEP 47, which proposes its inclusion in NumPy. It -is still considered experimental, and will issue a warning when imported. - -This is a proof-of-concept namespace that wraps the corresponding NumPy -functions to give a conforming implementation of the Python array API standard -(https://data-apis.github.io/array-api/latest/). The standard is currently in -an RFC phase and comments on it are both welcome and encouraged. Comments -should be made either at https://github.com/data-apis/array-api or at -https://github.com/data-apis/consortium-feedback/discussions. - -NumPy already follows the proposed spec for the most part, so this module -serves mostly as a thin wrapper around it. However, NumPy also implements a -lot of behavior that is not included in the spec, so this serves as a -restricted subset of the API. Only those functions that are part of the spec -are included in this namespace, and all functions are given with the exact -signature given in the spec, including the use of position-only arguments, and -omitting any extra keyword arguments implemented by NumPy but not part of the -spec. The behavior of some functions is also modified from the NumPy behavior -to conform to the standard. Note that the underlying array object itself is -wrapped in a wrapper Array() class, but is otherwise unchanged. This submodule -is implemented in pure Python with no C extensions. - -The array API spec is designed as a "minimal API subset" and explicitly allows -libraries to include behaviors not specified by it. But users of this module -that intend to write portable code should be aware that only those behaviors -that are listed in the spec are guaranteed to be implemented across libraries. -Consequently, the NumPy implementation was chosen to be both conforming and -minimal, so that users can use this implementation of the array API namespace -and be sure that behaviors that it defines will be available in conforming -namespaces from other libraries. - -A few notes about the current state of this submodule: - -- There is a test suite that tests modules against the array API standard at - https://github.com/data-apis/array-api-tests. The test suite is still a work - in progress, but the existing tests pass on this module, with a few - exceptions: - - - DLPack support (see https://github.com/data-apis/array-api/pull/106) is - not included here, as it requires a full implementation in NumPy proper - first. - - The test suite is not yet complete, and even the tests that exist are not - guaranteed to give a comprehensive coverage of the spec. Therefore, when - reviewing and using this submodule, you should refer to the standard - documents themselves. There are some tests in array_api_strict.tests, but - they primarily focus on things that are not tested by the official array API - test suite. - -- There is a custom array object, array_api_strict.Array, which is returned by - all functions in this module. All functions in the array API namespace - implicitly assume that they will only receive this object as input. The only - way to create instances of this object is to use one of the array creation - functions. It does not have a public constructor on the object itself. The - object is a small wrapper class around numpy.ndarray. The main purpose of it - is to restrict the namespace of the array object to only those dtypes and - only those methods that are required by the spec, as well as to limit/change - certain behavior that differs in the spec. In particular: - - - The array API namespace does not have scalar objects, only 0-D arrays. - Operations on Array that would create a scalar in NumPy create a 0-D - array. - - - Indexing: Only a subset of indices supported by NumPy are required by the - spec. The Array object restricts indexing to only allow those types of - indices that are required by the spec. See the docstring of the - array_api_strict.Array._validate_indices helper function for more - information. - - - Type promotion: Some type promotion rules are different in the spec. In - particular, the spec does not have any value-based casting. The spec also - does not require cross-kind casting, like integer -> floating-point. Only - those promotions that are explicitly required by the array API - specification are allowed in this module. See NEP 47 for more info. - - - Functions do not automatically call asarray() on their input, and will not - work if the input type is not Array. The exception is array creation - functions, and Python operators on the Array object, which accept Python - scalars of the same type as the array dtype. - -- All functions include type annotations, corresponding to those given in the - spec (see _typing.py for definitions of some custom types). These do not - currently fully pass mypy due to some limitations in mypy. - -- Dtype objects are just the NumPy dtype objects, e.g., float64 = - np.dtype('float64'). The spec does not require any behavior on these dtype - objects other than that they be accessible by name and be comparable by - equality, but it was considered too much extra complexity to create custom - objects to represent dtypes. - -- All places where the implementations in this submodule are known to deviate - from their corresponding functions in NumPy are marked with "# Note:" - comments. - -Still TODO in this module are: - -- DLPack support for numpy.ndarray is still in progress. See - https://github.com/numpy/numpy/pull/19083. - -- The copy=False keyword argument to asarray() is not yet implemented. This - requires support in numpy.asarray() first. - -- Some functions are not yet fully tested in the array API test suite, and may - require updates that are not yet known until the tests are written. - -- The spec is still in an RFC phase and may still have minor updates, which - will need to be reflected here. - -- Complex number support in array API spec is planned but not yet finalized, - as are the fft extension and certain linear algebra functions such as eig - that require complex dtypes. +array_api_strict is a strict, minimal implementation of the Python array +API (https://data-apis.org/array-api/latest/) + +The purpose of array-api-strict is to provide an implementation of the array +API for consuming libraries to test against so they can be completely sure +their usage of the array API is portable. + +It is *not* intended to be used by end-users. End-users of the array API +should just use their favorite array library (NumPy, CuPy, PyTorch, etc.) as +usual. It is also not intended to be used as a dependency by consuming +libraries. Consuming library code should use the +array-api-compat (https://github.com/data-apis/array-api-compat) package to +support the array API. Rather, it is intended to be used in the test suites of +consuming libraries to test their array API usage. """ From e4d079885288d4982efd47ac00214dd4bd10d2c2 Mon Sep 17 00:00:00 2001 From: Aaron Meurer Date: Tue, 23 Jan 2024 17:53:17 -0700 Subject: [PATCH 3/6] Fixes in the README --- README.md | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index 84c024f..f34548d 100644 --- a/README.md +++ b/README.md @@ -81,19 +81,19 @@ Here is an (incomplete) list of the sorts of ways that array-api-strict is strict/minimal: - Only those functions and methods that are [defined in the - standard](https://data-apis.org/array-api/draft/API_specification/index.html) + standard](https://data-apis.org/array-api/latest/API_specification/index.html) are included. - In those functions, only the keyword-arguments that are defined by the standard are included. All signatures in array-api-strict use [positional-only - arguments](https://data-apis.org/array-api/draft/API_specification/function_and_method_signatures.html#function-and-method-signatures). + arguments](https://data-apis.org/array-api/latest/API_specification/function_and_method_signatures.html#function-and-method-signatures). As noted above, only array_api_strict array objects are accepted by functions, except in the places where the standard allows Python scalars (i.e., functions to not automatically call `asarray` on their inputs). - Only those [dtypes that are defined in the - standard](https://data-apis.org/array-api/draft/API_specification/data_types.html) + standard](https://data-apis.org/array-api/latest/API_specification/data_types.html) are included. - All functions and methods reject inputs if the standard does not *require* @@ -101,25 +101,25 @@ strict/minimal: aspects of the library. For example, in NumPy, most transcendental functions like `sin` will accept integer array inputs, but the [standard only requires them to accept floating-point - inputs](https://data-apis.org/array-api/draft/API_specification/generated/array_api.sin.html#array_api.sin), + inputs](https://data-apis.org/array-api/latest/API_specification/generated/array_api.sin.html#array_api.sin), so in array-api-strict, `sin(integer_array)` will raise an exception. - The - [indexing](https://data-apis.org/array-api/draft/API_specification/indexing.html) + [indexing](https://data-apis.org/array-api/latest/API_specification/indexing.html) semantics required by the standard are not - There are no distinct "scalar" objects as in NumPy. There are only 0-D arrays. - Dtype objects are just empty objects that only implement [equality - comparison](https://data-apis.org/array-api/draft/API_specification/generated/array_api.data_types.__eq__.html). + comparison](https://data-apis.org/array-api/latest/API_specification/generated/array_api.data_types.__eq__.html). The way to access dtype objects in the standard is by name, like `xp.float32`. - The array object type itself is private and should not be accessed. Subclassing or otherwise trying to directly initialize this object is not supported. Arrays should created with one of the [array creation - functions](https://data-apis.org/array-api/draft/API_specification/creation_functions.html) + functions](https://data-apis.org/array-api/latest/API_specification/creation_functions.html) such as `asarray`. ## Caveats @@ -144,9 +144,9 @@ issue, but this hasn't necessarily been tested thoroughly. `NotImplementedError` in that case. 2. Since NumPy is a CPU-only library, the [device - support](https://data-apis.org/array-api/draft/design_topics/device_support.html) + support](https://data-apis.org/array-api/latest/design_topics/device_support.html) in array-api-strict is superficial only. `x.device` is always a (private) - `_CPU_DEVICE` object, and `device` keywords to creation functions only + `CPU_DEVICE` object, and `device` keywords to creation functions only accept either this object or `None`. A future version of array-api-strict [may add support for a CuPy backend](https://github.com/data-apis/array-api-strict/issues/5) so that @@ -162,13 +162,13 @@ issue, but this hasn't necessarily been tested thoroughly. 4. There are some behaviors in the standard that are not required to be implemented by libraries that cannot support [data dependent - shapes](https://data-apis.org/array-api/draft/design_topics/data_dependent_output_shapes.html). + shapes](https://data-apis.org/array-api/latest/design_topics/data_dependent_output_shapes.html). This includes [the `unique_*` - functions](https://data-apis.org/array-api/draft/API_specification/set_functions.html), + functions](https://data-apis.org/array-api/latest/API_specification/set_functions.html), [boolean array - indexing](https://data-apis.org/array-api/draft/API_specification/indexing.html#boolean-array-indexing), + indexing](https://data-apis.org/array-api/latest/API_specification/indexing.html#boolean-array-indexing), and the - [`nonzero`](https://data-apis.org/array-api/draft/API_specification/generated/array_api.nonzero.html) + [`nonzero`](https://data-apis.org/array-api/latest/API_specification/generated/array_api.nonzero.html) function. array-api-strict currently implements all of these. In the future, [there may be a way to disable them](https://github.com/data-apis/array-api-strict/issues/7). From 2bc5edd39958cdb2f87475694f2da3e5fbe1e1ba Mon Sep 17 00:00:00 2001 From: Aaron Meurer Date: Tue, 23 Jan 2024 17:53:25 -0700 Subject: [PATCH 4/6] Add a CHANGELOG --- CHANGELOG.md | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 CHANGELOG.md diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..0f92694 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,30 @@ +# array-api-strict Changelog + +## 1.0 (????) + +This is the first release of `array_api_strict`. It is extracted from +`numpy.array_api`, which was included as an experimental submodule in NumPy +versions prior to 2.0. Note that the commit history in this repository is +extracted from the git history of numpy/array_api/ (see the [README](README.md)). + +Additionally, the following changes are new to `array_api_strict` from +`numpy.array_api` in NumPy 1.26 (the last major NumPy release to include +`numpy.array_api`): + +- ``array_api_strict`` was made more portable. In particular: + + - ``array_api_strict`` no longer uses ``"cpu"`` as its "device", but rather a + separate ``CPU_DEVICE`` object (which is not accessible in the namespace). + This is because "cpu" is not part of the array API standard. + + - ``array_api_strict`` now uses separate wrapped objects for dtypes. + Previously it reused the ``numpy`` dtype objects. This makes it clear + which behaviors on dtypes are part of the array API standard (effectively, + the standard only requires ``==`` on dtype objects). + +- ``numpy.array_api.nonzero`` now errors on zero-dimensional arrays, as + required by the array API standard. + +- Support for the optional [fft + extension](https://data-apis.org/array-api/latest/extensions/fourier_transform_functions.html) + was added. From c027af9e5ad8abff38e9c323f61e7538f7725f60 Mon Sep 17 00:00:00 2001 From: Aaron Meurer Date: Wed, 24 Jan 2024 13:15:29 -0700 Subject: [PATCH 5/6] Wording updates from review --- CHANGELOG.md | 2 +- README.md | 10 ++++++---- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0f92694..be40bd6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,7 +8,7 @@ versions prior to 2.0. Note that the commit history in this repository is extracted from the git history of numpy/array_api/ (see the [README](README.md)). Additionally, the following changes are new to `array_api_strict` from -`numpy.array_api` in NumPy 1.26 (the last major NumPy release to include +`numpy.array_api` in NumPy 1.26 (the last NumPy feature release to include `numpy.array_api`): - ``array_api_strict`` was made more portable. In particular: diff --git a/README.md b/README.md index f34548d..dcf6b13 100644 --- a/README.md +++ b/README.md @@ -88,9 +88,9 @@ strict/minimal: standard are included. All signatures in array-api-strict use [positional-only arguments](https://data-apis.org/array-api/latest/API_specification/function_and_method_signatures.html#function-and-method-signatures). - As noted above, only array_api_strict array objects are accepted by + As noted above, only `array_api_strict` array objects are accepted by functions, except in the places where the standard allows Python scalars - (i.e., functions to not automatically call `asarray` on their inputs). + (i.e., functions do not automatically call `asarray` on their inputs). - Only those [dtypes that are defined in the standard](https://data-apis.org/array-api/latest/API_specification/data_types.html) @@ -106,7 +106,9 @@ strict/minimal: - The [indexing](https://data-apis.org/array-api/latest/API_specification/indexing.html) - semantics required by the standard are not + semantics required by the standard are limited compared to those implemented + by NumPy (e.g., out-of-bounds slices are not supported, integer array + indexing is not supported, only a single boolean array index is supported). - There are no distinct "scalar" objects as in NumPy. There are only 0-D arrays. @@ -118,7 +120,7 @@ strict/minimal: - The array object type itself is private and should not be accessed. Subclassing or otherwise trying to directly initialize this object is not - supported. Arrays should created with one of the [array creation + supported. Arrays should be created with one of the [array creation functions](https://data-apis.org/array-api/latest/API_specification/creation_functions.html) such as `asarray`. From c0f153460b1881115ceccfad854dc5d9ae346f55 Mon Sep 17 00:00:00 2001 From: Aaron Meurer Date: Wed, 24 Jan 2024 13:16:13 -0700 Subject: [PATCH 6/6] Add .DS_Store to .gitignore --- .gitignore | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.gitignore b/.gitignore index 68bc17f..dbce267 100644 --- a/.gitignore +++ b/.gitignore @@ -158,3 +158,5 @@ cython_debug/ # and can be added to the global gitignore or merged into this file. For a more nuclear # option (not recommended) you can uncomment the following to ignore the entire idea folder. #.idea/ + +.DS_Store