Skip to content

Update to Model Builders page for DAALL-7867 #6

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion daal4py/.buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: b3f9dd5f9dffbf1afeba6d7177901aa9
config: e1589d1aed29dd35620ebcc85aa22a53
tags: 645f666f9bcd5a90fca523b33c5a78b7
Empty file added daal4py/.nojekyll
Empty file.
20 changes: 14 additions & 6 deletions daal4py/_modules/index.html
Original file line number Diff line number Diff line change
@@ -1,15 +1,20 @@
<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<html class="writer-html5" lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Overview: module code &mdash; daal4py 2021 documentation</title>
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<title>Overview: module code &mdash; daal4py 2021.1 documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/css/theme.css" />
<link rel="stylesheet" type="text/css" href="../_static/style.css" />


<!--[if lt IE 9]>
<script src="../_static/js/html5shiv.min.js"></script>
<![endif]-->

<script src="../_static/jquery.js"></script>
<script src="../_static/_sphinx_javascript_frameworks_compat.js"></script>
<script data-url_root="../" id="documentation_options" src="../_static/documentation_options.js"></script>
<script src="../_static/doctools.js"></script>
<script src="../_static/sphinx_highlight.js"></script>
Expand Down Expand Up @@ -44,6 +49,9 @@
<a href="../contents.html" class="icon icon-home">
daal4py
</a>
<div class="version">
2021
</div>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
Expand All @@ -52,7 +60,7 @@
</form>
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<p class="caption" role="heading"><span class="caption-text">Contents</span></p>
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../index.html">About daal4py</a></li>
<li class="toctree-l1"><a class="reference internal" href="../data.html">Data</a></li>
Expand Down Expand Up @@ -98,7 +106,7 @@ <h1>All modules for which code is available</h1>
<hr/>

<div role="contentinfo">
<p>&#169; Copyright 2023, Intel.</p>
<p>&#169; Copyright Intel.</p>
</div>

Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
Expand Down
157 changes: 78 additions & 79 deletions daal4py/_sources/algorithms.rst.txt

Large diffs are not rendered by default.

28 changes: 24 additions & 4 deletions daal4py/_sources/contents.rst.txt
Original file line number Diff line number Diff line change
@@ -1,10 +1,30 @@
.. _contents::
.. ******************************************************************************
.. * Copyright 2020 Intel Corporation
.. *
.. * Licensed under the Apache License, Version 2.0 (the "License");
.. * you may not use this file except in compliance with the License.
.. * You may obtain a copy of the License at
.. *
.. * http://www.apache.org/licenses/LICENSE-2.0
.. *
.. * Unless required by applicable law or agreed to in writing, software
.. * distributed under the License is distributed on an "AS IS" BASIS,
.. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
.. * See the License for the specific language governing permissions and
.. * limitations under the License.
.. *******************************************************************************/

.. include:: note.rst
.. _contents:

########
Contents
########

.. include:: note.rst

.. toctree::
:maxdepth: 2
:caption: Contents
:caption: Contents:

About daal4py <index>
Data <data>
Expand All @@ -13,4 +33,4 @@
Distributed Mode <scaling>
Streaming Mode <streaming>
Examples <examples>
Scikit-Learn API <sklearn>
Scikit-Learn API <sklearn>
2 changes: 1 addition & 1 deletion daal4py/_sources/data.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Input Data
##########

.. include:: note.rst

All array arguments to compute functions and to algorithm constructors can be
provided in different formats. daal4py will automatically do its best to work on
the provided data with minimal overhead, most notably without copying the data.
Expand Down
116 changes: 58 additions & 58 deletions daal4py/_sources/examples.rst.txt

Large diffs are not rendered by default.

12 changes: 6 additions & 6 deletions daal4py/_sources/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Fast, Scalable and Easy Machine Learning With DAAL4PY
#####################################################

.. include:: note.rst

Daal4py makes your Machine Learning algorithms in Python lightning fast and easy to use. It provides
highly configurable Machine Learning kernels, some of which support streaming input data and/or can
be easily and efficiently scaled out to clusters of workstations. Internally it uses Intel(R)
Expand Down Expand Up @@ -102,7 +102,7 @@ Last but not least, daal4py allows :ref:`getting input data from streams <stream

oneAPI and GPU support in daal4py
---------------------------------
daal4py oneAPI and GPU support is deprecated. Use `scikit-learn-intelex <https://intel.github.io/scikit-learn-intelex/oneapi-gpu.html#>`_
daal4py oneAPI and GPU support is deprecated. Use `scikit-learn-intelex <https://intel.github.io/scikit-learn-intelex/latest/oneapi-gpu.html#>`_
instead.


Expand Down Expand Up @@ -146,11 +146,11 @@ daal4py is available at the `Python Package Index <https://pypi.org/project/daal
on Anaconda Cloud in `Conda Forge channel <https://anaconda.org/conda-forge/daal4py>`_
and in `Intel channel <https://anaconda.org/intel/daal4py>`_.
Sources and build instructions are available in
`daal4py repository <https://github.com/intel/scikit-learn-intelex/tree/master/daal4py>`_.
`daal4py repository <https://github.com/intel/scikit-learn-intelex/tree/main/daal4py>`_.

The daal4py package is available via same distribution channels and platforms as scikit-learn-intelex.
See
`scikit-learn-intelex requirements <https://intel.github.io/scikit-learn-intelex/system-requirements.html>` _
`scikit-learn-intelex requirements <https://intel.github.io/scikit-learn-intelex/latest/system-requirements.html>` _

- Install from PyPI::

Expand Down Expand Up @@ -194,11 +194,11 @@ Scikit-Learn API and patching
-----------------------------
.. tip::
We recommend using
the 'scikit-learn-intelex package patching <https://intel.github.io/scikit-learn-intelex/what-is-patching.html>' _ for the scikit-learn patching. daal4py exposes some oneDAL solvers using a scikit-learn compatible API.
the 'scikit-learn-intelex package patching <https://intel.github.io/scikit-learn-intelex/latest/what-is-patching.html>' _ for the scikit-learn patching.
daal4py exposes some oneDAL solvers using a scikit-learn compatible API.

daal4py can furthermore monkey-patch the ``sklearn`` package to use the DAAL
solvers as drop-in replacement without any code change.

Please refer to the section on :ref:`scikit-learn API and patching <sklearn>`
for more details.

77 changes: 61 additions & 16 deletions daal4py/_sources/model-builders.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,20 +24,26 @@ Model Builders for the Gradient Boosting Frameworks

Introduction
------------------
Gradient boosting on decision trees is one of the most accurate and efficient
machine learning algorithms for classification and regression.
The most popular implementations of it are:
Gradient boosting on decision trees is one of the most accurate and efficient
machine learning algorithms for classification and regression.
The most popular implementations of it are:

* XGBoost*
* LightGBM*
* CatBoost*

daal4py Model Builders deliver the accelerated
models inference of those frameworks. The inference is performed by the oneDAL GBT implementation tuned
for the best performance on the Intel(R) Architecture.
models inference of those frameworks. The inference is performed by the oneDAL GBT implementation tuned
for the best performance on the Intel(R) Architecture.

.. note::

Currently, experimental support for XGBoost* and LightGBM* categorical data is not supported.
For the model conversion to work with daal4py, convert non-numeric data to numeric data
before training and converting the model.

Conversion
---------
----------
The first step is to convert already trained model. The
API usage for different frameworks is the same:

Expand All @@ -61,37 +67,76 @@ CatBoost::
Classification and Regression Inference
----------------------------------------

The API is the same for classification and regression inference.
Based on the original model passed to the ``convert_model``, ``d4p_prediction`` is either the classification or regression output.
The API is the same for classification and regression inference.
Based on the original model passed to the ``convert_model()``, ``d4p_prediction`` is either the classification or regression output.

::

d4p_prediction = d4p_model.predict(test_data)

Here, the ``predict()`` method of ``d4p_model`` is being used to make predictions on the ``test_data`` dataset.
The ``d4p_prediction`` variable stores the predictions made by the ``predict()`` method.
The ``d4p_prediction`` variable stores the predictions made by the ``predict()`` method.

SHAP Value Calculation for Regression Models
------------------------------------------------------------

SHAP contribution and interaction value calculation are natively supported by models created with daal4py Model Builders.
For these models, the ``predict()`` method takes additional keyword arguments:

::

d4p_model.predict(test_data, pred_contribs=True) # for SHAP contributions
d4p_model.predict(test_data, pred_interactions=True) # for SHAP interactions

The returned prediction has the shape:

* ``(n_rows, n_features + 1)`` for SHAP contributions
* ``(n_rows, n_features + 1, n_features + 1)`` for SHAP interactions
Here, ``n_rows`` is the number of rows (i.e., observations) in
``test_data``, and ``n_features`` is the number of features in the dataset.

The prediction result for SHAP contributions includes a feature attribution value for each feature and a bias term for each observation.

The prediction result for SHAP interactions comprises ``(n_features + 1) x (n_features + 1)`` values for all possible
feature combinations, along with their corresponding bias terms.

.. note:: The shapes of SHAP contributions and interactions are consistent with the XGBoost results.
In contrast, the `SHAP Python package <https://shap.readthedocs.io/en/latest/>`_ drops bias terms, resulting
in SHAP contributions (SHAP interactions) with one fewer column (one fewer column and row) per observation.

Scikit-learn-style Estimators
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You can also use the scikit-learn-style classes ``GBTDAALClassifier`` and ``GBTDAALRegressor`` to convert and infer your models. For example:

::
::

from daal4py.sklearn.ensemble import GBTDAALRegressor
reg = xgb.XGBRegressor()
reg.fit(X, y)
d4p_predt = GBTDAALRegressor.convert_model(reg).predict(X)


Limitations
------------------
Model Builders support only base inference with prediction and probabilities prediction. The functionality is to be extended.
Therefore, there are the following limitations:
- The categorical features are not supported for conversion and prediction.
- The multioutput models are not supported for conversion and prediction.
- SHAP values can be calculated for regression models only.


Examples
---------------------------------
Model Builders models conversion

- `XGBoost model conversion <https://github.com/intel/scikit-learn-intelex/blob/master/examples/daal4py/model_builders_xgboost.py>`_
- `LightGBM model conversion <https://github.com/intel/scikit-learn-intelex/blob/master/examples/daal4py/model_builders_lightgbm.py>`_
- `CatBoost model conversion <https://github.com/intel/scikit-learn-intelex/blob/master/examples/daal4py/model_builders_catboost.py>`_
- `XGBoost model conversion <https://github.com/intel/scikit-learn-intelex/blob/main/examples/daal4py/model_builders_xgboost.py>`_
- `SHAP value prediction from an XGBoost model <https://github.com/intel/scikit-learn-intelex/blob/main/examples/daal4py/model_builders_xgboost_shap.py>`_
- `LightGBM model conversion <https://github.com/intel/scikit-learn-intelex/blob/main/examples/daal4py/model_builders_lightgbm.py>`_
- `CatBoost model conversion <https://github.com/intel/scikit-learn-intelex/blob/main/examples/daal4py/model_builders_catboost.py>`_

Articles and Blog Posts
---------------------------------

- `Improving the Performance of XGBoost and LightGBM Inference <https://medium.com/intel-analytics-software/improving-the-performance-of-xgboost-and-lightgbm-inference-3b542c03447e>`_

4 changes: 0 additions & 4 deletions daal4py/_sources/note.rst.txt

This file was deleted.

18 changes: 9 additions & 9 deletions daal4py/_sources/scaling.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -76,36 +76,36 @@ The following algorithms support distribution:

- PCA (pca)

- `PCA <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/pca_spmd.py>`_
- `PCA <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/pca_spmd.py>`_

- SVD (svd)

- `SVD <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/svd_spmd.py>`_
- `SVD <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/svd_spmd.py>`_

- Linear Regression Training (linear_regression_training)

- `Linear Regression <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/linear_regression_spmd.py>`_
- `Linear Regression <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/linear_regression_spmd.py>`_

- Ridge Regression Training (ridge_regression_training)

- `Ridge Regression <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/ridge_regression_spmd.py>`_
- `Ridge Regression <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/ridge_regression_spmd.py>`_

- Multinomial Naive Bayes Training (multinomial_naive_bayes_training)

- `Naive Bayes <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/naive_bayes_spmd.py>`_
- `Naive Bayes <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/naive_bayes_spmd.py>`_

- K-Means (kmeans_init and kmeans)

- `K-Means <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/kmeans_spmd.py>`_
- `K-Means <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/kmeans_spmd.py>`_

- Correlation and Variance-Covariance Matrices (covariance)

- `Covariance <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/covariance_spmd.py>`_
- `Covariance <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/covariance_spmd.py>`_

- Moments of Low Order (low_order_moments)

- `Low Order Moments <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/low_order_moms_spmd.py>`_
- `Low Order Moments <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/low_order_moms_spmd.py>`_

- QR Decomposition (qr)

- `QR <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/qr_spmd.py>`_
- `QR <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/qr_spmd.py>`_
3 changes: 1 addition & 2 deletions daal4py/_sources/sklearn.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@
Scikit-Learn API and patching
#############################

.. include:: note.rst

Python interface to efficient Intel(R) oneAPI Data Analytics Library provided by daal4py allows one
to create scikit-learn compatible estimators, transformers, clusterers, etc. powered by oneDAL which
Expand Down Expand Up @@ -160,7 +159,7 @@ algorithms:

Monkey-patched scikit-learn classes and functions passes scikit-learn's own test
suite, with few exceptions, specified in `deselected_tests.yaml
<https://github.com/IntelPython/daal4py/blob/master/deselected_tests.yaml>`__.
<https://github.com/IntelPython/daal4py/blob/main/deselected_tests.yaml>`__.

In particular the tests execute `check_estimator
<https://scikit-learn.org/stable/modules/generated/sklearn.utils.estimator_checks.check_estimator.html>`__
Expand Down
16 changes: 8 additions & 8 deletions daal4py/_sources/streaming.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,36 +48,36 @@ daal4py's streaming mode is as easy as follows:
The streaming algorithms also accept arrays and DataFrames as input, e.g. the
data can come from a stream rather than from multiple files. Here is an example
which simulates a data stream using a generator which reads a file in chunks:
`SVD reading stream of data <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/stream.py>`_
`SVD reading stream of data <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/stream.py>`_

Supported Algorithms and Examples
---------------------------------
The following algorithms support streaming:

- SVD (svd)

- `SVD <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/svd_streaming.py>`_
- `SVD <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/svd_streaming.py>`_

- Linear Regression Training (linear_regression_training)

- `Linear Regression <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/linear_regression_streaming.py>`_
- `Linear Regression <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/linear_regression_streaming.py>`_

- Ridge Regression Training (ridge_regression_training)

- `Ridge Regression <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/ridge_regression_streaming.py>`_
- `Ridge Regression <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/ridge_regression_streaming.py>`_

- Multinomial Naive Bayes Training (multinomial_naive_bayes_training)

- `Naive Bayes <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/naive_bayes_streaming.py>`_
- `Naive Bayes <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/naive_bayes_streaming.py>`_

- Moments of Low Order

- `Low Order Moments <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/low_order_moms_streaming.py>`_
- `Low Order Moments <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/low_order_moms_streaming.py>`_

- Covariance

- `Covariance <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/covariance_streaming.py>`_
- `Covariance <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/covariance_streaming.py>`_

- QR

- `QR <https://github.com/intel/scikit-learn-intelex/tree/master/examples/daal4py/qr_streaming.py>`_
- `QR <https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py/qr_streaming.py>`_
Loading