Skip to content

Releases: uxlfoundation/oneDAL

Intel® oneAPI Data Analytics Library 2021.4

14 Oct 09:56
6d8ea7e
Compare
Choose a tag to compare

The release introduces the following changes:

📚 Support Materials

The following additional materials were created:

🛠️ Library Engineering

  • Introduced new functionality for Intel® Extension for Scikit-learn*:
    • Enabled patching for all Scikit-learn applications at once:
    • Added the support of Python 3.9 for both Intel® Extension for Scikit-learn and daal4py. The packages are available from PyPI and the Intel Channel on Anaconda Cloud.
  • Introduced new oneDAL functionality:
    • Added pkg-config support for Linux, macOS, Windows and for static/dynamic, thread/sequential configurations of oneDAL applications.
    • Reduced the size of oneDAL library by approximately ~30%.

🚨 What's New

Introduced new oneDAL functionality:

  • General:
    • Basic statistics (Low order moments) algorithm in oneDAL interfaces
    • Result options for kNN Brute-force in oneDAL interfaces: using a single function call to return any combination of responses, indices, and distances
  • CPU:
    • Sigmoid kernel of SVM algorithm
    • Model converter from CatBoost to oneDAL representation
    • Louvain Community Detection algorithm technical preview
    • Connected Components algorithm technical preview
    • Search task and cosine distance for kNN Brute-force
  • GPU:
    • The full range support of Minkowski distances in kNN Brute-force

Improved oneDAL performance for the following algorithms:

  • CPU:
    • Decision Forest training and prediction
    • Brute-force kNN
    • KMeans
    • NuSVMs and SVR training

Introduced new functionality in Intel® Extension for Scikit-learn:

  • General:
    • Enabled the global patching of all Scikit-learn applications
    • Provided an integration with dpctl for heterogeneous computing (the support of dpctl.tensor.usm_ndarray for input and output)
    • Extended API with set_config and get_config methods. Added the support of target_offload and allow_fallback_to_host options for device offloading scenarios
    • Added the support of predict_proba in RandomForestClassifier estimator
  • CPU:
    • Added the support of Sigmoid kernel in SVM algorithms
  • GPU:
    • Added binary SVC support with Linear and RBF kernels

Improved the performance of the following scikit-learn estimators via scikit-learn patching:

  • SVR algorithm training
  • NuSVC and NuSVR algorithms training
  • RandomForestRegression and RandomForestClassifier algorithms training and prediction
  • KMeans

🐛 Bug Fixes

  • General:
    • Fixed an incorrectly raised exception during the patching of Random Forest algorithm when the number of trees was more than 7000.
  • CPU:
    • Fixed an accuracy issue in Random Forest algorithm caused by the exclusion of constant features.
    • Fixed an issue in NuSVC Multiclass.
    • Fixed an issue with KMeans convergence inconsistency.
    • Fixed incorrect work of train_test_split with specific subset sizes.
  • GPU:
    • Fixed incorrect bias calculation in SVM.

❗ Known Issues

  • GPU:
    • For most algorithms, performance degradations were observed when the 2021.4 version of Intel® oneAPI DPC++ Compiler was used.
    • Examples are failing when run with Visual Studio Solutions on hardware that does not support double precision floating-point operations.

Intel® oneAPI Data Analytics Library 2021.3

02 Jul 15:36
1f5dcc0
Compare
Choose a tag to compare

The release introduces the following changes:

📚 Support Materials

The following additional materials were created:

🛠️ Library Engineering

  • Introduced a new Python package, Intel® Extension for Scikit-learn*. The scikit-learn-intelex package contains scikit-learn patching functionality that was originally available in daal4py package. All future updates for the patches will be available only in Intel® Extension for Scikit-learn. We recommend using scikit-learn-intelex package instead of daal4py.
    • Download the extension using one of the following commands:
      • pip install scikit-learn-intelex
      • conda install scikit-learn-intelex -c conda-forge
    • Enable Scikit-learn patching:
      • from sklearnex import patch_sklearn
      • patch_sklearn()
  • Introduced optional dependencies on DPC++ runtime to daal4py. To enable DPC++ backend, install dpcpp_cpp_rt package. It reduces the default package size with all dependencies from 1.2GB to 400 MB.
  • Added the support of building oneDAL-based applications with /MD and /MDd options on Windows. The -d suffix is used in the names of oneDAL libraries that are built with debug run-time (/MDd).

🚨 What's New

Introduced new oneDAL and daal4py functionality:

  • CPU:
    • SVM Regression algorithm
    • NuSVM algorithm for both Classification and Regression tasks
    • Polynomial kernel support for all SVM algorithms (SVC, SVR, NuSVC, NuSVR)
    • Minkowski and Chebyshev distances for kNN Brute-force
    • The brute-force method and the voting mode support for kNN algorithm in oneDAL interfaces
    • Multiclass support for SVM algorithms in oneDAL interfaces
    • CSR-matrix support for SVM algorithms in oneDAL interfaces
    • Subgraph Isomorphism algorithm technical preview
    • Single Source Shortest Path (SSSP) algorithm technical preview

Improved oneDAL and daal4py performance for the following algorithms:

  • CPU:
    • Support Vector Machines training and prediction
    • Linear, Ridge, ElasticNet, and LASSO regressions prediction
  • GPU:
    • Decision Forest training and prediction
    • Principal Components Analysis training

Introduced the support of scikit-learn 1.0 version in Intel Extension for Scikit-learn.

  • The 2021.3 release of Intel Extension for Scikit-learn supports the latest scikit-learn releases: 0.22.X, 0.23.X, 0.24.X and 1.0.X.

Introduced new functionality for Intel Extension for Scikit-learn:

  • General:
    • The support of patch_sklearn for all algorithms
  • CPU:
    • Acceleration of SVR estimator
    • Acceleration of NuSVC and NuSVR estimators
    • Polynomial kernel support in SVM algorithms

Improved the performance of the following scikit-learn estimators via scikit-learn patching:

  • SVM algorithms training and prediction
  • Linear, Ridge, ElasticNet, and Lasso regressions prediction

Fixed the following issues:

  • General:
    • Fixed binary incompatibility for the versions of numpy earlier than 1.19.4
    • Fixed an issue with a very large number of trees (> 7000) for Random Forest algorithm.
    • Fixed patch_sklearn to patch both fit and predict methods of Logistic Regression when the algorithm is given as a single parameter to patch_sklearn
  • CPU:
    • Improved numerical stability of training for Alternating Least Squares (ALS) and Linear and Ridge regressions with Normal Equations method
    • Reduced the memory consumption of SVM prediction
  • GPU:
    • Fixed an issue with kernel compilation on the platforms without hardware FP64 support

❗ Known Issues

  • Intel® Extension for Scikit-learn and daal4py packages installed from PyPI repository can’t be found on Debian systems (including Google Collab). Mitigation: add “site-packages” folder into Python packages searching before importing the packages:
import sys 
import os 
import site 

sys.path.append(os.path.join(os.path.dirname(site.getsitepackages()[0]), "site-packages")) 

Intel® oneAPI Data Analytics Library 2021.2

31 Mar 22:05
481f859
Compare
Choose a tag to compare

The release introduces the following changes:

Library Engineering:

  • Enabled new PyPI distribution channel for daal4py:
    • Four latest Python versions (3.6, 3.7, 3.8, 3.9) are supported on Linux, Windows and MacOS.
    • Support of both CPU and GPU is included in the package.
    • You can download daal4py using the following command: pip install daal4py
  • Introduced CMake support for oneDAL examples

Support Materials

The following additional materials were created:

What's New

Introduced new oneDAL and daal4py functionality:

  • CPU:
    • Hist method for Decision Forest Classification and Regression, which outperforms the existing exact method
    • Bit-to-bit results reproducibility for: Linear and Ridge regressions, LASSO and ElasticNet, KMeans training and initialization, PCA, SVM, kNN Brute Force method, Decision Forest Classification and Regression
  • GPU:
    • Multi-node multi-GPU algorithms: KMeans (batch), Covariance (batch and online), Low order moments (batch and online) and PCA
    • Sparsity support for SVM algorithm

Improved oneDAL and daal4py performance for the following algorithms:

  • CPU:
    • Decision Forest training Classification and Regression
    • Support Vector Machines training and prediction
    • Logistic Regression, Logistic Loss and Cross Entropy for non-homogeneous input types
  • GPU:
    • Decision Forest training Classification and Regression
    • All algorithms with GPU kernels (as a result of migration to Unified Shared Memory data management)
    • Reduced performance overhead for oneAPI C++ interfaces on CPU and oneAPI DPC++ interfaces on GPU

Added technical preview features in Graph Analytics:

  • CPU:
    • Local and Global Triangle Counting

Introduced new functionality for scikit-learn patching through daal4py:

  • CPU:
    • Patches for four latest scikit-learn releases: 0.21.X, 0.22.X, 0.23.X and 0.24.X
    • Acceleration of roc_auc_score function
    • Bit-to-bit results reproducibility for: LinearRegression, Ridge, SVC, KMeans, PCA, Lasso, ElasticNet, tSNE, KNeighborsClassifier, KNeighborsRegressor, NearestNeighbors, RandomForestClassifier, RandomForestRegressor

​Improved performance of the following scikit-learn estimators via scikit-learn patching:

  • CPU
    • RandomForestClassifier and RandomForestRegressor scikit-learn estimators: training and prediction
    • Principal Component Analysis (PCA) scikit-learn estimator: training
    • Support Vector Classification (SVC) scikit-learn estimators: training and prediction
    • Support Vector Classification (SVC) scikit-learn estimator with the probability==True parameter: training and prediction

Fixed the following issues:

  • Scikit-learn patching:

    • Improved accuracy of RandomForestClassifier and RandomForestRegressor scikit-learn estimators
    • Fixed patching issues with pairwise_distances
    • Fixed the behavior of the patch_sklearn and unpatch_sklearn functions
    • Fixed unexpected behavior that made accelerated functionality unavailable through scikit-learn patching if the unput was not of float32 or float64 data types. Scikit-learn patching now works with all numpy data types.
    • Fixed a memory leak that appeared when DataFrame from pandas was used as an input type
    • Fixed performance issue for interoperability with Modin
  • daal4py:

    • Fixed the crash of SVM and kNN algorithms on Windows on GPU
  • oneDAL:

    • Improved accuracy of Decision Forest Classification and Regression on CPU
    • Improved accuracy of KMeans algorithm on GPU
    • Improved stability of Linear Regression and Logistic Regression algorithms on GPU

​​Known Issues

  • oneDAL vars.sh script does not support kornShell

Intel® oneAPI Data Analytics Library 2021.1

14 Dec 12:01
e15be9b
Compare
Choose a tag to compare

The release contains all functionality of Intel® DAAL. See Intel® DAAL release notes for more details.

What's New

Library Engineering:

  • Renamed the library from Intel® Data Analytics Acceleration Library to Intel® oneAPI Data Analytics Library and changed the package names to reflect this.
  • Deprecated 32-bit version of the library.
  • Introduced Intel GPU support for both OpenCL and Level Zero backends.
  • Introduced Unified Shared Memory (USM) support

Introduced new Intel® oneDAL and daal4py functionality:

  • GPU:
    • Batch algorithms: K-means, Covariance, PCA, Logistic Regression, Linear Regression, Random Forest Classification and Regression, Gradient Boosting Classification and Regression, kNN, SVM, DBSCAN and Low-order moments
    • Online algorithms: Covariance, PCA, Linear Regression and Low-order moments
    • Added Data Management functionality to support DPC++ APIs: a new table type for representation of SYCL-based numeric tables (SyclNumericTable) and an optimized CSV data source

Improved Intel® oneDAL and daal4py performance for the following algorithms:

  • CPU:
    • Logistic Regression training and prediction
    • k-Nearest Neighbors prediction with Brute Force method
    • Logistic Loss and Cross Entropy objective functions

Added Technical Preview Features in Graph Analytics:

  • CPU:
    • Undirected graph without edge and vertex weights (undirected_adjacency_array_graph), where vertex indices can only be of type int32
    • Jaccard Similarity Coefficients for all pairs of vertices, a batch algorithm that processes the graph by blocks

Aligned the library with Intel® oneDAL Specification 1.0 for the following algorithms:

  • CPU/GPU:
    • K-means, PCA, kNN

Introduced new functionality for scikit-learn patching through daal4py:

  • CPU:
    • Acceleration of NearestNeighbors and KNeighborsRegressor scikit-learn estimators with Brute Force and K-D tree methods
    • Acceleration of TSNE scikit-learn estimator
  • GPU:
    • Intel GPU support in scikit-learn for DBSCAN, K-means, Linear and Logistic Regression

Improved performance of the following scikit-learn estimators via scikit-learn patching:

  • CPU:
    • LogisticRegression fit, predict and predict_proba methods
    • KNeighborsClassifier predict, predict_proba and kneighbors methods with “brute” method

Known Issues

  • Intel® oneDAL DPC++ APIs does not work on GEN12 graphics with OpenCL backend. Use Level Zero backend for such cases.
  • train_test_split in daal4py patches for Scikit-learn can produce incorrect shuffling on Windows*

Intel® DAAL 2020 Update 3

03 Nov 20:49
d148c71
Compare
Choose a tag to compare

What's New in Intel® DAAL 2020 Update 3:

Introduced new Intel® DAAL and daal4py functionality:

  • Brute Force method for k-Nearest Neighbors classification algorithm, which for datasets with more than 13 features demonstrates a better performance than the existing K-D tree method
  • k-Nearest Neighbors search for K-D tree and Brute Force methods with computation of distances to nearest neighbors and their indices

Extended existing Intel® DAAL and daal4py functionality:

  • Voting methods for prediction in k-Nearest Neighbors classification and search: based on inverse-distance and uniform weighting
  • New parameters in Decision Forest classification and regression: minObservationsInSplitNode, minWeightFractionInLeafNode, minImpurityDecreaseInSplitNode, maxLeafNodes with best-first strategy and sample weights
  • Support of Support Vector Machine (SVM) decision function for Multi-class Classifier

Improved Intel® DAAL and daal4py performance for the following algorithms:

  • SVM training and prediction
  • Decision Forest classification training
  • RBF and Linear kernel functions

Introduced new daal4py functionality:

  • Conversion of trained XGBoost* and LightGBM* models into a daal4py Gradient Boosted Trees model for fast prediction
  • Support of Modin* DataFrame as an input

Introduced new functionality for scikit-learn patching through daal4py:

  • Acceleration of KNeighborsClassifier scikit-learn estimator with Brute Force and K-D tree methods
  • Acceleration of RandomForestClassifier and RandomForestRegressor scikit-learn estimators
  • Sparse input support for KMeans and Support Vector Classification (SVC) scikit-learn estimators
  • Prediction of probabilities for SVC scikit-learn estimator
  • Support of ‘normalize’ parameter for Lasso and ElasticNet scikit-learn estimators

Improved performance of the following functionality for scikit-learn patching through daal4py:

  • train_test_split()
  • Support Vector Classification (SVC) fit and prediction

Dependencies

14 Nov 10:54
970c25b
Compare
Choose a tag to compare
fix one-algorithm build and spicific prediction case after probabilit…

DAAL 2020

25 Sep 14:27
31f6c5a
Compare
Choose a tag to compare
DAAL 2020 Pre-release
Pre-release

Update version of bzip2 1.0.4 to 1.0.8 in mkl-fpk (mklfpk_lnx_20180112_10 / mklfpk_mac_20180112_10 / mklfpk_win_20180112_10)