Releases: bsc-wdc/dislib
Releases · bsc-wdc/dislib
v0.4.3
v0.4.0
Dependencies
- PyCOMPSs == 2.5
- Scikit-learn >= 0.19.2
- NumPy >= 1.15.4
- Scipy >= 1.0.0
Breaking Changes
- Most estimator methods, such as fit and predict, now expect one or two ds-arrays instead of a Dataset.
New Features
- This release introduces the distributed array as the main data structure in dislib. All estimators have been modified to accept ds-arrays instead of Datasets. The Dataset and Subset classes have been removed.
Bug Fixes
- Minor bug fixes in RandomForestClassifier and K-means
Improvements
- The performance of various algorithms has been improved by using PyCOMPSs COLLECTIONS.
- K-means now accepts an 'init' parameter.
v0.3.0
Dependencies
- PyCOMPSs == 2.5
- Scikit-learn >= 0.19.2
- NumPy >= 1.15.4
- Scipy >= 1.0.0
New Features
- GaussianMixture now supports covariance types 'tied', 'diag', and 'spherical' apart from 'full'.
- dislib now provides PCA and LinearRegression models.
Bug Fixes
- Fixed DBSCAN to be able to detect clusters with less than min_samples samples, and to be able to detect clusters that lie in the intersection of two regions.
Improvements
- The GaussianMixture documentation has been improved.
- Extra tests for GaussianMixture, C-SVM and DBSCAN have been added.
- The performance of K-means, DBSCAN and GaussianMixtures has been significantly improved.
- The performance of utils.shuffle has been improved by using PyCOMPSs collections.
- The performance of Dataset has been improved by removing the tracking of duplicates.
v0.2.1
Dependencies
(Update dependency versions if required)
- PyCOMPSs >= 2.4-rc1902
- Scikit-learn >= 0.19.1
- NumPy >= 1.15.4
- Scipy >= 1.0.0
Bug Fixes
- DBSCAN now detects clusters with less than min_samples in certain situations
Improvements
- The performance of DBSCAN has been improved
v0.2.0
Dependencies
- PyCOMPSs == 2.4-rc1902
- Scikit-learn => 0.19.1
- NumPy => 1.15.4
- Scipy => 1.0.0
Upgrade Steps
Breaking Changes
- predict and fit_predict methods in K-means, DBSCAN and C-SVM now take a Dataset as argument and do not return anything
New Features
-
The following new algorithms have been implemented:
- Gaussian mixtures
- Nearest neighbors
- Alternating least squares
- Standard scaler
-
Added the following utility methods:
- resample
- shuffle
- as_grid
Bug Fixes
- Numerous bug fixes in DBSCAN.
- Fixed the reproducibility of results in C-SVM and random forests
- Several other minor bug fixes
Improvements
- Completely unified the interface of the different algorithms
- Improved the documentation
- Added a way to easily access Dataset samples and labels
- Implemented Dataset's transpose
- Implemented Dataset's apply function
0.1.1
Initial Release
Merge pull request #45 from bsc-wdc/kmeans-fix Kmeans fix