Skip to content

cmccomb/smartcore_vs_linfa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

vs

About

linfa and smartcore have emerged as two leading scikit-learn-analogous machine learning frameworks for Rust. Both provide access to a number of algorithms that form the backbone of machine learning analysis. This repository provides a comparison between the training time of algorithms in these two machine learning frameworks. The algorithms included are:

Algorithm Smartcore v2.0.0 Linfa v5.0.0 Benchmarked here?
Linear Regression βœ“ βœ“ βœ“
Ridge Regression βœ“
LASSO Regression βœ“
Decision Tree Regression βœ“
Random Forest Regression βœ“
Support Vector Regression βœ“ βœ“ βœ“
KNN Regression βœ“
Elastic Net Regression βœ“ βœ“ βœ“
Partial Least Squares βœ“
Logistic Regression βœ“ βœ“ βœ“
Decision Tree Classification βœ“ βœ“ βœ“
Random Forest Classification βœ“
Support Vector Classification βœ“ βœ“ βœ“
KNN Classification βœ“
Gaussian Naive Bayes βœ“ βœ“ βœ“
K-Means βœ“ βœ“ βœ“
DBSCAN βœ“ βœ“ βœ“
Hierarchical Clustering βœ“
Approximated DBSCAN βœ“
Gaussian Mixture Model βœ“
PCA βœ“ βœ“ βœ“
ICA βœ“
SVD βœ“
t-SNE βœ“
Diffusion Mapping βœ“

The full report is available here, but summary violin plots are provided below.

Considerations Besides Execution Time

Over the process of creating this benchmark study, a few additional differences between the libraries emerged.

Documentation

The documentation for smartcore is a bit more consistent across algorithms. This may be due to the fact that it is maintained in a single crate.

Dependencies

While linfa requires a BLAS/LAPACK backend (either openblas, netblas, or intel-mkl), smartcore does not. This allows linfa to take advantage of some additional optimization, but it limits portability.

Results

Regression

No customization needed to equate algorithms.

Classification

The smartcore implementation has no parameters, but the linfa settings were modified to align it with smartcore defaults:

  • Gradient tolerance set to 1e-8
  • Maximum number of iterations set to 1000

Clustering

Since the two implementations use different convergence criteria, the number of max iterations was equated at a low value, and only 1 run of the linfa algorithm was permitted:

  • Max iterations set to 10
  • Number of runs set to 1

Dimensionality Reduction

Releases

No releases published

Packages

No packages published