GitHub - cmccomb/smartcore_vs_linfa: Benchmarking the top ML crates available for Rust 🦀

vs

About

linfa and smartcore have emerged as two leading scikit-learn-analogous machine learning frameworks for Rust. Both provide access to a number of algorithms that form the backbone of machine learning analysis. This repository provides a comparison between the training time of algorithms in these two machine learning frameworks. The algorithms included are:

Algorithm	Smartcore v2.0.0	Linfa v5.0.0	Benchmarked here?
Linear Regression	✓	✓	✓
Ridge Regression	✓
LASSO Regression	✓
Decision Tree Regression	✓
Random Forest Regression	✓
Support Vector Regression	✓	✓	✓
KNN Regression	✓
Elastic Net Regression	✓	✓	✓
Partial Least Squares		✓
Logistic Regression	✓	✓	✓
Decision Tree Classification	✓	✓	✓
Random Forest Classification	✓
Support Vector Classification	✓	✓	✓
KNN Classification	✓
Gaussian Naive Bayes	✓	✓	✓
K-Means	✓	✓	✓
DBSCAN	✓	✓	✓
Hierarchical Clustering		✓
Approximated DBSCAN		✓
Gaussian Mixture Model		✓
PCA	✓	✓	✓
ICA		✓
SVD	✓
t-SNE		✓
Diffusion Mapping		✓

The full report is available here, but summary violin plots are provided below.

Considerations Besides Execution Time

Over the process of creating this benchmark study, a few additional differences between the libraries emerged.

Documentation

The documentation for smartcore is a bit more consistent across algorithms. This may be due to the fact that it is maintained in a single crate.

Dependencies

While linfa requires a BLAS/LAPACK backend (either openblas, netblas, or intel-mkl), smartcore does not. This allows linfa to take advantage of some additional optimization, but it limits portability.

Results

Regression

Linear Regression

No customization needed to equate algorithms.

Elastic Net

Support Vector Regression

Classification

Logistic Regression

The smartcore implementation has no parameters, but the linfa settings were modified to align it with smartcore defaults:

Gradient tolerance set to 1e-8
Maximum number of iterations set to 1000

Clustering

K-Means

Since the two implementations use different convergence criteria, the number of max iterations was equated at a low value, and only 1 run of the linfa algorithm was permitted:

Max iterations set to 10
Number of runs set to 1

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
.github/workflows		.github/workflows
benches		benches
criterion		criterion
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Considerations Besides Execution Time

Documentation

Dependencies

Results

Regression

Linear Regression

Elastic Net

Support Vector Regression

Classification

Logistic Regression

Decision Tree

Gaussian Naive Bayes

Support Vector Classification

Clustering

K-Means

DBSCAN

Dimensionality Reduction

PCA

About

Releases

Packages

Languages

cmccomb/smartcore_vs_linfa

Folders and files

Latest commit

History

Repository files navigation

About

Considerations Besides Execution Time

Documentation

Dependencies

Results

Regression

Classification

Clustering

Dimensionality Reduction

About

Topics

Resources

Stars

Watchers

Forks

Languages