Algorithms
Eleven Rust-native families ship with zero external dependencies. Five more available via optional packages.
Rust-native (no dependencies)
| Algorithm | Classification | Regression | Key parameter |
|---|---|---|---|
random_forest | Yes | Yes | n_estimators |
extra_trees | Yes | Yes | n_estimators |
decision_tree | Yes | Yes | max_depth |
gradient_boosting | Yes | Yes | n_estimators, learning_rate |
histgradient | Yes | Yes | n_estimators, max_bins |
adaboost | Yes | — | n_estimators |
logistic | Yes | — | C |
linear | — | Yes | — |
elastic_net | — | Yes | alpha, l1_ratio |
naive_bayes | Yes | — | — |
knn | Yes | Yes | n_neighbors |
External (optional packages)
| Algorithm | Classification | Regression | Install |
|---|---|---|---|
xgboost | Yes | Yes | pip install "mlw[xgboost]" |
lightgbm | Yes | Yes | pip install "mlw[lightgbm]" |
catboost | Yes | Yes | pip install "mlw[catboost]" |
svm | Yes | Yes | Included (linear) / sklearn (nonlinear) |
Auto selection
When algorithm="auto", the package selects based on task and data characteristics. The default is random_forest — a reliable baseline that works well across most problems without tuning.
Usage
# List all available algorithms
ml.algorithms()
# Classification only
ml.algorithms(task="classification")
# Use a specific one
model = ml.fit(s.train, "target", algorithm="xgboost", seed=42) model <- ml_fit(s$train, "target", algorithm = "xgboost", seed = 42) Engine selection
Each algorithm can run on multiple backends:
| Engine | Description |
|---|---|
"auto" | Uses Rust backend when available, falls back to sklearn/CRAN |
"ml" | Rust backend (via PyO3). Zero external dependencies. |
"sklearn" | scikit-learn backend (Python only) |
"r" | CRAN packages (R only) |
# Force Rust backend
model = ml.fit(s.train, "target", engine="ml", seed=42)
# Force sklearn
model = ml.fit(s.train, "target", engine="sklearn", seed=42) # Force Rust backend
model <- ml_fit(s$train, "target", engine = "ml", seed = 42)
# Force CRAN packages
model <- ml_fit(s$train, "target", engine = "r", seed = 42)