Quickstart

A complete workflow from data to assessed model. Takes about 2 minutes.

1. Load data

import ml

data = ml.dataset("titanic")
data.head()
library(ml)

data <- ml_dataset("titanic")
head(data)
   pclass     sex      age  sibsp  parch      fare embarked survived
0       1  female  29.0000      0      0  211.3375        S      yes
1       1    male   0.9167      1      2  151.5500        S      yes
2       1  female   2.0000      1      2  151.5500        S       no
3       1    male  30.0000      1      2  151.5500        S       no
4       1  female  25.0000      1      2  151.5500        S       no

Several datasets ship with the package. Others are available via OpenML.

2. Split

s = ml.split(data, "survived", seed=42)
s <- ml_split(data, "survived", seed = 42)
train: 785 rows  (60%)
valid: 262 rows  (20%)
test:  262 rows  (20%)
dev:   1047 rows (train + valid)

Four accessors: .train, .valid, .test (locked), and .dev (train + valid combined for final refit).

3. Fit and evaluate

model = ml.fit(s.train, "survived", seed=42)
ml.evaluate(model, s.valid)
model <- ml_fit(s$train, "survived", seed = 42)
ml_evaluate(model, s$valid)
—— Metrics [classification] ————————
  accuracy:     0.8244
  f1:           0.7579
  precision:    0.8000
  recall:       0.7200
  roc_auc:      0.8647

The default algorithm is random forest. Try others by passing algorithm=.

model2 = ml.fit(s.train, "survived", algorithm="xgboost", seed=42)
ml.evaluate(model2, s.valid)
model2 <- ml_fit(s$train, "survived", algorithm = "xgboost", seed = 42)
ml_evaluate(model2, s$valid)
—— Metrics [classification] ————————
  accuracy:     0.8206
  f1:           0.7513
  precision:    0.7978
  recall:       0.7100
  roc_auc:      0.8616

Call evaluate as many times as you want — it only uses validation data.

4. Visualize

ml.plot(model, data=s.valid, kind="roc")
ml.plot(model, kind="importance")
ml.plot(model, data=s.valid, kind="confusion")
ml.plot(model, data=s.valid, kind="calibration")
ml_plot(model, data = s$valid, kind = "roc")
ml_plot(model, kind = "importance")
ml_plot(model, data = s$valid, kind = "confusion")
ml_plot(model, data = s$valid, kind = "calibration")

All plots return a matplotlib.figure.Figure. Save with fig.savefig("roc.png"). Ten plot kinds available — see explain and validate.

5. Final refit and assess

final = ml.fit(s.dev, "survived", algorithm="xgboost", seed=42)
evidence = ml.assess(final, test=s.test)
print(evidence)
final <- ml_fit(s$dev, "survived", algorithm = "xgboost", seed = 42)
evidence <- ml_assess(final, test = s$test)
print(evidence)
—— Evidence [classification] ———————
  accuracy:     0.7863
  f1:           0.7021
  precision:    0.7500
  recall:       0.6600
  roc_auc:      0.8315
  ⚠ Final. A second assess() on the same hold-out test set raises.

Refit on .dev (train + valid) for maximum data, then assess on test. One shot — a second assess on the same hold-out test set raises. That's the whole protocol: split first, evaluate freely, assess once. The types enforce it.

What's next

  • split — ratios, grouped splits, temporal splits
  • screen — compare all algorithms at once
  • tune — hyperparameter optimization
  • Algorithms — all supported families