fit

Train a model. Returns a Model object that tracks its own workflow state — fitted, evaluated, or assessed.

Signature

ml.fit(data, target, *, algorithm="auto", seed, task="auto", balance=False, engine="auto", **kwargs)
ml_fit(data, target, algorithm = "auto", seed = NULL, task = "auto", balance = FALSE, engine = "auto", ...)

Parameters

ParameterTypeDefaultDescription
dataDataFrameTraining data (or a SplitResult/CVResult)
targetstrTarget column name
algorithmstr"auto"Algorithm family. See algorithms.
seedintRandom seed for reproducibility. Required.
taskstr"auto""classification", "regression", or "auto" (inferred from target).
balanceboolFalseApply class-weight balancing for imbalanced classification.
enginestr"auto""auto", "ml" (Rust), "sklearn", or "native".

Returns

A Model with attributes:

  • .algorithm — the algorithm used
  • .task — classification or regression
  • .features — feature names from training
  • .scores_ — CV metrics (when fitted on a CVResult)

Examples

Default (random forest)

model = ml.fit(s.train, "target", seed=42)
model <- ml_fit(s$train, "target", seed = 42)

Specific algorithm

model = ml.fit(s.train, "price", algorithm="xgboost", seed=42)
model <- ml_fit(s$train, "price", algorithm = "xgboost", seed = 42)

With class balancing

model = ml.fit(s.train, "fraud", balance=True, seed=42)
model <- ml_fit(s$train, "fraud", balance = TRUE, seed = 42)

Fit on cross-validation result

cvr = ml.cv(s, folds=5, seed=42)
model = ml.fit(cvr, "target", seed=42)
model.scores_  # per-fold metrics
cvr <- ml_cv(s, folds = 5, seed = 42)
model <- ml_fit(cvr, "target", seed = 42)
model$scores_

Notes

  • When algorithm="auto", the package selects based on task and data size.
  • The Rust engine (engine="ml") provides native implementations for 11 algorithm families with no external dependencies.
  • Pass additional hyperparameters as keyword arguments: ml.fit(..., n_estimators=500, max_depth=8).