calibrate

Adjust predicted probabilities so they match observed frequencies. A model that says 80% should be right 80% of the time. Supports Platt scaling and isotonic regression (PAV).

Signature

ml.calibrate(model, *, data, method="auto")
# R: calibration available via ml_plot(model, data, kind = "calibration")

Parameters

ParameterTypeDefaultDescription
modelModelA fitted classification model
dataDataFrameCalibration data (typically validation set)
methodstr"auto""auto", "platt" (sigmoid), or "isotonic" (PAV)

Returns

A calibrated Model. Predictions from this model have adjusted probabilities.

Examples

calibrated = ml.calibrate(model, data=s.valid)
probs = ml.predict(calibrated, new_data, proba=True)

# Verify calibration
ml.plot(calibrated, data=s.valid, kind="calibration")
# Visualize calibration
ml_plot(model, data = s$valid, kind = "calibration")

When to calibrate

  • Tree-based models (random forest, XGBoost) often have poorly calibrated probabilities.
  • Logistic regression is typically well-calibrated already.
  • method="auto" uses Platt scaling for small datasets, isotonic for large ones.
  • Always calibrate on validation data, never on training data.