drift

Detect distribution shift between reference data and new data. Two methods: statistical tests (KS/chi-squared per feature) or adversarial (train a classifier to distinguish old from new).

Signature

ml.drift(*, reference, new, method="statistical", threshold=0.05, seed=None)

ml_drift(reference, new, method = "statistical", threshold = 0.05, seed = NULL)

Parameters

Parameter	Type	Default	Description
`reference`	DataFrame	—	Original training data
`new`	DataFrame	—	New incoming data
`method`	str	`"statistical"`	`"statistical"` (KS/chi2) or `"adversarial"` (classifier-based)
`threshold`	float	`0.05`	p-value threshold for statistical method
`seed`	int \| None	`None`	Random seed (adversarial method)

Returns

DriftResult with:

.shifted — True if drift detected
.severity — "none", "low", "medium", or "high"
.features_shifted — list of drifted feature names
.features — dict of per-feature p-values
.auc — adversarial AUC (adversarial method only)

Examples

Statistical drift detection

result = ml.drift(reference=s.train, new=new_data)
print(result.shifted)           # True/False
print(result.severity)          # "none", "low", "medium", "high"
print(result.features_shifted)  # ["age", "fare"]

result <- ml_drift(s$train, new_data)
result$shifted
result$severity
result$features_shifted

Adversarial drift detection

# If a classifier can distinguish old from new, the data has drifted
result = ml.drift(reference=s.train, new=new_data, method="adversarial", seed=42)
print(result.auc)  # > 0.5 means drift

result <- ml_drift(s$train, new_data, method = "adversarial", seed = 42)
result$auc

calibrate →