Perform k-fold cross-validation on a list of models, using model-specific parameters. Supports verbose messages and a progress bar.

benchmark(
  models,
  X,
  y,
  cv = 5L,
  scoring = NULL,
  params = NULL,
  cl = NULL,
  show_progress = FALSE,
  verbose = TRUE
)

Arguments

models

A named list of Model$new(...) objects to benchmark.

X

A data frame or matrix of predictors.

y

A vector of outcomes (factor for classification, numeric for regression).

cv

Integer, number of cross-validation folds (default 5).

scoring

Scoring metric: "rmse", "mae", "accuracy", or "f1" (default: auto-detected based on task)

params

Optional named list of lists, each sublist containing extra arguments to pass to the corresponding model's fit() call. Names must match models.

cl

Optional number of clusters for parallel processing

show_progress

Logical, whether to show a progress bar (default TRUE).

verbose

Logical, whether to print messages about each model (default TRUE).

Value

A list containing the CV scores for each model.

Examples

if (FALSE) { # \dontrun{
library(randomForest)

X <- iris[, 1:4]
y <- iris$Species

models <- list(
  glm  = Model$new(caret::train),
  rf   = Model$new(randomForest::randomForest),
  xgb  = Model$new(caret::train)
)

params <- list(
  glm = list(method = "glmnet",
             tuneGrid = data.frame(alpha = 0, lambda = 0.01),
             trControl = trainControl(method = "none")),
  rf  = list(ntree = 150),
  xgb = list(method = "xgbTree",
             tuneGrid = data.frame(nrounds = 150, max_depth = 3, eta = 0.3,
                                   gamma = 0, colsample_bytree = 1,
                                   min_child_weight = 1, subsample = 1),
             trControl = trainControl(method = "none"))
)

results <- benchmark(models, X, y, cv = 5, params = params,
                     show_progress = TRUE, verbose = TRUE)
print(results)
} # }