Wraps an mlr3::Learner into a PipeOp.

Returns cross-validated predictions during training as a Task and stores a model of the Learner trained on the whole data in $state. This is used to create a similar Task during prediction.

The Task gets features depending on the capsuled Learner's $predict_type. If the Learner's $predict.type is "response", a feature <ID>.response is created, for $predict.type "prob" the <ID>.prob.<CLASS> features are created, and for $predict.type "se" the new columns are <ID>.response and <ID>.se. <ID> denotes the $id of the PipeOpLearnerCV object.

Inherits the $param_set (and therefore $param_set$values) from the Learner it is constructed from.

PipeOpLearnerCV can be used to create "stacking" or "super learning" Graphs that use the output of one Learner as feature for another Learner. Because the PipeOpLearnerCV erases the original input features, it is often useful to use PipeOpFeatureUnion to bind the prediction Task to the original input Task.

Format

R6Class object inheriting from PipeOpTaskPreproc/PipeOp.

Construction

PipeOpLearnerCV$new(learner, id = if (is.character(learner)) learner else learner$id, param_vals = list())
  • learner :: Learner
    Learner to use for cross validation / prediction, or a string identifying a Learner in the mlr3::mlr_learners Dictionary.

  • id :: character(1) Identifier of the resulting object, defaulting to the id of the Learner being wrapped.

  • param_vals :: named list
    List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list().

Input and Output Channels

PipeOpLearnerCV has one input channel named "input", taking a Task specific to the Learner type given to learner during construction; both during training and prediction.

PipeOpLearnerCV has one output channel named "output", producing a Task specific to the Learner type given to learner during construction; both during training and prediction.

The output is a task with the same target as the input task, with features replaced by predictions made by the Learner. During training, this prediction is the out-of-sample prediction made by resample, during prediction, this is the ordinary prediction made on the data by a Learner trained on the training phase data.

State

The $state is set to the $state slot of the Learner object, together with the $state elements inherited from the PipeOpTaskPreproc. It is a named list with the inherited members, as well as:

  • model :: any
    Model created by the Learner's $train_internal() function.

  • train_log :: data.table with columns class (character), msg (character)
    Errors logged during training.

  • train_time :: numeric(1)
    Training time, in seconds.

  • predict_log :: NULL | data.table with columns class (character), msg (character)
    Errors logged during prediction.

  • predict_time :: NULL | numeric(1) Prediction time, in seconds.

Parameters

The parameters are the parameters inherited from the PipeOpTaskPreproc, as well as the parameters of the Learner wrapped by this object. Besides that, parameters introduced are:

  • resampling.method :: character(1)
    Which resampling method do we want to use. Currently only supports "cv".

  • resampling.folds :: numeric(1)
    Number of cross validation folds. Initialized to 3.

  • keep_response :: logical(1)
    Only effective during "prob" prediction: Whether to keep response values, if available. Initialized to FALSE.

Internals

The $state is currently not updated by prediction, so the $state$predict_log and $state$predict_time will always be NULL.

Fields

Fields inherited from PipeOpTaskPreproc/PipeOp, as well as:

Methods

Methods inherited from PipeOpTaskPreproc/PipeOp.

See also

Other Meta PipeOps: mlr_pipeops_learner

Examples

library("mlr3") task = tsk("iris") learner = lrn("classif.rpart") lrncv_po = po("learner_cv", learner) lrncv_po$learner$predict_type = "response" nop = mlr_pipeops$get("nop") graph = gunion(list( lrncv_po, nop )) %>>% po("featureunion") graph$train(task)
#> $featureunion.output #> <TaskClassif:iris> (150 x 6) #> * Target: Species #> * Properties: multiclass #> * Features (5): #> - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width #> - fct (1): classif.rpart.response #>
graph$pipeops$classif.rpart$learner$predict_type = "prob" graph$train(task)
#> $featureunion.output #> <TaskClassif:iris> (150 x 8) #> * Target: Species #> * Properties: multiclass #> * Features (7): #> - dbl (7): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width, #> classif.rpart.prob.setosa, classif.rpart.prob.versicolor, #> classif.rpart.prob.virginica #>