Wrap a Learner into a PipeOp with Cross-validated Predictions as Features

Wraps an mlr3::Learner into a PipeOp.

Returns cross-validated predictions during training as a Task and stores a model of the Learner trained on the whole data in $state. This is used to create a similar Task during prediction.

The Task gets features depending on the capsuled Learner's $predict_type. If the Learner's $predict.type is "response", a feature <ID>.response is created, for $predict.type "prob" the <ID>.prob.<CLASS> features are created, and for $predict.type "se" the new columns are <ID>.response and <ID>.se. <ID> denotes the $id of the PipeOpLearnerCV object.

Inherits the $param_set (and therefore $param_set$values) from the Learner it is constructed from.

PipeOpLearnerCV can be used to create "stacking" or "super learning" Graphs that use the output of one Learner as feature for another Learner. Because the PipeOpLearnerCV erases the original input features, it is often useful to use PipeOpFeatureUnion to bind the prediction Task to the original input Task.

Format

R6Class object inheriting from PipeOpTaskPreproc/PipeOp.

Construction

PipeOpLearnerCV$new(learner, id = NULL, param_vals = list())

learner :: Learner
Learner to use for cross validation / prediction, or a string identifying a Learner in the mlr3::mlr_learners Dictionary. This argument is always cloned; to access the Learner inside PipeOpLearnerCV by-reference, use $learner.
id :: character(1) Identifier of the resulting object, internally defaulting to the id of the Learner being wrapped.
param_vals :: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list().

Input and Output Channels

PipeOpLearnerCV has one input channel named "input", taking a Task specific to the Learner type given to learner during construction; both during training and prediction.

PipeOpLearnerCV has one output channel named "output", producing a Task specific to the Learner type given to learner during construction; both during training and prediction.

The output is a task with the same target as the input task, with features replaced by predictions made by the Learner. During training, this prediction is the out-of-sample prediction made by resample, during prediction, this is the ordinary prediction made on the data by a Learner trained on the training phase data.

State

The $state is set to the $state slot of the Learner object, together with the $state elements inherited from the PipeOpTaskPreproc. It is a named list with the inherited members, as well as:

model :: any
Model created by the Learner's $.train() function.
train_log :: data.table with columns class (character), msg (character)
Errors logged during training.
train_time :: numeric(1)
Training time, in seconds.
predict_log :: NULL | data.table with columns class (character), msg (character)
Errors logged during prediction.
predict_time :: NULL | numeric(1) Prediction time, in seconds.

This state is given the class "pipeop_learner_cv_state".

Parameters

The parameters are the parameters inherited from the PipeOpTaskPreproc, as well as the parameters of the Learner wrapped by this object. Besides that, parameters introduced are:

resampling.method :: character(1)
Which resampling method do we want to use. Currently only supports "cv" and "insample". "insample" generates predictions with the model trained on all training data.
resampling.folds :: numeric(1)
Number of cross validation folds. Initialized to 3. Only used for resampling.method = "cv".
keep_response :: logical(1)
Only effective during "prob" prediction: Whether to keep response values, if available. Initialized to FALSE.

Internals

The $state is currently not updated by prediction, so the $state$predict_log and $state$predict_time will always be NULL.

Fields

Fields inherited from PipeOp, as well as:

learner :: Learner
Learner that is being wrapped. Read-only.
learner_model :: Learner
Learner that is being wrapped. This learner contains the model if the PipeOp is trained. Read-only.

Methods

Methods inherited from PipeOpTaskPreproc/PipeOp.

Examples

library("mlr3")

task = tsk("iris")
learner = lrn("classif.rpart")

lrncv_po = po("learner_cv", learner)
lrncv_po$learner$predict_type = "response"

nop = mlr_pipeops$get("nop")

graph = gunion(list(
  lrncv_po,
  nop
)) %>>% po("featureunion")

graph$train(task)
#> $featureunion.output
#> <TaskClassif:iris> (150 x 6): Iris Flowers
#> * Target: Species
#> * Properties: multiclass
#> * Features (5):
#>   - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width
#>   - fct (1): classif.rpart.response
#> 

graph$pipeops$classif.rpart$learner$predict_type = "prob"

graph$train(task)
#> $featureunion.output
#> <TaskClassif:iris> (150 x 8): Iris Flowers
#> * Target: Species
#> * Properties: multiclass
#> * Features (7):
#>   - dbl (7): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width,
#>     classif.rpart.prob.setosa, classif.rpart.prob.versicolor,
#>     classif.rpart.prob.virginica
#>