
Wrap a Learner into a PipeOp with Cross-validated Predictions as Features
Source:R/PipeOpLearnerCV.R
mlr_pipeops_learner_cv.RdWraps an mlr3::Learner into a PipeOp.
Returns cross-validated predictions during training as a Task and stores a model of the
Learner trained on the whole data in $state. This is used to create a similar
Task during prediction.
The Task gets features depending on the capsuled Learner's
$predict_type. If the Learner's $predict.type is "response", a feature <ID>.response is created,
for $predict.type "prob" the <ID>.prob.<CLASS> features are created, and for $predict.type "se" the new columns
are <ID>.response and <ID>.se. <ID> denotes the $id of the PipeOpLearnerCV object.
Inherits the $param_set (and therefore $param_set$values) from the Learner it is constructed from.
PipeOpLearnerCV can be used to create "stacking" or "super learning" Graphs that use the output of one Learner
as feature for another Learner. Because the PipeOpLearnerCV erases the original input features, it is often
useful to use PipeOpFeatureUnion to bind the prediction Task to the original input Task.
Format
R6Class object inheriting from PipeOpTaskPreproc/PipeOp.
Construction
learner::LearnerLearnerto use for cross validation / prediction, or a string identifying aLearnerin themlr3::mlr_learnersDictionary. This argument is always cloned; to access theLearnerinsidePipeOpLearnerCVby-reference, use$learner.id::character(1)Identifier of the resulting object, internally defaulting to theidof theLearnerbeing wrapped.param_vals:: namedlist
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Defaultlist().
Input and Output Channels
PipeOpLearnerCV has one input channel named "input", taking a Task specific to the Learner
type given to learner during construction; both during training and prediction.
PipeOpLearnerCV has one output channel named "output", producing a Task specific to the Learner
type given to learner during construction; both during training and prediction.
The output is a task with the same target as the input task, with features replaced by predictions made by the Learner.
During training, this prediction is the out-of-sample prediction made by resample, during prediction, this is the
ordinary prediction made on the data by a Learner trained on the training phase data.
State
The $state is set to the $state slot of the Learner object, together with the $state elements inherited from the
PipeOpTaskPreproc. It is a named list with the inherited members, as well as:
model::any
Model created by theLearner's$.train()function.train_log::data.tablewith columnsclass(character),msg(character)
Errors logged during training.train_time::numeric(1)
Training time, in seconds.predict_log::NULL|data.tablewith columnsclass(character),msg(character)
Errors logged during prediction.predict_time::NULL|numeric(1)Prediction time, in seconds.
This state is given the class "pipeop_learner_cv_state".
Parameters
The parameters are the parameters inherited from the PipeOpTaskPreproc, as well as the parameters of the Learner wrapped by this object.
Besides that, parameters introduced are:
resampling.method::character(1)
Which resampling method do we want to use. Currently only supports"cv"and"insample"."insample"generates predictions with the model trained on all training data.resampling.folds::numeric(1)
Number of cross validation folds. Initialized to 3. Only used forresampling.method = "cv".keep_response::logical(1)
Only effective during"prob"prediction: Whether to keep response values, if available. Initialized toFALSE.
Internals
The $state is currently not updated by prediction, so the $state$predict_log and $state$predict_time will always be NULL.
Fields
Fields inherited from PipeOp, as well as:
Methods
Methods inherited from PipeOpTaskPreproc/PipeOp.
See also
https://mlr-org.com/pipeops.html
Other Meta PipeOps:
mlr_pipeops_learner,
mlr_pipeops_learner_pi_cvplus,
mlr_pipeops_learner_quantiles
Examples
library("mlr3")
task = tsk("iris")
learner = lrn("classif.rpart")
lrncv_po = po("learner_cv", learner)
lrncv_po$learner$predict_type = "response"
nop = mlr_pipeops$get("nop")
graph = gunion(list(
lrncv_po,
nop
)) %>>% po("featureunion")
graph$train(task)
#> $featureunion.output
#>
#> ── <TaskClassif> (150x6): Iris Flowers ─────────────────────────────────────────
#> • Target: Species
#> • Target classes: setosa (33%), versicolor (33%), virginica (33%)
#> • Properties: multiclass
#> • Features (5):
#> • dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width
#> • fct (1): classif.rpart.response
#>
graph$pipeops$classif.rpart$learner$predict_type = "prob"
graph$train(task)
#> $featureunion.output
#>
#> ── <TaskClassif> (150x8): Iris Flowers ─────────────────────────────────────────
#> • Target: Species
#> • Target classes: setosa (33%), versicolor (33%), virginica (33%)
#> • Properties: multiclass
#> • Features (7):
#> • dbl (7): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width,
#> classif.rpart.prob.setosa, classif.rpart.prob.versicolor,
#> classif.rpart.prob.virginica
#>