Wrap a Learner into a PipeOp with Cross-validated Predictions as Features
Source:R/PipeOpLearnerCV.R
mlr_pipeops_learner_cv.Rd
Wraps an mlr3::Learner
into a PipeOp
.
Returns cross-validated predictions during training as a Task
and stores a model of the
Learner
trained on the whole data in $state
. This is used to create a similar
Task
during prediction.
The Task
gets features depending on the capsuled Learner
's
$predict_type
. If the Learner
's $predict.type
is "response"
, a feature <ID>.response
is created,
for $predict.type
"prob"
the <ID>.prob.<CLASS>
features are created, and for $predict.type
"se"
the new columns
are <ID>.response
and <ID>.se
. <ID>
denotes the $id
of the PipeOpLearnerCV
object.
Inherits the $param_set
(and therefore $param_set$values
) from the Learner
it is constructed from.
PipeOpLearnerCV
can be used to create "stacking" or "super learning" Graph
s that use the output of one Learner
as feature for another Learner
. Because the PipeOpLearnerCV
erases the original input features, it is often
useful to use PipeOpFeatureUnion
to bind the prediction Task
to the original input Task
.
Format
R6Class
object inheriting from PipeOpTaskPreproc
/PipeOp
.
Construction
learner
::Learner
Learner
to use for cross validation / prediction, or a string identifying aLearner
in themlr3::mlr_learners
Dictionary
. This argument is always cloned; to access theLearner
insidePipeOpLearnerCV
by-reference, use$learner
.id
::character(1)
Identifier of the resulting object, internally defaulting to theid
of theLearner
being wrapped.param_vals
:: namedlist
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Defaultlist()
.
Input and Output Channels
PipeOpLearnerCV
has one input channel named "input"
, taking a Task
specific to the Learner
type given to learner
during construction; both during training and prediction.
PipeOpLearnerCV
has one output channel named "output"
, producing a Task
specific to the Learner
type given to learner
during construction; both during training and prediction.
The output is a task with the same target as the input task, with features replaced by predictions made by the Learner
.
During training, this prediction is the out-of-sample prediction made by resample
, during prediction, this is the
ordinary prediction made on the data by a Learner
trained on the training phase data.
State
The $state
is set to the $state
slot of the Learner
object, together with the $state
elements inherited from the
PipeOpTaskPreproc
. It is a named list
with the inherited members, as well as:
model
::any
Model created by theLearner
's$.train()
function.train_log
::data.table
with columnsclass
(character
),msg
(character
)
Errors logged during training.train_time
::numeric(1)
Training time, in seconds.predict_log
::NULL
|data.table
with columnsclass
(character
),msg
(character
)
Errors logged during prediction.predict_time
::NULL
|numeric(1)
Prediction time, in seconds.
This state is given the class "pipeop_learner_cv_state"
.
Parameters
The parameters are the parameters inherited from the PipeOpTaskPreproc
, as well as the parameters of the Learner
wrapped by this object.
Besides that, parameters introduced are:
resampling.method
::character(1)
Which resampling method do we want to use. Currently only supports"cv"
and"insample"
."insample"
generates predictions with the model trained on all training data.resampling.folds
::numeric(1)
Number of cross validation folds. Initialized to 3. Only used forresampling.method = "cv"
.keep_response
::logical(1)
Only effective during"prob"
prediction: Whether to keep response values, if available. Initialized toFALSE
.
Internals
The $state
is currently not updated by prediction, so the $state$predict_log
and $state$predict_time
will always be NULL
.
Fields
Fields inherited from PipeOp
, as well as:
Methods
Methods inherited from PipeOpTaskPreproc
/PipeOp
.
See also
https://mlr-org.com/pipeops.html
Other Meta PipeOps:
mlr_pipeops_learner
,
mlr_pipeops_learner_pi_cvplus
,
mlr_pipeops_learner_quantiles
Examples
library("mlr3")
task = tsk("iris")
learner = lrn("classif.rpart")
lrncv_po = po("learner_cv", learner)
lrncv_po$learner$predict_type = "response"
nop = mlr_pipeops$get("nop")
graph = gunion(list(
lrncv_po,
nop
)) %>>% po("featureunion")
graph$train(task)
#> $featureunion.output
#> <TaskClassif:iris> (150 x 6): Iris Flowers
#> * Target: Species
#> * Properties: multiclass
#> * Features (5):
#> - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width
#> - fct (1): classif.rpart.response
#>
graph$pipeops$classif.rpart$learner$predict_type = "prob"
graph$train(task)
#> $featureunion.output
#> <TaskClassif:iris> (150 x 8): Iris Flowers
#> * Target: Species
#> * Properties: multiclass
#> * Features (7):
#> - dbl (7): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width,
#> classif.rpart.prob.setosa, classif.rpart.prob.versicolor,
#> classif.rpart.prob.virginica
#>