Create a bagging learner — mlr_graphs

Creates a Graph that performs bagging for a supplied graph. This is done as follows:

Subsample the data in each step using PipeOpSubsample, afterwards apply graph
Replicate this step iterations times (in parallel via multiplicities)
Average outputs of replicated graphs predictions using the averager (note that setting collect_multipliciy = TRUE is required)

All input arguments are cloned and have no references in common with the returned Graph.

Usage

pipeline_bagging(
  graph,
  iterations = 10,
  frac = 0.7,
  averager = NULL,
  replace = FALSE
)

Arguments

graph: PipeOp | Graph
A PipeOpLearner or Graph to create a robustifying pipeline for. Outputs from the replicated graphs are connected with the averager.
iterations: integer(1)
Number of bagging iterations. Defaults to 10.
frac: numeric(1)
Percentage of rows to keep during subsampling. See PipeOpSubsample for more information. Defaults to 0.7.
averager: PipeOp | Graph
A PipeOp or Graph that averages the predictions from the replicated and subsampled graph's. In the simplest case, po("classifavg") and po("regravg") can be used in order to perform simple averaging of classification and regression predictions respectively. If NULL (default), no averager is added to the end of the graph. Note that setting collect_multipliciy = TRUE during construction of the averager is required.
replace: logical(1)
Whether to sample with replacement. Default FALSE.

Value

Graph

Examples

# \donttest{
library(mlr3)
lrn_po = po("learner", lrn("regr.rpart"))
task = mlr_tasks$get("boston_housing")
gr = pipeline_bagging(lrn_po, 3, averager = po("regravg", collect_multiplicity = TRUE))
resample(task, GraphLearner$new(gr), rsmp("holdout"))$aggregate()
#> regr.mse 
#> 17.31492 

# The original bagging method uses boosting by sampling with replacement.
gr = ppl("bagging", lrn_po, frac = 1, replace = TRUE,
  averager = po("regravg", collect_multiplicity = TRUE))
resample(task, GraphLearner$new(gr), rsmp("holdout"))$aggregate()
#> regr.mse 
#> 11.37488 
# }