Creates a Graph that performs bagging for a supplied graph.
This is done as follows:
Subsamplethe data in each step usingPipeOpSubsample, afterwards applygraphReplicate this step
iterationstimes (in parallel via multiplicities)Average outputs of replicated
graphs predictions using theaverager(note that settingcollect_multipliciy = TRUEis required)
All input arguments are cloned and have no references in common with the returned Graph.
Arguments
- graph
PipeOp|Graph
APipeOpLearnerorGraphto create a robustifying pipeline for. Outputs from the replicatedgraphs are connected with theaverager.- iterations
integer(1)
Number of bagging iterations. Defaults to 10.- frac
numeric(1)
Percentage of rows to keep during subsampling. SeePipeOpSubsamplefor more information. Defaults to 0.7.- averager
PipeOp|Graph
APipeOporGraphthat averages the predictions from the replicated and subsampled graph's. In the simplest case,po("classifavg")andpo("regravg")can be used in order to perform simple averaging of classification and regression predictions respectively. IfNULL(default), no averager is added to the end of the graph. Note that settingcollect_multipliciy = TRUEduring construction of the averager is required.- replace
logical(1)
Whether to sample with replacement. DefaultFALSE.
Examples
# \donttest{
library(mlr3)
lrn_po = po("learner", lrn("regr.rpart"))
task = mlr_tasks$get("boston_housing")
gr = pipeline_bagging(lrn_po, 3, averager = po("regravg", collect_multiplicity = TRUE))
resample(task, GraphLearner$new(gr), rsmp("holdout"))$aggregate()
#> regr.mse
#> 17.31492
# The original bagging method uses boosting by sampling with replacement.
gr = ppl("bagging", lrn_po, frac = 1, replace = TRUE,
averager = po("regravg", collect_multiplicity = TRUE))
resample(task, GraphLearner$new(gr), rsmp("holdout"))$aggregate()
#> regr.mse
#> 11.37488
# }
