Subsamples a Task to use a fraction of the rows.

Sampling happens only during training phase. Subsampling a Task may be beneficial for training time at possibly (depending on original Task size) negligible cost of predictive performance.

## Format

R6Class object inheriting from PipeOpTaskPreproc/PipeOp.

PipeOpSubsample$new(id = "subsample", param_vals = list())  • id :: character(1) Identifier of the resulting object, default "subsample" • param_vals :: named list List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list(). ## Input and Output Channels Input and output channels are inherited from PipeOpTaskPreproc. The output during training is the input Task with added or removed rows according to the sampling. The output during prediction is the unchanged input. ## State The $state is a named list with the $state elements inherited from PipeOpTaskPreproc. ## Parameters The parameters are the parameters inherited from PipeOpTaskPreproc; however, the affect_columns parameter is not present. Further parameters are: • frac :: numeric(1) Fraction of rows in the Task to keep. May only be greater than 1 if replace is TRUE. Initialized to (1 - exp(-1)) == 0.6321. • stratify :: logical(1) Should the subsamples be stratified by target? Initialized to FALSE. May only be TRUE for TaskClassif input. • replace :: logical(1) Sample with replacement? Initialized to FALSE. ## Internals Uses task$filter() to remove rows. If replace is TRUE and identical rows are added, then the task$row_roles$use can not be used to duplicate rows because of [inaudible]; instead the task$rbind() function is used, and a new data.table is attached that contains all rows that are being duplicated exactly as many times as they are being added. ## Fields Only fields inherited from PipeOpTaskPreproc/PipeOp. ## Methods Only methods inherited from PipeOpTaskPreproc/PipeOp. ## See also https://mlr3book.mlr-org.com/list-pipeops.html Other PipeOps: PipeOpEnsemble, PipeOpImpute, PipeOpTargetTrafo, PipeOpTaskPreprocSimple, PipeOpTaskPreproc, PipeOp, mlr_pipeops_boxcox, mlr_pipeops_branch, mlr_pipeops_chunk, mlr_pipeops_classbalancing, mlr_pipeops_classifavg, mlr_pipeops_classweights, mlr_pipeops_colapply, mlr_pipeops_collapsefactors, mlr_pipeops_colroles, mlr_pipeops_copy, mlr_pipeops_datefeatures, mlr_pipeops_encodeimpact, mlr_pipeops_encodelmer, mlr_pipeops_encode, mlr_pipeops_featureunion, mlr_pipeops_filter, mlr_pipeops_fixfactors, mlr_pipeops_histbin, mlr_pipeops_ica, mlr_pipeops_imputeconstant, mlr_pipeops_imputehist, mlr_pipeops_imputelearner, mlr_pipeops_imputemean, mlr_pipeops_imputemedian, mlr_pipeops_imputemode, mlr_pipeops_imputeoor, mlr_pipeops_imputesample, mlr_pipeops_kernelpca, mlr_pipeops_learner, mlr_pipeops_missind, mlr_pipeops_modelmatrix, mlr_pipeops_multiplicityexply, mlr_pipeops_multiplicityimply, mlr_pipeops_mutate, mlr_pipeops_nmf, mlr_pipeops_nop, mlr_pipeops_ovrsplit, mlr_pipeops_ovrunite, mlr_pipeops_pca, mlr_pipeops_proxy, mlr_pipeops_quantilebin, mlr_pipeops_randomprojection, mlr_pipeops_randomresponse, mlr_pipeops_regravg, mlr_pipeops_removeconstants, mlr_pipeops_renamecolumns, mlr_pipeops_replicate, mlr_pipeops_scalemaxabs, mlr_pipeops_scalerange, mlr_pipeops_scale, mlr_pipeops_select, mlr_pipeops_smote, mlr_pipeops_spatialsign, mlr_pipeops_targetinvert, mlr_pipeops_targetmutate, mlr_pipeops_targettrafoscalerange, mlr_pipeops_textvectorizer, mlr_pipeops_threshold, mlr_pipeops_tunethreshold, mlr_pipeops_unbranch, mlr_pipeops_updatetarget, mlr_pipeops_vtreat, mlr_pipeops_yeojohnson, mlr_pipeops ## Examples library("mlr3") pos = mlr_pipeops$get("subsample", param_vals = list(frac = 0.7, stratify = TRUE))

pos$train(list(tsk("iris"))) #>$output
#> * Target: Species
#> * Properties: multiclass
#> * Features (4):
#>   - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width
#>