Generates a more balanced data set by creating
synthetic instances of the minority class using the SMOTE algorithm.
The algorithm samples for each minority instance a new data point based on the `K`

nearest
neighbors of that data point.
It can only be applied to tasks with numeric features.
See `smotefamily::SMOTE`

for details.

`R6Class`

object inheriting from `PipeOpTaskPreproc`

/`PipeOp`

.

PipeOpSmote$new(id = "smote", param_vals = list())

Identifier of resulting object, default `"smote"`

.

`param_vals`

:: named`list`

List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default`list()`

.

Input and output channels are inherited from `PipeOpTaskPreproc`

.

The output during training is the input `Task`

with added synthetic rows for the minority class.
The output during prediction is the unchanged input.

The `$state`

is a named `list`

with the `$state`

elements inherited from `PipeOpTaskPreproc`

.

The parameters are the parameters inherited from `PipeOpTaskPreproc`

, as well as:

`K`

::`numeric(1)`

The number of nearest neighbors used for sampling new values. See`SMOTE()`

.`dup_size`

::`numeric`

Desired times of synthetic minority instances over the original number of majority instances. See`SMOTE()`

.

For details see:

Chawla, N., Bowyer, K., Hall, L. and Kegelmeyer, W. 2002.

SMOTE: Synthetic minority oversampling technique.

Journal of Artificial Intelligence Research. 16, 321-357.

Only fields inherited from `PipeOpTaskPreproc`

/`PipeOp`

.

Only methods inherited from `PipeOpTaskPreproc`

/`PipeOp`

.

Other PipeOps: `PipeOpEnsemble`

,
`PipeOpImpute`

,
`PipeOpTaskPreproc`

, `PipeOp`

,
`mlr_pipeops_boxcox`

,
`mlr_pipeops_branch`

,
`mlr_pipeops_chunk`

,
`mlr_pipeops_classbalancing`

,
`mlr_pipeops_classifavg`

,
`mlr_pipeops_classweights`

,
`mlr_pipeops_colapply`

,
`mlr_pipeops_collapsefactors`

,
`mlr_pipeops_copy`

,
`mlr_pipeops_encodeimpact`

,
`mlr_pipeops_encodelmer`

,
`mlr_pipeops_encode`

,
`mlr_pipeops_featureunion`

,
`mlr_pipeops_filter`

,
`mlr_pipeops_fixfactors`

,
`mlr_pipeops_histbin`

,
`mlr_pipeops_ica`

,
`mlr_pipeops_imputehist`

,
`mlr_pipeops_imputemean`

,
`mlr_pipeops_imputemedian`

,
`mlr_pipeops_imputenewlvl`

,
`mlr_pipeops_imputesample`

,
`mlr_pipeops_kernelpca`

,
`mlr_pipeops_learner`

,
`mlr_pipeops_missind`

,
`mlr_pipeops_modelmatrix`

,
`mlr_pipeops_mutate`

,
`mlr_pipeops_nop`

,
`mlr_pipeops_pca`

,
`mlr_pipeops_quantilebin`

,
`mlr_pipeops_regravg`

,
`mlr_pipeops_removeconstants`

,
`mlr_pipeops_scalemaxabs`

,
`mlr_pipeops_scalerange`

,
`mlr_pipeops_scale`

,
`mlr_pipeops_select`

,
`mlr_pipeops_spatialsign`

,
`mlr_pipeops_subsample`

,
`mlr_pipeops_unbranch`

,
`mlr_pipeops_yeojohnson`

,
`mlr_pipeops`

library("mlr3") # Create example task data_example = smotefamily::sample_generator(1000, ratio = 0.80) task = TaskClassif$new(id = "example", backend = data_example, target = "result") task$data()#> result X1 X2 #> 1: n 0.2866815 0.45899585 #> 2: n 0.2682259 0.20055563 #> 3: n 0.8959587 0.28225572 #> 4: p 0.5282639 0.33736117 #> 5: n 0.9983724 0.13857375 #> --- #> 996: n 0.2099023 0.88892125 #> 997: p 0.6939768 0.49183863 #> 998: n 0.9204687 0.05490594 #> 999: n 0.8816077 0.30730362 #> 1000: n 0.8660524 0.59036143#> #> n p #> 768 232# Generate synthetic data for minority class pop = po("smote") smotedata = pop$train(list(task))[[1]]$data() table(smotedata$result)#> #> n p #> 768 696