Centers all numeric features to mean = 0 (if center parameter is TRUE) and scales them by dividing them by their root-mean-square (if scale parameter is TRUE).

The root-mean-square here is defined as sqrt(sum(x^2)/(length(x)-1)). If the center parameter is TRUE, this corresponds to the sd().

Format

R6Class object inheriting from PipeOpTaskPreproc/PipeOp.

PipeOpScale$new(id = "scale", param_vals = list())  • id :: character(1) Identifier of resulting object, default "scale". • param_vals :: named list List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list(). Input and Output Channels Input and output channels are inherited from PipeOpTaskPreproc. The output is the input Task with all affected numeric parameters centered and/or scaled. State The $state is a named list with the $state elements inherited from PipeOpTaskPreproc, as well as: • center :: numeric The mean / median (depending on robust) of each numeric feature during training, or 0 if center is FALSE. Will be subtracted during the predict phase. • scale :: numeric The value by which features are divided. 1 if scale is FALSE If robust is FALSE, this is the root mean square, defined as sqrt(sum(x^2)/(length(x)-1)), of each feature, possibly after centering. If robust is TRUE, this is the mean absolute deviation multiplied by 1.4826 (see stats::mad of each feature, possibly after centering. This is 1 for features that are constant during training if center is TRUE, to avoid division-by-zero. Parameters The parameters are the parameters inherited from PipeOpTaskPreproc, as well as: • center :: logical(1) Whether to center features, i.e. subtract their mean() from them. Default TRUE. • scale :: logical(1) Whether to scale features, i.e. divide them by sqrt(sum(x^2)/(length(x)-1)). Default TRUE. • robust :: logical(1) Whether to use robust scaling; instead of scaling / centering with mean / standard deviation, median and median absolute deviation mad are used. Initialized to FALSE. Internals Uses the scale() function for robust = FALSE and alternatively subtracts the median and divides by mad for robust = TRUE. Methods Only methods inherited from PipeOpTaskPreproc/PipeOp. See also Other PipeOps: PipeOpEnsemble, PipeOpImpute, PipeOpTargetTrafo, PipeOpTaskPreprocSimple, PipeOpTaskPreproc, PipeOp, mlr_pipeops_boxcox, mlr_pipeops_branch, mlr_pipeops_chunk, mlr_pipeops_classbalancing, mlr_pipeops_classifavg, mlr_pipeops_classweights, mlr_pipeops_colapply, mlr_pipeops_collapsefactors, mlr_pipeops_colroles, mlr_pipeops_copy, mlr_pipeops_datefeatures, mlr_pipeops_encodeimpact, mlr_pipeops_encodelmer, mlr_pipeops_encode, mlr_pipeops_featureunion, mlr_pipeops_filter, mlr_pipeops_fixfactors, mlr_pipeops_histbin, mlr_pipeops_ica, mlr_pipeops_imputeconstant, mlr_pipeops_imputehist, mlr_pipeops_imputelearner, mlr_pipeops_imputemean, mlr_pipeops_imputemedian, mlr_pipeops_imputemode, mlr_pipeops_imputeoor, mlr_pipeops_imputesample, mlr_pipeops_kernelpca, mlr_pipeops_learner, mlr_pipeops_missind, mlr_pipeops_modelmatrix, mlr_pipeops_multiplicityexply, mlr_pipeops_multiplicityimply, mlr_pipeops_mutate, mlr_pipeops_nmf, mlr_pipeops_nop, mlr_pipeops_ovrsplit, mlr_pipeops_ovrunite, mlr_pipeops_pca, mlr_pipeops_proxy, mlr_pipeops_quantilebin, mlr_pipeops_randomprojection, mlr_pipeops_randomresponse, mlr_pipeops_regravg, mlr_pipeops_removeconstants, mlr_pipeops_renamecolumns, mlr_pipeops_replicate, mlr_pipeops_scalemaxabs, mlr_pipeops_scalerange, mlr_pipeops_select, mlr_pipeops_smote, mlr_pipeops_spatialsign, mlr_pipeops_subsample, mlr_pipeops_targetinvert, mlr_pipeops_targetmutate, mlr_pipeops_targettrafoscalerange, mlr_pipeops_textvectorizer, mlr_pipeops_threshold, mlr_pipeops_tunethreshold, mlr_pipeops_unbranch, mlr_pipeops_updatetarget, mlr_pipeops_vtreat, mlr_pipeops_yeojohnson, mlr_pipeops Examples library("mlr3") task = tsk("iris") pos = po("scale") pos$train(list(task))[[1]]$data() #> Species Petal.Length Petal.Width Sepal.Length Sepal.Width #> 1: setosa -1.3357516 -1.3110521 -0.89767388 1.01560199 #> 2: setosa -1.3357516 -1.3110521 -1.13920048 -0.13153881 #> 3: setosa -1.3923993 -1.3110521 -1.38072709 0.32731751 #> 4: setosa -1.2791040 -1.3110521 -1.50149039 0.09788935 #> 5: setosa -1.3357516 -1.3110521 -1.01843718 1.24503015 #> --- #> 146: virginica 0.8168591 1.4439941 1.03453895 -0.13153881 #> 147: virginica 0.7035638 0.9192234 0.55148575 -1.27867961 #> 148: virginica 0.8168591 1.0504160 0.79301235 -0.13153881 #> 149: virginica 0.9301544 1.4439941 0.43072244 0.78617383 #> 150: virginica 0.7602115 0.7880307 0.06843254 -0.13153881 one_line_of_iris = task$filter(13)

one_line_of_iris$data() #> Species Petal.Length Petal.Width Sepal.Length Sepal.Width #> 1: setosa 1.4 0.1 4.8 3 pos$predict(list(one_line_of_iris))[[1]]\$data()
#>    Species Petal.Length Petal.Width Sepal.Length Sepal.Width
#> 1:  setosa    -1.335752   -1.442245    -1.259964  -0.1315388