Centers all numeric features to mean = 0 (if center parameter is TRUE) and scales them by dividing them by their root-mean-square (if scale parameter is TRUE).

The root-mean-square here is defined as sqrt(sum(x^2)/(length(x)-1)). If the center parameter is TRUE, this corresponds to the sd().

Format

R6Class object inheriting from PipeOpTaskPreproc/PipeOp.

Construction

PipeOpScale$new(id = "scale", param_vals = list())
  • id :: character(1)
    Identifier of resulting object, default "scale".

  • param_vals :: named list
    List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list().

Input and Output Channels

Input and output channels are inherited from PipeOpTaskPreproc.

The output is the input Task with all affected numeric parameters centered and/or scaled.

State

The $state is a named list with the $state elements inherited from PipeOpTaskPreproc, as well as:

  • center :: numeric
    The mean of each numeric feature during training, or 0 if center is FALSE. Will be subtracted during the predict phase.

  • scale :: numeric
    The root mean square, defined as sqrt(sum(x^2)/(length(x)-1)), of each feature during training, or 1 if scale is FALSE. During predict phase, features are divided by this.
    This is 1 for features that are constant during training if center is TRUE, to avoid division-by-zero.

Parameters

The parameters are the parameters inherited from PipeOpTaskPreproc, as well as:

Internals

Uses the scale() function.

Methods

Only methods inherited from PipeOpTaskPreproc/PipeOp.

See also

Examples

library("mlr3") task = tsk("iris") pos = po("scale") pos$train(list(task))[[1]]$data()
#> Species Petal.Length Petal.Width Sepal.Length Sepal.Width #> 1: setosa -1.3357516 -1.3110521 -0.89767388 1.01560199 #> 2: setosa -1.3357516 -1.3110521 -1.13920048 -0.13153881 #> 3: setosa -1.3923993 -1.3110521 -1.38072709 0.32731751 #> 4: setosa -1.2791040 -1.3110521 -1.50149039 0.09788935 #> 5: setosa -1.3357516 -1.3110521 -1.01843718 1.24503015 #> --- #> 146: virginica 0.8168591 1.4439941 1.03453895 -0.13153881 #> 147: virginica 0.7035638 0.9192234 0.55148575 -1.27867961 #> 148: virginica 0.8168591 1.0504160 0.79301235 -0.13153881 #> 149: virginica 0.9301544 1.4439941 0.43072244 0.78617383 #> 150: virginica 0.7602115 0.7880307 0.06843254 -0.13153881
one_line_of_iris = task$filter(13) one_line_of_iris$data()
#> Species Petal.Length Petal.Width Sepal.Length Sepal.Width #> 1: setosa 1.4 0.1 4.8 3
pos$predict(list(one_line_of_iris))[[1]]$data()
#> Species Petal.Length Petal.Width Sepal.Length Sepal.Width #> 1: setosa -1.335752 -1.442245 -1.259964 -0.1315388