Remove constant features from a mlr3::Task. For each feature, calculates the ratio of features which differ from their mode value. All features which a ratio below a settable threshold are removed from the task. Missing values can be ignored or treated as a regular value distinct from non-missing values.

Format

R6Class object inheriting from PipeOpTaskPreprocSimple/PipeOpTaskPreproc/PipeOp.

Construction

PipeOpRemoveConstants$new(id = "removeconstants")
  • id :: character(1) Identifier of the resulting object, defaulting to "removeconstants".

  • param_vals :: named list
    List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list().

State

$state is a named list with the $state elements inherited from PipeOpTaskPreproc, as well as:

  • features :: character
    Names of features that are being kept. Features of types that the Filter can not operate on are always being kept.

Parameters

The parameters are the parameters inherited from the PipeOpTaskPreproc, as well as:

  • ratio :: numeric(1)
    Ratio of values which must be different from the mode value in order to keep a feature in the task. Default is 0, which means only constant features with exactly one observed level are removed.

  • rel_tol :: numeric(1)
    Relative tolerance within which to consider a numeric feature constant. Set to 0 to disregard relative tolerance. Default is 1e-8.

  • abs_tol :: numeric(1)
    Absolute tolerance within which to consider a numeric feature constant. Set to 0 to disregard absolute tolerance. Default is 1e-8.

  • na_ignore :: logical(1)
    If TRUE, the ratio is calculated after removing all missing values first. Default is FALSE.

Fields

Fields inherited from PipeOpTaskPreproc/PipeOp.

Methods

Methods inherited from PipeOpTaskPreprocSimple/PipeOpTaskPreproc/PipeOp.

See also

Examples

library("mlr3") data = data.table::data.table(y = runif(10), a = 1:10, b = rep(1, 10), c = rep(1:2, each = 5)) task = TaskRegr$new("example", data, target = "y") po = po("removeconstants") po$train(list(task = task))[[1]]$data()
#> y a c #> 1: 0.41117738 1 1 #> 2: 0.64798216 2 1 #> 3: 0.31861969 3 1 #> 4: 0.78915578 4 1 #> 5: 0.72116796 5 1 #> 6: 0.49786710 6 2 #> 7: 0.99099822 7 2 #> 8: 0.09698042 8 2 #> 9: 0.41336562 9 2 #> 10: 0.22938504 10 2
po$state
#> $features #> [1] "a" "c" #> #> $affected_cols #> [1] "a" "b" "c" #> #> $intasklayout #> id type #> 1: a integer #> 2: b numeric #> 3: c integer #> #> $outtasklayout #> id type #> 1: a integer #> 2: c integer #>