Remove constant features from a mlr3::Task. For each feature, calculates the ratio of features which differ from their mode value. All features with a ratio below a settable threshold are removed from the task. Missing values can be ignored or treated as a regular value distinct from non-missing values.
Format
R6Class
object inheriting from PipeOpTaskPreprocSimple
/PipeOpTaskPreproc
/PipeOp
.
Construction
id
::character(1)
Identifier of the resulting object, defaulting to"removeconstants"
.param_vals
:: namedlist
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Defaultlist()
.
State
$state
is a named list
with the $state
elements inherited from PipeOpTaskPreproc
, as well as:
features
::character()
Names of features that are being kept. Features of types that theFilter
can not operate on are always being kept.
Parameters
The parameters are the parameters inherited from the PipeOpTaskPreproc
, as well as:
ratio
::numeric(1)
Ratio of values which must be different from the mode value in order to keep a feature in the task. Initialized to 0, which means only constant features with exactly one observed level are removed.rel_tol
::numeric(1)
Relative tolerance within which to consider a numeric feature constant. Set to 0 to disregard relative tolerance. Initialized to1e-8
.abs_tol
::numeric(1)
Absolute tolerance within which to consider a numeric feature constant. Set to 0 to disregard absolute tolerance. Initialized to1e-8
.na_ignore
::logical(1)
IfTRUE
, the ratio is calculated after removing all missing values first, so a column can be "constant" even if some but not all values areNA
. Initialized toTRUE
.
Fields
Fields inherited from PipeOpTaskPreproc
/PipeOp
.
Methods
Methods inherited from PipeOpTaskPreprocSimple
/PipeOpTaskPreproc
/PipeOp
.
See also
https://mlr-org.com/pipeops.html
Other PipeOps:
PipeOp
,
PipeOpEnsemble
,
PipeOpImpute
,
PipeOpTargetTrafo
,
PipeOpTaskPreproc
,
PipeOpTaskPreprocSimple
,
mlr_pipeops
,
mlr_pipeops_adas
,
mlr_pipeops_blsmote
,
mlr_pipeops_boxcox
,
mlr_pipeops_branch
,
mlr_pipeops_chunk
,
mlr_pipeops_classbalancing
,
mlr_pipeops_classifavg
,
mlr_pipeops_classweights
,
mlr_pipeops_colapply
,
mlr_pipeops_collapsefactors
,
mlr_pipeops_colroles
,
mlr_pipeops_copy
,
mlr_pipeops_datefeatures
,
mlr_pipeops_decode
,
mlr_pipeops_encode
,
mlr_pipeops_encodeimpact
,
mlr_pipeops_encodelmer
,
mlr_pipeops_featureunion
,
mlr_pipeops_filter
,
mlr_pipeops_fixfactors
,
mlr_pipeops_histbin
,
mlr_pipeops_ica
,
mlr_pipeops_imputeconstant
,
mlr_pipeops_imputehist
,
mlr_pipeops_imputelearner
,
mlr_pipeops_imputemean
,
mlr_pipeops_imputemedian
,
mlr_pipeops_imputemode
,
mlr_pipeops_imputeoor
,
mlr_pipeops_imputesample
,
mlr_pipeops_kernelpca
,
mlr_pipeops_learner
,
mlr_pipeops_learner_pi_cvplus
,
mlr_pipeops_learner_quantiles
,
mlr_pipeops_missind
,
mlr_pipeops_modelmatrix
,
mlr_pipeops_multiplicityexply
,
mlr_pipeops_multiplicityimply
,
mlr_pipeops_mutate
,
mlr_pipeops_nearmiss
,
mlr_pipeops_nmf
,
mlr_pipeops_nop
,
mlr_pipeops_ovrsplit
,
mlr_pipeops_ovrunite
,
mlr_pipeops_pca
,
mlr_pipeops_proxy
,
mlr_pipeops_quantilebin
,
mlr_pipeops_randomprojection
,
mlr_pipeops_randomresponse
,
mlr_pipeops_regravg
,
mlr_pipeops_renamecolumns
,
mlr_pipeops_replicate
,
mlr_pipeops_rowapply
,
mlr_pipeops_scale
,
mlr_pipeops_scalemaxabs
,
mlr_pipeops_scalerange
,
mlr_pipeops_select
,
mlr_pipeops_smote
,
mlr_pipeops_smotenc
,
mlr_pipeops_spatialsign
,
mlr_pipeops_subsample
,
mlr_pipeops_targetinvert
,
mlr_pipeops_targetmutate
,
mlr_pipeops_targettrafoscalerange
,
mlr_pipeops_textvectorizer
,
mlr_pipeops_threshold
,
mlr_pipeops_tomek
,
mlr_pipeops_tunethreshold
,
mlr_pipeops_unbranch
,
mlr_pipeops_updatetarget
,
mlr_pipeops_vtreat
,
mlr_pipeops_yeojohnson
Examples
library("mlr3")
data = data.table::data.table(y = runif(10), a = 1:10, b = rep(1, 10), c = rep(1:2, each = 5))
task = TaskRegr$new("example", data, target = "y")
po = po("removeconstants")
po$train(list(task = task))[[1]]$data()
#> y a c
#> <num> <int> <int>
#> 1: 0.84982510 1 1
#> 2: 0.33041754 2 1
#> 3: 0.36344250 3 1
#> 4: 0.76530451 4 1
#> 5: 0.54021206 5 1
#> 6: 0.06709176 6 2
#> 7: 0.02093449 7 2
#> 8: 0.80938970 8 2
#> 9: 0.60619806 9 2
#> 10: 0.81683238 10 2
po$state
#> $features
#> [1] "a" "c"
#>
#> $affected_cols
#> [1] "a" "b" "c"
#>
#> $intasklayout
#> Key: <id>
#> id type
#> <char> <char>
#> 1: a integer
#> 2: b numeric
#> 3: c integer
#>
#> $outtasklayout
#> Key: <id>
#> id type
#> <char> <char>
#> 1: a integer
#> 2: c integer
#>
#> $outtaskshell
#> Empty data.table (0 rows and 3 cols): y,a,c
#>