Adds features according to expressions given as formulas that may depend on values of other features. This can add new features, or can change existing features.
Format
R6Class object inheriting from PipeOpTaskPreprocSimple/PipeOpTaskPreproc/PipeOp.
Construction
id::character(1)
Identifier of resulting object, default"mutate".param_vals:: namedlist
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Defaultlist().
Input and Output Channels
Input and output channels are inherited from PipeOpTaskPreproc.
The output is the input Task with added and/or mutated features according to the mutation parameter.
State
The $state is a named list with the $state elements inherited from PipeOpTaskPreproc.
Parameters
The parameters are the parameters inherited from PipeOpTaskPreproc, as well as:
mutation:: namedlistofformula
Expressions for new features to create (or present features to change), in the form offormula. Each element of the list is aformulawith the name of the element naming the feature to create or change, and the formula expression determining the result. This expression may reference other features, as well as variables visible at the creation of theformula(see examples). Initialized tolist().delete_originals::logical(1)
Whether to delete original features. Even when this isFALSE, present features may still be overwritten. Initialized toFALSE.
Internals
A formula created using the ~ operator always contains a reference to the environment in which
the formula is created. This makes it possible to use variables in the ~-expressions that both
reference either column names or variable names.
Note that the formulas in mutation are evaluated sequentially. This allows for using
variables that were constructed during evaluation of a previous formula. However, if existing
features are changed, precedence is given to the original ones before the newly constructed ones.
Fields
Only fields inherited from PipeOp.
Methods
Only methods inherited from PipeOpTaskPreprocSimple/PipeOpTaskPreproc/PipeOp.
See also
https://mlr-org.com/pipeops.html
Other PipeOps:
PipeOp,
PipeOpEncodePL,
PipeOpEnsemble,
PipeOpImpute,
PipeOpTargetTrafo,
PipeOpTaskPreproc,
PipeOpTaskPreprocSimple,
mlr_pipeops,
mlr_pipeops_adas,
mlr_pipeops_blsmote,
mlr_pipeops_boxcox,
mlr_pipeops_branch,
mlr_pipeops_chunk,
mlr_pipeops_classbalancing,
mlr_pipeops_classifavg,
mlr_pipeops_classweights,
mlr_pipeops_colapply,
mlr_pipeops_collapsefactors,
mlr_pipeops_colroles,
mlr_pipeops_copy,
mlr_pipeops_datefeatures,
mlr_pipeops_decode,
mlr_pipeops_encode,
mlr_pipeops_encodeimpact,
mlr_pipeops_encodelmer,
mlr_pipeops_encodeplquantiles,
mlr_pipeops_encodepltree,
mlr_pipeops_featureunion,
mlr_pipeops_filter,
mlr_pipeops_fixfactors,
mlr_pipeops_histbin,
mlr_pipeops_ica,
mlr_pipeops_imputeconstant,
mlr_pipeops_imputehist,
mlr_pipeops_imputelearner,
mlr_pipeops_imputemean,
mlr_pipeops_imputemedian,
mlr_pipeops_imputemode,
mlr_pipeops_imputeoor,
mlr_pipeops_imputesample,
mlr_pipeops_kernelpca,
mlr_pipeops_learner,
mlr_pipeops_learner_pi_cvplus,
mlr_pipeops_learner_quantiles,
mlr_pipeops_missind,
mlr_pipeops_modelmatrix,
mlr_pipeops_multiplicityexply,
mlr_pipeops_multiplicityimply,
mlr_pipeops_nearmiss,
mlr_pipeops_nmf,
mlr_pipeops_nop,
mlr_pipeops_ovrsplit,
mlr_pipeops_ovrunite,
mlr_pipeops_pca,
mlr_pipeops_proxy,
mlr_pipeops_quantilebin,
mlr_pipeops_randomprojection,
mlr_pipeops_randomresponse,
mlr_pipeops_regravg,
mlr_pipeops_removeconstants,
mlr_pipeops_renamecolumns,
mlr_pipeops_replicate,
mlr_pipeops_rowapply,
mlr_pipeops_scale,
mlr_pipeops_scalemaxabs,
mlr_pipeops_scalerange,
mlr_pipeops_select,
mlr_pipeops_smote,
mlr_pipeops_smotenc,
mlr_pipeops_spatialsign,
mlr_pipeops_subsample,
mlr_pipeops_targetinvert,
mlr_pipeops_targetmutate,
mlr_pipeops_targettrafoscalerange,
mlr_pipeops_textvectorizer,
mlr_pipeops_threshold,
mlr_pipeops_tomek,
mlr_pipeops_tunethreshold,
mlr_pipeops_unbranch,
mlr_pipeops_updatetarget,
mlr_pipeops_vtreat,
mlr_pipeops_yeojohnson
Examples
library("mlr3")
constant = 1
pom = po("mutate")
pom$param_set$values$mutation = list(
Sepal.Length_plus_constant = ~ Sepal.Length + constant,
Sepal.Area = ~ Sepal.Width * Sepal.Length,
Petal.Area = ~ Petal.Width * Petal.Length,
Sepal.Area_plus_Petal.Area = ~ Sepal.Area + Petal.Area
)
pom$train(list(tsk("iris")))[[1]]$data()
#> Species Petal.Length Petal.Width Sepal.Length Sepal.Width
#> <fctr> <num> <num> <num> <num>
#> 1: setosa 1.4 0.2 5.1 3.5
#> 2: setosa 1.4 0.2 4.9 3.0
#> 3: setosa 1.3 0.2 4.7 3.2
#> 4: setosa 1.5 0.2 4.6 3.1
#> 5: setosa 1.4 0.2 5.0 3.6
#> ---
#> 146: virginica 5.2 2.3 6.7 3.0
#> 147: virginica 5.0 1.9 6.3 2.5
#> 148: virginica 5.2 2.0 6.5 3.0
#> 149: virginica 5.4 2.3 6.2 3.4
#> 150: virginica 5.1 1.8 5.9 3.0
#> Sepal.Length_plus_constant Sepal.Area Petal.Area
#> <num> <num> <num>
#> 1: 6.1 17.85 0.28
#> 2: 5.9 14.70 0.28
#> 3: 5.7 15.04 0.26
#> 4: 5.6 14.26 0.30
#> 5: 6.0 18.00 0.28
#> ---
#> 146: 7.7 20.10 11.96
#> 147: 7.3 15.75 9.50
#> 148: 7.5 19.50 10.40
#> 149: 7.2 21.08 12.42
#> 150: 6.9 17.70 9.18
#> Sepal.Area_plus_Petal.Area
#> <num>
#> 1: 18.13
#> 2: 14.98
#> 3: 15.30
#> 4: 14.56
#> 5: 18.28
#> ---
#> 146: 32.06
#> 147: 25.25
#> 148: 29.90
#> 149: 33.50
#> 150: 26.88
