Conducts a Box-Cox transformation on numeric features. The lambda parameter
of the transformation is estimated during training and used for both training
and prediction transformation.
See bestNormalize::boxcox() for details.
Format
R6Class object inheriting from PipeOpTaskPreproc/PipeOp.
Construction
id::character(1)
Identifier of resulting object, default"boxcox".param_vals:: namedlist
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Defaultlist().
Input and Output Channels
Input and output channels are inherited from PipeOpTaskPreproc.
The output is the input Task with all affected numeric features replaced by their transformed versions.
State
The $state is a named list with the $state elements inherited from PipeOpTaskPreproc,
as well as a list of class boxcox for each column, which is transformed.
Parameters
The parameters are the parameters inherited from PipeOpTaskPreproc, as well as:
standardize::logical(1)
Whether to center and scale the transformed values to attempt a standard normal distribution. For details seeboxcox().eps::numeric(1)
Tolerance parameter to identify if lambda parameter is equal to zero. For details seeboxcox().lower::numeric(1)
Lower value for estimation of lambda parameter. For details seeboxcox().upper::numeric(1)
Upper value for estimation of lambda parameter. For details seeboxcox().
Internals
Uses the bestNormalize::boxcox function.
Fields
Only fields inherited from PipeOp.
Methods
Only methods inherited from PipeOpTaskPreproc/PipeOp.
See also
https://mlr-org.com/pipeops.html
Other PipeOps:
PipeOp,
PipeOpEncodePL,
PipeOpEnsemble,
PipeOpImpute,
PipeOpTargetTrafo,
PipeOpTaskPreproc,
PipeOpTaskPreprocSimple,
mlr_pipeops,
mlr_pipeops_adas,
mlr_pipeops_blsmote,
mlr_pipeops_branch,
mlr_pipeops_chunk,
mlr_pipeops_classbalancing,
mlr_pipeops_classifavg,
mlr_pipeops_classweights,
mlr_pipeops_colapply,
mlr_pipeops_collapsefactors,
mlr_pipeops_colroles,
mlr_pipeops_copy,
mlr_pipeops_datefeatures,
mlr_pipeops_decode,
mlr_pipeops_encode,
mlr_pipeops_encodeimpact,
mlr_pipeops_encodelmer,
mlr_pipeops_encodeplquantiles,
mlr_pipeops_encodepltree,
mlr_pipeops_featureunion,
mlr_pipeops_filter,
mlr_pipeops_fixfactors,
mlr_pipeops_histbin,
mlr_pipeops_ica,
mlr_pipeops_imputeconstant,
mlr_pipeops_imputehist,
mlr_pipeops_imputelearner,
mlr_pipeops_imputemean,
mlr_pipeops_imputemedian,
mlr_pipeops_imputemode,
mlr_pipeops_imputeoor,
mlr_pipeops_imputesample,
mlr_pipeops_info,
mlr_pipeops_isomap,
mlr_pipeops_kernelpca,
mlr_pipeops_learner,
mlr_pipeops_learner_pi_cvplus,
mlr_pipeops_learner_quantiles,
mlr_pipeops_missind,
mlr_pipeops_modelmatrix,
mlr_pipeops_multiplicityexply,
mlr_pipeops_multiplicityimply,
mlr_pipeops_mutate,
mlr_pipeops_nearmiss,
mlr_pipeops_nmf,
mlr_pipeops_nop,
mlr_pipeops_ovrsplit,
mlr_pipeops_ovrunite,
mlr_pipeops_pca,
mlr_pipeops_proxy,
mlr_pipeops_quantilebin,
mlr_pipeops_randomprojection,
mlr_pipeops_randomresponse,
mlr_pipeops_regravg,
mlr_pipeops_removeconstants,
mlr_pipeops_renamecolumns,
mlr_pipeops_replicate,
mlr_pipeops_rowapply,
mlr_pipeops_scale,
mlr_pipeops_scalemaxabs,
mlr_pipeops_scalerange,
mlr_pipeops_select,
mlr_pipeops_smote,
mlr_pipeops_smotenc,
mlr_pipeops_spatialsign,
mlr_pipeops_subsample,
mlr_pipeops_targetinvert,
mlr_pipeops_targetmutate,
mlr_pipeops_targettrafoscalerange,
mlr_pipeops_textvectorizer,
mlr_pipeops_threshold,
mlr_pipeops_tomek,
mlr_pipeops_tunethreshold,
mlr_pipeops_unbranch,
mlr_pipeops_updatetarget,
mlr_pipeops_vtreat,
mlr_pipeops_yeojohnson
Examples
library("mlr3")
task = tsk("iris")
pop = po("boxcox")
task$data()
#> Species Petal.Length Petal.Width Sepal.Length Sepal.Width
#> <fctr> <num> <num> <num> <num>
#> 1: setosa 1.4 0.2 5.1 3.5
#> 2: setosa 1.4 0.2 4.9 3.0
#> 3: setosa 1.3 0.2 4.7 3.2
#> 4: setosa 1.5 0.2 4.6 3.1
#> 5: setosa 1.4 0.2 5.0 3.6
#> ---
#> 146: virginica 5.2 2.3 6.7 3.0
#> 147: virginica 5.0 1.9 6.3 2.5
#> 148: virginica 5.2 2.0 6.5 3.0
#> 149: virginica 5.4 2.3 6.2 3.4
#> 150: virginica 5.1 1.8 5.9 3.0
pop$train(list(task))[[1]]$data()
#> Species Petal.Length Petal.Width Sepal.Length Sepal.Width
#> <fctr> <num> <num> <num> <num>
#> 1: setosa -1.3431567 -1.3850773 -0.8917547 1.01831791
#> 2: setosa -1.3431567 -1.3850773 -1.1812229 -0.08167295
#> 3: setosa -1.4033413 -1.3850773 -1.4845435 0.37307046
#> 4: setosa -1.2832670 -1.3850773 -1.6417967 0.14833599
#> 5: setosa -1.3431567 -1.3850773 -1.0348319 1.22454068
#> ---
#> 146: virginica 0.8174171 1.2930924 1.0385560 -0.08167295
#> 147: virginica 0.7075555 0.9020852 0.6097200 -1.32264877
#> 148: virginica 0.8174171 1.0023867 0.8279148 -0.08167295
#> 149: virginica 0.9269887 1.2930924 0.4976284 0.80781419
#> 150: virginica 0.7625234 0.7998822 0.1485189 -0.08167295
pop$state
#> $bc
#> $bc$Petal.Length
#> Standardized Box Cox Transformation with 150 nonmissing obs.:
#> Estimated statistics:
#> - lambda = 0.931286
#> - mean (before standardization) = 2.58137
#> - sd (before standardization) = 1.627669
#>
#> $bc$Petal.Width
#> Standardized Box Cox Transformation with 150 nonmissing obs.:
#> Estimated statistics:
#> - lambda = 0.6433629
#> - mean (before standardization) = 0.08586719
#> - sd (before standardization) = 0.7857394
#>
#> $bc$Sepal.Length
#> Standardized Box Cox Transformation with 150 nonmissing obs.:
#> Estimated statistics:
#> - lambda = -0.144751
#> - mean (before standardization) = 1.549011
#> - sd (before standardization) = 0.1094848
#>
#> $bc$Sepal.Width
#> Standardized Box Cox Transformation with 150 nonmissing obs.:
#> Estimated statistics:
#> - lambda = 0.2810121
#> - mean (before standardization) = 1.30301
#> - sd (before standardization) = 0.1950175
#>
#>
#> $dt_columns
#> [1] "Petal.Length" "Petal.Width" "Sepal.Length" "Sepal.Width"
#>
#> $affected_cols
#> [1] "Petal.Length" "Petal.Width" "Sepal.Length" "Sepal.Width"
#>
#> $intasklayout
#> Key: <id>
#> id type
#> <char> <char>
#> 1: Petal.Length numeric
#> 2: Petal.Width numeric
#> 3: Sepal.Length numeric
#> 4: Sepal.Width numeric
#>
#> $outtasklayout
#> Key: <id>
#> id type
#> <char> <char>
#> 1: Petal.Length numeric
#> 2: Petal.Width numeric
#> 3: Sepal.Length numeric
#> 4: Sepal.Width numeric
#>
#> $outtaskshell
#> Empty data.table (0 rows and 5 cols): Species,Petal.Length,Petal.Width,Sepal.Length,Sepal.Width
#>
