Skip to contents

Splits numeric features into quantile bins.

Format

R6Class object inheriting from PipeOpTaskPreprocSimple/PipeOpTaskPreproc/PipeOp.

Construction

PipeOpQuantileBin$new(id = "quantilebin", param_vals = list())

  • id :: character(1)
    Identifier of resulting object, default "quantilebin".

  • param_vals :: named list
    List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list().

Input and Output Channels

Input and output channels are inherited from PipeOpTaskPreproc.

The output is the input Task with all affected numeric features replaced by their binned versions.

State

The $state is a named list with the $state elements inherited from PipeOpTaskPreproc, as well as:

  • bins :: list
    List of intervals representing the bins for each numeric feature.

Parameters

The parameters are the parameters inherited from PipeOpTaskPreproc, as well as:

  • numsplits :: numeric(1)
    Number of bins to create. Default is 2.

Internals

Uses the stats::quantile function.

Methods

Only methods inherited from PipeOpTaskPreprocSimple/PipeOpTaskPreproc/PipeOp.

See also

https://mlr-org.com/pipeops.html

Other PipeOps: PipeOp, PipeOpEnsemble, PipeOpImpute, PipeOpTargetTrafo, PipeOpTaskPreproc, PipeOpTaskPreprocSimple, mlr_pipeops, mlr_pipeops_boxcox, mlr_pipeops_branch, mlr_pipeops_chunk, mlr_pipeops_classbalancing, mlr_pipeops_classifavg, mlr_pipeops_classweights, mlr_pipeops_colapply, mlr_pipeops_collapsefactors, mlr_pipeops_colroles, mlr_pipeops_copy, mlr_pipeops_datefeatures, mlr_pipeops_encode, mlr_pipeops_encodeimpact, mlr_pipeops_encodelmer, mlr_pipeops_featureunion, mlr_pipeops_filter, mlr_pipeops_fixfactors, mlr_pipeops_histbin, mlr_pipeops_ica, mlr_pipeops_imputeconstant, mlr_pipeops_imputehist, mlr_pipeops_imputelearner, mlr_pipeops_imputemean, mlr_pipeops_imputemedian, mlr_pipeops_imputemode, mlr_pipeops_imputeoor, mlr_pipeops_imputesample, mlr_pipeops_kernelpca, mlr_pipeops_learner, mlr_pipeops_missind, mlr_pipeops_modelmatrix, mlr_pipeops_multiplicityexply, mlr_pipeops_multiplicityimply, mlr_pipeops_mutate, mlr_pipeops_nmf, mlr_pipeops_nop, mlr_pipeops_ovrsplit, mlr_pipeops_ovrunite, mlr_pipeops_pca, mlr_pipeops_proxy, mlr_pipeops_randomprojection, mlr_pipeops_randomresponse, mlr_pipeops_regravg, mlr_pipeops_removeconstants, mlr_pipeops_renamecolumns, mlr_pipeops_replicate, mlr_pipeops_rowapply, mlr_pipeops_scale, mlr_pipeops_scalemaxabs, mlr_pipeops_scalerange, mlr_pipeops_select, mlr_pipeops_smote, mlr_pipeops_spatialsign, mlr_pipeops_subsample, mlr_pipeops_targetinvert, mlr_pipeops_targetmutate, mlr_pipeops_targettrafoscalerange, mlr_pipeops_textvectorizer, mlr_pipeops_threshold, mlr_pipeops_tunethreshold, mlr_pipeops_unbranch, mlr_pipeops_updatetarget, mlr_pipeops_vtreat, mlr_pipeops_yeojohnson

Examples

library("mlr3")

task = tsk("iris")
pop = po("quantilebin")

task$data()
#>        Species Petal.Length Petal.Width Sepal.Length Sepal.Width
#>         <fctr>        <num>       <num>        <num>       <num>
#>   1:    setosa          1.4         0.2          5.1         3.5
#>   2:    setosa          1.4         0.2          4.9         3.0
#>   3:    setosa          1.3         0.2          4.7         3.2
#>   4:    setosa          1.5         0.2          4.6         3.1
#>   5:    setosa          1.4         0.2          5.0         3.6
#>  ---                                                            
#> 146: virginica          5.2         2.3          6.7         3.0
#> 147: virginica          5.0         1.9          6.3         2.5
#> 148: virginica          5.2         2.0          6.5         3.0
#> 149: virginica          5.4         2.3          6.2         3.4
#> 150: virginica          5.1         1.8          5.9         3.0
pop$train(list(task))[[1]]$data()
#>        Species Petal.Length Petal.Width Sepal.Length Sepal.Width
#>         <fctr>        <ord>       <ord>        <ord>       <ord>
#>   1:    setosa  (-Inf,4.35]  (-Inf,1.3]   (-Inf,5.8]    (3, Inf]
#>   2:    setosa  (-Inf,4.35]  (-Inf,1.3]   (-Inf,5.8]    (-Inf,3]
#>   3:    setosa  (-Inf,4.35]  (-Inf,1.3]   (-Inf,5.8]    (3, Inf]
#>   4:    setosa  (-Inf,4.35]  (-Inf,1.3]   (-Inf,5.8]    (3, Inf]
#>   5:    setosa  (-Inf,4.35]  (-Inf,1.3]   (-Inf,5.8]    (3, Inf]
#>  ---                                                            
#> 146: virginica  (4.35, Inf]  (1.3, Inf]   (5.8, Inf]    (-Inf,3]
#> 147: virginica  (4.35, Inf]  (1.3, Inf]   (5.8, Inf]    (-Inf,3]
#> 148: virginica  (4.35, Inf]  (1.3, Inf]   (5.8, Inf]    (-Inf,3]
#> 149: virginica  (4.35, Inf]  (1.3, Inf]   (5.8, Inf]    (3, Inf]
#> 150: virginica  (4.35, Inf]  (1.3, Inf]   (5.8, Inf]    (-Inf,3]

pop$state
#> $bins
#> $bins$Petal.Length
#> [1] -Inf 4.35  Inf
#> 
#> $bins$Petal.Width
#> [1] -Inf  1.3  Inf
#> 
#> $bins$Sepal.Length
#> [1] -Inf  5.8  Inf
#> 
#> $bins$Sepal.Width
#> [1] -Inf    3  Inf
#> 
#> 
#> $dt_columns
#> [1] "Petal.Length" "Petal.Width"  "Sepal.Length" "Sepal.Width" 
#> 
#> $affected_cols
#> [1] "Petal.Length" "Petal.Width"  "Sepal.Length" "Sepal.Width" 
#> 
#> $intasklayout
#> Key: <id>
#>              id    type
#>          <char>  <char>
#> 1: Petal.Length numeric
#> 2:  Petal.Width numeric
#> 3: Sepal.Length numeric
#> 4:  Sepal.Width numeric
#> 
#> $outtasklayout
#> Key: <id>
#>              id    type
#>          <char>  <char>
#> 1: Petal.Length ordered
#> 2:  Petal.Width ordered
#> 3: Sepal.Length ordered
#> 4:  Sepal.Width ordered
#> 
#> $outtaskshell
#> Empty data.table (0 rows and 5 cols): Species,Petal.Length,Petal.Width,Sepal.Length,Sepal.Width
#>