Skip to contents

Extracts statistically independent components from data. Only affects numerical features. See fastICA::fastICA for details.

Format

R6Class object inheriting from PipeOpTaskPreproc/PipeOp.

Construction

PipeOpICA$new(id = "ica", param_vals = list())

  • id :: character(1)
    Identifier of resulting object, default "ica".

  • param_vals :: named list
    List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list().

Input and Output Channels

Input and output channels are inherited from PipeOpTaskPreproc.

The output is the input Task with all affected numeric parameters replaced by independent components.

State

The $state is a named list with the $state elements inherited from PipeOpTaskPreproc, as well as the elements of the function fastICA::fastICA(), with the exception of the $X and $S slots. These are in particular:

  • K :: matrix
    Matrix that projects data onto the first n.comp principal components. See fastICA().

  • W :: matrix
    Estimated un-mixing matrix. See fastICA().

  • A :: matrix
    Estimated mixing matrix. See fastICA().

  • center :: numeric
    The mean of each numeric feature during training.

Parameters

The parameters are the parameters inherited from PipeOpTaskPreproc, as well as the following parameters based on fastICA():

  • n.comp :: numeric(1)
    Number of components to extract. Default is NULL, which sets it to the number of available numeric columns.

  • alg.typ:: character(1)
    Algorithm type. One of "parallel" (default) or "deflation".

  • fun :: character(1)
    One of "logcosh" (default) or "exp".

  • alpha :: numeric(1)
    In range [1, 2], Used for negentropy calculation when fun is "logcosh". Default is 1.0.

  • method :: character(1)
    Internal calculation method. "C" (default) or "R". See fastICA().

  • row.norm :: logical(1)
    Logical value indicating whether rows should be standardized beforehand. Default is FALSE.

  • maxit :: numeric(1)
    Maximum number of iterations. Default is 200.

  • tol :: numeric(1)
    Tolerance for convergence, default is 1e-4.

  • verbose logical(1)
    Logical value indicating the level of output during the run of the algorithm. Default is FALSE.

  • w.init:: matrix
    Initial un-mixing matrix. See fastICA(). Default is NULL.

Internals

Uses the fastICA() function.

Methods

Only methods inherited from PipeOpTaskPreproc/PipeOp.

See also

https://mlr-org.com/pipeops.html

Other PipeOps: PipeOpEnsemble, PipeOpImpute, PipeOpTargetTrafo, PipeOpTaskPreprocSimple, PipeOpTaskPreproc, PipeOp, mlr_pipeops_boxcox, mlr_pipeops_branch, mlr_pipeops_chunk, mlr_pipeops_classbalancing, mlr_pipeops_classifavg, mlr_pipeops_classweights, mlr_pipeops_colapply, mlr_pipeops_collapsefactors, mlr_pipeops_colroles, mlr_pipeops_copy, mlr_pipeops_datefeatures, mlr_pipeops_encodeimpact, mlr_pipeops_encodelmer, mlr_pipeops_encode, mlr_pipeops_featureunion, mlr_pipeops_filter, mlr_pipeops_fixfactors, mlr_pipeops_histbin, mlr_pipeops_imputeconstant, mlr_pipeops_imputehist, mlr_pipeops_imputelearner, mlr_pipeops_imputemean, mlr_pipeops_imputemedian, mlr_pipeops_imputemode, mlr_pipeops_imputeoor, mlr_pipeops_imputesample, mlr_pipeops_kernelpca, mlr_pipeops_learner, mlr_pipeops_missind, mlr_pipeops_modelmatrix, mlr_pipeops_multiplicityexply, mlr_pipeops_multiplicityimply, mlr_pipeops_mutate, mlr_pipeops_nmf, mlr_pipeops_nop, mlr_pipeops_ovrsplit, mlr_pipeops_ovrunite, mlr_pipeops_pca, mlr_pipeops_proxy, mlr_pipeops_quantilebin, mlr_pipeops_randomprojection, mlr_pipeops_randomresponse, mlr_pipeops_regravg, mlr_pipeops_removeconstants, mlr_pipeops_renamecolumns, mlr_pipeops_replicate, mlr_pipeops_scalemaxabs, mlr_pipeops_scalerange, mlr_pipeops_scale, mlr_pipeops_select, mlr_pipeops_smote, mlr_pipeops_spatialsign, mlr_pipeops_subsample, mlr_pipeops_targetinvert, mlr_pipeops_targetmutate, mlr_pipeops_targettrafoscalerange, mlr_pipeops_textvectorizer, mlr_pipeops_threshold, mlr_pipeops_tunethreshold, mlr_pipeops_unbranch, mlr_pipeops_updatetarget, mlr_pipeops_vtreat, mlr_pipeops_yeojohnson, mlr_pipeops

Examples

library("mlr3")

task = tsk("iris")
pop = po("ica")

task$data()
#>        Species Petal.Length Petal.Width Sepal.Length Sepal.Width
#>         <fctr>        <num>       <num>        <num>       <num>
#>   1:    setosa          1.4         0.2          5.1         3.5
#>   2:    setosa          1.4         0.2          4.9         3.0
#>   3:    setosa          1.3         0.2          4.7         3.2
#>   4:    setosa          1.5         0.2          4.6         3.1
#>   5:    setosa          1.4         0.2          5.0         3.6
#>  ---                                                            
#> 146: virginica          5.2         2.3          6.7         3.0
#> 147: virginica          5.0         1.9          6.3         2.5
#> 148: virginica          5.2         2.0          6.5         3.0
#> 149: virginica          5.4         2.3          6.2         3.4
#> 150: virginica          5.1         1.8          5.9         3.0
pop$train(list(task))[[1]]$data()
#>        Species         V1         V2          V3          V4
#>         <fctr>      <num>      <num>       <num>       <num>
#>   1:    setosa  0.2618804  1.3930423  0.01858163  0.37322898
#>   2:    setosa  0.3780112  1.3292462 -0.08402269 -0.97539362
#>   3:    setosa -0.3621013  1.3478643 -0.16088078 -0.34866766
#>   4:    setosa -0.8821564  1.2050708  0.31459635 -0.37346222
#>   5:    setosa -0.1972881  1.3733227  0.09785696  0.73767564
#>  ---                                                        
#> 146: virginica  1.2526553 -0.8398994 -2.59510472 -0.34005412
#> 147: virginica  0.7134698 -0.7614934 -1.04626777 -1.37118659
#> 148: virginica  0.3381378 -0.8319335 -0.98242690  0.08514387
#> 149: virginica -1.2518547 -1.0580918 -1.76521840  1.38972597
#> 150: virginica -1.5464571 -0.8768282  0.05313113  0.58218232

pop$state
#> $K
#>            [,1]       [,2]       [,3]      [,4]
#> [1,] -0.4180098  0.3531217  0.2735163  3.118456
#> [2,] -0.1748261  0.1537381  1.9583093 -4.897992
#> [3,] -0.1763375 -1.3373258 -2.0881803 -2.050340
#> [4,]  0.0412425 -1.4871770  2.1451574  2.077869
#> 
#> $W
#>            [,1]        [,2]        [,3]        [,4]
#> [1,] -0.1472535  0.98656218 -0.01579210 -0.06900775
#> [2,] -0.5896973 -0.14339476  0.01308129 -0.79468480
#> [3,] -0.6413565 -0.07174251 -0.59498373  0.47907115
#> [4,] -0.4682257 -0.03140099  0.80347610  0.36633979
#> 
#> $A
#>             [,1]        [,2]        [,3]        [,4]
#> [1,]  0.16013328  0.04299547  0.42593333  0.05597331
#> [2,] -1.74811998 -0.73699926 -0.67128589  0.20879744
#> [3,]  0.07551950 -0.17161440  0.06499724 -0.06705313
#> [4,]  0.09073779  0.05162242  0.21178676  0.37079291
#> 
#> $center
#> Petal.Length  Petal.Width Sepal.Length  Sepal.Width 
#>     3.758000     1.199333     5.843333     3.057333 
#> 
#> $dt_columns
#> [1] "Petal.Length" "Petal.Width"  "Sepal.Length" "Sepal.Width" 
#> 
#> $affected_cols
#> [1] "Petal.Length" "Petal.Width"  "Sepal.Length" "Sepal.Width" 
#> 
#> $intasklayout
#> Key: <id>
#>              id    type
#>          <char>  <char>
#> 1: Petal.Length numeric
#> 2:  Petal.Width numeric
#> 3: Sepal.Length numeric
#> 4:  Sepal.Width numeric
#> 
#> $outtasklayout
#> Key: <id>
#>        id    type
#>    <char>  <char>
#> 1:     V1 numeric
#> 2:     V2 numeric
#> 3:     V3 numeric
#> 4:     V4 numeric
#> 
#> $outtaskshell
#> Empty data.table (0 rows and 5 cols): Species,V1,V2,V3,V4
#>