Collapses factors of type `factor`

, `ordered`

: Collapses the rarest factors in the
training samples, until `target_level_count`

levels remain. Levels that have prevalence above `no_collapse_above_prevalence`

are retained, however. For `factor`

variables, these are collapsed to the next larger level, for `ordered`

variables,
rare variables are collapsed to the neighbouring class, whichever has fewer samples.

Levels not seen during training are not touched during prediction; Therefore it is useful to combine this with the
`PipeOpFixFactors`

.

## Format

`R6Class`

object inheriting from `PipeOpTaskPreprocSimple`

/`PipeOpTaskPreproc`

/`PipeOp`

.

## Construction

`id`

::`character(1)`

Identifier of resulting object, default`"collapsefactors"`

.`param_vals`

:: named`list`

List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default`list()`

.

## Input and Output Channels

Input and output channels are inherited from `PipeOpTaskPreproc`

.

The output is the input `Task`

with rare affected `factor`

and `ordered`

feature levels collapsed.

## State

The `$state`

is a named `list`

with the `$state`

elements inherited from `PipeOpTaskPreproc`

, as well as:

`collapse_map`

:: named`list`

of named`list`

of`character`

List of factor level maps. For each factor,`collapse_map`

contains a named`list`

that indicates what levels of the input task get mapped to what levels of the output task. If`collapse_map`

has an entry`feat_1`

with an entry`a = c("x", "y")`

, it means that levels`"x"`

and`"y"`

get collapsed to level`"a"`

in feature`"feat_1"`

.

## Parameters

The parameters are the parameters inherited from `PipeOpTaskPreproc`

, as well as:

`no_collapse_above_prevalence`

::`numeric(1)`

Fraction of samples below which factor levels get collapsed. Default is 1, which causes all levels to be collapsed until`target_level_count`

remain.`target_level_count`

::`integer(1)`

Number of levels to retain. Default is 2.

## Internals

Makes use of the fact that `levels(fact_var) = list(target1 = c("source1", "source2"), target2 = "source2")`

causes
renaming of level `"source1"`

and `"source2"`

both to `"target1"`

, and also `"source2"`

to `"target2"`

.

## Methods

Only methods inherited from `PipeOpTaskPreprocSimple`

/`PipeOpTaskPreproc`

/`PipeOp`

.

## See also

https://mlr3book.mlr-org.com/list-pipeops.html

Other PipeOps:
`PipeOpEnsemble`

,
`PipeOpImpute`

,
`PipeOpTargetTrafo`

,
`PipeOpTaskPreprocSimple`

,
`PipeOpTaskPreproc`

,
`PipeOp`

,
`mlr_pipeops_boxcox`

,
`mlr_pipeops_branch`

,
`mlr_pipeops_chunk`

,
`mlr_pipeops_classbalancing`

,
`mlr_pipeops_classifavg`

,
`mlr_pipeops_classweights`

,
`mlr_pipeops_colapply`

,
`mlr_pipeops_colroles`

,
`mlr_pipeops_copy`

,
`mlr_pipeops_datefeatures`

,
`mlr_pipeops_encodeimpact`

,
`mlr_pipeops_encodelmer`

,
`mlr_pipeops_encode`

,
`mlr_pipeops_featureunion`

,
`mlr_pipeops_filter`

,
`mlr_pipeops_fixfactors`

,
`mlr_pipeops_histbin`

,
`mlr_pipeops_ica`

,
`mlr_pipeops_imputeconstant`

,
`mlr_pipeops_imputehist`

,
`mlr_pipeops_imputelearner`

,
`mlr_pipeops_imputemean`

,
`mlr_pipeops_imputemedian`

,
`mlr_pipeops_imputemode`

,
`mlr_pipeops_imputeoor`

,
`mlr_pipeops_imputesample`

,
`mlr_pipeops_kernelpca`

,
`mlr_pipeops_learner`

,
`mlr_pipeops_missind`

,
`mlr_pipeops_modelmatrix`

,
`mlr_pipeops_multiplicityexply`

,
`mlr_pipeops_multiplicityimply`

,
`mlr_pipeops_mutate`

,
`mlr_pipeops_nmf`

,
`mlr_pipeops_nop`

,
`mlr_pipeops_ovrsplit`

,
`mlr_pipeops_ovrunite`

,
`mlr_pipeops_pca`

,
`mlr_pipeops_proxy`

,
`mlr_pipeops_quantilebin`

,
`mlr_pipeops_randomprojection`

,
`mlr_pipeops_randomresponse`

,
`mlr_pipeops_regravg`

,
`mlr_pipeops_removeconstants`

,
`mlr_pipeops_renamecolumns`

,
`mlr_pipeops_replicate`

,
`mlr_pipeops_scalemaxabs`

,
`mlr_pipeops_scalerange`

,
`mlr_pipeops_scale`

,
`mlr_pipeops_select`

,
`mlr_pipeops_smote`

,
`mlr_pipeops_spatialsign`

,
`mlr_pipeops_subsample`

,
`mlr_pipeops_targetinvert`

,
`mlr_pipeops_targetmutate`

,
`mlr_pipeops_targettrafoscalerange`

,
`mlr_pipeops_textvectorizer`

,
`mlr_pipeops_threshold`

,
`mlr_pipeops_tunethreshold`

,
`mlr_pipeops_unbranch`

,
`mlr_pipeops_updatetarget`

,
`mlr_pipeops_vtreat`

,
`mlr_pipeops_yeojohnson`

,
`mlr_pipeops`