Adds a class-dependent sample weights column to a Task, allowing
Learners and Measures to weight observations
differently during training and evaluation.
Weights are assigned per observation based on the target class and can be written
to the "weights_learner" column, the "weights_measure" column, both, or neither.
Only binary classification tasks (TaskClassif) are supported.
Note: By default, all weights are set to 1. To obtain a meaningful effect, the
minor_weight parameter must be adjusted.
See PipeOpClassWeightsEx for an extended version of this PipeOp which can
handle multiclass classification tasks and offers several methods for automatically
determining weights.
Format
R6Class object inheriting from PipeOpTaskPreproc/PipeOp.
Construction
id::character(1)
Identifier of the resulting object, default"classweights"param_vals:: namedlist
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Defaultlist().
Input and Output Channels
Input and output channels are inherited from PipeOpTaskPreproc. Instead of a Task, a
TaskClassif is used as input and output during training and prediction.
The output during training is the input Task with an added weights column according to the target class.
The output during prediction is the unchanged input.
State
The $state is a named list with the $state elements inherited from PipeOpTaskPreproc.
Parameters
The parameters are the parameters inherited from PipeOpTaskPreproc; however, the affect_columns parameter is not present. Further parameters are:
minor_weight::numeric(1)
Weight given to samples of the minor class. Major class samples have weight1. Initialized to1.weights_learner::logical(1)
Whether the created weights should be stored as aweights_learnercolumn or not. Initialized toTRUE.weights_measure::logical(1)
Whether the created weights should be stored as aweights_measurecolumn or not. Initialized toFALSE.
Internals
Adds a .WEIGHTS column to the Task, which is removed from the feature role and mapped to the requested weight roles.
There will be a naming conflict if this column already exists and is not a weight column already. For potentially pre-existing weight columns,
the weight column role gets dropped, but they remain in the DataBackend of the Task.
The Learner must support weights for this PipeOp to have an effect.
Fields
Only fields inherited from PipeOp.
Methods
Only methods inherited from PipeOpTaskPreproc/PipeOp.
See also
https://mlr-org.com/pipeops.html
Other PipeOps:
PipeOp,
PipeOpEncodePL,
PipeOpEnsemble,
PipeOpImpute,
PipeOpTargetTrafo,
PipeOpTaskPreproc,
PipeOpTaskPreprocSimple,
mlr_pipeops,
mlr_pipeops_adas,
mlr_pipeops_blsmote,
mlr_pipeops_boxcox,
mlr_pipeops_branch,
mlr_pipeops_chunk,
mlr_pipeops_classbalancing,
mlr_pipeops_classifavg,
mlr_pipeops_classweightsex,
mlr_pipeops_colapply,
mlr_pipeops_collapsefactors,
mlr_pipeops_colroles,
mlr_pipeops_copy,
mlr_pipeops_datefeatures,
mlr_pipeops_decode,
mlr_pipeops_encode,
mlr_pipeops_encodeimpact,
mlr_pipeops_encodelmer,
mlr_pipeops_encodeplquantiles,
mlr_pipeops_encodepltree,
mlr_pipeops_featureunion,
mlr_pipeops_filter,
mlr_pipeops_fixfactors,
mlr_pipeops_histbin,
mlr_pipeops_ica,
mlr_pipeops_imputeconstant,
mlr_pipeops_imputehist,
mlr_pipeops_imputelearner,
mlr_pipeops_imputemean,
mlr_pipeops_imputemedian,
mlr_pipeops_imputemode,
mlr_pipeops_imputeoor,
mlr_pipeops_imputesample,
mlr_pipeops_info,
mlr_pipeops_isomap,
mlr_pipeops_kernelpca,
mlr_pipeops_learner,
mlr_pipeops_learner_pi_cvplus,
mlr_pipeops_learner_quantiles,
mlr_pipeops_missind,
mlr_pipeops_modelmatrix,
mlr_pipeops_multiplicityexply,
mlr_pipeops_multiplicityimply,
mlr_pipeops_mutate,
mlr_pipeops_nearmiss,
mlr_pipeops_nmf,
mlr_pipeops_nop,
mlr_pipeops_ovrsplit,
mlr_pipeops_ovrunite,
mlr_pipeops_pca,
mlr_pipeops_proxy,
mlr_pipeops_quantilebin,
mlr_pipeops_randomprojection,
mlr_pipeops_randomresponse,
mlr_pipeops_regravg,
mlr_pipeops_removeconstants,
mlr_pipeops_renamecolumns,
mlr_pipeops_replicate,
mlr_pipeops_rowapply,
mlr_pipeops_scale,
mlr_pipeops_scalemaxabs,
mlr_pipeops_scalerange,
mlr_pipeops_select,
mlr_pipeops_smote,
mlr_pipeops_smotenc,
mlr_pipeops_spatialsign,
mlr_pipeops_splines,
mlr_pipeops_subsample,
mlr_pipeops_targetinvert,
mlr_pipeops_targetmutate,
mlr_pipeops_targettrafoscalerange,
mlr_pipeops_textvectorizer,
mlr_pipeops_threshold,
mlr_pipeops_tomek,
mlr_pipeops_tunethreshold,
mlr_pipeops_unbranch,
mlr_pipeops_updatetarget,
mlr_pipeops_vtreat,
mlr_pipeops_yeojohnson
Examples
library("mlr3")
task = tsk("spam")
opb = po("classweights")
# task weights
if ("weights_learner" %in% names(task)) {
task$weights_learner # recent mlr3-versions
} else {
task$weights # old mlr3-versions
}
#> NULL
# double the instances in the minority class (spam)
opb$param_set$values$minor_weight = 2
result = opb$train(list(task))[[1L]]
if ("weights_learner" %in% names(result)) {
result$weights_learner # recent mlr3-versions
} else {
result$weights # old mlr3-versions
}
#> Key: <row_id>
#> row_id weight
#> <int> <num>
#> 1: 1 2
#> 2: 2 2
#> 3: 3 2
#> 4: 4 2
#> 5: 5 2
#> ---
#> 4597: 4597 1
#> 4598: 4598 1
#> 4599: 4599 1
#> 4600: 4600 1
#> 4601: 4601 1
