
Class Weights for Sample Weighting - Extended
Source:R/PipeOpClassWeightsEx.R
mlr_pipeops_classweightsex.RdAdds a class-dependent sample weights column to a Task, allowing
Learners and Measures to weight observations
differently during training and evaluation.
Weights are assigned per observation based on the target class and can be written
to the "weights_learner" column, the "weights_measure" column, both, or neither.
Binary as well as multiclass classification tasks (TaskClassif) are supported.
Format
R6Class object inheriting from PipeOpTaskPreproc/PipeOp.
Construction
id::character(1)
Identifier of the resulting object, default"classweightsex"param_vals:: namedlist
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Defaultlist().
Input and Output Channels
Input and output channels are inherited from PipeOpTaskPreproc. Instead of a Task, a
TaskClassif is used as input and output during training and prediction.
The output during training is the input Task with an added weights column according to the target class.
The output during prediction is the unchanged input.
State
The $state is a named list with the $state elements inherited from PipeOpTaskPreproc.
Parameters
The parameters are the parameters inherited from PipeOpTaskPreproc; however, the affect_columns parameter is not present. Further parameters are:
weights_learner::logical(1)
Whether the created weights should be stored as aweights_learnercolumn or not. Initialized toTRUE.weights_measure::logical(1)
Whether the created weights should be stored as aweights_measurecolumn or not. Initialized toFALSE.weight_method::character(1)
The method that is chosen to determine the weights of the samples. Methods encompass"inverse_class_frequency","inverse_square_root_of_frequency","median_frequency_balancing"and"explicit". In case of"explicit", themappinghyperparameter must be use. Initialized to"explicit".mapping:: namednumeric
A named numeric vector that specifies a finite weight for each target class in the task. This only has an effect ifweight_methodisexplicit.
Internals
Adds a .WEIGHTS column to the Task, which is removed from the feature role and mapped to the requested weight roles.
There will be a naming conflict if this column already exists and is not a weight column already. For potentially pre-existing weight columns, the weight
column role gets dropped, but they remain in the DataBackend of the Task.
When weight_method = "explicit", the mapping must cover every class present in the training data and may not contain additional classes.
The Learner must support weights for this PipeOp to have an effect.
Fields
Only fields inherited from PipeOp.
Methods
Only methods inherited from PipeOpTaskPreproc/PipeOp.
See also
https://mlr-org.com/pipeops.html
Other PipeOps:
PipeOp,
PipeOpEncodePL,
PipeOpEnsemble,
PipeOpImpute,
PipeOpTargetTrafo,
PipeOpTaskPreproc,
PipeOpTaskPreprocSimple,
mlr_pipeops,
mlr_pipeops_adas,
mlr_pipeops_blsmote,
mlr_pipeops_boxcox,
mlr_pipeops_branch,
mlr_pipeops_chunk,
mlr_pipeops_classbalancing,
mlr_pipeops_classifavg,
mlr_pipeops_classweights,
mlr_pipeops_colapply,
mlr_pipeops_collapsefactors,
mlr_pipeops_colroles,
mlr_pipeops_copy,
mlr_pipeops_datefeatures,
mlr_pipeops_decode,
mlr_pipeops_encode,
mlr_pipeops_encodeimpact,
mlr_pipeops_encodelmer,
mlr_pipeops_encodeplquantiles,
mlr_pipeops_encodepltree,
mlr_pipeops_featureunion,
mlr_pipeops_filter,
mlr_pipeops_fixfactors,
mlr_pipeops_histbin,
mlr_pipeops_ica,
mlr_pipeops_imputeconstant,
mlr_pipeops_imputehist,
mlr_pipeops_imputelearner,
mlr_pipeops_imputemean,
mlr_pipeops_imputemedian,
mlr_pipeops_imputemode,
mlr_pipeops_imputeoor,
mlr_pipeops_imputesample,
mlr_pipeops_info,
mlr_pipeops_isomap,
mlr_pipeops_kernelpca,
mlr_pipeops_learner,
mlr_pipeops_learner_pi_cvplus,
mlr_pipeops_learner_quantiles,
mlr_pipeops_missind,
mlr_pipeops_modelmatrix,
mlr_pipeops_multiplicityexply,
mlr_pipeops_multiplicityimply,
mlr_pipeops_mutate,
mlr_pipeops_nearmiss,
mlr_pipeops_nmf,
mlr_pipeops_nop,
mlr_pipeops_ovrsplit,
mlr_pipeops_ovrunite,
mlr_pipeops_pca,
mlr_pipeops_proxy,
mlr_pipeops_quantilebin,
mlr_pipeops_randomprojection,
mlr_pipeops_randomresponse,
mlr_pipeops_regravg,
mlr_pipeops_removeconstants,
mlr_pipeops_renamecolumns,
mlr_pipeops_replicate,
mlr_pipeops_rowapply,
mlr_pipeops_scale,
mlr_pipeops_scalemaxabs,
mlr_pipeops_scalerange,
mlr_pipeops_select,
mlr_pipeops_smote,
mlr_pipeops_smotenc,
mlr_pipeops_spatialsign,
mlr_pipeops_subsample,
mlr_pipeops_targetinvert,
mlr_pipeops_targetmutate,
mlr_pipeops_targettrafoscalerange,
mlr_pipeops_textvectorizer,
mlr_pipeops_threshold,
mlr_pipeops_tomek,
mlr_pipeops_tunethreshold,
mlr_pipeops_unbranch,
mlr_pipeops_updatetarget,
mlr_pipeops_vtreat,
mlr_pipeops_yeojohnson
Examples
library("mlr3")
task = tsk("spam")
poicf = po("classweightsex", param_vals = list(weights_learner = TRUE, weights_measure = TRUE,
weight_method = "inverse_class_frequency"))
result = poicf$train(list(task))[[1L]]
if ("weights_learner" %in% names(result)) {
result$weights_learner # recent mlr3-versions
} else {
result$weights # old mlr3-versions
}
#> Key: <row_id>
#> row_id weight
#> <int> <num>
#> 1: 1 2.537783
#> 2: 2 2.537783
#> 3: 3 2.537783
#> 4: 4 2.537783
#> 5: 5 2.537783
#> ---
#> 4597: 4597 1.650287
#> 4598: 4598 1.650287
#> 4599: 4599 1.650287
#> 4600: 4600 1.650287
#> 4601: 4601 1.650287
if ("weights_measure" %in% names(result)) {
result$weights_measure # recent mlr3-versions
} else {
result$weights # old mlr3-versions
}
#> Key: <row_id>
#> row_id weight
#> <int> <num>
#> 1: 1 2.537783
#> 2: 2 2.537783
#> 3: 3 2.537783
#> 4: 4 2.537783
#> 5: 5 2.537783
#> ---
#> 4597: 4597 1.650287
#> 4598: 4598 1.650287
#> 4599: 4599 1.650287
#> 4600: 4600 1.650287
#> 4601: 4601 1.650287