Base class for handling many "preprocessing" operations
that perform essentially the same operation during training and prediction.
Instead implementing a private$.train_task()
and a private$.predict_task()
operation, only
a private$.get_state()
and a private$.transform()
operation needs to be defined,
both of which take one argument: a Task
.
Alternatively, analogously to the PipeOpTaskPreproc
approach of offering private$.train_dt()
/private$.predict_dt()
,
the private$.get_state_dt()
and private$.transform_dt()
functions may be implemented.
private$.get_state
must not change its input value in-place and must return
something that will be written into $state
(which must not be NULL), private$.transform()
should modify its argument in-place;
it is called both during training and prediction.
This inherits from PipeOpTaskPreproc
and behaves essentially the same.
Format
Abstract R6Class
inheriting from PipeOpTaskPreproc
/PipeOp
.
Construction
PipeOpTaskPreprocSimple$new(id, param_set = ps(), param_vals = list(), can_subset_cols = TRUE, packages = character(0), task_type = "Task")
(Construction is identical to PipeOpTaskPreproc
.)
id
::character(1)
Identifier of resulting object. See$id
slot ofPipeOp
.param_set
::ParamSet
Parameter space description. This should be created by the subclass and given tosuper$initialize()
.param_vals
:: namedlist
List of hyperparameter settings, overwriting the hyperparameter settings given inparam_set
. The subclass should have its ownparam_vals
parameter and pass it on tosuper$initialize()
. Defaultlist()
.can_subset_cols
::logical(1)
Whether theaffect_columns
parameter should be added which lets the user limit the columns that are modified by thePipeOpTaskPreprocSimple
. This should generally beFALSE
if the operation adds or removes rows from theTask
, andTRUE
otherwise. Default isTRUE
.packages ::
character
Set of all required packages for thePipeOp
'sprivate$.train()
andprivate$.predict()
methods. See$packages
slot. Default ischaracter(0)
.task_type
::character(1)
The class ofTask
that should be accepted as input and will be returned as output. This should generally be acharacter(1)
identifying a type ofTask
, e.g."Task"
,"TaskClassif"
or"TaskRegr"
(or another subclass introduced by other packages). Default is"Task"
.
Input and Output Channels
Input and output channels are inherited from PipeOpTaskPreproc
.
The output during training and prediction is the Task
, modified by private$.transform()
or private$.transform_dt()
.
State
The $state
is a named list
with the $state
elements inherited from PipeOpTaskPreproc
.
Parameters
The parameters are the parameters inherited from PipeOpTaskPreproc
.
Internals
PipeOpTaskPreprocSimple
is an abstract class inheriting from PipeOpTaskPreproc
and implementing the
private$.train_task()
and private$.predict_task()
functions. A subclass of PipeOpTaskPreprocSimple
may implement the
functions private$.get_state()
and private$.transform()
, or alternatively the functions private$.get_state_dt()
and private$.transform_dt()
(as well as private$.select_cols()
, in the latter case). This works by having the default implementations of
private$.get_state()
and private$.transform()
call private$.get_state_dt()
and private$.transform_dt()
.
Fields
Fields inherited from PipeOp
.
Methods
Methods inherited from PipeOpTaskPreproc
, as well as:
.get_state(task)
(Task
) -> namedlist
Store create something that will be stored in$state
during training phase ofPipeOpTaskPreprocSimple
. The state can then influence theprivate$.transform()
function. Note thatprivate$.get_state()
must return the state, and should not store it in$state
. It is not strictly necessary to implement eitherprivate$.get_state()
orprivate$.get_state_dt()
; if they are not implemented, the state will be stored aslist()
.
This method can optionally be overloaded when inheriting fromPipeOpTaskPreprocSimple
, together withprivate$.transform()
; alternatively,private$.get_state_dt()
(optional) andprivate$.transform_dt()
(and possiblyprivate$.select_cols()
, fromPipeOpTaskPreproc
) can be overloaded..transform(task)
(Task
) ->Task
Predict on new data intask
, possibly using the stored$state
.task
should not be cloned, instead it should be changed in-place. This method is called both during training and prediction phase, and should essentially behave the same independently of phase. (If this is incongruent with the functionality to be implemented, then it should inherit fromPipeOpTaskPreproc
, not fromPipeOpTaskPreprocSimple
.)
This method can be overloaded when inheriting fromPipeOpTaskPreprocSimple
, optionally withprivate$.get_state()
; alternatively,private$.get_state_dt()
(optional) andprivate$.transform_dt()
(and possiblyprivate$.select_cols()
, fromPipeOpTaskPreproc
) can be overloaded..get_state_dt(dt)
(data.table
) -> namedlist
Create something that will be stored in$state
during training phase ofPipeOpTaskPreprocSimple
. The state can then influence theprivate$.transform_dt()
function. Note thatprivate$.get_state_dt()
must return the state, and should not store it in$state
. If neitherprivate$.get_state()
norprivate$.get_state_dt()
are overloaded, the state will be stored aslist()
.
This method can optionally be overloaded when inheriting fromPipeOpTaskPreprocSimple
, together withprivate$.transform_dt()
(and optionallyprivate$.select_cols()
, fromPipeOpTaskPreproc
); Alternatively,private$.get_state()
(optional) andprivate$.transform()
can be overloaded..transform_dt(dt)
(data.table
) ->data.table
|data.frame
|matrix
Predict on new data indt
, possibly using the stored$state
. A transformed object must be returned that can be converted to adata.table
usingas.data.table
.dt
does not need to be copied deliberately, it is possible and encouraged to change it in-place. This method is called both during training and prediction phase, and should essentially behave the same independently of phase. (If this is incongruent with the functionality to be implemented, then it should inherit fromPipeOpTaskPreproc
, not fromPipeOpTaskPreprocSimple
.)
This method can optionally be overloaded when inheriting fromPipeOpTaskPreprocSimple
, together withprivate$.transform_dt()
(and optionallyprivate$.select_cols()
, fromPipeOpTaskPreproc
); Alternatively,private$.get_state()
(optional) andprivate$.transform()
can be overloaded.
See also
https://mlr-org.com/pipeops.html
Other PipeOps:
PipeOp
,
PipeOpEnsemble
,
PipeOpImpute
,
PipeOpTargetTrafo
,
PipeOpTaskPreproc
,
mlr_pipeops
,
mlr_pipeops_adas
,
mlr_pipeops_blsmote
,
mlr_pipeops_boxcox
,
mlr_pipeops_branch
,
mlr_pipeops_chunk
,
mlr_pipeops_classbalancing
,
mlr_pipeops_classifavg
,
mlr_pipeops_classweights
,
mlr_pipeops_colapply
,
mlr_pipeops_collapsefactors
,
mlr_pipeops_colroles
,
mlr_pipeops_copy
,
mlr_pipeops_datefeatures
,
mlr_pipeops_encode
,
mlr_pipeops_encodeimpact
,
mlr_pipeops_encodelmer
,
mlr_pipeops_featureunion
,
mlr_pipeops_filter
,
mlr_pipeops_fixfactors
,
mlr_pipeops_histbin
,
mlr_pipeops_ica
,
mlr_pipeops_imputeconstant
,
mlr_pipeops_imputehist
,
mlr_pipeops_imputelearner
,
mlr_pipeops_imputemean
,
mlr_pipeops_imputemedian
,
mlr_pipeops_imputemode
,
mlr_pipeops_imputeoor
,
mlr_pipeops_imputesample
,
mlr_pipeops_kernelpca
,
mlr_pipeops_learner
,
mlr_pipeops_missind
,
mlr_pipeops_modelmatrix
,
mlr_pipeops_multiplicityexply
,
mlr_pipeops_multiplicityimply
,
mlr_pipeops_mutate
,
mlr_pipeops_nmf
,
mlr_pipeops_nop
,
mlr_pipeops_ovrsplit
,
mlr_pipeops_ovrunite
,
mlr_pipeops_pca
,
mlr_pipeops_proxy
,
mlr_pipeops_quantilebin
,
mlr_pipeops_randomprojection
,
mlr_pipeops_randomresponse
,
mlr_pipeops_regravg
,
mlr_pipeops_removeconstants
,
mlr_pipeops_renamecolumns
,
mlr_pipeops_replicate
,
mlr_pipeops_rowapply
,
mlr_pipeops_scale
,
mlr_pipeops_scalemaxabs
,
mlr_pipeops_scalerange
,
mlr_pipeops_select
,
mlr_pipeops_smote
,
mlr_pipeops_smotenc
,
mlr_pipeops_spatialsign
,
mlr_pipeops_subsample
,
mlr_pipeops_targetinvert
,
mlr_pipeops_targetmutate
,
mlr_pipeops_targettrafoscalerange
,
mlr_pipeops_textvectorizer
,
mlr_pipeops_threshold
,
mlr_pipeops_tunethreshold
,
mlr_pipeops_unbranch
,
mlr_pipeops_updatetarget
,
mlr_pipeops_vtreat
,
mlr_pipeops_yeojohnson
Other mlr3pipelines backend related:
Graph
,
PipeOp
,
PipeOpTargetTrafo
,
PipeOpTaskPreproc
,
mlr_graphs
,
mlr_pipeops
,
mlr_pipeops_updatetarget