Materializes the active view of a Task by replacing its
DataBackend with a new backend containing only the
rows and columns currently used by the task.
This can be useful after operations that create virtual task views, such as
Task $filter(), $select(), or $cbind(). In particular, many
PipeOpTaskPreproc operations use Task $cbind() internally,
which can create nested virtual backends. Materializing the view can reduce backend
nesting and may free memory or speed up later data access.
Note that Task $materialize_view() only materializes the currently
active view. Columns without any column role are dropped, and observations occuring more
than once (duplicates in $row_ids), the resulting backend contains it only once,
but the new task view will still contain it multiple times (duplicates in $row_ids are preserved).
Construction
id::character(1)
Identifier of resulting object. See$idslot ofPipeOp.
Input and Output Channels
PipeOpMaterialize has one input channel named "input", taking a Task both during training and prediction.
PipeOpMaterialize has one output channel named "output", producing a Task both during training and prediction.
The output is the input Task with the active view materialized.
State
The $state is left empty (list()).
Internals
PipeOpMaterialize calls Task $materialize_view() on a clone of the input task,
both during training and prediction. During training, the internal validation task is also materialized
using $materialize_view(internal_valid_task = TRUE), but not during prediction.
Fields
Only fields inherited from PipeOp.
Methods
Only methods inherited from PipeOp.
See also
https://mlr-org.com/pipeops.html
Other mlr3pipelines backend related:
Graph,
PipeOp,
PipeOpTargetTrafo,
PipeOpTaskPreproc,
PipeOpTaskPreprocSimple,
mlr_graphs,
mlr_pipeops,
mlr_pipeops_updatetarget
Other PipeOps:
PipeOp,
PipeOpEncodePL,
PipeOpEnsemble,
PipeOpImpute,
PipeOpTargetTrafo,
PipeOpTaskPreproc,
PipeOpTaskPreprocSimple,
mlr_pipeops,
mlr_pipeops_adas,
mlr_pipeops_blsmote,
mlr_pipeops_boxcox,
mlr_pipeops_branch,
mlr_pipeops_chunk,
mlr_pipeops_classbalancing,
mlr_pipeops_classifavg,
mlr_pipeops_classweights,
mlr_pipeops_classweightsex,
mlr_pipeops_colapply,
mlr_pipeops_collapsefactors,
mlr_pipeops_colroles,
mlr_pipeops_copy,
mlr_pipeops_datefeatures,
mlr_pipeops_decode,
mlr_pipeops_encode,
mlr_pipeops_encodeimpact,
mlr_pipeops_encodelmer,
mlr_pipeops_encodeplquantiles,
mlr_pipeops_encodepltree,
mlr_pipeops_featureunion,
mlr_pipeops_filter,
mlr_pipeops_fixfactors,
mlr_pipeops_histbin,
mlr_pipeops_ica,
mlr_pipeops_imputeconstant,
mlr_pipeops_imputehist,
mlr_pipeops_imputelearner,
mlr_pipeops_imputemean,
mlr_pipeops_imputemedian,
mlr_pipeops_imputemode,
mlr_pipeops_imputeoor,
mlr_pipeops_imputesample,
mlr_pipeops_info,
mlr_pipeops_isomap,
mlr_pipeops_kernelpca,
mlr_pipeops_learner,
mlr_pipeops_learner_pi_cvplus,
mlr_pipeops_learner_quantiles,
mlr_pipeops_missind,
mlr_pipeops_modelmatrix,
mlr_pipeops_multiplicityexply,
mlr_pipeops_multiplicityimply,
mlr_pipeops_mutate,
mlr_pipeops_nearmiss,
mlr_pipeops_nmf,
mlr_pipeops_nop,
mlr_pipeops_ovrsplit,
mlr_pipeops_ovrunite,
mlr_pipeops_pca,
mlr_pipeops_proxy,
mlr_pipeops_quantilebin,
mlr_pipeops_randomprojection,
mlr_pipeops_randomresponse,
mlr_pipeops_regravg,
mlr_pipeops_removeconstants,
mlr_pipeops_renamecolumns,
mlr_pipeops_replicate,
mlr_pipeops_rowapply,
mlr_pipeops_scale,
mlr_pipeops_scalemaxabs,
mlr_pipeops_scalerange,
mlr_pipeops_select,
mlr_pipeops_smote,
mlr_pipeops_smotenc,
mlr_pipeops_spatialsign,
mlr_pipeops_splines,
mlr_pipeops_subsample,
mlr_pipeops_targetinvert,
mlr_pipeops_targetmutate,
mlr_pipeops_targettrafoscalerange,
mlr_pipeops_textvectorizer,
mlr_pipeops_threshold,
mlr_pipeops_tomek,
mlr_pipeops_tunethreshold,
mlr_pipeops_unbranch,
mlr_pipeops_updatetarget,
mlr_pipeops_vtreat,
mlr_pipeops_yeojohnson
Examples
library("mlr3")
task = tsk("iris")
task$select("Petal.Length")$filter(1:10)
task$backend$colnames
#> [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
#> [6] "..row_id"
task$backend$nrow
#> [1] 150
pom = PipeOpMaterialize$new("materialize")
materialized = pom$train(list(task))[[1]]
materialized$backend$colnames
#> [1] "..row_id" "Petal.Length" "Species"
materialized$backend$nrow
#> [1] 10
