A Selector function is used by different PipeOps, most prominently PipeOpSelect and many PipeOps inheriting from PipeOpTaskPreproc, to determine a subset of Tasks to operate on.

Even though a Selector is a function that can be written itself, it is preferable to use the Selector constructors shown here. Each of these can be called with its arguments to create a Selector, which can then be given to the PipeOpSelect selector parameter, or many PipeOpTaskPreprocs' affect_columns parameter. See there for examples of this usage.

## Usage

selector_all()

selector_none()

selector_type(types)

selector_grep(pattern, ignore.case = FALSE, perl = FALSE, fixed = FALSE)

selector_name(feature_names, assert_present = FALSE)

selector_invert(selector)

selector_intersect(selector_x, selector_y)

selector_union(selector_x, selector_y)

selector_setdiff(selector_x, selector_y)

selector_missing()

selector_cardinality_greater_than(min_cardinality)

## Arguments

types

(character)
Type of feature to select

pattern

(character(1))
grep pattern

ignore.case

(logical(1))
ignore case

perl

(logical(1))
perl regex

fixed

(logical(1))

feature_names

(character)
Select features by exact name match.

assert_present

(logical(1))
Throw an error if feature_names are not all present in the task being operated on.

selector

(Selector)
Selector to invert.

selector_x

(Selector)
First Selector to query.

selector_y

(Selector)
Second Selector to query.

min_cardinality

(integer)
Minimum number of levels required to be selected.

## Value

function: A Selector function that takes a Task and returns the feature names to be processed.

## Functions

• selector_all: selector_all selects all features.

• selector_none: selector_none selects none of the features.

• selector_type: selector_type selects features according to type. Legal types are listed in mlr_reflections\$task_feature_types.

• selector_grep: selector_grep selects features with names matching the grep() pattern.

• selector_name: selector_name selects features with names matching exactly the names listed.

• selector_invert: selector_invert inverts a given Selector: It always selects the features that would be dropped by the other Selector, and drops the features that would be kept.

• selector_intersect: selector_intersect selects the intersection of two Selectors: Only features selected by both Selectors are selected in the end.

• selector_union: selector_union selects the union of two Selectors: Features selected by either Selector are selected in the end.

• selector_setdiff: selector_setdiff selects the setdiff of two Selectors: Features selected by selector_x are selected, unless they are also selected by selector_y.

• selector_missing: selector_missing selects features with missing values.

• selector_cardinality_greater_than: selector_cardinality_greater_than selects categorical features with cardinality greater then a given threshold.

## Details

A Selector is a function that has one input argument (commonly named task). The function is called with the Task that a PipeOp is operating on. The return value of the function must be a character vector that is a subset of the feature names present in the Task.

For example, a Selector that selects all columns is

(this is the selector_all()-Selector.) A Selector that selects all columns that have names shorter than four letters would be:

A Selector that selects only the column "Sepal.Length" (as in the iris task), if present, is

It is preferable to use the Selector construction functions like select_type, select_grep etc. if possible, instead of writing custom Selectors.

Other Selectors: mlr_pipeops_select

## Examples

library("mlr3")

sela = selector_all()
#> [1] "Petal.Length" "Petal.Width"  "Sepal.Length" "Sepal.Width"
#>  [1] "age"     "b"       "chas"    "cmedv"   "crim"    "dis"     "indus"
#>  [8] "lat"     "lon"     "lstat"   "nox"     "ptratio" "rad"     "rm"
#> [15] "tax"     "town"    "tract"   "zn"

self = selector_type("factor")
#> character(0)
#> [1] "chas" "town"

selg = selector_grep("a.*i")
#> [1] "Petal.Width" "Sepal.Width"
#> [1] "ptratio"

selgi = selector_invert(selg)