Skip to contents

Base class for filters. Predefined filters are stored in the dictionary mlr_filters. A Filter calculates a score for each feature of a task. Important features get a large value and unimportant features get a small value. Note that filter scores may also be negative.

Details

Some features support partial scoring of the feature set: If nfeat is not NULL, only the best nfeat features are guaranteed to get a score. Additional features may be ignored for computational reasons, and then get a score value of NA.

Public fields

id

(character(1))
Identifier of the object. Used in tables, plot and text output.

label

(character(1))
Label for this object. Can be used in tables, plot and text output instead of the ID.

task_types

(character())
Set of supported task types, e.g. "classif" or "regr". Can be set to the scalar value NA to allow any task type.

For a complete list of possible task types (depending on the loaded packages), see mlr_reflections$task_types$type.

task_properties

(character())
mlr3::Tasktask properties.

param_set

(paradox::ParamSet)
Set of hyperparameters.

feature_types

(character())
Feature types of the filter.

packages

(character())
Packages which this filter is relying on.

man

(character(1))
String in the format [pkg]::[topic] pointing to a manual page for this object. Defaults to NA, but can be set by child classes.

scores

Stores the calculated filter score values as named numeric vector. The vector is sorted in decreasing order with possible NA values last. The more important the feature, the higher the score. Tied values (this includes NA values) appear in a random, non-deterministic order.

Active bindings

properties

(character())
Properties of the filter. Currently, only "missings" is supported. A filter has the property "missings", iff the filter can handle missing values in the features in a graceful way. Otherwise, an assertion is thrown if missing values are detected.

hash

(character(1))
Hash (unique identifier) for this object.

phash

(character(1))
Hash (unique identifier) for this partial object, excluding some components which are varied systematically during tuning (parameter values) or feature selection (feature names).

Methods


Method new()

Create a Filter object.

Usage

Filter$new(
  id,
  task_types,
  task_properties = character(),
  param_set = ps(),
  feature_types = character(),
  packages = character(),
  label = NA_character_,
  man = NA_character_
)

Arguments

id

(character(1))
Identifier for the filter.

task_types

(character())
Types of the task the filter can operator on. E.g., "classif" or "regr". Can be set to scalar NA to allow any task type.

task_properties

(character())
Required task properties, see mlr3::Task. Must be a subset of mlr_reflections$task_properties.

param_set

(paradox::ParamSet)
Set of hyperparameters.

feature_types

(character())
Feature types the filter operates on. Must be a subset of mlr_reflections$task_feature_types.

packages

(character())
Set of required packages. Note that these packages will be loaded via requireNamespace(), and are not attached.

label

(character(1))
Label for the new instance.

man

(character(1))
String in the format [pkg]::[topic] pointing to a manual page for this object. The referenced help package can be opened via method $help().


Method format()

Format helper for Filter class

Usage

Filter$format(...)

Arguments

...

(ignored).


Method print()

Printer for Filter class

Usage

Filter$print()


Method help()

Opens the corresponding help page referenced by field $man.

Usage

Filter$help()


Method calculate()

Calculates the filter score values for the provided mlr3::Task and stores them in field scores. nfeat determines the minimum number of features to score (see details), and defaults to the number of features in task. Loads required packages and then calls private$.calculate() of the respective subclass.

This private method is is expected to return a numeric vector, uniquely named with (a subset of) feature names. The returned vector may have missing values. Features with missing values as well as features with no calculated score are automatically ranked last, in a random order. If the task has no rows, each feature gets the score NA.

Usage

Filter$calculate(task, nfeat = NULL)

Arguments

task

(mlr3::Task)
mlr3::Task to calculate the filter scores for.

nfeat

(integer())
The minimum number of features to calculate filter scores for.


Method clone()

The objects of this class are cloneable with this method.

Usage

Filter$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.