Base class for filters. Predefined filters are stored in the dictionary mlr_filters. A Filter calculates a score for each feature of a task. Important features get a large value and unimportant features get a small value. Note that filter scores may also be negative.
Details
Some features support partial scoring of the feature set:
If nfeat is not NULL, only the best nfeat features are guaranteed to
get a score. Additional features may be ignored for computational reasons,
and then get a score value of NA.
See also
Other Filter:
mlr_filters,
mlr_filters_anova,
mlr_filters_auc,
mlr_filters_boruta,
mlr_filters_carscore,
mlr_filters_carsurvscore,
mlr_filters_cmim,
mlr_filters_correlation,
mlr_filters_disr,
mlr_filters_find_correlation,
mlr_filters_importance,
mlr_filters_information_gain,
mlr_filters_jmi,
mlr_filters_jmim,
mlr_filters_kruskal_test,
mlr_filters_mim,
mlr_filters_mrmr,
mlr_filters_njmim,
mlr_filters_performance,
mlr_filters_permutation,
mlr_filters_relief,
mlr_filters_selected_features,
mlr_filters_univariate_cox,
mlr_filters_variance
Public fields
- id
- ( - character(1))
 Identifier of the object. Used in tables, plot and text output.
- label
- ( - character(1))
 Label for this object. Can be used in tables, plot and text output instead of the ID.
- task_types
- ( - character())
 Set of supported task types, e.g.- "classif"or- "regr". Can be set to the scalar value- NAto allow any task type.- For a complete list of possible task types (depending on the loaded packages), see - mlr_reflections$task_types$type.
- task_properties
- ( - character())
 mlr3::Tasktask properties.
- feature_types
- ( - character())
 Feature types of the filter.
- packages
- ( - character())
 Packages which this filter is relying on.
- man
- ( - character(1))
 String in the format- [pkg]::[topic]pointing to a manual page for this object. Defaults to- NA, but can be set by child classes.
- scores
- Stores the calculated filter score values as named numeric vector. The vector is sorted in decreasing order with possible - NAvalues last. The more important the feature, the higher the score. Tied values (this includes- NAvalues) appear in a random, non-deterministic order.
Active bindings
- param_set
- (paradox::ParamSet) 
 Set of hyperparameters.
- properties
- ( - character())
 Properties of the filter. Currently, only- "missings"is supported. A filter has the property- "missings", iff the filter can handle missing values in the features in a graceful way. Otherwise, an assertion is thrown if missing values are detected.
- hash
- ( - character(1))
 Hash (unique identifier) for this object.
- phash
- ( - character(1))
 Hash (unique identifier) for this partial object, excluding some components which are varied systematically during tuning (parameter values) or feature selection (feature names).
Methods
Method new()
Create a Filter object.
Arguments
- id
- ( - character(1))
 Identifier for the filter.
- task_types
- ( - character())
 Types of the task the filter can operator on. E.g.,- "classif"or- "regr". Can be set to scalar- NAto allow any task type.
- task_properties
- ( - character())
 Required task properties, see mlr3::Task. Must be a subset of- mlr_reflections$task_properties.
- param_set
- (paradox::ParamSet) 
 Set of hyperparameters.
- feature_types
- ( - character())
 Feature types the filter operates on. Must be a subset of- mlr_reflections$task_feature_types.
- packages
- ( - character())
 Set of required packages. Note that these packages will be loaded via- requireNamespace(), and are not attached.
- label
- ( - character(1))
 Label for the new instance.
- man
- ( - character(1))
 String in the format- [pkg]::[topic]pointing to a manual page for this object. The referenced help package can be opened via method- $help().
Method calculate()
Calculates the filter score values for the provided mlr3::Task and
stores them in field scores. nfeat determines the minimum number of
features to score (see details), and defaults to the number
of features in task. Loads required packages and then calls
private$.calculate() of the respective subclass.
This private method is is expected to return a numeric vector, uniquely named
with (a subset of) feature names. The returned vector may have missing
values.
Features with missing values as well as features with no calculated
score are automatically ranked last, in a random order.
If the task has no rows, each feature gets the score NA.
Arguments
- task
- (mlr3::Task) 
 mlr3::Task to calculate the filter scores for.
- nfeat
- ( - integer())
 The minimum number of features to calculate filter scores for.
