Skip to contents

Area under the (ROC) Curve filter, analogously to mlr3measures::auc() from mlr3measures. Missing values of the features are removed before calculating the AUC. If the AUC is undefined for the input, it is set to 0.5 (random classifier). The absolute value of the difference between the AUC and 0.5 is used as final filter value.

References

For a benchmark of filter methods:

Bommert A, Sun X, Bischl B, Rahnenführer J, Lang M (2020). “Benchmark for filter methods for feature selection in high-dimensional classification data.” Computational Statistics & Data Analysis, 143, 106839. doi:10.1016/j.csda.2019.106839 .

Super class

mlr3filters::Filter -> FilterAUC

Methods

Inherited methods


Method new()

Create a FilterAUC object.

Usage

FilterAUC$new()


Method clone()

The objects of this class are cloneable with this method.

Usage

FilterAUC$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

task = mlr3::tsk("pima")
filter = flt("auc")
filter$calculate(task)
head(as.data.table(filter), 3)
#>    feature     score
#> 1: glucose 0.2927906
#> 2: insulin 0.2316288
#> 3:    mass 0.1870358

if (mlr3misc::require_namespaces(c("mlr3pipelines", "rpart"), quietly = TRUE)) {
  library("mlr3pipelines")
  task = mlr3::tsk("spam")

  # Note: `filter.frac` is selected randomly and should be tuned.

  graph = po("filter", filter = flt("auc"), filter.frac = 0.5) %>>%
    po("learner", mlr3::lrn("classif.rpart"))

  graph$train(task)
}
#> $classif.rpart.output
#> NULL
#>