Benchmark suites

This is a brief showcase of OpenML benchmark suites, which were introduced by Bischl et al. (2019). Benchmark suites standardize the datasets and splits to be used in an experiment or paper. They are fully integrated into OpenML and simplify both the sharing of the setup and the results.

# License: BSD 3-Clause

import openml

OpenML-CC18

As an example we have a look at the OpenML-CC18, which is a suite of 72 classification datasets from OpenML which were carefully selected to be usable by many algorithms and also represent datasets commonly used in machine learning research. These are all datasets from mid-2018 that satisfy a large set of clear requirements for thorough yet practical benchmarking:

  1. the number of observations are between 500 and 100,000 to focus on medium-sized datasets,

  2. the number of features does not exceed 5,000 features to keep the runtime of the algorithms low

  3. the target attribute has at least two classes with no class having less than 20 observations

  4. the ratio of the minority class and the majority class is above 0.05 (to eliminate highly imbalanced datasets which require special treatment for both algorithms and evaluation measures).

A full description can be found in the OpenML benchmarking docs.

In this example we’ll focus on how to use benchmark suites in practice.

Downloading benchmark suites

suite = openml.study.get_suite(99)
print(suite)

Out:

OpenML Benchmark Suite
======================
ID..............: 99
Name............: OpenML Benchmarking Suites and the OpenML-CC18
Status..........: in_preparation
Main Entity Type: task
Study URL.......: https://www.openml.org/s/99
# of Data.......: 72
# of Tasks......: 72
Creator.........: https://www.openml.org/u/1
Upload Time.....: 2019-02-21 18:47:13

The benchmark suite does not download the included tasks and datasets itself, but only contains a list of which tasks constitute the study.

Tasks can then be accessed via

tasks = suite.tasks
print(tasks)

Out:

[3, 6, 11, 12, 14, 15, 16, 18, 22, 23, 28, 29, 31, 32, 37, 43, 45, 49, 53, 219, 2074, 2079, 3021, 3022, 3481, 3549, 3560, 3573, 3902, 3903, 3904, 3913, 3917, 3918, 7592, 9910, 9946, 9952, 9957, 9960, 9964, 9971, 9976, 9977, 9978, 9981, 9985, 10093, 10101, 14952, 14954, 14965, 14969, 14970, 125920, 125922, 146195, 146800, 146817, 146819, 146820, 146821, 146822, 146824, 146825, 167119, 167120, 167121, 167124, 167125, 167140, 167141]

and iterated over for benchmarking. For speed reasons we only iterate over the first three tasks:

for task_id in tasks[:3]:
    task = openml.tasks.get_task(task_id)
    print(task)

Out:

OpenML Classification Task
==========================
Task Type Description: https://www.openml.org/tt/1
Task ID..............: 3
Task URL.............: https://www.openml.org/t/3
Estimation Procedure.: crossvalidation
Target Feature.......: class
# of Classes.........: 2
Cost Matrix..........: Available
OpenML Classification Task
==========================
Task Type Description: https://www.openml.org/tt/1
Task ID..............: 6
Task URL.............: https://www.openml.org/t/6
Estimation Procedure.: crossvalidation
Target Feature.......: class
# of Classes.........: 26
Cost Matrix..........: Available
OpenML Classification Task
==========================
Task Type Description: https://www.openml.org/tt/1
Task ID..............: 11
Task URL.............: https://www.openml.org/t/11
Estimation Procedure.: crossvalidation
Target Feature.......: class
# of Classes.........: 3
Cost Matrix..........: Available