List, Download, and Upload Suites
How to list, download and upload benchmark suites.
In [ ]:
Copied!
import uuid
import numpy as np
import openml
import uuid
import numpy as np
import openml
Listing suites¶
- Use the output_format parameter to select output type
- Default gives
dict
, but we'll usedataframe
to obtain an easier-to-work-with data structure
In [ ]:
Copied!
suites = openml.study.list_suites(status="all")
print(suites.head(n=10))
suites = openml.study.list_suites(status="all")
print(suites.head(n=10))
Downloading suites¶
This is done based on the dataset ID.
In [ ]:
Copied!
suite = openml.study.get_suite(99)
print(suite)
suite = openml.study.get_suite(99)
print(suite)
Suites also feature a description:
In [ ]:
Copied!
print(suite.description)
print(suite.description)
Suites are a container for tasks:
In [ ]:
Copied!
print(suite.tasks)
print(suite.tasks)
And we can use the task listing functionality to learn more about them:
In [ ]:
Copied!
tasks = openml.tasks.list_tasks()
tasks = openml.tasks.list_tasks()
Using @
in
pd.DataFrame.query
accesses variables outside of the current dataframe.
In [ ]:
Copied!
tasks = tasks.query("tid in @suite.tasks")
print(tasks.describe().transpose())
tasks = tasks.query("tid in @suite.tasks")
print(tasks.describe().transpose())
We'll use the test server for the rest of this tutorial.
In [ ]:
Copied!
openml.config.start_using_configuration_for_example()
openml.config.start_using_configuration_for_example()
Uploading suites¶
Uploading suites is as simple as uploading any kind of other OpenML entity - the only reason why we need so much code in this example is because we upload some random data.
We'll take a random subset of at least ten tasks of all available tasks on the test server:
In [ ]:
Copied!
all_tasks = list(openml.tasks.list_tasks()["tid"])
task_ids_for_suite = sorted(np.random.choice(all_tasks, replace=False, size=20))
# The study needs a machine-readable and unique alias. To obtain this,
# we simply generate a random uuid.
alias = uuid.uuid4().hex
new_suite = openml.study.create_benchmark_suite(
name="Test-Suite",
description="Test suite for the Python tutorial on benchmark suites",
task_ids=task_ids_for_suite,
alias=alias,
)
new_suite.publish()
print(new_suite)
all_tasks = list(openml.tasks.list_tasks()["tid"])
task_ids_for_suite = sorted(np.random.choice(all_tasks, replace=False, size=20))
# The study needs a machine-readable and unique alias. To obtain this,
# we simply generate a random uuid.
alias = uuid.uuid4().hex
new_suite = openml.study.create_benchmark_suite(
name="Test-Suite",
description="Test suite for the Python tutorial on benchmark suites",
task_ids=task_ids_for_suite,
alias=alias,
)
new_suite.publish()
print(new_suite)
In [ ]:
Copied!
openml.config.stop_using_configuration_for_example()
openml.config.stop_using_configuration_for_example()