`openml.datasets`.list_datasets¶

Return a list of all dataset which are on OpenML. Supports large amount of results.

Parameters:

data_idlist, optional: A list of data ids, to specify which datasets should be listed
offsetint, optional: The number of datasets to skip, starting from the first.
sizeint, optional: The maximum number of datasets to show.
statusstr, optional: Should be {active, in_preparation, deactivated}. By default active datasets are returned, but also datasets from another status can be requested.
tagstr, optional
output_format: str, optional (default=’dict’): The parameter decides the format of the output. - If ‘dict’ the output is a dict of dict - If ‘dataframe’ the output is a pandas DataFrame
kwargsdict, optional: Legal filter operators (keys in the dict): data_name, data_version, number_instances, number_features, number_classes, number_missing_values.

Returns:

datasetsdict of dicts, or dataframe

If output_format=’dict’
A mapping from dataset ID to dict.

Every dataset is represented by a dictionary containing the following information: - dataset id - name - format - status If qualities are calculated for the dataset, some of these are also returned.
If output_format=’dataframe’
Each row maps to a dataset Each column contains the following information: - dataset id - name - format - status If qualities are calculated for the dataset, some of these are also included as columns.

openml.datasets.list_datasets¶