openml.tasks.OpenMLClassificationTask

class openml.tasks.OpenMLClassificationTask(task_type_id: TaskType, task_type: str, data_set_id: int, target_name: str, estimation_procedure_id: int = 1, estimation_procedure_type: str | None = None, estimation_parameters: dict[str, str] | None = None, evaluation_measure: str | None = None, data_splits_url: str | None = None, task_id: int | None = None, class_labels: list[str] | None = None, cost_matrix: np.ndarray | None = None)

OpenML Classification object.

Parameters:
task_type_idTaskType

ID of the Classification task type.

task_typestr

Name of the Classification task type.

data_set_idint

ID of the OpenML dataset associated with the Classification task.

target_namestr

Name of the target variable.

estimation_procedure_idint, default=None

ID of the estimation procedure for the Classification task.

estimation_procedure_typestr, default=None

Type of the estimation procedure.

estimation_parametersdict, default=None

Estimation parameters for the Classification task.

evaluation_measurestr, default=None

Name of the evaluation measure.

data_splits_urlstr, default=None

URL of the data splits for the Classification task.

task_idUnion[int, None]

ID of the Classification task (if it already exists on OpenML).

class_labelsList of str, default=None

A list of class labels (for classification tasks).

cost_matrixarray, default=None

A cost matrix (for classification tasks).

download_split() OpenMLSplit

Download the OpenML split for a given task.

property estimation_parameters: dict[str, str] | None

Return the estimation parameters for the task.

get_X_and_y(dataset_format: Literal['dataframe', 'array'] = 'array') tuple[np.ndarray | pd.DataFrame | scipy.sparse.spmatrix, np.ndarray | pd.Series | pd.DataFrame | None]

Get data associated with the current task.

Parameters:
dataset_formatstr

Data structure of the returned data. See openml.datasets.OpenMLDataset.get_data() for possible options.

Returns:
tuple - X and y
get_dataset() OpenMLDataset

Download dataset associated with task.

get_split_dimensions() tuple[int, int, int]

Get the (repeats, folds, samples) of the split for a given task.

get_train_test_split_indices(fold: int = 0, repeat: int = 0, sample: int = 0) tuple[np.ndarray, np.ndarray]

Get the indices of the train and test splits for a given task.

property id: int | None

Return the OpenML ID of this task.

open_in_browser() None

Opens the OpenML web page corresponding to this object in your default browser.

property openml_url: str | None

The URL of the object on the server, if it was uploaded, else None.

publish() OpenMLBase

Publish the object on the OpenML server.

push_tag(tag: str) None

Annotates this entity with a tag on the server.

Parameters:
tagstr

Tag to attach to the flow.

remove_tag(tag: str) None

Removes a tag from this entity on the server.

Parameters:
tagstr

Tag to attach to the flow.

classmethod url_for_id(id_: int) str

Return the OpenML URL for the object of the class entity with the given id.