openml.runs.run_model_on_task

openml.runs.run_model_on_task(model: Any, task: int | str | OpenMLTask, avoid_duplicate_runs: bool = True, flow_tags: list[str] | None = None, seed: int | None = None, add_local_measures: bool = True, upload_flow: bool = False, return_flow: bool = False, dataset_format: Literal['array', 'dataframe'] = 'dataframe', n_jobs: int | None = None) OpenMLRun | tuple[OpenMLRun, OpenMLFlow]

Run the model on the dataset defined by the task.

Parameters:
modelsklearn model

A model which has a function fit(X,Y) and predict(X), all supervised estimators of scikit learn follow this definition of a model (https://scikit-learn.org/stable/tutorial/statistical_inference/supervised_learning.html)

taskOpenMLTask or int or str

Task to perform or Task id. This may be a model instead if the first argument is an OpenMLTask.

avoid_duplicate_runsbool, optional (default=True)

If True, the run will throw an error if the setup/task combination is already present on the server. This feature requires an internet connection.

flow_tagsList[str], optional (default=None)

A list of tags that the flow should have at creation.

seed: int, optional (default=None)

Models that are not seeded will get this seed.

add_local_measuresbool, optional (default=True)

Determines whether to calculate a set of evaluation measures locally, to later verify server behaviour.

upload_flowbool (default=False)

If True, upload the flow to OpenML if it does not exist yet. If False, do not upload the flow to OpenML.

return_flowbool (default=False)

If True, returns the OpenMLFlow generated from the model in addition to the OpenMLRun.

dataset_formatstr (default=’dataframe’)

If ‘array’, the dataset is passed to the model as a numpy array. If ‘dataframe’, the dataset is passed to the model as a pandas dataframe.

n_jobsint (default=None)

The number of processes/threads to distribute the evaluation asynchronously. If None or 1, then the evaluation is treated as synchronous and processed sequentially. If -1, then the job uses as many cores available.

Returns:
runOpenMLRun

Result of the run.

flowOpenMLFlow (optional, only if return_flow is True).

Flow generated from the model.