openml.runs.run_flow_on_task

openml.runs.run_flow_on_task(flow: OpenMLFlow, task: OpenMLTask, avoid_duplicate_runs: bool = True, flow_tags: list[str] | None = None, seed: int | None = None, add_local_measures: bool = True, upload_flow: bool = False, dataset_format: Literal['array', 'dataframe'] = 'dataframe', n_jobs: int | None = None) OpenMLRun

Run the model provided by the flow on the dataset defined by task.

Takes the flow and repeat information into account. The Flow may optionally be published.

Parameters:
flowOpenMLFlow

A flow wraps a machine learning model together with relevant information. The model has a function fit(X,Y) and predict(X), all supervised estimators of scikit learn follow this definition of a model.

taskOpenMLTask

Task to perform. This may be an OpenMLFlow instead if the first argument is an OpenMLTask.

avoid_duplicate_runsbool, optional (default=True)

If True, the run will throw an error if the setup/task combination is already present on the server. This feature requires an internet connection.

flow_tagsList[str], optional (default=None)

A list of tags that the flow should have at creation.

seed: int, optional (default=None)

Models that are not seeded will get this seed.

add_local_measuresbool, optional (default=True)

Determines whether to calculate a set of evaluation measures locally, to later verify server behaviour.

upload_flowbool (default=False)

If True, upload the flow to OpenML if it does not exist yet. If False, do not upload the flow to OpenML.

dataset_formatstr (default=’dataframe’)

If ‘array’, the dataset is passed to the model as a numpy array. If ‘dataframe’, the dataset is passed to the model as a pandas dataframe.

n_jobsint (default=None)

The number of processes/threads to distribute the evaluation asynchronously. If None or 1, then the evaluation is treated as synchronous and processed sequentially. If -1, then the job uses as many cores available.

Returns:
runOpenMLRun

Result of the run.