flows
openml.flows
#
OpenMLFlow
#
OpenMLFlow(name: str, description: str, model: object, components: dict, parameters: dict, parameters_meta_info: dict, external_version: str, tags: list, language: str, dependencies: str, class_name: str | None = None, custom_name: str | None = None, binary_url: str | None = None, binary_format: str | None = None, binary_md5: str | None = None, uploader: str | None = None, upload_date: str | None = None, flow_id: int | None = None, extension: Extension | None = None, version: str | None = None)
Bases: OpenMLBase
OpenML Flow. Stores machine learning models.
Flows should not be generated manually, but by the function
:meth:openml.flows.create_flow_from_model
. Using this helper function
ensures that all relevant fields are filled in.
Implements openml.implementation.upload.xsd
<https://github.com/openml/openml/blob/master/openml_OS/views/pages/api_new/v1/xsd/
openml.implementation.upload.xsd>
_.
Parameters#
name : str
Name of the flow. Is used together with the attribute
external_version
as a unique identifier of the flow.
description : str
Human-readable description of the flow (free text).
model : object
ML model which is described by this flow.
components : OrderedDict
Mapping from component identifier to an OpenMLFlow object. Components
are usually subfunctions of an algorithm (e.g. kernels), base learners
in ensemble algorithms (decision tree in adaboost) or building blocks
of a machine learning pipeline. Components are modeled as independent
flows and can be shared between flows (different pipelines can use
the same components).
parameters : OrderedDict
Mapping from parameter name to the parameter default value. The
parameter default value must be of type str
, so that the respective
toolbox plugin can take care of casting the parameter default value to
the correct type.
parameters_meta_info : OrderedDict
Mapping from parameter name to dict
. Stores additional information
for each parameter. Required keys are data_type
and description
.
external_version : str
Version number of the software the flow is implemented in. Is used
together with the attribute name
as a uniquer identifier of the flow.
tags : list
List of tags. Created on the server by other API calls.
language : str
Natural language the flow is described in (not the programming
language).
dependencies : str
A list of dependencies necessary to run the flow. This field should
contain all libraries the flow depends on. To allow reproducibility
it should also specify the exact version numbers.
class_name : str, optional
The development language name of the class which is described by this
flow.
custom_name : str, optional
Custom name of the flow given by the owner.
binary_url : str, optional
Url from which the binary can be downloaded. Added by the server.
Ignored when uploaded manually. Will not be used by the python API
because binaries aren't compatible across machines.
binary_format : str, optional
Format in which the binary code was uploaded. Will not be used by the
python API because binaries aren't compatible across machines.
binary_md5 : str, optional
MD5 checksum to check if the binary code was correctly downloaded. Will
not be used by the python API because binaries aren't compatible across
machines.
uploader : str, optional
OpenML user ID of the uploader. Filled in by the server.
upload_date : str, optional
Date the flow was uploaded. Filled in by the server.
flow_id : int, optional
Flow ID. Assigned by the server.
extension : Extension, optional
The extension for a flow (e.g., sklearn).
version : str, optional
OpenML version of the flow. Assigned by the server.
Source code in openml/flows/flow.py
openml_url
property
#
The URL of the object on the server, if it was uploaded, else None.
from_filesystem
classmethod
#
from_filesystem(input_directory: str | Path) -> OpenMLFlow
Read a flow from an XML in input_directory on the filesystem.
Source code in openml/flows/flow.py
get_structure
#
Returns for each sub-component of the flow the path of identifiers that should be traversed to reach this component. The resulting dict maps a key (identifying a flow by either its id, name or fullname) to the parameter prefix.
Parameters#
key_item: str The flow attribute that will be used to identify flows in the structure. Allowed values {flow_id, name}
Returns#
dict[str, List[str]] The flow structure
Source code in openml/flows/flow.py
get_subflow
#
get_subflow(structure: list[str]) -> OpenMLFlow
Returns a subflow from the tree of dependencies.
Parameters#
structure: list[str] A list of strings, indicating the location of the subflow
Returns#
OpenMLFlow The OpenMLFlow that corresponds to the structure
Source code in openml/flows/flow.py
open_in_browser
#
Opens the OpenML web page corresponding to this object in your default browser.
Source code in openml/base.py
publish
#
publish(raise_error_if_exists: bool = False) -> OpenMLFlow
Publish this flow to OpenML server.
Raises a PyOpenMLError if the flow exists on the server, but
self.flow_id
does not match the server known flow id.
Parameters#
raise_error_if_exists : bool, optional (default=False) If True, raise PyOpenMLError if the flow exists on the server. If False, update the local flow to match the server flow.
Returns#
self : OpenMLFlow
Source code in openml/flows/flow.py
push_tag
#
remove_tag
#
to_filesystem
#
Write a flow to the filesystem as XML to output_directory.
Source code in openml/flows/flow.py
url_for_id
classmethod
#
Return the OpenML URL for the object of the class entity with the given id.
assert_flows_equal
#
assert_flows_equal(flow1: OpenMLFlow, flow2: OpenMLFlow, ignore_parameter_values_on_older_children: str | None = None, ignore_parameter_values: bool = False, ignore_custom_name_if_none: bool = False, check_description: bool = True) -> None
Check equality of two flows.
Two flows are equal if their all keys which are not set by the server are equal, as well as all their parameters and components.
Parameters#
flow1 : OpenMLFlow
flow2 : OpenMLFlow
str (optional)
If set to OpenMLFlow.upload_date
, ignores parameters in a child
flow if it's upload date predates the upload date of the parent flow.
bool
Whether to ignore parameter values when comparing flows.
bool
Whether to ignore the custom name field if either flow has custom_name
equal to None
.
bool
Whether to ignore matching of flow descriptions.
Source code in openml/flows/functions.py
363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 |
|
delete_flow
#
Delete flow with id flow_id
from the OpenML server.
You can only delete flows which you uploaded and which which are not linked to runs.
Parameters#
flow_id : int OpenML id of the flow
Returns#
bool True if the deletion was successful. False otherwise.
Source code in openml/flows/functions.py
flow_exists
#
Retrieves the flow id.
A flow is uniquely identified by name + external_version.
Parameters#
name : string Name of the flow external_version : string Version information associated with flow.
Returns#
flow_exist : int or bool flow id iff exists, False otherwise
Notes#
see www.openml.org/api_docs/#!/flow/get_flow_exists_name_version
Source code in openml/flows/functions.py
get_flow
#
get_flow(flow_id: int, reinstantiate: bool = False, strict_version: bool = True) -> OpenMLFlow
Download the OpenML flow for a given flow ID.
Parameters#
flow_id : int The OpenML flow id.
bool
Whether to reinstantiate the flow to a model instance.
bool, default=True
Whether to fail if version requirements are not fulfilled.
Returns#
flow : OpenMLFlow the flow
Source code in openml/flows/functions.py
get_flow_id
#
get_flow_id(model: Any | None = None, name: str | None = None, exact_version: bool = True) -> int | bool | list[int]
Retrieves the flow id for a model or a flow name.
Provide either a model or a name to this function. Depending on the input, it does
model
andexact_version == True
: This helper function first queries for the necessary extension. Second, it uses that extension to convert the model into a flow. Third, it executesflow_exists
to potentially obtain the flow id the flow is published to the server.model
andexact_version == False
: This helper function first queries for the necessary extension. Second, it uses that extension to convert the model into a flow. Third it callslist_flows
and filters the returned values based on the flow name.name
: Ignoresexact_version
and callslist_flows
, then filters the returned values based on the flow name.
Parameters#
model : object
Any model. Must provide either model
or name
.
name : str
Name of the flow. Must provide either model
or name
.
exact_version : bool
Whether to return the flow id of the exact version or all flow ids where the name
of the flow matches. This is only taken into account for a model where a version number
is available (requires model
to be set).
Returns#
int or bool, List
flow id iff exists, False
otherwise, List if exact_version is False
Source code in openml/flows/functions.py
list_flows
#
list_flows(offset: int | None = None, size: int | None = None, tag: str | None = None, uploader: str | None = None) -> DataFrame
Return a list of all flows which are on OpenML. (Supports large amount of results)
Parameters#
offset : int, optional the number of flows to skip, starting from the first size : int, optional the maximum number of flows to return tag : str, optional the tag to include kwargs: dict, optional Legal filter operators: uploader.
Returns#
flows : dataframe Each row maps to a dataset Each column contains the following information: - flow id - full name - name - version - external version - uploader