openml.extensions.sklearn.SklearnExtension¶
-
class
openml.extensions.sklearn.SklearnExtension¶ Connect scikit-learn to OpenML-Python.
-
classmethod
can_handle_flow(flow: 'OpenMLFlow') → bool¶ Check whether a given describes a scikit-learn estimator.
This is done by parsing the
external_versionfield.- Parameters
- flowOpenMLFlow
- Returns
- bool
-
classmethod
can_handle_model(model: Any) → bool¶ Check whether a model is an instance of
sklearn.base.BaseEstimator.- Parameters
- modelAny
- Returns
- bool
-
compile_additional_information(self, task: 'OpenMLTask', additional_information: List[Tuple[int, int, Any]]) → Dict[str, Tuple[str, str]]¶ Compiles additional information provided by the extension during the runs into a final set of files.
- Parameters
- taskOpenMLTask
The task the model was run on.
- additional_information: List[Tuple[int, int, Any]]
A list of (fold, repetition, additional information) tuples obtained during training.
- Returns
- filesDict[str, Tuple[str, str]]
A dictionary of files with their file name and contents.
-
create_setup_string(self, model: Any) → str¶ Create a string which can be used to reinstantiate the given model.
- Parameters
- modelAny
- Returns
- str
-
flow_to_model(self, flow: 'OpenMLFlow', initialize_with_defaults: bool = False) → Any¶ Initializes a sklearn model based on a flow.
- Parameters
- flowmixed
the object to deserialize (can be flow object, or any serialized parameter value that is accepted by)
- initialize_with_defaultsbool, optional (default=False)
If this flag is set, the hyperparameter values of flows will be ignored and a flow with its defaults is returned.
- Returns
- mixed
-
get_version_information(self) → List[str]¶ List versions of libraries required by the flow.
Libraries listed are
Python,scikit-learn,numpyandscipy.- Returns
- List
-
instantiate_model_from_hpo_class(self, model: Any, trace_iteration: openml.runs.trace.OpenMLTraceIteration) → Any¶ Instantiate a
base_estimatorwhich can be searched over by the hyperparameter optimization model.- Parameters
- modelAny
A hyperparameter optimization model which defines the model to be instantiated.
- trace_iterationOpenMLTraceIteration
Describing the hyperparameter settings to instantiate.
- Returns
- Any
-
is_estimator(self, model: Any) → bool¶ Check whether the given model is a scikit-learn estimator.
This function is only required for backwards compatibility and will be removed in the near future.
- Parameters
- modelAny
- Returns
- bool
-
model_to_flow(self, model: Any) → 'OpenMLFlow'¶ Transform a scikit-learn model to a flow for uploading it to OpenML.
- Parameters
- modelAny
- Returns
- OpenMLFlow
-
obtain_parameter_values(self, flow: 'OpenMLFlow', model: Any = None) → List[Dict[str, Any]]¶ Extracts all parameter settings required for the flow from the model.
If no explicit model is provided, the parameters will be extracted from flow.model instead.
- Parameters
- flowOpenMLFlow
OpenMLFlow object (containing flow ids, i.e., it has to be downloaded from the server)
- model: Any, optional (default=None)
The model from which to obtain the parameter values. Must match the flow signature. If None, use the model specified in
OpenMLFlow.model.
- Returns
- list
A list of dicts, where each dict has the following entries: -
oml:name: str: The OpenML parameter name -oml:value: mixed: A representation of the parameter value -oml:component: int: flow id to which the parameter belongs
-
seed_model(self, model: Any, seed: Union[int, NoneType] = None) → Any¶ Set the random state of all the unseeded components of a model and return the seeded model.
Required so that all seed information can be uploaded to OpenML for reproducible results.
Models that are already seeded will maintain the seed. In this case, only integer seeds are allowed (An exception is raised when a RandomState was used as seed).
- Parameters
- modelsklearn model
The model to be seeded
- seedint
The seed to initialize the RandomState with. Unseeded subcomponents will be seeded with a random number from the RandomState.
- Returns
- Any
-
classmethod