Collaborative Machine Learning in Python

Welcome to the documentation of the OpenML Python API, a connector to the collaborative machine learning platform The OpenML Python package allows to use datasets and tasks from OpenML together with scikit-learn and share the results online.


import openml
from sklearn import impute, tree, pipeline

# Define a scikit-learn classifier or pipeline
clf = pipeline.Pipeline(
        ('imputer', impute.SimpleImputer()),
        ('estimator', tree.DecisionTreeClassifier())
# Download the OpenML task for the german credit card dataset with 10-fold
# cross-validation.
task = openml.tasks.get_task(32)
# Run the scikit-learn model on the task.
run = openml.runs.run_model_on_task(clf, task)
# Publish the experiment on OpenML (optional, requires an API key.
# You can get your own API key by signing up to
print(f'View the run online: {run.openml_url}')

You can find more examples in our Examples Gallery.

How to get OpenML for python

You can install the OpenML package via pip:

pip install openml

For more advanced installation information, please see the Installation & Set up section.


Further information


Contribution to the OpenML package is highly appreciated. The OpenML package currently has a 1/4 position for the development and all help possible is needed to extend and maintain the package, create new examples and improve the usability. Please see the Contributing page for more information.

Citing OpenML-Python

If you use OpenML-Python in a scientific publication, we would appreciate a reference to the following paper:

OpenML-Python: an extensible Python API for OpenML, Feurer et al., arXiv:1911.02490.

Bibtex entry:

    author    = {Matthias Feurer and Jan N. van Rijn and Arlind Kadra and Pieter Gijsbers and Neeratyoy Mallik and Sahithya Ravi and Andreas Müller and Joaquin Vanschoren and Frank Hutter},
    title     = {OpenML-Python: an extensible Python API for OpenML},
    journal   = {arXiv:1911.02490},
    year      = {2019},