OpenML
Installation¶
The OpenML package is available in many languages and has deep integration in many machine learning libraries.
- Python/sklearn repository
pip install openml
- Pytorch repository
pip install openml-pytorch
- TensorFlow repository
pip install openml-tensorflow
- R repository
install.packages("mlr3oml")
- Julia repository
using Pkg;Pkg.add("OpenML")
- RUST repository
- Install from source
- .Net repository
Install-Package openMl
You can find detailed guides for the different libraries in the top menu.
Authentication¶
OpenML is entirely open and you do not need an account to access data (rate limits apply). However, signing up via the OpenML website is very easy (and free) and required to upload new resources to OpenML and to manage them online.
API authentication happens via an API key, which you can find in your profile after logging in to openml.org.
Minimal Example¶
Use the following code to load the credit-g dataset directly into a pandas dataframe. Note that OpenML can automatically load all datasets, separate data X and labels y, and give you useful dataset metadata (e.g. feature names and which ones have categorical data).
Get a task for supervised classification on credit-g.
Tasks specify how a dataset should be used, e.g. including train and test splits.
Use an OpenML benchmarking suite to get a curated list of machine-learning tasks:
You can now benchmark your models easily across many datasets at once. A model training is called a run:
You can now publish your experiment on OpenML so that others can build on it:
Learning more OpenML¶
Next, check out the 10 minute tutorial and the
short description of OpenML concepts.