OpenML Open Source¶
OpenML is an open source project, hosted on GitHub. We welcome everybody to help improve OpenML, and make it more useful for everyone.
To integrate your own machine learning tools with OpenML, check out the available APIs.
We always love to welcome new contributers, and will gladly help you in any way possible.
GitHub repo's¶
You can find relevant code in the corresponding GitHub repositories. Please also post issues in the relevant issue tracker.
- OpenML Core - The website, web services, and API.
- Evaluation Engine - Evaluate models, analyse datasets, and much more.
- Java API - The Java API and Java-based plugins
- R API - The OpenML R package
- Python API - The Python API
Database snapshots¶
Everything uploaded to OpenML is available to the community. The nightly snapshot of the public database contains all experiment runs, evaluations and links to datasets, implementations and result files. In SQL format (gzipped). You can also download the Database schema.
If you want to work on the website locally, you'll also need the schema for the 'private' database with non-public information.
Legacy Resources¶
OpenML is always evolving, but we keep hosting the resources that were used in prior publications so that others may still build on them.
-
The experiment database used in Vanschoren et al. (2012) Experiment databases. Machine Learning 87(2), pp 127-158. You'll need to import this database (we used MySQL) to run queries. The database structure is described in the paper. Note that most of the experiments in this database have been rerun using OpenML, using newer algorithm implementations and stored in much more detail.
-
The Exposé ontology used in the same paper, and described in more detail here and here. Exposé is used in designing our databases, and we aim to use it to export all OpenML data as Linked Open Data.