OpenML-Python: an extensible Python API for OpenML

OpenML-Python: an extensible Python API for OpenML

\nameMatthias Feurer \
\addrUniversity of Freiburg, Freiburg, Germany \AND\nameJan N. van Rijn \
\addrLeiden University, Leiden, Netherlands \AND\nameArlind Kadra \
\addrUniversity of Freiburg, Freiburg, Germany \AND\namePieter Gijsbers \
\addrEindhoven University of Technology, Eindhoven, Netherlands \AND\nameNeeratyoy Mallik \
\addrUniversity of Freiburg, Freiburg, Germany \AND\nameSahithya Ravi \
\addrEindhoven University of Technology, Eindhoven, Netherlands \AND\nameAndreas Müller \
\addrColumbia University, New York, USA \AND\nameJoaquin Vanschoren \
\addrEindhoven University of Technology, Eindhoven, Netherlands \AND\nameFrank Hutter \
\addrUniversity of Freiburg, Freiburg & Bosch Center for Artificial Intelligence, Germany

OpenML is an online platform for open science collaboration in machine learning, used to share datasets and results of machine learning experiments. In this paper we introduce OpenML-Python, a client API for Python, opening up the OpenML platform for a wide range of Python-based tools. It provides easy access to all datasets, tasks and experiments on OpenML from within Python. It also provides functionality to conduct machine learning experiments, upload the results to OpenML, and reproduce results which are stored on OpenML. Furthermore, it comes with a scikit-learn plugin and a plugin mechanism to easily integrate other machine learning libraries written in Python into the OpenML ecosystem. Source code and documentation is available at


TBD20191-510/19TBDTBDMatthias Feurer, Jan N. van Rijn, Arlind Kadra, Pieter Gijsbers, Neeratyoy Mallik, Sahithya Ravi, Andreas Müller, Joaquin Vanschoren and Frank Hutter \ShortHeadingsOpenML-Python: an extensible Python API for OpenMLFeurer et al. \firstpageno1


Python, Collaborative Science, Meta-Learning, Reproducible Research

1 Introduction

OpenML is a collaborative online machine learning (ML) platform, meant for sharing and building on prior empirical machine learning research (Vanschoren et al., 2014).

It goes beyond open data repositories, such as UCI (Dua and Graff, 2017), PMLB (Olson et al., 2018), the ‘datasets’ submodules in scikit-learn (Pedregosa et al., 2011) and tensorflow (Abadi et al., 2016), and the closed-source data sharing platform at, since OpenML also collects millions of shared experiments on these datasets, linked to the exact ML pipelines and hyperparameter settings, and includes comprehensive logging and uploading functionalities which can be accessed programmatically via a REST API. However, sharing ML experiments adds significant complexity to most people’s workflows.

OpenML-Python is a seamless integration of OpenML into the popular Python ML ecosystem111, that takes away this complexity by providing easy programmatic access to all OpenML data and automating the sharing of new experiments.222Other clients already exist for R (Casalicchio et al., 2017) and Java (van Rijn, 2016). In this paper, we introduce OpenML-Python’s core design, showcase its extensibility to new ML libraries, and give code examples for several common research tasks.

2 Use cases for the OpenML-Python API

OpenML-Python allows for easy dataset and experiment sharing by handling all communication with OpenML’s REST API. In this section, we briefly describe how the package can be used in several common machine learning tasks and highlight recent uses.

Working with datasets. OpenML-Python can retrieve the thousands of datasets on OpenML (all of them, or specific subsets) in a unified format, retrieve meta-data describing them, and search through them with filters. Datasets are converted from OpenML’s internal format into numpy, scipy or pandas data structures, which are standard for ML in Python. To facilitate contributions from the community, it allows people to upload new datasets in only two function calls, and to define new tasks on them (combinations of a dataset, train/test split and target attribute).

Publishing and retrieving results. Sharing empirical results allows anyone to search and download them in order to reproduce and reuse them in their own research. One goal of OpenML is to simplify the comparison of new algorithms and implementations to existing approaches by comparing to the results on OpenML. To this end we also provide an interface for integrating new machine learning libraries with OpenML and we have already integrated scikit-learn. OpenML-Python can then be used to set up and conduct machine learning experiments for a given task and flow (an ML pipeline including hyperparameters and random states), and publish reproducible results.

Use cases in published works. OpenML-Python has already been used to scale up studies with hundreds of consistently formatted datasets (Feurer et al., 2015; Fusi et al., 2018), supply large amounts of meta-data for meta-learning (Perrone et al., 2018), answer questions about algorithms such as hyperparameter importance (van Rijn and Hutter, 2018) and facilitate large-scale comparisons of algorithms (Strang et al., 2018).

3 High-level Design of OpenML-Python

The OpenML platform is organized around several entity types which describe different aspects of a machine learning study. It hosts datasets, tasks that define how models should be evaluated on them, flows that record the structure and other details of ML pipelines, and runs that record the experiments evaluating specific flows on certain tasks. For instance, an experiment (run) shared on OpenML can show how a random forest (flow) performs on ‘iris’ (dataset) if evaluated with 10-fold cross-validation (task), and how to reproduce that result. In OpenML-Python, all these entities are represented by classes, each defined in their own submodule. This implements a natural mapping from OpenML concepts to Python objects. While OpenML is an online platform, we facilitate offline usage as well.

Plugins. To allow users to automatically run and share machine learning experiments with different libraries through the same OpenML-Python interface, we designed a plugin interface that standardizes the interaction between machine learning library code and OpenML-Python. We also created a plugin for scikit-learn (Pedregosa et al., 2011), as it is one of the most popular Python machine learning libraries. This plugin can be used for any library which follows the scikit-learn API (Buitinck et al., 2013).

A plugin’s responsibility is to convert between the libraries’ models and OpenML flows, interact with its training interface and format predictions. For example, the scikit-learn plugin can convert an OpenMLFlow to an Estimator (including hyperparameter settings), train models and produce predictions for a task, and create an OpenMLRun object to upload the predictions to the OpenML server. The plugin also handles advanced procedures, such as scikit-learn’s random search or grid search and uploading its traces (hyperparameters and scores of each model evaluated during search).
We are working on more plugins, and anyone can
contribute their own using the scikit-learn plugin
implementation as a reference.

SVM hyperparameter contour plot generated by the code in Figure 1.

1\ttbimport openml; \ttbimport numpy as np
2\ttbimport matplotlib.pyplot as plt
3df = openml.evaluations.list_evaluations_setups(
4    ’predictive_accuracy’, flow=[8353], task=[6],
5    output_format=’dataframe’, parameters_in_separate_columns=True,
6) # Choose an SVM flow (e.g. 8353), and the dataset ’letter’ (task 6).
7hp_names = [’sklearn.svm.classes.SVC(16)_C’,’sklearn.svm.classes.SVC(16)_gamma’]
8df[hp_names] = df[hp_names].astype(\ttbfloat).\ttbapply(np.log)
9C, gamma, score = df[hp_names[0]], df[hp_names[1]], df[’value’]
10cntr = plt.tricontourf(C, gamma, score, levels=12, cmap=’RdBu_r’)
11plt.colorbar(cntr, label=’accuracy’)
12plt.xlim((\ttbmin(C), \ttbmax(C))); plt.ylim((\ttbmin(gamma), \ttbmax(gamma)))
13plt.xlabel(’C (log10)’, size=16); plt.ylabel(’gamma (log10)’, size=16)
14plt.title(’SVM performance landscape’, size=20)
Figure 1: Code for retrieving the predictive accuracy of an SVM classifier on the ‘letter’ dataset and creating a contour plot with the results.

4 Examples

We show two example uses of OpenML-Python to demonstrate its API’s simplicity. First, we show how to retrieve results and evaluations from the OpenML server in Figure 1 (generating the plot on the right). Second, in Figure 2 we show how to conduct experiments on a benchmark suite (Bischl et al., 2019). Further examples, including how to create datasets and tasks and how OpenML-Python was used in previous publications, can be found in the online documentation.333We provide documentation and code examples on and host the project on

1\ttbimport openml
2\ttbimport sklearn.tree, sklearn.impute, sklearn.pipeline
3# obtain a benchmark suite
4benchmark_suite =’OpenML-CC18’)
5clf = sklearn.pipeline.Pipeline(steps=[
6    (’imputer’, sklearn.impute.SimpleImputer()),
7    (’estimator’, sklearn.tree.DecisionTreeClassifier()),
8])  # build a sklearn classifier
9\ttbfor task_id \ttbin benchmark_suite.tasks:  # iterate over all tasks
10    task = openml.tasks.get_task(task_id)  # download the OpenML task
11    run = openml.runs.run_model_on_task(clf, task)  # run classifier on splits
12    # run.publish()  # upload the run to the server, optional
Figure 2: Training and evaluating a decision tree classifier from scikit-learn on each task of the OpenML-CC18 benchmark suite (Bischl et al., 2019).

5 Project development

The project has been set up for development through community effort from different research groups, and has received contributions from numerous individuals. The package is developed publicly through Github which also provides an issue tracker for bug reports, feature requests and usage questions. To ensure a coherent and robust code base we use continuous integration for Windows and Linux as well as automated type and style checking. Documentation is also rendered on continuous integration servers and consists of a mix of tutorials, examples and API documentation.

For ease of use and stability, we use well-known and established third-party packages where needed. For instance, we build documentation using the popular sphinx Python documentation generator444           5, use an extension to automatically compile examples into documentation and Jupyter notebooks5 , and employ standard open-source packages for scientific computing such as numpy, scipy (Virtanen et al., 2019), and pandas (McKinney, 2010). The package is written in Python3 and open-sourced with a 3-Clause BSD License.3

6 Conclusion

OpenML-Python allows easy interaction with OpenML from within Python. It makes it easy for people to share and reuse the data, meta-data, and empirical results which are generated as part of an ML study. This allows for better reproducibility, simpler benchmarking and easier collaboration on ML projects. Our software is shipped with a scikit-learn plugin and has a plugin mechanism to easily integrate other ML libraries written in Python.


MF, NM and FH acknowledge funding by the Robert Bosch GmbH. AK, JvR and FH acknowledge funding by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under grant no. 716721. JV and PG acknowledge funding by the Data Driven Discovery of Models (D3M) program run by DARPA and the Air Force Research Laboratory. The authors also thank Bilge Celik, Victor Gal and everyone listed at for their contributions.


Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description