Random Forest Baseline¶
Let's try evaluating the RandomForest
baseline, which uses scikit-learn's random forest:
Running the Benchmark¶
Linux¶
MacOS¶
Windows¶
As noted above, we need to install the AutoML frameworks (and baselines) in
a container. Add -m docker
to the command as shown:
Important
Future example usages will only show invocations without -m docker
mode,
but Windows users will need to run in some non-local mode.
Results¶
After running the command, there will be a lot of output to the screen that reports on what is currently happening. After a few minutes final results are shown and should look similar to this:
The result denotes the performance of the framework on the test data as measured by
the metric listed in the metric column. The result column always denotes performance
in a way where higher is better (metrics which normally observe "lower is better" are
converted, which can be observed from the neg_
prefix).
While running the command, the AutoML benchmark performed the following steps:
- Create a new virtual environment for the Random Forest experiment.
This environment can be found in
frameworks/randomforest/venv
and will be re-used when you perform other experiments withRandomForest
. - It downloaded datasets from OpenML complete with a "task definition" which specifies cross-validation folds.
- It evaluated
RandomForest
on each (task, fold)-combination in a separate subprocess, where:- The framework (
RandomForest
) is initialized. - The training data is passed to the framework for training.
- The test data is passed to the framework to make predictions on.
- It passes the predictions back to the main process
- The framework (
- The predictions are evaluated and reported on. They are printed to the console and
are stored in the
results
directory. There you will find:results/results.csv
: a file with all results from all benchmarks conducted on your machine.results/randomforest.test.test.local.TIMESTAMP
: a directory with more information about the run, such as logs, predictions, and possibly other artifacts.
Docker Mode
When using docker mode (with -m docker
) a docker image will be made that contains
the virtual environment. Otherwise, it functions much the same way.
Important Parameters¶
As you can see from the results above, the default behavior is to execute a short test
benchmark. However, we can specify a different benchmark, provide different constraints,
and even run the experiment in a container or on AWS. There are many parameters
for the runbenchmark.py
script, but the most important ones are:
Framework (required)¶
- The AutoML framework or baseline to evaluate and is not case-sensitive. See
integrated frameworks for a list of supported frameworks.
In the above example, this benchmarked framework
randomforest
.
Benchmark (optional, default='test')¶
- The benchmark suite is the dataset or set of datasets to evaluate the framework on.
These can be defined as on OpenML as a study or task
(formatted as
openml/s/X
oropenml/t/Y
respectively) or in a local file. The default is a short evaluation on two folds ofiris
,kc2
, andcholesterol
.
Constraints (optional, default='test')¶
-
The constraints applied to the benchmark as defined by default in constraints.yaml. These include time constraints, memory constrains, the number of available cpu cores, and more. Default constraint is
test
(2 folds for 10 min each).Constraints are not enforced!
These constraints are forwarded to the AutoML framework if possible but, except for runtime constraints, are generally not enforced. It is advised when benchmarking to use an environment that mimics the given constraints.
Constraints can be overriden by
benchmark
A benchmark definition can override constraints on a task level. This is useful if you want to define a benchmark which has different constraints for different tasks. The default "test" benchmark does this to limit runtime to 60 seconds instead of 600 seconds, which is useful to get quick results for its small datasets. For more information, see defining a benchmark.
Mode (optional, default='local')¶
-
The benchmark can be run in four modes:
local
: install a local virtual environment and run the benchmark on your machine.docker
: create a docker image with the virtual environment and run the benchmark in a container on your machine. If a local or remote image already exists, that will be used instead. Requires Docker.singularity
: create a singularity image with the virtual environment and run the benchmark in a container on your machine. Requires Singularity.aws
: run the benchmark on AWS EC2 instances. It is possible to run directly on the instance or have the EC2 instance run indocker
mode. Requires valid AWS credentials to be configured, for more information see Running on AWS.
For a full list of parameters available, run: