# Summary The goal of this project is to provide users of the `mlr` package with a way of visualizing what happens during the tuning process that identifies the best hyperparameters for given data. This will enable users to assess the impact of different parameters and provide pointers to authors of learning methods what parameters have an impact in practice and how to improve their approaches. # Description Many machine learning algorithms have lots of parameters that need to be set in order to achieve optimal performance on a given data set. Doing this manually is a tedious and error-prone task. The `mlr` package implements not only a interface to dozens of different learning algorithms in R, but also a set of generic hyperparameter optimisation methods -- given a learner, its parameters and data, it will automatically identify the best parameter setting for the particular case. While good parameter settings can be determined efficiently, `mlr` currently provides no means of visualizing this process. The user is given a result without much explanation of how this result was arrived at. Understanding what happens during the process is not only interesting from the user's point of view, but also crucial for understanding what happens and linking this back to an understanding of the behaviour of the machine learning algorithm on the data. Such understanding can inform improvements for the particular approach. This project will create visualizations of hyperparameter tuning for `mlr`. It will allow the plotting of a hyperparameter against a scoring function, showing the effect of tuning the specified hyperparameter. It will furthermore include support for plotting multiple hyperparameters and scoring functions, along with ablation analysis (a method for identifying the most important parameters). # Technical Details The path taken from the starting parameter configuration to the end result is stored in an optimization path data structure that is part of the `ParamHelpers` package. The data structure should contain all the necessary information, but may need to be extended to accommodate more detail. The plotting should use `ggplot2`/`ggvis`, in line with the other visualizations in `mlr`. Providing interactive functionality, e.g. through `shiny`, would be desirable. # Skills Required Applicants should have: - Experience using or developing in R, and development tools such as git. - Experience with visualization methods. - A background in computer science or engineering will be beneficial. # Test Implement a simple visualization that plots the points on an optimization path with respect to the achieved performance. The [`mlr` tutorial](https://mlr-org.github.io/mlr-tutorial/devel/html/tune/index.html) gives details on how to get started. [Visualizing Hyperparameter Optimization by Mason](https://github.com/MasonGallo/model-optimization/blob/master/hyperparam_opt/visualising_hyperparameter_optimization.md) # Mentors Bernd Bischl (bernd_bischl@gmx.net) is one of the primary author of mlr and ParamHelpers and has mentored for GSoC before. Lars Kotthoff (larsko@cs.ubc.ca) is one of the primary authors of mlr and has mentored for GSoC before. # References - [`mlr` package](https://github.com/mlr-org/mlr) - [`ParamHelpers` package](https://github.com/berndbischl/ParamHelpers) - [Ablation Analysis](http://www.cs.ubc.ca/labs/beta/Projects/Ablation/) - [sk-modelcurve](https://github.com/MasonGallo/sk-modelcurve)