About Me

Hi! My name is Janek.

I’m a Data Scientist/Statistician/Researcher/Open Source Developer/…

Officially I am a Ph.D. student at the Ludwig Maximilians Unversity in the working group computational statistics.


Meetups and Datageeks

I am (co-)organizing two meetups in Munich, the Munich Datageeks and the Applied R user group. We recently founded a nonprofit organization in Munich, the Munich Datageeks e.V.. Our aim is to encourage data science in Munich and connect industry, community, research and government. If you are interested in helping with the organization any of the above groups by investing some of your precious spare time, feel free to contact me.


Most of my research is in predictive modeling and general machine learning. Especially Gradient Boosting (Tree-based, Model-based and distributional), automatic machine learning, Bayesian optimization and machine learning pipeline configuration.

You can find out more about my research at my

  1. University’s webpage
  2. Google Scholar.
  3. Research Gate


Journal Articles

J. Thomas, A. Mayr, B. Bischl, M. Schmid, A. Smith, and B. Hofner. Gradient boosting for distributional regression: faster tuning and improved variable selection via noncyclical updates. Statistics and Computing, pages 1–15, 2017.

J. Thomas, T. Hepp, A. Mayr, and B. Bischl. Probing for sparse and fast variable selection with model-based boosting. 2017.

Conference Articles (Peer Reviewed)

H. Kotthaus, J. Richter, A. Lang, J. Thomas, B. Bischl, P. Marwedel, J. Rahnenführer, and M. Lang. Rambo: Resource-aware model-based optimization with scheduling for heterogeneous runtimes and a comparison with asynchronous model-based optimization. In International Conference on Learning and Intelligent Optimization, pages 180–195. Springer, 2017.

M. Rietzler, F. Geiselhart, J. Thomas, and E. Rukzio. Fusionkit: a generic toolkit for skeleton, marker and rigid-body tracking. In Proceedings of the 8th ACM SIGCHI Symposium on Engineering Interactive Computing Systems, pages 73–84. ACM, 2016.

Technical Reports (not Peer Reviewed)

B. Bischl, J. Richter, J. Bossek, D. Horn, J. Thomas, and M. Lang. mlrmbo: A modular framework for model-based optimization of expensive black-box functions. arXiv preprint arXiv:1703.03373, 2017.

J. Schiffner, B. Bischl, M. Lang, J. Richter, Z. M. Jones, P. Probst, F. Pfisterer, M. Gallo, D. Kirchhoff, T. Kühn, J. Thomas, and L. Kotthoff. mlr tutorial, 2016.

Open Source Software

I actively contribute to open source development, most projects are R based and can be found on github.

Project Description
mlr General machine learning framework in R, unified interface for a lot of common machine learning tasks.
mlrMBO Model-Based optimization / Bayesian optimization toolbox.
gamboostLSS Framework for boosting distributional regression models.
autoxgboost Automatic tuning and fitting of gradient boosting models utilizing both mlr and mlrMBO. Still in a pretty early stage.
hyperbandr General and extendable implementation of the hyperband algorithm using R6.
compboost Model-Based boosting framework in C++ exposed to R.

We also have a blog for mlr (and mlrMBO), where updates and projects are collected.


The content of this website is licesend under the Creative Commons BY-NC-SA 3.0. The Code of the Templates and everything not directly specified ist licensed under MIT.