Optuna – A Hyperparameter Optimization Framework

gillesjacobs · on April 6, 2024

I have extensively tested Optuna and Weights and Biases (WandB) for hyperparameter tuning on multiple task-specific transformer models back in 2020.

Optuna lost out by a long mile back then in feature parity and dashboarding. Optuna did not have hyperband optimization which was and still is one of the best search algos for hyperopt. It looks like it is possible to implement hyperband yourself now, but in the loosely coupled architecture between Sampler and Pruner it's a bit baroque [1].

Anyway back then it was clear WandB was the far superior choice for features, ease of use, experiment tracking and dashboarding. We went with WandB for our lab.

Could be Optuna caught up, but WandB has seen significant development too. Looking at their dashboard docs, it looks meagre compared to what you can do with WandB.

1. https://tech.preferred.jp/en/blog/how-we-implement-hyperband...

bionhoward · on April 6, 2024

I didn’t even know WandB did hyperparameter optimization, I figured it was a neural network visualizer based on 2 minute papers. Didn’t seem like many alternatives out there to Optuna with TPE + SQL storage persistence in conditional continuous & discrete spaces.

Anyway, it’s doable to make a multi objective decide_to_prune function with Optuna, here’s an example https://github.com/optuna/optuna/issues/3450#issuecomment-19...

tkellogg · on April 6, 2024

looks like they have it now

https://optuna.readthedocs.io/en/stable/reference/generated/...

greensh · on April 13, 2024

There is also mlflow in combination with optuna which is great to visualize and dashboard.

3abiton · on April 6, 2024

When did you do this test? I am curious about the current performance gap.

gillesjacobs · on April 7, 2024

Back in 2020, so that's 4 years of active development on both tools since then. Both tools have different scopes though (cf my other comment in the thread).

zwaps · on April 6, 2024

sadly, w&B means you have to upload it to the cloud, which is not possible in every case :(

gillesjacobs · on April 6, 2024

You need to upload your time series (loss, performance metrics) to the cloud. Not your weights or models.

Reliance on cloud services is a legitimate worry though for privacy, IP, process control, reliability, etc.

The comparison between Optuna and WandB was not apples to apples. Optuna is completely self-hosted and local. It also focuses on hyperopt narrowly with flexible design unlike WandB that now assumes to be capture a large part of cloud-based MLOPS workflow.

It would be more fair to compare Optuna to Hyperopt. And I think Optuna was the better choice there, but I did simple PoCing and have no strong opinions.

michaelmior · on April 7, 2024

You don't need to upload any time series data. You can just upload your the final metrics at the end of the run. Admittedly though, W&B loses a lot of its value in that case since you can't really get any insight into runs in progress.

michaelmior · on April 6, 2024

What is the "it" you're referring to? You don't need to upload your model or the weights. You do need to upload the hyperparameters you're optimizing and your target values, but those seem unlikely to be sensitive. (Although I'm sure there are still some legitimate reasons why someone might not want to do so.)

ansgri · on April 6, 2024

Used it for general blackbox optimization some 3 years ago, switched to it from comparatively ancient NOMAD [1]. Worked well and easy enough to suggest it as a default choice for similar problems at the time.

[1] https://www.gerad.ca/en/software/nomad/

nickpsecurity · on April 6, 2024

Another thing you can do is try to optimize hyper parameters with techniques that have fewer or easier hyper parameters. I found several papers where people were using either simulated annealing or differential evolution to optimize the NN’s themselves or the hyperparameters. Some claimed to get good results but with higher, computational cost for that component.

I think even a simple NN with few layers could probably pull it off if you already had categorized the types of data you were training the main model with.

nurettin · on April 7, 2024

Optuna provides a batch API which lets you test a bunch of parameters in parallel. TPE and NSGAII can be used for solving different blackbox optimization problems. They have a pretty dashboard and something to look at when optimizer is running. Overall great library.

If you are in C++ world, I suggest giving nlopt a try. Note that ESCH and ISRES depend on nlopt::srand.

jgalt212 · on April 6, 2024

I'd be curious to check this out. Our shop has a strong bias for models with the least number of hyperparameters.

I will pass along to our ML people.

bbstats · on April 6, 2024

HEBO >>>> everything else