Correct me if I'm wrong, but it seems the bayesian_optimization.py optimizer in ...

morelandjs · on Feb 28, 2021

I haven't looked at his source, but usually there's a white noise term in the covariance structure of the Gaussian process regressor that adds some statistical uncertainty at each evaluation point. So even when it evaluates a point of parameter space the GPR is still somewhat uncertain about the value of the optimization function at that point. So it should balance exploration versus exploitation taking that statistical uncertainty into account.

plaidfuji · on Feb 28, 2021

I would be very disappointed if that were the case.. no, it looks like it’s set up to capture variance. The BO algo wraps an “Expected Improvement Optimizer”:

https://github.com/SimonBlanke/Gradient-Free-Optimizers/blob...

Which selects new points based on both the model’s mean estimate and its variance. See around line 58

protoplaid · on Feb 28, 2021

line 62: exp_imp[sigma == 0.0] = 0.0

I'm afraid it never samples points more than once, since it estimated already-sampled-points as points with variance zero, and no expected improvement.

IMHO that's wrong. Variance of a single sample should be infinite (classical statistics), or similar to the variance of nearby points (bayesian+model), or some pre-defined prior (not a great idea... I'd prefer some automatic method). But not zero.

plaidfuji · on Feb 28, 2021

Ah, good catch. So in the event the gpr predicts zero variance, the optimizer says EI is zero and thus won’t sample again. That may depend on the settings of the gpr.. if I’m not mistaken there are ways for gpr to model noise and not collapse to zero variance on every sampled point.

Anyway, I guess I stand by my original suggestion that BO is the best tool for gradient free optim with slow and noisy fevals, but to my knowledge nobody has built a way to dial in the hyper parameters automatically. And there are quite a few. Entire companies exist for this purpose, SigOpt comes to mind..