What is TPESample?
understand TPE as a search algorithm that uses kernel density estimation. This implementation can be found in Optuna and Hyperopt, with a class called TPESampler in Optuna. So, what kind of algorithm is kernel density estimation?
Kernel Density Estimation
Kernel density estimation is a non-parametric method for estimating probability density functions. Kernels (typically smooth functions like Gaussian functions) are placed around sample data points, and by summing these kernels across all data points, the overall probability density is estimated.
Implementation in Optuna
As you can see in Optuna Documantation,
https://optuna.readthedocs.io/en/stable/reference/samplers/generated/optuna.samplers.TPESampler.html
It says that
On each trial, for each parameter, TPE fits one Gaussian Mixture Model (GMM) l(x)
to the set of parameter values associated with the best objective values, and another GMM g(x)
to the remaining parameter values. It chooses the parameter value x
that maximizes the ratio l(x)/g(x)
.
They use Gaussian Mixture Model (GMM), not Kernel Density Estimation (KDE).
I could not understand exact reason, but what I can say is that it looks completely different method is chosen in their implementation, because the GMM is completely parametric model, and KDE is non-parametric. Therefore, I am not sure if we can regard the TPESampler as TPE.