Skip to content

Sampler Settings

You can select which Bayesian Sampling algorithm–the engine–by using set_engine:

sim.set_engine('reddemcee')

Each engine is managed through two distinct dictionaries. engine_config is passed as named arguments to the sampler itself, and run_config sets options within EMPEROR to interact with the sampler.

reddemcee

An Adaptative Parallel Tempering MCMC algorithm, based on the excellent emcee.

Chain tempering has been shown to be necessary to efficiently sample highly multi-modal posteriors, where instead of sampling the posterior of the distribution, an artificially dampened posterior is sampled. The dampening factor, is the inverse temperature \(\beta\).

The benefit from this is that now the samplers at different temperatures can build proposal densities that are based on chains with other temperatures, and since the walkers in the hotter chains are less constrained, they are less likely to get stuck in regions of the posterior that are much higher than others, bringing confidence to the fact that cold chain (\(\beta\)=1) members have sampled the actual maximum of the posterior and have not gotten trapped in a region of high probability that is not the global maximum.

Another benefit from this method is that with multiple chains at different temperatures, one is able to approximate the Bayesian Evidence, through thermodynamic integration.

More information on reddemcee's options can be found on reddemcee's documentation.

engine_config type description
setup list [ntemps, nwalkers, nsweeps, nsteps]
betas (opt) list The specific temperature ladder to start with.
moves (opt) list moves used by the sampler
tsw_history (opt) bool Saves the temperature swap rate.
smd_history (opt) bool Saves the swap mean distance.
adapt_mode str Uses different adaptation schemes.
adapt_tau float Ladder adaptation decay timescale.
adapt_nu float Ladder adaptation decay rate.
progress bool Wheter to display the progress bar.

As an example, after setting the engine, you could change the number of [ntemps, nwalkers, nsweeps, nsteps] to use in the run, the temperature adaptation rate, and the initial inverse temperatures:

sim.engine_config['setup'] = [5, 200, 1000, 1]
sim.engine_config['adapt_nu'] = 0.5
sim.engine_config['betas'] = [1.0, 0.624, 0.3673, 0.3414, 0.]

EMPEROR also has the option to run in batches. After a batch is done, it will check if a pre-determined convergence criteria over the auto-correlation time is met. When this criteria is met, it will cease to adapt it's ladder, and sample a final time. Options pertaining this mode start with 'adaptation' in the following list:

run_config type description
adaptation_batches int Batches where the ladder adaptation is done.
adaptation_nsweeps int Length of each adaptation batch.
adaptation_tol int Minimum chain length in tau units.
adaptation_tau_diff int Difference in estimated tau between batches.
burnin float Drops the first part of the chain.
thin int Thins the samples.
logger_level str 'ERROR', 'CRITICAL', 'DEBUG'

Let's picture we want to run emperor in batches. We will use a maximum of 12 batches, each of length 500. After the stopping criteria has been met, we run the chain for an additional 1000 sweeps, and we burn-in half of those, leaving us with a total of 100,000 samples for the cold-chain (\(200 \cdot 1000 \cdot 0.5\)), we simply add:

sim.run_config['adaptation_batches'] = 12
sim.run_config['adaptation_nsweeps'] = 500
sim.run_config['burnin'] = 0.5  # it is in niter

dynesty

Although the APT method is highly recommended for broad searches in multi-modal phase-spaces, dynesty is an alternative Bayesian posterior sampling engine which uses DNS, Dynamic Nested Sampling, a generalisation of the Standard Nested Sampling (SNS) algorithm where the live-points (akin to MCMC walkers) vary in number to improve sampling efficiency.

dynesty's sampler options can be found on its documentation. In this case, engine_config is used when setting up the sampler, while run_config when running the sampler. Some useful options are:

engine_config type description
nlive int N livepoints
queue_size int for parallelization
bound str 'none', 'single', 'multi', 'balls', 'cubes'
sample str 'rwalk', 'auto', 'unif', 'rwalk', 'slice', 'rslice'

For example, if we wanted to use the dynamic nested sampler, with 1500 live-points, multiprocessing, utilising the slice sampling method, with a maximum number of likelihood calls of 100,000:

sim.set_engine('dynesty_dynamic')

sim.engine_config['nlive'] = 1500
sim.engine_config['queue_size'] = sim.cores__  # for multiprocessing
sim.engine_config['sample'] = 'slice'  # 'auto', 'unif', 'rwalk', 'slice', 'rslice'

sim.run_config['maxcall'] = 100000