Sampler Settings

You can select which Bayesian Sampling algorithm–the engine–by using set_engine:

sim.set_engine('reddemcee')

Each engine is managed through two distinct dictionaries. engine_config is passed as named arguments to the sampler itself, and run_config sets options within EMPEROR to interact with the sampler.

reddemcee

An Adaptative Parallel Tempering MCMC algorithm, based on the excellent emcee.

Chain tempering has been shown to be necessary to efficiently sample highly multi-modal posteriors, where instead of sampling the posterior of the distribution, an artificially dampened posterior is sampled. The dampening factor, is the inverse temperature \(\beta\).

The benefit from this is that now the samplers at different temperatures can build proposal densities that are based on chains with other temperatures, and since the walkers in the hotter chains are less constrained, they are less likely to get stuck in regions of the posterior that are much higher than others, bringing confidence to the fact that cold chain (\(\beta\)=1) members have sampled the actual maximum of the posterior and have not gotten trapped in a region of high probability that is not the global maximum.

Another benefit from this method is that with multiple chains at different temperatures, one is able to approximate the Bayesian Evidence, through thermodynamic integration.

More information on reddemcee's options can be found on reddemcee's documentation.

engine_config	type	description
setup	list	[ntemps, nwalkers, nsweeps, nsteps]
betas	(opt) list	The specific temperature ladder to start with.
moves	(opt) list	moves used by the sampler
tsw_history	(opt) bool	Saves the temperature swap rate.
smd_history	(opt) bool	Saves the swap mean distance.
adapt_mode	str	Uses different adaptation schemes.
adapt_tau	float	Ladder adaptation decay timescale.
adapt_nu	float	Ladder adaptation decay rate.
progress	bool	Wheter to display the progress bar.

As an example, after setting the engine, you could change the number of [ntemps, nwalkers, nsweeps, nsteps] to use in the run, the temperature adaptation rate, and the initial inverse temperatures:

sim.engine_config['setup'] = [5, 200, 1000, 1]
sim.engine_config['adapt_nu'] = 0.5
sim.engine_config['betas'] = [1.0, 0.624, 0.3673, 0.3414, 0.]

EMPEROR also has the option to run in batches. After a batch is done, it will check if a pre-determined convergence criteria over the auto-correlation time is met. When this criteria is met, it will cease to adapt it's ladder, and sample a final time. Options pertaining this mode start with 'adaptation' in the following list:

run_config	type	description
adaptation_batches	int	Batches where the ladder adaptation is done.
adaptation_nsweeps	int	Length of each adaptation batch.
adaptation_tol	int	Minimum chain length in tau units.
adaptation_tau_diff	int	Difference in estimated tau between batches.
burnin	float	Drops the first part of the chain.
thin	int	Thins the samples.
logger_level	str	'ERROR', 'CRITICAL', 'DEBUG'

Let's picture we want to run emperor in batches. We will use a maximum of 12 batches, each of length 500. After the stopping criteria has been met, we run the chain for an additional 1000 sweeps, and we burn-in half of those, leaving us with a total of 100,000 samples for the cold-chain (\(200 \cdot 1000 \cdot 0.5\)), we simply add:

sim.run_config['adaptation_batches'] = 12
sim.run_config['adaptation_nsweeps'] = 500
sim.run_config['burnin'] = 0.5  # it is in niter

dynesty

Although the APT method is highly recommended for broad searches in multi-modal phase-spaces, dynesty is an alternative Bayesian posterior sampling engine which uses DNS, Dynamic Nested Sampling, a generalisation of the Standard Nested Sampling (SNS) algorithm where the live-points (akin to MCMC walkers) vary in number to improve sampling efficiency.

dynesty's sampler options can be found on its documentation. In this case, engine_config is used when setting up the sampler, while run_config when running the sampler. Some useful options are:

engine_config	type	description
nlive	int	N livepoints
queue_size	int	for parallelization
bound	str	'none', 'single', 'multi', 'balls', 'cubes'
sample	str	'rwalk', 'auto', 'unif', 'rwalk', 'slice', 'rslice'

For example, if we wanted to use the dynamic nested sampler, with 1500 live-points, multiprocessing, utilising the slice sampling method, with a maximum number of likelihood calls of 100,000:

sim.set_engine('dynesty_dynamic')

sim.engine_config['nlive'] = 1500
sim.engine_config['queue_size'] = sim.cores__  # for multiprocessing
sim.engine_config['sample'] = 'slice'  # 'auto', 'unif', 'rwalk', 'slice', 'rslice'

sim.run_config['maxcall'] = 100000