Why does t0 estimate well in PMwG?

There has been substantial research and debate over the non-decision time parameter of evidence accumulation models. For LBA and diffusion models, there is often a discrepancy, where diffusion models show longer t0 times, with the LBA known to poorly estimate this parameter. Researchers are aware of this, however, the issue is largely ignored. Research into this parameter has used neural measurement data and muscular activity to provide useful estimates of t0 (i.e. the upper and lower limits of the parameter) to give a sense of how accurate this estimation is.

Recently, following a paper by Gunawan et al., (2020), a new method of Bayesian hierarchical estimation was outlined which shows more reliable parameter estimation. Following several modelling exercises using this method (known as Particle Metropolis within Gibbs - PMwG), it was found that t0 estimates were much improved from previous DE-MCMC sampling methods. “Improved” being that estimates of t0 often centered around 0.01 seconds, whereas in PMwG, this was closer to 0.1 seconds - much more reasonable and in line with literature. So maybe it wasn’t so much a problem with the model, but rather a problem with the model estimation. In this blog post, i explore why this might be.

First, parameter recovery

First of all, I’ll make sure this holds in a parameter recovery exercise

  • after all, we would like to know if the LBA is still doing a not-so-great job of figuring out t0. In this simulation exercise, I show t0 recovering at 3 different values for a single t0 LBA model (i.e. one t0 for all simulated conditions) and 3 lots of t0 values for a three t0 model (i.e. each condition has a t0 value). In the code below, small, medium and large refer to the input t0 parameters (small = fast t0, large = slow t0).

Single t0

  t0
Actual.t0 0.2000000
Recovered 0.2254268

Large t0

  t0
Actual.t0 0.1000000
Recovered 0.1715356

Medium t0

  t0
Actual.t0 0.0300000
Recovered 0.0836326

Small t0

Large

Medium

t0 varies with conds

  t0
Actual.t01 0.1500000
Actual.t02 0.2000000
Actual.t03 0.2500000
Recovered.t01 0.1227897
Recovered.t02 0.1991755
Recovered.t03 0.1987277

Large t0

  t0
Actual.t01 0.0500000
Actual.t02 0.1000000
Actual.t03 0.1500000
Recovered.t01 0.0733236
Recovered.t02 0.1138871
Recovered.t03 0.1758666

Medium t0

  t0
Actual.t01 0.0300000
Actual.t02 0.0600000
Actual.t03 0.0900000
Recovered.t01 0.0507706
Recovered.t02 0.0972373
Recovered.t03 0.1111606

Small t0

So, it looks like t0 recovers relatively well, but maybe overestimates smaller values. This means that the LBA may still not perfectly estimate actual t0 values, but could also come from the variance in the individual subject synthetic parameters. One thing is for sure though, t0 is recovered at reasonable values compared to old DE-MCMC. So what could help this estimation method?

Log Transform

One answer to this question is the log transformation of the parameter vector. This was proposed in Gunawan et al., (2020) so that values drawn from PMwG, which are on the real number line, could be used with the LBA

  • which requires positive-definite values. Hence, when using the LBA in PMwG, we take the exponent of the proposal parameters to calculate the likelihood - where we return the log of the likelihood.

Essentially by doing the log transform of the parameters, all particle values are sampled from the real number line. This makes it starightforward to specify their joint distribution as a multivariate normal (with full covariance matrix structure). Previously, we would’ve assumed that the prior joint distribution of the parameters was an uncorrelated, truncated (at zero) univariate normal. In this new approach, with the log transform (which in practice is just taking the exponential of the values on the real line), there are two key advantages; increasing sampling efficiency (sampling from the real line) and using more informed prior knowledge. further, previous methods could lead to overconfidence in the precision of estimation, and underestimation of the magnitide of individual differences - which could be key in estimating t0. With the covariance matrix able to be estimated, this links to the next section, but also, the log transform is necessary to better estimate this structure.

To test the log transform, I use the Forstmann et al., (2008) dataset reported in Gunawan et al., (2020) for a 3 threshold LBA. Using the PMwG sampler, I fit the model twice, using varying likelihood functions - one which takes the exponent of the proposed values and one which returns bad likelihoods for proposed values below 0 - i.e. method one is the log transformed way, method two is the untransformed (but protected against negative values) way.

The results are shown below.

  t0
Log Transformed 0.1207189
Truncated values 0.1004302

t0 estimated values on the exponential scale (normal way) and on the real number line (logged)

alpha values

covariance matrix

In this section, I investigate whether the covariance matrix could be a main cause for better estimates of t0. As mentioned earlier, PMwG samples from the real line, making it easy to specify the parameters (particles) distribution as an unconstrained multivariate normal distribution with full covariance structure. Using the particle approach has an advantage over other MCMC methods as we can jointly estimate the density of parameters, which enables the covariance matrix to be informed, which then constrains proposed particles.

To test whether the covariance matrix is a main cause for more accurate t0 estimation, I again ran the above model of the Forstmann et al., (2008) data set (with the log transform) over two different iterations. The first iteration ran PMwG as the R package defaults too. In the second iteration, I changed the v-half parameter, which is the hyperparameter on Σ prior (Half-t degrees of freedom). We tend to use v-half = 1. But to constrain the covariance matrix, I set v-half to 1000, essentially rendering the matrix useless.

The following shows the results of this change.

  t0
Exponential with Covariance 0.1207189
Without Covariance 0.1058109

t0 estimated with and without the covariance structure (using a high v-half)

alpha values

Evidently, removing the restraint from the covariance matrix leads to less reliable sampling and worse t0 estimates. Evidently, it is not directly one reason that t0 estimates more reliably in PMwG, but a combination of things.

a comment on hierarchical shrinkage

Evidently, although the sampler does a good job in recovering most values of t0, we still see some hierarchical shrinkage, especially with larger t0 values. This is to be expected with hierarchical Bayesian sampling models, however, should still be considered when reporting results.

References

Forstmann, B. U., Dutilh, G., Brown, S., Neumann, J., Von Cramon, D. Y., Ridderinkhof, K. R., & Wagenmakers, E. J. (2008). Striatum and pre-SMA facilitate decision-making under time pressure. Proceedings of the National Academy of Sciences, 105(45), 17538-17542.

Gunawan, D., Hawkins, G. E., Tran, M. N., Kohn, R., & Brown, S. D. (2020). New estimation approaches for the hierarchical Linear Ballistic Accumulator model. Journal of Mathematical Psychology, 96, 102368.

Read More

Joint Modelling in PMwG

Joint Modelling

Cognitive modelling has become ubiquitous within cognitive psychology, with greater computational resources and more efficient techniques enabling the uptake of modelling for a variety of tasks. This is exemplified in decision-making modelling literature, where models of decision-making have progressed rapidly since being first proposed. In recent years, such advances in modelling techniques has led to a new type of modelling technique; joint modelling.

Joint models come in several flavours, however, the main idea underpinning them is that they use data from multiple sources. This could be in the form of two behavioural tasks or neural data, which can help to better inform or constrain the model. PMwG is perfectly setup for joint modelling applications, due to the easily estimated covariance matrix which can act as a linking function, or can be used in covariate models. This blog post will show a three brief tutorials of how to conduct joint modelling exercises within PMwG, with a focus on data structure and writing the log-likelihood function.

Read More

PMWG version 0.2.0 released

The second CRAN release of the PMWG package is now available for download and includes some new functionality, improvements in the sampling stage and some minor bug fixes. I’ll cover the major points in this post, but further details are available in the changelog for the package.

Read More